Update README.md
Browse files
README.md
CHANGED
|
@@ -1,15 +1,13 @@
|
|
| 1 |
---
|
| 2 |
-
language:
|
| 3 |
-
- fr
|
| 4 |
widget:
|
| 5 |
-
- text: "generate question: Barack Hussein Obama, né le 4 aout 1961, est un homme politique américain et avocat. Il a été élu
|
| 6 |
- text: "question: Quand Barack Obama a t'il été élu président? context: Barack Hussein Obama, né le 4 aout 1961, est un homme politique américain et avocat. Il a été élu en 2009 pour devenir le 44ème président des Etats-Unis d'Amérique. </s>"
|
| 7 |
tags:
|
| 8 |
- pytorch
|
| 9 |
- t5
|
| 10 |
- question-generation
|
| 11 |
- seq2seq
|
| 12 |
-
license:
|
| 13 |
datasets:
|
| 14 |
- fquad
|
| 15 |
- piaf
|
|
@@ -19,21 +17,18 @@ datasets:
|
|
| 19 |
|
| 20 |
## Model description
|
| 21 |
|
| 22 |
-
This model is a T5 Transformers model (airklizz/t5-base-multi-fr-wiki-news) that was fine-tuned in french on 3 different tasks
|
| 23 |
-
|
| 24 |
-
- question answering
|
| 25 |
-
- answer extraction
|
| 26 |
-
It obtains quite good results on FQuAD validation dataset.
|
| 27 |
|
| 28 |
-
|
| 29 |
|
| 30 |
-
|
| 31 |
|
| 32 |
-
|
| 33 |
|
| 34 |
-
|
| 35 |
|
| 36 |
-
|
| 37 |
|
| 38 |
```python
|
| 39 |
from transformers import T5ForConditionalGeneration, T5Tokenizer
|
|
@@ -45,21 +40,25 @@ tokenizer = T5Tokenizer.from_pretrained("JDBN/t5-base-fr-qg-fquad")
|
|
| 45 |
|
| 46 |
The initial model used was https://huggingface.co/airKlizz/t5-base-multi-fr-wiki-news. This model was finetuned on a dataset composed of FQuAD and PIAF on the 3 tasks mentioned previously.
|
| 47 |
|
| 48 |
-
The data were preprocessed like this
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
|
|
|
|
|
|
| 52 |
|
| 53 |
The preprocessing we used was implemented in https://github.com/patil-suraj/question_generation
|
| 54 |
|
| 55 |
## Eval results
|
| 56 |
|
| 57 |
-
On FQuAD validation set
|
|
|
|
| 58 |
| BLEU_1 | BLEU_2 | BLEU_3 | BLEU_4 | METEOR | ROUGE_L | CIDEr |
|
| 59 |
|--------|--------|--------|--------|--------|---------|-------|
|
| 60 |
| 0.290 | 0.203 | 0.149 | 0.111 | 0.197 | 0.284 | 1.038 |
|
| 61 |
|
| 62 |
-
Question Answering metrics
|
|
|
|
| 63 |
For these metrics, the performance of this question answering model (https://huggingface.co/illuin/camembert-base-fquad) on FQuAD original question and on T5 generated questions are compared.
|
| 64 |
|
| 65 |
| Questions | Exact Match | F1 Score |
|
|
@@ -95,3 +94,5 @@ howpublished={\url{https://github.com/patil-suraj/question_generation}}
|
|
| 95 |
primaryClass={cs.CL}
|
| 96 |
}
|
| 97 |
```
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
language: fr
|
|
|
|
| 3 |
widget:
|
| 4 |
+
- text: "generate question: Barack Hussein Obama, né le 4 aout 1961, est un homme politique américain et avocat. Il a été élu <hl> en 2009 <hl> pour devenir le 44ème président des Etats-Unis d'Amérique. </s>"
|
| 5 |
- text: "question: Quand Barack Obama a t'il été élu président? context: Barack Hussein Obama, né le 4 aout 1961, est un homme politique américain et avocat. Il a été élu en 2009 pour devenir le 44ème président des Etats-Unis d'Amérique. </s>"
|
| 6 |
tags:
|
| 7 |
- pytorch
|
| 8 |
- t5
|
| 9 |
- question-generation
|
| 10 |
- seq2seq
|
|
|
|
| 11 |
datasets:
|
| 12 |
- fquad
|
| 13 |
- piaf
|
|
|
|
| 17 |
|
| 18 |
## Model description
|
| 19 |
|
| 20 |
+
This model is a T5 Transformers model (airklizz/t5-base-multi-fr-wiki-news) that was fine-tuned in french on 3 different tasks
|
| 21 |
+
* question generation
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
+
* question answering
|
| 24 |
|
| 25 |
+
* answer extraction
|
| 26 |
|
| 27 |
+
It obtains quite good results on FQuAD validation dataset.
|
| 28 |
|
| 29 |
+
## Intended uses & limitations
|
| 30 |
|
| 31 |
+
This model functions for the 3 tasks mentionned earlier and was not tested on other tasks.
|
| 32 |
|
| 33 |
```python
|
| 34 |
from transformers import T5ForConditionalGeneration, T5Tokenizer
|
|
|
|
| 40 |
|
| 41 |
The initial model used was https://huggingface.co/airKlizz/t5-base-multi-fr-wiki-news. This model was finetuned on a dataset composed of FQuAD and PIAF on the 3 tasks mentioned previously.
|
| 42 |
|
| 43 |
+
The data were preprocessed like this
|
| 44 |
+
* question generation: "generate question: Barack Hussein Obama, né le 4 aout 1961, est un homme politique américain et avocat. Il a été élu <hl> en 2009 <hl> pour devenir le 44ème président des Etats-Unis d'Amérique."
|
| 45 |
+
|
| 46 |
+
* question answering: "question: Quand Barack Hussein Obamaa-t-il été élu président des Etats-Unis d’Amérique? context: Barack Hussein Obama, né le 4 aout 1961, est un homme politique américain et avocat. Il a été élu en 2009 pour devenir le 44ème président des Etats-Unis d’Amérique."
|
| 47 |
+
|
| 48 |
+
* answer extraction: "extract_answers: Barack Hussein Obama, né le 4 aout 1961, est un homme politique américain et avocat. <hl> Il a été élu en 2009 pour devenir le 44ème président des Etats-Unis d’Amérique <hl>."
|
| 49 |
|
| 50 |
The preprocessing we used was implemented in https://github.com/patil-suraj/question_generation
|
| 51 |
|
| 52 |
## Eval results
|
| 53 |
|
| 54 |
+
#### On FQuAD validation set
|
| 55 |
+
|
| 56 |
| BLEU_1 | BLEU_2 | BLEU_3 | BLEU_4 | METEOR | ROUGE_L | CIDEr |
|
| 57 |
|--------|--------|--------|--------|--------|---------|-------|
|
| 58 |
| 0.290 | 0.203 | 0.149 | 0.111 | 0.197 | 0.284 | 1.038 |
|
| 59 |
|
| 60 |
+
#### Question Answering metrics
|
| 61 |
+
|
| 62 |
For these metrics, the performance of this question answering model (https://huggingface.co/illuin/camembert-base-fquad) on FQuAD original question and on T5 generated questions are compared.
|
| 63 |
|
| 64 |
| Questions | Exact Match | F1 Score |
|
|
|
|
| 94 |
primaryClass={cs.CL}
|
| 95 |
}
|
| 96 |
```
|
| 97 |
+
|
| 98 |
+
|