Difference between revisions of "User:Capsot/Proposicion"
m |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 23: | Line 23: | ||
{|class=wikitable |
{|class=wikitable |
||
! Setmana !! Datas !! Descripcion !! Bidix<br/>(sens np)<br/>previst !!(%) Cobertura<br/>prevista !! (%) WER<br/>previst !! Testvoc |
! Setmana !! Datas !! Descripcion !! Bidix<br/>(sens np)<br/>previst !!(%) Cobertura<br/>prevista !! (%) WER<br/>previst !! Testvoc |
||
|- |
|- |
||
| 0 || <b>francés > occitan</b> || || ~5,700 || || || |
| 0 || <b>francés > occitan</b> || || ~5,700 || || || |
||
|- |
|- |
||
| 1 || 14 mai—20 mai || Melhorament del monodix occitan<br/>Adding prn, pr, cnj*, basic adv to bidix || ~6,000 || ~84,0% || || |
| 1 || 14 mai—20 mai || Melhorament del monodix occitan<br/>Adding prn, pr, cnj*, basic adv to bidix || ~6,000 || ~84,0% || || |
||
|- |
|- |
||
| 2 || 21 mai—27 mai || Apondeson dels n, adj, adv al bidix a partir del Wiktionnaire francés || ~12,000 || ~86,0% || || |
| 2 || 21 mai—27 mai || Apondeson dels n, adj, adv al bidix a partir del Wiktionnaire francés || ~12,000 || ~86,0% || || |
||
|- |
|- |
||
| 3 || 28 mai—3 junh || Apondeson del vblex al bidix a partir del Wiktionnaire francés<br/>Beginning a apondre los mots que mancan en òrdre decreissent de frequéncia fra > oci || ~14,000 || ~88.0% || || |
| 3 || 28 mai—3 junh || Apondeson del vblex al bidix a partir del Wiktionnaire francés<br/>Beginning a apondre los mots que mancan en òrdre decreissent de frequéncia fra > oci || ~14,000 || ~88.0% || || |
||
|- |
|- |
||
| 4 || 4 junh—10 junh || Apondeson de mots<br/>Transferiment de règlas fra > oci || ~16,000 || ~89.0% || || |
| 4 || 4 junh—10 junh || Apondeson de mots<br/>Transferiment de règlas fra > oci || ~16,000 || ~89.0% || || |
||
|- |
|- |
||
| 5 || <b>11 junh—15 junh<br>Deliverable #1: Traductor del francés a l'occitan</b> || Apondeson de mots<br/>Transfer rules fra > oci || <b>~18,000</b> || <b>~89.5%</b> || <b>~25%</b> || |
| 5 || <b>11 junh—15 junh<br>Deliverable #1: Traductor del francés a l'occitan</b> || Apondeson de mots<br/>Transfer rules fra > oci || <b>~18,000</b> || <b>~89.5%</b> || <b>~25%</b> || |
||
|- |
|- |
||
| 6 || 18 junh—24 junh || Apondeson de mots<br/>Transferiment de règlas fra > oci<br/>Començament del testvoc fra > oci || ~20,000 || ~90.0% || || pr, cnj*, adv, prn, det |
| 6 || 18 junh—24 junh || Apondeson de mots<br/>Transferiment de règlas fra > oci<br/>Començament del testvoc fra > oci || ~20,000 || ~90.0% || || pr, cnj*, adv, prn, det |
||
|- |
|- |
||
| 7 || 25 junh—1 julhet || Apondeson de mots<br/>Transferiment de règlas fra > oci<br/>Testvoc fra > oci || ~21,000 || ~90.5% || || vblex |
| 7 || 25 junh—1 julhet || Apondeson de mots<br/>Transferiment de règlas fra > oci<br/>Testvoc fra > oci || ~21,000 || ~90.5% || || vblex |
||
|- |
|- |
||
| 8 || 2 julhet—8 julhet || Apondeson de mots<br/>Transferiment de règlas fra > oci<br/>Testvoc fra > oci || ~22,000 || ~91.0% || || adj |
| 8 || 2 julhet—8 julhet || Apondeson de mots<br/>Transferiment de règlas fra > oci<br/>Testvoc fra > oci || ~22,000 || ~91.0% || || adj |
||
|- |
|- |
||
| 9 || <b>9 julhet—13 julhet<br>Deliverable #2: Traductor del francés a l'occitan</b> || Transfer rules fra > oci<br/>Testvoc fra > oci || <b>~22,000</b> || <b>~91.0%</b> || <b>~15%</b> || n |
| 9 || <b>9 julhet—13 julhet<br>Deliverable #2: Traductor del francés a l'occitan</b> || Transfer rules fra > oci<br/>Testvoc fra > oci || <b>~22,000</b> || <b>~91.0%</b> || <b>~15%</b> || n |
||
|- |
|- |
||
| 0 || <b>occitan > français</b> || || ~22,000 || || || |
| 0 || <b>occitan > français</b> || || ~22,000 || || || |
||
|- |
|- |
||
| 10 || 16 julhet—22 julhet || Adding missing words in decreasing order of frequency oci > fra<br/>Transfer rules oci > fra<br/>Testvoc oci > fra || ~22,500 || ~88.0% || || pr, cnj*, adv, prn, det |
| 10 || 16 julhet—22 julhet || Adding missing words in decreasing order of frequency oci > fra<br/>Transfer rules oci > fra<br/>Testvoc oci > fra || ~22,500 || ~88.0% || || pr, cnj*, adv, prn, det |
||
|- |
|- |
||
| 13 || 23 julhet—29 julhet || Adding words<br/>Transfer rules oci > fra <br/>Testvoc oci > fra || ~23,000 || ~89.0% || || n, adj |
| 13 || 23 julhet—29 julhet || Adding words<br/>Transfer rules oci > fra <br/>Testvoc oci > fra || ~23,000 || ~89.0% || || n, adj |
||
|- |
|- |
||
| 11 || 30 julhet—5 agost || Adding words<br/>Transfer rules oci > fra <br/>Testvoc oci > fra || ~23,500 || ~90.0% || || vblex |
| 11 || 30 julhet—5 agost || Adding words<br/>Transfer rules oci > fra <br/>Testvoc oci > fra || ~23,500 || ~90.0% || || vblex |
||
|- |
|- |
||
| 12* || 6 agost—9 agost || Final improvements || || || || |
| 12* || 6 agost—9 agost || Final improvements || || || || |
||
|- |
|- |
||
| 12** || <b>10 agost—14 agost<br>Deliverable #3: Occitan to French translator</b> || Final evalution|| <b>~23,500</b> || <b>~90.0%</b> || <b>~30%</b> || |
| 12** || <b>10 agost—14 agost<br>Deliverable #3:<br> Occitan to French translator</b> || Final evalution|| <b>~23,500</b> || <b>~90.0%</b> || <b>~30%</b> || |
||
|- |
|- |
||
|} |
|} |
||
== Coding Challenge == |
|||
As I said before, I have already begun studying the files and how everything works. I have worked and made significant changes on the apertium-oci-fra file. My potential mentor Hèctor Alòs has been a great guide and superviser. I have learnt much of the syntax used and how it works, and we even went through many technical problems together. I had then much appreciated help from Shardul Chiplunkar (shardulc; धन्यवाद), Jacob Nordfalk (JacobEo), Tino Didriksen and Ilnar Salimzianov (selimcan; Räxmät!) from the Apertium community. I think that I have finally managed to catch a decent grasp of many of the commands and much of the syntax, even though I guess much more remains to be acquired!<br> |
|||
As mentioned previously I have worked mostly on the oci-fra file, trying to understand how things worked, and then added many words trying to fill the gaps that the translation of the James and Mary text gave at first. It is not finished yet but it looks much better and even though some sentences still make trouble I am confident it will be completed soon.<br> |
|||
===Per calcular los nombres=== |
|||
Currently, the translator can generate Aranese and standard Occitan translations.<br> |
|||
;Errors (calcular en apertium-fra-oci/dev) |
|||
However most of the problems encountered so far have been in trying to find the best referencial forms to be used in the bilingual file and sort of all this out; since Occitan possesses much dialectal variation it is a difficult task to make sure you choose the right word. One of the main objectives will be improving and expanding the monolingual Occitan dictionary so it can produce texts in standard Occitan, avoiding the mixing of different Occitan dialectal solutions. Albeit, at the same time it will have to be flexible enough so it can accept other varieties and even produce later diverse dialectal translations. |
|||
<pre> |
|||
$ bash dev/testvoc/generation.sh fra-oci | wc -l # en apertium-oci-fra |
|||
$ bash dev/testvoc/generation.sh oci-fra | wc -l # en apertium-oci-fra |
|||
</pre> |
|||
;Bidix (calcular en apertium-oci-fra) |
|||
<pre> |
|||
$ cat apertium-oci-fra.oci-fra.dix | grep '<l' | grep -v '¨np"' | wc -l |
|||
</pre> |
|||
;Cobertura (calcular en apertium-oci-fra) |
|||
<pre> |
|||
$ cat ../apertium-fra/corpus/corpus_fra_wp100000.txt | apertium -d . fra-oci-morph | sed 's/\$\W*\^/$\n^/g' > /tmp/fra-oci.coverage.txt |
|||
$ calc `cat /tmp/fra-oci.coverage.txt | grep -v '\*' | wc -l `/`cat /tmp/fra-oci.coverage.txt | wc -l` |
|||
$ cat ../apertium-cat/corpus/corpus_oci_wp100000.txt | apertium -d . oci-fra-morph | sed 's/\$\W*\^/$\n^/g' > /tmp/oci-fra.coverage.txt |
|||
$ calc `cat /tmp/oci-fra.coverage.txt | grep -v '\*' | wc -l `/`cat /tmp/oci-fra.coverage.txt | wc -l` |
|||
</pre> |
Latest revision as of 20:54, 25 March 2018
Mentor possible: Hèctor Alòs
Contents
- 1 Competéncias e experiéncia
- 2 D'ont ven aqueste interès per la traduccion automatizada?
- 3 Per qué vos interèssa Apertium?
- 4 Quina de las tascas publicadas vos interèssan? Qu'avètz l'intencion de realizar?
- 5 Rasons per las que Google e Apertium vos deurián esponsorizar
- 6 Descripcion de cossí e qui ne serán los beneficiaris dins la societat
- 7 Plan de trabalh
- 8 Coding Challenge
Competéncias e experiéncia[edit]
D'ont ven aqueste interès per la traduccion automatizada?[edit]
Per qué vos interèssa Apertium?[edit]
Quina de las tascas publicadas vos interèssan? Qu'avètz l'intencion de realizar?[edit]
Rasons per las que Google e Apertium vos deurián esponsorizar[edit]
Descripcion de cossí e qui ne serán los beneficiaris dins la societat[edit]
Plan de trabalh[edit]
- Nòta: La part francés → occitan del projècte es la direccion principala.
Setmana | Datas | Descripcion | Bidix (sens np) previst |
(%) Cobertura prevista |
(%) WER previst |
Testvoc |
---|---|---|---|---|---|---|
0 | francés > occitan | ~5,700 | ||||
1 | 14 mai—20 mai | Melhorament del monodix occitan Adding prn, pr, cnj*, basic adv to bidix |
~6,000 | ~84,0% | ||
2 | 21 mai—27 mai | Apondeson dels n, adj, adv al bidix a partir del Wiktionnaire francés | ~12,000 | ~86,0% | ||
3 | 28 mai—3 junh | Apondeson del vblex al bidix a partir del Wiktionnaire francés Beginning a apondre los mots que mancan en òrdre decreissent de frequéncia fra > oci |
~14,000 | ~88.0% | ||
4 | 4 junh—10 junh | Apondeson de mots Transferiment de règlas fra > oci |
~16,000 | ~89.0% | ||
5 | 11 junh—15 junh Deliverable #1: Traductor del francés a l'occitan |
Apondeson de mots Transfer rules fra > oci |
~18,000 | ~89.5% | ~25% | |
6 | 18 junh—24 junh | Apondeson de mots Transferiment de règlas fra > oci Començament del testvoc fra > oci |
~20,000 | ~90.0% | pr, cnj*, adv, prn, det | |
7 | 25 junh—1 julhet | Apondeson de mots Transferiment de règlas fra > oci Testvoc fra > oci |
~21,000 | ~90.5% | vblex | |
8 | 2 julhet—8 julhet | Apondeson de mots Transferiment de règlas fra > oci Testvoc fra > oci |
~22,000 | ~91.0% | adj | |
9 | 9 julhet—13 julhet Deliverable #2: Traductor del francés a l'occitan |
Transfer rules fra > oci Testvoc fra > oci |
~22,000 | ~91.0% | ~15% | n |
0 | occitan > français | ~22,000 | ||||
10 | 16 julhet—22 julhet | Adding missing words in decreasing order of frequency oci > fra Transfer rules oci > fra Testvoc oci > fra |
~22,500 | ~88.0% | pr, cnj*, adv, prn, det | |
13 | 23 julhet—29 julhet | Adding words Transfer rules oci > fra Testvoc oci > fra |
~23,000 | ~89.0% | n, adj | |
11 | 30 julhet—5 agost | Adding words Transfer rules oci > fra Testvoc oci > fra |
~23,500 | ~90.0% | vblex | |
12* | 6 agost—9 agost | Final improvements | ||||
12** | 10 agost—14 agost Deliverable #3: Occitan to French translator |
Final evalution | ~23,500 | ~90.0% | ~30% |
Coding Challenge[edit]
As I said before, I have already begun studying the files and how everything works. I have worked and made significant changes on the apertium-oci-fra file. My potential mentor Hèctor Alòs has been a great guide and superviser. I have learnt much of the syntax used and how it works, and we even went through many technical problems together. I had then much appreciated help from Shardul Chiplunkar (shardulc; धन्यवाद), Jacob Nordfalk (JacobEo), Tino Didriksen and Ilnar Salimzianov (selimcan; Räxmät!) from the Apertium community. I think that I have finally managed to catch a decent grasp of many of the commands and much of the syntax, even though I guess much more remains to be acquired!
As mentioned previously I have worked mostly on the oci-fra file, trying to understand how things worked, and then added many words trying to fill the gaps that the translation of the James and Mary text gave at first. It is not finished yet but it looks much better and even though some sentences still make trouble I am confident it will be completed soon.
Currently, the translator can generate Aranese and standard Occitan translations.
However most of the problems encountered so far have been in trying to find the best referencial forms to be used in the bilingual file and sort of all this out; since Occitan possesses much dialectal variation it is a difficult task to make sure you choose the right word. One of the main objectives will be improving and expanding the monolingual Occitan dictionary so it can produce texts in standard Occitan, avoiding the mixing of different Occitan dialectal solutions. Albeit, at the same time it will have to be flexible enough so it can accept other varieties and even produce later diverse dialectal translations.