Difference between revisions of "Occitan and French/Work plan"

From Apertium
Jump to navigation Jump to search
 
(45 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Note: The French → Occitan part of the project is the main direction.
+
*Note: The French → Occitan part of the project is the main direction.
  +
*<small>Nòta: La part francés → occitan del projècte es la direccion principala.</small>
  +
*<small>Note : La partie français → occitan du projet est la direction principale.</small>
   
 
{|class=wikitable
 
{|class=wikitable
Line 6: Line 8:
 
| 0 || <b>français > occitan</b> || || ~5 700 || || || || || || || || ||
 
| 0 || <b>français > occitan</b> || || ~5 700 || || || || || || || || ||
 
|-
 
|-
| 1 || 14 mai&mdash;20 mai || || ~10 000 || ~84,0% || || || || || || || ||
+
| 1 || 14 mai&mdash;20 mai || Improving Occitan monodix<br/>Adding prn, pr, cnj*, basic adv to bidix || ~6,000 || ~84,0% || || || || 7643 || 77,1% || || || ½
 
|-
 
|-
| 2 || 21 mai&mdash;11 mai || || ~12 000 || ~86,0% || || || || || || || ||
+
| 2 || 21 mai&mdash;27 mai || Adding n, adj, adv to the bidix from the French Wictionary || ~12,000 || ~86,0% || || || || 12811 || 82,2% || || || ½
 
|-
 
|-
| 3 || 28 mai&mdash;3 junh || || ~14 000 || ~88.0% || || || || || || || ||
+
| 3 || 28 mai&mdash;3 junh || Adding vblex to the bidix from the French Wictionary<br/>Beginning to add missing words in decreasing order of frequency fra > oci || ~14,000 || ~88.0% || || || || 14452 || 85,1% || || || ½
 
|-
 
|-
| 4 || 4 junh&mdash;10 junh || || ~16 000 || ~89.0% || || || || || || || ||
+
| 4 || 4 junh&mdash;10 junh || Adding words<br/>Transfer rules fra > oci || ~16,000 || ~89.0% || || || || 16745 || 89,2% || || ||
 
|-
 
|-
| 5 || <b>11 junh&mdash;15 junh</b> || || ~18 000 || ~89.5% || ~30% || pr, cnj*, adv || || || || || ||
+
| 5 || <b>11 junh&mdash;15 junh<br>Deliverable #1: French to Occitan translator</b> || Adding words<br/>Transfer rules fra > oci || <b>~18,000</b> || <b>~89.5%</b> || <b>~25%</b> || || || 19897 || 91,1% || (WP) 15,0% || ||
 
|-
 
|-
| 6 || 18 junh&mdash;24 junh || || ~20 000 || ~90.0% || || prn, det || || || || || ||
+
| 6 || 18 junh&mdash;24 junh || Adding words<br/>Transfer rules fra > oci<br/>Begin testvoc fra > oci || ~20,000 || ~90.0% || || pr, cnj*, adv, prn, det || || 20581 || 91,5% || (WP) 12,3% || 0 ||
 
|-
 
|-
| 7 || 25 junh&mdash;1 julhet || || ~21 000 || ~90.5% || || vblex || || || || || ||
+
| 7 || 25 junh&mdash;1 julhet || Adding words<br/>Transfer rules fra > oci<br/>Testvoc fra > oci || ~21,000 || ~90.5% || || vblex || || 21823 || 91,8% || (Euro- News) 18,0% || 0 ||
 
|-
 
|-
| 8 || 2 julhet&mdash;8 julhet || || ~22 000 || ~91.0% || || adj || || || || || ||
+
| 8 || 2 julhet&mdash;8 julhet || Adding words<br/>Transfer rules fra > oci<br/>Testvoc fra > oci || ~22,000 || ~91.0% || || adj || || 22609 || 91,9% || || 0 ||
 
|-
 
|-
| 9 || <b>9 julhet&mdash;13 julhet</b> || || ~22 000 || ~91.0% || ~20% || n || || || || || ||
+
| 9 || <b>9 julhet&mdash;13 julhet<br>Deliverable #2: French to Occitan translator</b> || Transfer rules fra > oci<br/>Testvoc fra > oci || <b>~22,000</b> || <b>~91.0%</b> || <b>~15%</b> || n || || 25045 || 92,1% || (WP) 7,2% || 0 ||
 
|-
 
|-
| 0 || <b>occitan > français</b> || || ~20 000 || || || || || || || || ||
+
| 0 || <b>occitan > français</b> || || ~22,000 || || || || || || || || ||
 
|-
 
|-
| 10 || 16 julhet&mdash;22 julhet || || ~20 500 || ~88.0% || || pr, cnj*, adv || || || || || ||
+
| 10 || 16 julhet&mdash;22 julhet || Adding missing words in decreasing order of frequency oci > fra<br/>Transfer rules oci > fra<br/>Testvoc oci > fra || ~22,500 || ~88.0% || || pr, cnj*, adv, prn, det || || 25161 || 91,7% || fra>oci (Euro- News) 6,6% || 10 ||
 
|-
 
|-
| 13 || 23 julhet&mdash;29 julhet || || ~21 000 || ~89.0% || || prn, det, n, adj || || || || || ||
+
| 13 || 23 julhet&mdash;29 julhet || Adding words<br/>Transfer rules oci > fra <br/>Testvoc oci > fra || ~23,000 || ~89.0% || || n, adj || || 25504 || 92,1% || || 1 ||
 
|-
 
|-
| 11 || 30 julhet&mdash;5 agost || || ~21 500 || ~90.0% || || vblex || || || || || ||
+
| 11 || 30 julhet&mdash;5 agost || Adding words<br/>Transfer rules oci > fra <br/>Testvoc oci > fra || ~23,500 || ~90.0% || || vblex || || 26908 || oci>fra 92,9% fra>oci 92,3% || fra>oci (WP) 10,0% || 0 || ½
 
|-
 
|-
| 12* || 6 agost&mdash;9 agost || || ~21.500 || ~90.5% || || || || || || || ||
+
| 12* || 6 agost&mdash;9 agost || Final improvements || || || || || || || || || ||
 
|-
 
|-
| 12** || <b>10 agost&mdash;14 agost</b> ||avaluacion finala|| || || ~30% || || || || || || ||
+
| 12** || <b>10 agost&mdash;14 agost<br>Deliverable #3: Occitan to French translator</b> || Final evalution|| <b>~23,500</b> || <b>~90.0%</b> || <b>~30%</b> || || || || || || ||
 
|-
 
|-
 
|}
 
|}
Line 50: Line 52:
   
 
<pre>
 
<pre>
$ cat apertium-oci-fra.oci-fra.dix | grep '<l' | grep -v '¨np"' | wc -l
+
$ cat apertium-oci-fra.oci-fra.dix | grep '<l' | grep -v '"cog"' | grep -v "oci@" | wc -l
 
</pre>
 
</pre>
   
Line 56: Line 58:
   
 
<pre>
 
<pre>
  +
$ ./coverage_fra_oci.sh
$ cat ../apertium-fra/corpus/corpus_fra_wp100000.txt | apertium -d . fra-oci-morph | sed 's/\$\W*\^/$\n^/g' > /tmp/fra-oci.coverage.txt
 
$ calc `cat /tmp/fra-oci.coverage.txt | grep -v '\*' | wc -l `/`cat /tmp/fra-oci.coverage.txt | wc -l`
 
   
  +
$ ./coverage_oci_fra.sh
$ cat ../apertium-cat/corpus/corpus_oci_wp100000.txt | apertium -d . oci-fra-morph | sed 's/\$\W*\^/$\n^/g' > /tmp/oci-fra.coverage.txt
 
$ calc `cat /tmp/oci-fra.coverage.txt | grep -v '\*' | wc -l `/`cat /tmp/oci-fra.coverage.txt | wc -l`
 
 
</pre>
 
</pre>
   

Latest revision as of 10:44, 9 August 2018

  • Note: The French → Occitan part of the project is the main direction.
  • Nòta: La part francés → occitan del projècte es la direccion principala.
  • Note : La partie français → occitan du projet est la direction principale.
Setmana Dates Descripció Bidix
(sense np)
previst
(%) Cobertura
prevista
(%) WER
previst
Testvoc Avaluació Bidix
real
(%) Cobertura
real
(%) WER Err. Fet?
0 français > occitan ~5 700
1 14 mai—20 mai Improving Occitan monodix
Adding prn, pr, cnj*, basic adv to bidix
~6,000 ~84,0% 7643 77,1% ½
2 21 mai—27 mai Adding n, adj, adv to the bidix from the French Wictionary ~12,000 ~86,0% 12811 82,2% ½
3 28 mai—3 junh Adding vblex to the bidix from the French Wictionary
Beginning to add missing words in decreasing order of frequency fra > oci
~14,000 ~88.0% 14452 85,1% ½
4 4 junh—10 junh Adding words
Transfer rules fra > oci
~16,000 ~89.0% 16745 89,2%
5 11 junh—15 junh
Deliverable #1: French to Occitan translator
Adding words
Transfer rules fra > oci
~18,000 ~89.5% ~25% 19897 91,1% (WP) 15,0%
6 18 junh—24 junh Adding words
Transfer rules fra > oci
Begin testvoc fra > oci
~20,000 ~90.0% pr, cnj*, adv, prn, det 20581 91,5% (WP) 12,3% 0
7 25 junh—1 julhet Adding words
Transfer rules fra > oci
Testvoc fra > oci
~21,000 ~90.5% vblex 21823 91,8% (Euro- News) 18,0% 0
8 2 julhet—8 julhet Adding words
Transfer rules fra > oci
Testvoc fra > oci
~22,000 ~91.0% adj 22609 91,9% 0
9 9 julhet—13 julhet
Deliverable #2: French to Occitan translator
Transfer rules fra > oci
Testvoc fra > oci
~22,000 ~91.0% ~15% n 25045 92,1% (WP) 7,2% 0
0 occitan > français ~22,000
10 16 julhet—22 julhet Adding missing words in decreasing order of frequency oci > fra
Transfer rules oci > fra
Testvoc oci > fra
~22,500 ~88.0% pr, cnj*, adv, prn, det 25161 91,7% fra>oci (Euro- News) 6,6% 10
13 23 julhet—29 julhet Adding words
Transfer rules oci > fra
Testvoc oci > fra
~23,000 ~89.0% n, adj 25504 92,1% 1
11 30 julhet—5 agost Adding words
Transfer rules oci > fra
Testvoc oci > fra
~23,500 ~90.0% vblex 26908 oci>fra 92,9% fra>oci 92,3% fra>oci (WP) 10,0% 0 ½
12* 6 agost—9 agost Final improvements
12** 10 agost—14 agost
Deliverable #3: Occitan to French translator
Final evalution ~23,500 ~90.0% ~30%

Per a calcular els nombres[edit]

Errors (calcular en apertium-fra-oci/dev)
$ bash dev/testvoc/generation.sh fra-oci | wc -l  # en apertium-oci-fra
$ bash dev/testvoc/generation.sh oci-fra | wc -l  # en apertium-oci-fra
Bidix (calcular en apertium-oci-fra)
$ cat apertium-oci-fra.oci-fra.dix | grep '<l'  | grep -v '"cog"'  | grep -v "oci@" | wc -l
Cobertura (calcular en apertium-oci-fra)
$ ./coverage_fra_oci.sh

$ ./coverage_oci_fra.sh

Veire tanben[edit]