Difference between revisions of "Occitan and French/Work plan"
		
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
		
		
		
		
		
	
Hectoralos (talk | contribs)  | 
				Hectoralos (talk | contribs)   | 
				||
| (49 intermediate revisions by 2 users not shown) | |||
| Line 1: | Line 1: | ||
Note: The French → Occitan part of the project is the main direction.  | 
  *Note: The French → Occitan part of the project is the main direction.  | 
||
*<small>Nòta: La part francés → occitan del projècte es la direccion principala.</small>  | 
|||
*<small>Note : La partie français → occitan du projet est la direction principale.</small>  | 
|||
{|class=wikitable  | 
  {|class=wikitable  | 
||
! Setmana !! Dates                !!    | 
  ! Setmana !! Dates                !!  Descripció !!  Bidix<br/>(sense np)<br/>previst !!(%) Cobertura<br/>prevista !! (%) WER<br/>previst !! Testvoc  !! Avaluació !! Bidix<br/>real !! (%) Cobertura<br/>real !! (%) WER !! Err. !! Fet?  | 
||
|-  | 
  |-  | 
||
| 0      || <b>français > occitan</b> ||  | 
  | 0      || <b>français > occitan</b> ||                  ||    ~5 700       ||          ||         ||   ||        ||    ||    || ||   ||   | 
||
|-   | 
  |-   | 
||
| 1      || 14 mai—20 mai  ||            | 
  | 1      || 14 mai—20 mai  || Improving Occitan monodix<br/>Adding prn, pr, cnj*, basic adv to bidix ||    ~6,000       ||  ~84,0%  ||         ||  ||        || 7643 || 77,1%   || ||   || ½  | 
||
|-   | 
  |-   | 
||
| 2      || 21 mai—  | 
  | 2      || 21 mai—27 mai  || Adding n, adj, adv to the bidix from the French Wictionary ||    ~12,000       ||  ~86,0%  ||         ||  ||        || 12811 || 82,2% || ||   || ½  | 
||
|-   | 
  |-   | 
||
| 3      || 28 mai—3 junh ||                       | 
  | 3      || 28 mai—3 junh || Adding vblex to the bidix from the French Wictionary<br/>Beginning to add missing words in decreasing order of frequency fra > oci ||    ~14,000       ||  ~88.0%  ||         ||  ||        || 14452   || 85,1% || ||   || ½  | 
||
|-   | 
  |-   | 
||
| 4      || 4 junh—10 junh ||        | 
  | 4      || 4 junh—10 junh || Adding words<br/>Transfer rules fra > oci ||  ~16,000       ||  ~89.0%  ||         ||  ||        || 16745  ||  89,2%  || ||   || ✓  | 
||
|-   | 
  |-   | 
||
| 5     || 11 junh—  | 
  | 5     || <b>11 junh—15 junh<br>Deliverable #1: French to Occitan translator</b> || Adding words<br/>Transfer rules fra > oci ||  <b>~18,000</b>       ||  <b>~89.5%</b>  ||  <b>~25%</b>   ||  ||    || 19897 || 91,1% || (WP) 15,0% ||  ||  ✓  | 
||
|-   | 
  |-   | 
||
| 6    || 18 junh—24 junh ||            | 
  | 6    || 18 junh—24 junh || Adding words<br/>Transfer rules fra > oci<br/>Begin testvoc fra > oci ||  ~20,000       ||  ~90.0%  ||         ||  pr, cnj*, adv, prn, det || || 20581 || 91,5% || (WP) 12,3% || 0  || ✓  | 
||
|-   | 
  |-   | 
||
| 7  || 25 junh—1 julhet ||           | 
  | 7  || 25 junh—1 julhet || Adding words<br/>Transfer rules fra > oci<br/>Testvoc fra > oci ||  ~21,000       ||  ~90.5%  ||         ||  vblex    ||    || 21823 || 91,8% || (Euro- News) 18,0% || 0 || ✓  | 
||
|-   | 
  |-   | 
||
| 8  || 2 julhet—8 julhet ||           | 
  | 8  || 2 julhet—8 julhet || Adding words<br/>Transfer rules fra > oci<br/>Testvoc fra > oci ||  ~22,000       ||  ~91.0%  ||         ||  adj     ||    ||  22609 ||  91,9% ||   || 0 || ✓  | 
||
|-   | 
  |-   | 
||
| 9  || 9 julhet—  | 
  | 9  || <b>9 julhet—13 julhet<br>Deliverable #2: French to Occitan translator</b> || Transfer rules fra > oci<br/>Testvoc fra > oci ||  <b>~22,000</b>     ||  <b>~91.0%</b>  ||  <b>~15%</b>   ||  n        ||    || 25045   || 92,1% || (WP) 7,2% || 0 || ✓  | 
||
| ⚫ | |||
| ⚫ | |||
|-  | 
  |-  | 
||
|   | 
  | 0  || <b>occitan > français</b> ||  ||    ~22,000       ||          ||         ||   ||        ||    ||    || ||   ||   | 
||
| ⚫ | |||
| ⚫ | |||
|-  | 
  |-  | 
||
|   | 
  | 13   || 23 julhet—29 julhet || Adding words<br/>Transfer rules oci > fra <br/>Testvoc oci > fra ||   ~23,000      ||  ~89.0%   ||       || n, adj ||    || 25504 || 92,1% ||   || 1 || ✓  | 
||
|-   | 
  |-   | 
||
| 11    || 30 julhet—5 agost ||                | 
  | 11    || 30 julhet—5 agost || Adding words<br/>Transfer rules oci > fra <br/>Testvoc oci > fra ||   ~23,500      ||  ~90.0% ||         || vblex ||    || 26908 || oci>fra 92,9% fra>oci 92,3% || fra>oci (WP) 10,0% || 0 || ½  | 
||
|-   | 
  |-   | 
||
| 12   || 6 agost—  | 
  | 12*   || 6 agost—9 agost || Final improvements ||         ||    ||     || ||    ||    || ||   ||   ||   | 
||
|-   | 
  |-   | 
||
|   | 
  | 12**   || <b>10 agost—14 agost<br>Deliverable #3: Occitan to French translator</b> || Final evalution|| <b>~23,500</b>  || <b>~90.0%</b> ||  <b>~30%</b>   ||             ||    ||    || ||   ||   ||   | 
||
|-   | 
  |-   | 
||
|}  | 
  |}  | 
||
| Line 50: | Line 52: | ||
<pre>  | 
  <pre>  | 
||
$ cat apertium-oci-fra.oci-fra.dix | grep '<l' | wc -l  | 
  $ cat apertium-oci-fra.oci-fra.dix | grep '<l'  | grep -v '"cog"'  | grep -v "oci@" | wc -l  | 
||
</pre>  | 
  </pre>  | 
||
| Line 56: | Line 58: | ||
<pre>  | 
  <pre>  | 
||
$ ./coverage_fra_oci.sh  | 
|||
$ cat ../apertium-fra/corpus/corpus_fra_wp100000.txt | apertium -d . fra-oci-morph | sed 's/\$\W*\^/$\n^/g' > /tmp/fra-oci.coverage.txt  | 
  |||
$ calc `cat /tmp/fra-oci.coverage.txt | grep -v '\*' | wc -l `/`cat /tmp/fra-oci.coverage.txt | wc -l`  | 
  |||
$ ./coverage_oci_fra.sh  | 
|||
$ cat ../apertium-cat/corpus/corpus_oci_wp100000.txt | apertium -d . oci-fra-morph | sed 's/\$\W*\^/$\n^/g' > /tmp/oci-fra.coverage.txt  | 
  |||
$ calc `cat /tmp/oci-fra.coverage.txt | grep -v '\*' | wc -l `/`cat /tmp/oci-fra.coverage.txt | wc -l`  | 
  |||
</pre>  | 
  </pre>  | 
||
Latest revision as of 10:44, 9 August 2018
- Note: The French → Occitan part of the project is the main direction.
 - Nòta: La part francés → occitan del projècte es la direccion principala.
 - Note : La partie français → occitan du projet est la direction principale.
 
| Setmana | Dates | Descripció | Bidix (sense np) previst  | 
(%) Cobertura prevista  | 
(%) WER previst  | 
Testvoc | Avaluació | Bidix real  | 
(%) Cobertura real  | 
(%) WER | Err. | Fet? | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | français > occitan | ~5 700 | ||||||||||
| 1 | 14 mai—20 mai | Improving Occitan monodix Adding prn, pr, cnj*, basic adv to bidix  | 
~6,000 | ~84,0% | 7643 | 77,1% | ½ | |||||
| 2 | 21 mai—27 mai | Adding n, adj, adv to the bidix from the French Wictionary | ~12,000 | ~86,0% | 12811 | 82,2% | ½ | |||||
| 3 | 28 mai—3 junh | Adding vblex to the bidix from the French Wictionary Beginning to add missing words in decreasing order of frequency fra > oci  | 
~14,000 | ~88.0% | 14452 | 85,1% | ½ | |||||
| 4 | 4 junh—10 junh | Adding words Transfer rules fra > oci  | 
~16,000 | ~89.0% | 16745 | 89,2% | ✓ | |||||
| 5 | 11 junh—15 junh Deliverable #1: French to Occitan translator  | 
Adding words Transfer rules fra > oci  | 
~18,000 | ~89.5% | ~25% | 19897 | 91,1% | (WP) 15,0% | ✓ | |||
| 6 | 18 junh—24 junh | Adding words Transfer rules fra > oci Begin testvoc fra > oci  | 
~20,000 | ~90.0% | pr, cnj*, adv, prn, det | 20581 | 91,5% | (WP) 12,3% | 0 | ✓ | ||
| 7 | 25 junh—1 julhet | Adding words Transfer rules fra > oci Testvoc fra > oci  | 
~21,000 | ~90.5% | vblex | 21823 | 91,8% | (Euro- News) 18,0% | 0 | ✓ | ||
| 8 | 2 julhet—8 julhet | Adding words Transfer rules fra > oci Testvoc fra > oci  | 
~22,000 | ~91.0% | adj | 22609 | 91,9% | 0 | ✓ | |||
| 9 | 9 julhet—13 julhet Deliverable #2: French to Occitan translator  | 
Transfer rules fra > oci Testvoc fra > oci  | 
~22,000 | ~91.0% | ~15% | n | 25045 | 92,1% | (WP) 7,2% | 0 | ✓ | |
| 0 | occitan > français | ~22,000 | ||||||||||
| 10 | 16 julhet—22 julhet | Adding  missing words in decreasing order of frequency oci > fra Transfer rules oci > fra Testvoc oci > fra  | 
~22,500 | ~88.0% | pr, cnj*, adv, prn, det | 25161 | 91,7% | fra>oci (Euro- News) 6,6% | 10 | ✓ | ||
| 13 | 23 julhet—29 julhet | Adding words Transfer rules oci > fra Testvoc oci > fra  | 
~23,000 | ~89.0% | n, adj | 25504 | 92,1% | 1 | ✓ | |||
| 11 | 30 julhet—5 agost | Adding words Transfer rules oci > fra Testvoc oci > fra  | 
~23,500 | ~90.0% | vblex | 26908 | oci>fra 92,9% fra>oci 92,3% | fra>oci (WP) 10,0% | 0 | ½ | ||
| 12* | 6 agost—9 agost | Final improvements | ||||||||||
| 12** | 10 agost—14 agost Deliverable #3: Occitan to French translator  | 
Final evalution | ~23,500 | ~90.0% | ~30% | 
Per a calcular els nombres[edit]
- Errors (calcular en apertium-fra-oci/dev)
 
$ bash dev/testvoc/generation.sh fra-oci | wc -l # en apertium-oci-fra $ bash dev/testvoc/generation.sh oci-fra | wc -l # en apertium-oci-fra
- Bidix (calcular en apertium-oci-fra)
 
$ cat apertium-oci-fra.oci-fra.dix | grep '<l' | grep -v '"cog"' | grep -v "oci@" | wc -l
- Cobertura (calcular en apertium-oci-fra)
 
$ ./coverage_fra_oci.sh $ ./coverage_oci_fra.sh