Difference between revisions of "User:Junzay/Blank handling"

From Apertium
Jump to navigation Jump to search
Line 30: Line 30:
 
After reformatter:
 
After reformatter:
 
<pre><p><u>Tarro</u> <i>the</i> té <i>de hombre de correo </i> <div><i>Justo coger la gracias prpers tener </i><u> poder no prpers</u></div> </pre>
 
<pre><p><u>Tarro</u> <i>the</i> té <i>de hombre de correo </i> <div><i>Justo coger la gracias prpers tener </i><u> poder no prpers</u></div> </pre>
 
==TODO==
 
* Fill in the "what works" section above
 
* Makefile for deformatter/reformatter code
 
* Merge tests from https://github.com/junaidiiith/Apertium master into blank-handling, write tests for transfer/interchunk using the structure of the pretransfer tests
 
* Fix <code><nowiki>[1]</nowiki></code> being printed twice and <code><nowiki>[4]</nowiki></code> not at all when testing interchunk with apertium-nno-nob:
 
<pre>$ git log -1
 
commit 6ec869c012b2965f619e0a0532b8ca4cdf335d18
 
Author: junaidiiith <junaid695683@gmail.com>
 
Date: Sun Jul 31 17:51:54 2016 +0530
 
 
Transfer and interchunk updated
 
 
$ make --quiet
 
Making all in apertium
 
 
$ echo '[1]^gen-prep<pr>{^til<pr>$}$ [3]^n<n><m><sg><def><gen>{^bil<n><m><sg><def>$}$[4]^n<n><nt><sg><ind>{[{2}]^problem<n><nt><sg><ind>$}$[]' \
 
| apertium/apertium-interchunk /l/n/apertium-nno-nob.nob-nno.t2x /l/n/nob-nno.t2x.bin 2>/dev/null
 
[1]^n<n><nt><sg><def>{[{2}]^problem<n><nt><sg><ind>$}$[1]^gen-prep<pr>{^til<pr>$}$ [3]^n<n><m><sg><def><gen>{^bil<n><m><sg><def>$}$[]
 
</pre>
 
* When transfer etc. is working: clean up and merge in changes from SVN
 
   
 
==See also==
 
==See also==

Revision as of 22:15, 12 August 2016

GsoC 2016 project

Code at https://github.com/junaidiiith/Apertium / https://github.com/junaidiiith/Apertium_Code

What works currently

The deformatter and the reformatter works for now. There's still more testing that needs to be done. The fst processor works fine to distribute the tags efficiently and correctly to the words. The pretransfer works fine with testing phase completed. The transfer, interchunk and post-chunk are completed, but still more testing needs to be done. This is how the chain works as of now:

Before deformatter:

<p><i>Hello man</i> tea <u>pot</u> <div><i>Just see the point she's got it</i><u> I couldn't do it</u></div>

After deformatter:

[2][{5}]Post man[] tea [{11}] pot[3][{8}] Just see the point she's got it[][{9}] I couldn't do it [4]

After pretransfer:

[2][{5}]^Post<n><sg>$ [{5}]^man<n><sg>$[] ^tea<n><sg>$  [{11}]^pot<n><sg>$[3] [{8}]^Just<adv>$ [{8}]^see# the point<vblex><inf>$ [{8}]^prpers<prn><subj><p3><f><sg>$ [{8}]^have got<vblex><pri><p3><sg>$ [{8}]^prpers<prn><obj><p3><nt><sg>$[]  [{9}]^prpers<prn><subj><p1><mf><sg>$ [{9}]^can<vaux><past>$ [{9}]^not<adv>$ [{9}]^do<vbdo><pres>$ [{9}]^prpers<prn><subj><p3><nt><sg>$[4] 

After transfer:

[2]^Nom_pr_nom_pr_nom_pr_nom<SN><UNDET><m><sg>{[{11}]^tarro<n><3><4>$ [{5}]^de<pr>$[] ^té<n><m><sg>$  [{5}]^de<pr>$ [{5}]^hombre<n><m><sg>$ [{5}]^de<pr>$ [{5}]^correo<n><m><sg>$}$[3] ^adv<adv>{[{8}]^Justo<adv>$[]}$ ^inf<SV><vblex><inf><PD><ND>{[{8}]^coger<vblex><3># la gracia$[]}$ ^prnsubj<SN><tn><p3><f><sg>{[{8}]^prpers<prn><2><p3><4><sg>$[]}$ ^pro_verbcj<SV><vblex><pri><p3><sg>{[{8}]^prpers<prn><pro><p3><m><sg>$ [{8}]^tener<vblex><3><4><5>$[]}$  ^prnsubj<SN><tn><p1><mf><sg>{[{9}]^prpers<prn><2><p1><4><sg>$[]}$ ^mod<SV><vbmod><cni><PD><ND>{[{9}]^poder<vbmod><3><4><5>$[]}$  ^adv<adv><NEG>{[{9}]^no<adv>$[]}$   ^prnsubj<SN><tn><p3><m><sg>{[{9}]^prpers<prn><2><p3><4><sg>$}$ [4]

After interchunk:

[2]^Nom_pr_nom_pr_nom_pr_nom<SN><UNDET><m><sg>{[{11}]^tarro<n><3><4>$ [{5}]^de<pr>$[] ^té<n><m><sg>$  [{5}]^de<pr>$ [{5}]^hombre<n><m><sg>$ [{5}]^de<pr>$ [{5}]^correo<n><m><sg>$}$ [3] ^adv<adv> {[{8}]^Justo<adv>$[]}$  ^inf<SV><vblex><pri><p3><sg>{[{8}]^coger<vblex><3># la gracia$[]}$  ^pro_verbcj<SV><vblex><prs><p3><sg>{[{8}]^prpers<prn><pro><p3><m><sg>$ [{8}]^tener<vblex><3><4><5>$[]}$  ^mod<SV><vbmod><cni><p1><sg>{[{9}]^poder<vbmod><3><4><5>$[]}$  ^adv<adv><NEG>{[{9}]^no<adv>$[]}$   ^prnsubj<SN><tn><p3><m><sg>{[{9}]^prpers<prn><2><p3><4><sg>$}$ [4]

After postchunk:

[2][{11}]^Tarro<n><m><sg>$ [{5}]^de<pr>$ ^té<n><m><sg>$   [{5}]^de<pr>$ [{5}]^hombre<n><m><sg>$ [{5}]^de<pr>$ [{5}]^correo<n><m><sg>$ [3][{8}]^Justo<adv>$ [{8}]^coger<vblex><pri><p3><sg># la gracia$  [{8}]^prpers<prn><pro><p3><m><sg>$ [{8}]^tener<vblex><prs><p3><sg>$[] [{9}]^poder<vbmod><cni><p1><sg>$  [{9}]^no<adv>$[{9}]^prpers<prn><tn><p3><m><sg>$ [4]

After reformatter:

<p><u>Tarro</u> <i>the</i> té <i>de hombre de correo </i> <div><i>Justo coger la gracias prpers tener </i><u> poder no prpers</u></div> 

See also