User:Junzay/Blank handling
Jump to navigation
Jump to search
GsoC 2016 project
Code at https://github.com/junaidiiith/Apertium / https://github.com/junaidiiith/Apertium_Code
What works currently
The deformatter and the reformatter works for now. There's still more testing that needs to be done. The fst processor works fine to distribute the tags efficiently and correctly to the words. The pretransfer works fine with testing phase completed. The transfer, interchunk and post-chunk are completed, but still more testing needs to be done. This is how the chain works as of now:
Before deformatter:
<p><i>Hello man</i> tea <u>pot</u> <div><i>Just see the point she's got it</i><u> I couldn't do it</u></div>
After deformatter:
[2][{5}]Post man[] tea [{11}] pot[3][{8}] Just see the point she's got it[][{9}] I couldn't do it [4]
After pretransfer:
[2][{5}]^Post<n><sg>$ [{5}]^man<n><sg>$[] ^tea<n><sg>$ [{11}]^pot<n><sg>$[3] [{8}]^Just<adv>$ [{8}]^see# the point<vblex><inf>$ [{8}]^prpers<prn><subj><p3><f><sg>$ [{8}]^have got<vblex><pri><p3><sg>$ [{8}]^prpers<prn><obj><p3><nt><sg>$[] [{9}]^prpers<prn><subj><p1><mf><sg>$ [{9}]^can<vaux><past>$ [{9}]^not<adv>$ [{9}]^do<vbdo><pres>$ [{9}]^prpers<prn><subj><p3><nt><sg>$[4]
After transfer:
[2]^Nom_pr_nom_pr_nom_pr_nom<SN><UNDET><m><sg>{[{11}]^tarro<n><3><4>$ [{5}]^de<pr>$[] ^té<n><m><sg>$ [{5}]^de<pr>$ [{5}]^hombre<n><m><sg>$ [{5}]^de<pr>$ [{5}]^correo<n><m><sg>$}$[3] ^adv<adv>{[{8}]^Justo<adv>$[]}$ ^inf<SV><vblex><inf><PD><ND>{[{8}]^coger<vblex><3># la gracia$[]}$ ^prnsubj<SN><tn><p3><f><sg>{[{8}]^prpers<prn><2><p3><4><sg>$[]}$ ^pro_verbcj<SV><vblex><pri><p3><sg>{[{8}]^prpers<prn><pro><p3><m><sg>$ [{8}]^tener<vblex><3><4><5>$[]}$ ^prnsubj<SN><tn><p1><mf><sg>{[{9}]^prpers<prn><2><p1><4><sg>$[]}$ ^mod<SV><vbmod><cni><PD><ND>{[{9}]^poder<vbmod><3><4><5>$[]}$ ^adv<adv><NEG>{[{9}]^no<adv>$[]}$ ^prnsubj<SN><tn><p3><m><sg>{[{9}]^prpers<prn><2><p3><4><sg>$}$ [4]
After interchunk:
[2]^Nom_pr_nom_pr_nom_pr_nom<SN><UNDET><m><sg>{[{11}]^tarro<n><3><4>$ [{5}]^de<pr>$[] ^té<n><m><sg>$ [{5}]^de<pr>$ [{5}]^hombre<n><m><sg>$ [{5}]^de<pr>$ [{5}]^correo<n><m><sg>$}$ [3] ^adv<adv> {[{8}]^Justo<adv>$[]}$ ^inf<SV><vblex><pri><p3><sg>{[{8}]^coger<vblex><3># la gracia$[]}$ ^pro_verbcj<SV><vblex><prs><p3><sg>{[{8}]^prpers<prn><pro><p3><m><sg>$ [{8}]^tener<vblex><3><4><5>$[]}$ ^mod<SV><vbmod><cni><p1><sg>{[{9}]^poder<vbmod><3><4><5>$[]}$ ^adv<adv><NEG>{[{9}]^no<adv>$[]}$ ^prnsubj<SN><tn><p3><m><sg>{[{9}]^prpers<prn><2><p3><4><sg>$}$ [4]
After postchunk:
[2][{11}]^Tarro<n><m><sg>$ [{5}]^de<pr>$ ^té<n><m><sg>$ [{5}]^de<pr>$ [{5}]^hombre<n><m><sg>$ [{5}]^de<pr>$ [{5}]^correo<n><m><sg>$ [3][{8}]^Justo<adv>$ [{8}]^coger<vblex><pri><p3><sg># la gracia$ [{8}]^prpers<prn><pro><p3><m><sg>$ [{8}]^tener<vblex><prs><p3><sg>$[] [{9}]^poder<vbmod><cni><p1><sg>$ [{9}]^no<adv>$[{9}]^prpers<prn><tn><p3><m><sg>$ [4]
After reformatter:
<p><u>Tarro</u> <i>the</i> té <i>de hombre de correo </i> <div><i>Justo coger la gracias prpers tener </i><u> poder no prpers</u></div>