Difference between revisions of "User:Khannatanmai/Secondary tags features"

From Apertium
Jump to navigation Jump to search
(New page to document features related to secondary tags)
 
Line 5: Line 5:
 
* Secondary tags (stags) are ignored while pattern matching for rules.
 
* Secondary tags (stags) are ignored while pattern matching for rules.
 
* Attribute "tags" (in t1x) gets only primary and not secondary tags. (Ensures no regression)
 
* Attribute "tags" (in t1x) gets only primary and not secondary tags. (Ensures no regression)
* "whole" gets the whole LU including secondary tags (could this be a problem?)
+
* "whole" gets the whole LU including secondary tags.
 
* New attribute "stags" gets all secondary tags. (can be used in clip).
 
* New attribute "stags" gets all secondary tags. (can be used in clip).
 
* Secondary tags are added in the output LU from the LU that the lem/lemh is clipped from.
 
* Secondary tags are added in the output LU from the LU that the lem/lemh is clipped from.
* If the lem/lemh comes from a variable in the output then the stags come from the LU which the lemma comes from, by tracing its variable assignment in <let>.
+
* If the lem/lemh comes from a variable in the output then the stags come from the LU which the lemma comes from, by tracing its variable assignment in <code><let></code>.
 
* '''No regression.''' Stream without secondary tags work as-is.
 
* '''No regression.''' Stream without secondary tags work as-is.
   

Revision as of 19:04, 12 May 2020

This page will list all the features being added to the pipe to deal with secondary tags. To follow updates on the development, see [[ ]]. This was done as part of the Google Summer of Code 2020. Proposal. Progress.

Module-specific features

Chunker (t1x)

  • Secondary tags (stags) are ignored while pattern matching for rules.
  • Attribute "tags" (in t1x) gets only primary and not secondary tags. (Ensures no regression)
  • "whole" gets the whole LU including secondary tags.
  • New attribute "stags" gets all secondary tags. (can be used in clip).
  • Secondary tags are added in the output LU from the LU that the lem/lemh is clipped from.
  • If the lem/lemh comes from a variable in the output then the stags come from the LU which the lemma comes from, by tracing its variable assignment in <let>.
  • No regression. Stream without secondary tags work as-is.

Example Usage (Here the secondary tags show the surface form):

Input:

^El<det><def><m><pl><sf:Los>/The<det><def><m><pl><sf:Los>$ ^perro<n><m><pl><sf:perros>/dog<n><m><pl><sf:perros>$ ^de<pr><sf:del>/of<pr><sf:del>/from<pr><sf:del>$ ^el<det><def><m><sg><sf:del>/the<det><def><m><sg><sf:del>$ ^chico<n><m><sg><sf:chico>/boy<n><sg><sf:chico>$ ^correr<vblex><pri><p3><pl><sf:corren>/run<vblex><pri><p3><pl><sf:corren>$ ^rápido<adj><m><sg><sf:rápido>/fast<adj><sint><m><sg><sf:rápido>$ ^.<sent><sf:.>/.<sent><sf:.>$ ^.<sent><sf:.>/.<sent><sf:.>$[][ ]

Output:

^Det_nom<SN><m><pl>{^the<det><def><3><sf:Los>$ ^dog<n><3><sf:perros>$}$ ^de<PREP>{^of<pr><sf:del>$}$ ^det_nom<SN><m><sg>{^the<det><def><3><sf:del>$ ^boy<n><3><sf:chico>$}$ ^verbcj<SV><vblex><pri><p3><pl>{^run<vblex><pres><sf:corren>$}$ ^adj<SA><m><sg>{^fast<adj><sint><sf:rápido>$}$^punt<sent>{^.<sent><sf:.>$}$^punt<sent>{^.<sent><sf:.>$}$[][ ]