Difference between revisions of "Weights in the pipeline"

From Apertium
Jump to navigation Jump to search
Line 28: Line 28:
   
 
Maybe transfer could combine them and output a single one.
 
Maybe transfer could combine them and output a single one.
  +
  +
Are weights interpretable across LUs ? or are they restricted to within-LU?
   
 
In general I think we don't want to be writing rules that say "if this weight is >= 0.6 then ...". If weights are used they should be probably combined directly with other weights, or in terms of probability mass.
 
In general I think we don't want to be writing rules that say "if this weight is >= 0.6 then ...". If weights are used they should be probably combined directly with other weights, or in terms of probability mass.

Revision as of 14:22, 22 June 2020

In some cases we want to be able to pass "weights" along in the pipeline.

What are weights? Well they could be probabilities, or they could be scores, or lambdas (feature weights) or anything, but we probably want to define what they are.

morph:

^Emplea/emplear<vblex><pri><p3><sg><0.9424>/emplear<vblex><imp><p2><sg><0.2323>$ ^a/a<pr><0.9934>/a<n><m><sg><0.0123>$ ^un/uno<det><ind><m><sg>$ ^70%/70%<num><0.9999>$ ^del/de<pr>+el<det><def><m><sg>$ ^total/total<adj><mf><sg>/total<n><m><sg>$ ^de/de<pr>$ ^asalariados/asalariado<adj><m><pl>/asalariado<n><m><pl>$^./.<sent>$^./.<sent>$

tagger:

biltrans:

^Emplear<vblex><pri><p3><sg><0.9424>/Emprar<vblex><pri><p3><sg><0.9424><0.7343>/Ocupar<vblex><pri><p3><sg><0.9424><0.3204>$ ^a<pr>/a<pr><0.8930>/<0.0324>/de<pr><0.2342>$ ^uno<det><ind><m><sg>/un<det><ind><m><sg>$ ^70%<num>/70%<num>$ ^de<pr>/de<pr>$ ^el<det><def><m><sg>/el<det><def><m><sg>$ ^total<n><m><sg>/total<n><m><sg>$ ^de<pr>/de<pr>$ ^asalariado<n><m><pl>/assalariat<n><m><pl>$^.<sent>/.<sent>$^.<sent>/.<sent>$

lexsel:


transfer:

Should we maintain the weights of previous modules throughout the pipe?

  • Pros:
    • If we do this then we can output a confidence for the translation. This could be exposed through the web interface (like GF does with colourcoding)
  • Cons:
    • There could be many weights.

Maybe transfer could combine them and output a single one.

Are weights interpretable across LUs ? or are they restricted to within-LU?

In general I think we don't want to be writing rules that say "if this weight is >= 0.6 then ...". If weights are used they should be probably combined directly with other weights, or in terms of probability mass.