Weights in the pipeline

In some cases we want to be able to pass "weights" along in the pipeline.

What are weights? Well they could be probabilities, or they could be scores, or lambdas (feature weights) or anything, but we probably want to define what they are.

morph:

^Emplea/emplear<vblex><pri><p3><sg><0.9424>/emplear<vblex><imp><p2><sg><0.2323>$ ^a/a<pr><0.9934>/a<n><m><sg><0.0123>$ ^un/uno<det><ind><m><sg>$ ^70%/70%<num><0.9999>$ ^del/de<pr>+el<det><def><m><sg>$ ^total/total<adj><mf><sg>/total<n><m><sg>$ ^de/de<pr>$ ^asalariados/asalariado<adj><m><pl>/asalariado<n><m><pl>$^./.<sent>$^./.<sent>$

tagger:

biltrans:

^Emplear<vblex><pri><p3><sg><0.9424>/Emprar<vblex><pri><p3><sg><0.9424><0.7343>/Ocupar<vblex><pri><p3><sg><0.9424><0.3204>$ ^a<pr>/a<pr><0.8930>/<0.0324>/de<pr><0.2342>$ ^uno<det><ind><m><sg>/un<det><ind><m><sg>$ ^70%<num>/70%<num>$ ^de<pr>/de<pr>$ ^el<det><def><m><sg>/el<det><def><m><sg>$ ^total<n><m><sg>/total<n><m><sg>$ ^de<pr>/de<pr>$ ^asalariado<n><m><pl>/assalariat<n><m><pl>$^.<sent>/.<sent>$^.<sent>/.<sent>$

lexsel:


transfer:

Should we maintain the weights of previous modules throughout the pipe?

Pros:
- If we do this then we can output a confidence for the translation. This could be exposed through the web interface (like GF does with colourcoding)
Cons:
- There could be many weights.

Maybe transfer could combine them and output a single one.

Are weights interpretable across LUs ? or are they restricted to within-LU?

What are the boundaries? Tagger (we choose the "LU") Lexsel (we choose the target "LU")
- Unless we have a reweighting step, but even then the boundaries would exist.

In general I think we don't want to be writing rules that say "if this weight is >= 0.6 then ...". If weights are used they should be probably combined directly with other weights, or in terms of probability mass.

Weights in the pipeline

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools