Difference between revisions of "User:Francis Tyers/Apertium 4"

Revision as of 21:27, 21 June 2020

Better support for Unicode in lttoolbox.
Use embeddings for morphological disambiguation and lexical selection
Pass the surface form until transfer (to allow modules to look up surface form embeddings)
Retire the HMM tagger
Be able to train weights for morph analysis + morph. disambiguation + lexical selection + transfer end to end.
- e.g. can we treat the modules of the pipeline as a neural net and train the weights for them via backprop?
Fully functional recursive transfer
Per session state, this could be stored in something like a special blank that could be updated. It might contain things like domain, etc.

apertium-neural: There should be a basic NMT implementation that functions in the Apertium ecosystem (C++,autotools,bash,apy,html-tools) for communities that want to build their own NMT systems and still take advantage of our ecosystem. We should be a one-stop shop for MT for marginalised langs.

@@ Line 27: / Line 27: @@
 * [[apertium-neural]]: There should be a basic NMT implementation that functions in the Apertium ecosystem (C++,autotools,bash,apy,html-tools) for communities that want to build their own NMT systems and still take advantage of our ecosystem. We should be a one-stop shop for MT for marginalised langs.
+== Data creation ==
+* Automatic multiword extraction using parallel corpora
+* Recursive rule extraction
 == End user ==