Difference between revisions of "Bilingual dictionary discovery"

Revision as of 11:34, 9 February 2015

vs crossdics

кил--йорт are not strongly-connected to each other, but hypothesising an arc between them would make them a strongly-connected subgraph along with ev and дом. The size of the strongly-connected subgraph (here: 4) could be an indicator of the strength of the association, but strongly-connected subgraphs might be too hard a requirement.

You could still get кил--йорт through crossdics. If crossdics gives the subgraphs of size 3 (where one arc is hypothesized), then the intersection of runs of crossdics (chv-rus-tat and chv-tur-tat) doesn't necessarily give the subgraphs of size 4 – that would require the rus-tur connection as well, while the crossdics intersection doesn't require that.

Two things we wouldn't get from crossdics:

the fact that even with one arc missing, is still stronger than the simple chv-rus-tat crossdics (due to the extra route via tur),
the possibility of adding translations where both crossings would have lacunae, but doublecrossing shows a translation:
possibly a bad translation, but if there are no shorter paths for either word, it might be worth it

Restrictions on sub-graphs:

Only one word per input language
Prune words with only a single output arc.
Only accept words where there is a cycle(?)

Some ideas:

Weighting
- Outgoing arcs get 1/number of arcs?
Using more monolingual data, e.g. each word gets an SL concordance/context vector.

Notes

↑ http://en.wikipedia.org/wiki/Strongly_connected_components

@@ Line 52: / Line 52: @@
 * [http://www.lrec-conf.org/proceedings/lrec2014/pdf/417_Paper.pdf Bilingual Dictionary Induction as an Optimisation Problem]
 * [http://turing.cs.washington.edu/papers/mausam-acl-ijcnlp-09.pdf Compiling a Massive, Multilingual Dictionary via Probabilistic Inference]
+* [http://www.mt-archive.info/10/MTS-2013-Sato.pdf Terminology-driven Augmentation of Bilingual Terminologies]
 [[Category:Development]]

Difference between revisions of "Bilingual dictionary discovery"

Revision as of 11:34, 9 February 2015

Contents

vs crossdics

Restrictions on sub-graphs:

Some ideas:

Notes

Further reading

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools