Difference between revisions of "Bilingual dictionary discovery"

Revision as of 08:58, 14 July 2014

vs crossdics

кил--йорт are not strongly-connected to each other, but hypothesising an arc between them would make them a strongly-connected subgraph along with ev and дом. The size of the strongly-connected subgraph (here: 4) could be an indicator of the strength of the association, but strongly-connected subgraphs might be too hard a requirement.

You could still get кил--йорт through crossdics. If crossdics gives the subgraphs of size 3 (where one arc is hypothesized), then the intersection of runs of crossdics (chv-rus-tat and chv-tur-tat) doesn't necessarily give the subgraphs of size 4 – that would require the rus-tur connection as well, while the crossdics intersection doesn't require that.

Two things we wouldn't get from crossdics:

the fact that even with one arc missing, is still stronger than the simple chv-rus-tat crossdics (due to the extra route via tur),
the possibility of adding translations where both crossings would have lacunae, but doublecrossing shows a translation:
possibly a bad translation, but if there are no shorter paths for either word, it might be worth it

Restrictions on sub-graphs:

Only one word per input language
Prune words with only a single output arc.
Only accept words where there is a cycle(?)

Some ideas:

Weighting
- Outgoing arcs get 1/number of arcs?
Using more monolingual data, e.g. each word gets an SL concordance/context vector.

Notes

↑ http://en.wikipedia.org/wiki/Strongly_connected_components

Revision as of 07:34, 14 July 2014 (edit) Unhammer (talk \| contribs) ← Older edit		Revision as of 08:58, 14 July 2014 (edit) (undo) Francis Tyers (talk \| contribs) Newer edit →
Line 1:		Line 1:
			{{TOCD}}
	This page describes a way of discovering new bilingual, or multilingual dictionaries.		This page describes a way of discovering new bilingual, or multilingual dictionaries.

Difference between revisions of "Bilingual dictionary discovery"

Revision as of 08:58, 14 July 2014

Contents

vs crossdics

Restrictions on sub-graphs:

Some ideas:

Notes

Further reading

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools