User:Firespeaker/Templatic bidix

From Apertium
Jump to navigation Jump to search

I have this idea that I think would make translations better (via more explicit mappings between languages as well as arbitrary structure mapping) and development easier. This would work by offloading disambiguation and "syntax" to bidix via bidix accepting "translation templates" instead of "words".

This would create a few issues:

  • The user would then have to know the languages in depth to even really being working on a bidix. But isn't this already the ideal case?
  • Addition to / rewrite of bidix (maybe best to fork it and release it as something different)
  • Some way to deal with ranking of preference between different possible mappings
    • Tokenisation / longest-match

Test cases

English/Turkic translations mostly

a long example

  • Хип-хоптун алгачкы хореографы, америкалык өнөрпоз жергиликтүү бийчилер менен жолугушуп, хип-хоп аркылуу ич ара араздашууну жөнгө салуу тажрыйбасын көрсөтүп, маданияттын бул түрү аркылуу жаштарды туура жолго салып, ак жолтой келечек курса болот деген көз карашын жайылтууда.
  • The first hip-hop choreographer, an American specialist, met with local dancers, presented his experience in settling internal disagreements through hip-hop, and advanced his stance that through this sort of culture you can set youth on the right path and built a bright future.

mappings needed

  • [1]<n><gen> [2]<det> [3]<n><px3sp> = the [2]<det> [1]<n> [3]<n>
  • хип-хоп<n> = hip-hop<n>
  • алгачкы<det> = first<det>
  • хореограф<n> = choreographer<n>
  • америкалык<adj> = American<adj>
  • өнөрпоз<n> = specialist<n>
    • {{{1}}}
  • жергиликтүү<adj> = local<adj>
  • бийчи<n> = dancer<n>
  • <pl> = <pl> (a fall-back default?)
  • ( [1 <n>|(<np>.*)] ~ [2 <n>|(<np>.*)] менен ) жолук<v><coop>[3 _tags_] = [1] meet<v>[3] with<prep> [2]
  • {{{1}}}
  • [1 <n>] аркылуу<post> = via<prep> [1]<n>
    • [1 <n>] аркылуу<post> = through<prep> [1]<n>
  • {{{1}}}