Difference between revisions of "German/Noun Capitalisation"

From Apertium
Jump to navigation Jump to search
(Created page with "Why apertium-deu has two entries per word: Both the word «paradigm» and «Paradigm» need to be analysable, and they need different tags, so there's a "long-distance" re...")
 
 
Line 10: Line 10:
Possible alternatives:
Possible alternatives:
* lower-case-entries only for nouns in the monodix, then let transfer capitalise nouns on translating into deu
* lower-case-entries only for nouns in the monodix, then let transfer capitalise nouns on translating into deu
** lt-proc already gives analyses for «Paradigm» if only «paradigm» is in the dix
* making some other, separate casing-module
* making some other, separate casing-module

Latest revision as of 09:58, 18 March 2016

Why apertium-deu has two entries per word:

Both the word «paradigm» and «Paradigm» need to be analysable, and they need different tags, so there's a "long-distance" relation from (the capitalisation of) the first letter of the word to the last tag – that means either you put the full word inside the paradigm (ouch, basically a full-form list) or two entries per word.

Because:

  • if you want to analyse «Kompositumparadigm»

then you need «paradigm» (also if you want to analyse typos)

  • while we want to generate «Paradigm» but only when it's the first part

Possible alternatives:

  • lower-case-entries only for nouns in the monodix, then let transfer capitalise nouns on translating into deu
    • lt-proc already gives analyses for «Paradigm» if only «paradigm» is in the dix
  • making some other, separate casing-module