German/Noun Capitalisation

From Apertium
Jump to navigation Jump to search

Why apertium-deu has two entries per word:

Both the word «paradigm» and «Paradigm» need to be analysable, and they need different tags, so there's a "long-distance" relation from (the capitalisation of) the first letter of the word to the last tag – that means either you put the full word inside the paradigm (ouch, basically a full-form list) or two entries per word.

Because:

  • if you want to analyse «Kompositumparadigm»

then you need «paradigm» (also if you want to analyse typos)

  • while we want to generate «Paradigm» but only when it's the first part

Possible alternatives:

  • lower-case-entries only for nouns in the monodix, then let transfer capitalise nouns on translating into deu
    • lt-proc already gives analyses for «Paradigm» if only «paradigm» is in the dix
  • making some other, separate casing-module