Difference between revisions of "Speling format"
m (fix example) |
(Link to French page) |
||
(3 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
[[Speling format (français)|En français]] |
|||
The '''Speling format''' is a way of representing "full form" vocabulary lists in a way that makes generating paradigms, and lemma-paradigm pairs for Apertium [[monodix|monodices]] easy. It is similar in principle to the "expanded" format of [[lttoolbox]], but with the advantage that the part-of-speech is separated from the rest of the sub-categories/features. |
The '''Speling format''' is a way of representing "full form" vocabulary lists in a way that makes generating paradigms, and lemma-paradigm pairs for Apertium [[monodix|monodices]] easy. It is similar in principle to the "expanded" format of [[lttoolbox]], but with the advantage that the part-of-speech is separated from the rest of the sub-categories/features. |
||
Line 32: | Line 34: | ||
The format is named after [http://www.speling.org speling.org] who collect full form lists for spell-checkers in several Germanic languages and Finnish. |
The format is named after [http://www.speling.org speling.org] who collect full form lists for spell-checkers in several Germanic languages and Finnish. |
||
==See also== |
|||
* [[Speling tools]] |
|||
* [[Paradigm chopper]] |
|||
[[Category:Development]] |
[[Category:Development]] |
||
[[Category:Formats]] |
[[Category:Formats]] |
||
[[Category:Documentation in English]] |
Latest revision as of 09:55, 6 October 2014
The Speling format is a way of representing "full form" vocabulary lists in a way that makes generating paradigms, and lemma-paradigm pairs for Apertium monodices easy. It is similar in principle to the "expanded" format of lttoolbox, but with the advantage that the part-of-speech is separated from the rest of the sub-categories/features.
The format is broadly organised as follows:
lemma; surface form; features; part-of-speech
So for example in English we might see some noun inflection represented as:
house; house; sg; n house; houses; pl; n computer; computer; sg; n computer; computers; pl; n bird; bird; sg; n bird; birds; pl; n wolf; wolf; sg; n wolf; wolves; pl; n
Or in Spanish:
casa; casa; sg; n.f casa; casas; pl; n.f ...
From lists such as these it is fairly straightforward in languages with basic inflectional morphology to generate paradigms, or alternatively to generate partial paradigms for languages with richer morphology.
The format is named after speling.org who collect full form lists for spell-checkers in several Germanic languages and Finnish.