Difference between revisions of "Beginner's Constraint Grammar HOWTO"

From Apertium
Jump to navigation Jump to search
(Replaced content with ' ==Download== ;Apertium ;Constraint grammar ==Install== ;Apertium ;Constraint grammar Category:Documentation')
Line 1: Line 1:
=General for CG=
 
   
  +
==Download==
Constraint Grammar (CG) is a methodological paradigm for [http://en.wikipedia.org/wiki/Natural_language_processing Natural language processing] (NLP). Linguist-written, context dependent rules are compiled into a grammar that assigns grammatical tags ("readings") to words or other tokens in running text. Typical tags address [http://en.wikipedia.org/wiki/Lemmatisation lemmatisation] (lexeme or base form), [http://en.wikipedia.org/wiki/Inflexion inflexion], [http://en.wikipedia.org/wiki/Derivation_%28linguistics%29 derivation], [http://en.wikipedia.org/wiki/Syntactic_function syntactic function], dependency, [http://en.wikipedia.org/wiki/Valency_%28linguistics%29 valency], [http://en.wikipedia.org/wiki/Case_role case roles], [http://en.wikipedia.org/wiki/Semantic semantic] type etc. Each rule either adds, removes, selects or replaces a tag or a set of grammatical tags in a given sentence context. Context conditions can be linked to any tag or tag set of any word anywhere in the sentence, either locally (defined distances) or globally (undefined distances). Context conditions in the same rule may be linked, i.e. conditioned upon each other, negated, or blocked by interfering words or tags. Typical CGs consist of thousands of rules, that are applied set-wise in progressive steps, covering ever more advanced levels of analysis. Within each level, safe rules are used before heuristic rules, and no rule is allowed to remove the last reading of a given kind, thus providing a high degree of robustness.
 
   
  +
;Apertium
The Constraint Grammar concept was launched by [http://en.wikipedia.org/wiki/Fred_Karlsson Fred Karlsson] in 1990 (Karlsson 1990; Karlsson et al., eds, 1995), and CG taggers and parsers have since been written for a large variety of languages, routinely achieving accuracy F-scores for PoS (word class) of over 99%. A number of syntactic CG systems have reported F-scores of around 95% for syntactic function labels. CG systems can be used to create full syntactic trees in other formalisms by adding small, non-terminal based [http://en.wikipedia.org/wiki/Phrase_structure_grammar phrase structure grammars] or [http://en.wikipedia.org/wiki/Dependency_grammar dependency grammars], and a number of corpus/treebank projects have used Constraint Grammar for automatic annotation. CG methodology has also used in a number of language technology applications, such as [http://en.wikipedia.org/wiki/Spell_checker spell checkers] and [http://en.wikipedia.org/wiki/Machine_translation machine translation] systems.
 
   
  +
;Constraint grammar
   
  +
==Install==
   
  +
;Apertium
=VISLCG3=
 
 
 
'''What is vislcg3'''
 
 
 
Vislcg3 is the newest parser generation from Odense. As its predecessor, vislcg, it is open source. Vislcg3 is licensed under GPL.
 
 
Starting on March 5th 2008, we have migrated to vislcg3. Rule files for vislcg are still available in older revisions. For vislcg3 documentation we recommend the online [http://beta.visl.sdu.dk/cg3.html documentation].
 
 
 
'''Preparations before you install vislcg3'''
 
 
 
The MacOS needs certain libraries to be able to run vislcg3. They can be found by downloading the latest version of ICU [http://site.icu-project.org/download here]. The folder should be saved in the home catalogue and run with the commands:
 
 
 
 
cd ~/icu/source
 
 
./runConfigureICU MacOSX
 
 
gnumake # (or if the machine protests, try with make or gmake)
 
 
gnumake check
 
 
sudo gnumake install
 
 
 
 
After installation, the icu folder may be deleted.
 
 
 
 
'''Commands to check out, install and update the vislcg3 program'''
 
 
 
vislcg3 may be checked out, and later on updated, from Odense via svn, or it may be downloaded from sourceforge. Here, we assume you download it from Odense. Run the following commands:
 
 
 
'''Commands to check out and install vislcg3'''
 
 
 
svn co --username anonymous --password anonymous http://beta.visl.sdu.dk/svn/visl/tools/vislcg3/trunk vislcg3
 
 
cd vislcg3/
 
 
./autogen.sh
 
 
make
 
 
test/runall.pl
 
 
sudo make install
 
 
 
 
Now vislcg3 is installed in /usr/local/bin/, and is ready to be used.
 
 
 
'''''Note:''' If you are logged in as a non-admin user, you need to switch to an admin user before you run the last command (the sudo command): su [admin-username] Replace [admin-username] with a username with administrative privileges. Then type in the corresponding password, and continue with the final step above.''
 
 
 
 
cd vislcg3/trunk/
 
 
./compile-mac.sh # (or: ./compile-linux.sh)
 
 
test/runall.pl
 
 
mv vislcg3 ~/bin/
 
 
 
 
Using this method, vislcg3 is installed in your home dir, in ~/bin/.
 
 
'''''Note:''' If you are using this method, there is no need to do the su + sudo steps outlined in the first case.
 
''
 
 
 
 
'''Commands to update'''
 
 
 
If you already have checked out vislcg3, then you can simply do the following:
 
 
 
cd vislcg3/
 
 
svn up
 
 
./autogen.sh
 
 
make
 
 
test/runall.pl
 
 
sudo make install
 
 
 
'''Tips:''' The vislcg3 is downloaded automatically to victorio every night. If you have access to the svn you can check whether you have the latest version (compare vislcg3 --version on your machine and on victorio, and repeat the steps above if your version is older).
 
 
 
'''Compilation and usage of CG files'''
 
 
 
The CG .rle files can be run as text files, or comiled. They will be compiled with the make TARGET=$LANG command d:
 
 
... | vislcg3 -g src/sme-dis.rle | ...
 
 
Vislcg3 can be run with this command:
 
 
... | vislcg3 -g src/sme-dis.rle | ...
 
 
 
'''Flags'''
 
 
The list of flags can be obtained by vislcg3 --help. That command prints something like this (use the newest version rather than this list):
 
 
 
-bash-3.00$ vislcg3 -h
 
 
VISL CG-3 Disambiguator version 0.9.2.3279
 
 
Usage: vislcg3 [OPTIONS]
 
 
Options:
 
 
'''-h or -? or --help ''' Displays this list.
 
 
'''-V or --version''' Prints version number.
 
 
'''-g or --grammar''' Specifies the grammar file to use for disambiguation.
 
 
'''-p or --vislcg-compat''' Tells the grammar compiler to be compatible with older VISLCG syntax.
 
 
'''--grammar-out''' Writes the compiled grammar back out in textual form to a file.
 
 
'''--grammar-bin ''' Writes the compiled grammar back out in binary form to a file.
 
 
'''--grammar-info ''' Writes the compiled grammar back out in textual form to a file, with lots of statistics and information.
 
 
'''--grammar-only''' Compiles the grammar only.
 
 
'''--trace''' Prints debug output alongside with normal output.
 
 
'''--prefix''' Sets the prefix for mapping. Defaults to @.
 
 
'''--sections''' Number of sections to run. Defaults to running all sections.
 
 
'''--single-run''' Only runs each section once.
 
 
'''--no-mappings''' Disables running any MAP, ADD, or REPLACE rules.
 
 
'''--no-corrections ''' Disables running any SUBSTITUTE or APPEND rules.
 
 
'''--no-before-sections''' Disables running rules from BEFORE-SECTIONS.
 
 
'''--no-sections''' Disables running rules from any SECTION.
 
 
'''--no-after-sections ''' Disables running rules from AFTER-SECTIONS.
 
 
 
'''--num-windows''' Number of windows to keep in before/ahead buffers. Defaults to 2.
 
 
'''--always-span''' Forces all scanning tests to always span across window boundaries.
 
 
'''--soft-limit''' Number of cohorts after which the SOFT-DELIMITERS kick in. Defaults to 300.
 
 
'''--hard-limit''' Number of cohorts after which the window is delimited forcefully. Defaults to 500.
 
 
'''--no-magic-readings''' Prevents running rules on magic readings.
 
 
'''--dep-allow-loops''' Allows the creation of circular dependencies.
 
 
 
'''-O or --stdout''' A file to print output to instead of stdout.
 
 
'''-I or --stdin''' A file to read input from instead of stdin.
 
 
'''-E or --stderr''' A file to print errors to instead of stderr.
 
 
 
'''-C or --codepage-all''' The codepage to use for grammar, input, and output streams. Auto-detects default from environment.
 
 
'''--codepage-grammar''' Codepage to use for grammar. Overrides --codepage-all.
 
 
'''--codepage-input''' Codepage to use for input. Overrides --codepage-all.
 
 
'''--codepage-output''' Codepage to use for output and errors. Overrides --codepage-all.
 
 
 
'''-L or --locale-all''' The locale to use for grammar, input, and output streams. Defaults to en_US_POSIX.
 
 
'''--locale-grammar''' Locale to use for grammar. Overrides --locale-all.
 
 
'''--locale-input''' Locale to use for input. Overrides --locale-all.
 
 
'''--locale-output''' Locale to use for output and errors. Overrides --locale-all.
 
 
 
 
=List of CG systems sorted by language=
 
 
'''Free software'''
 
 
Free software
 
 
[http://beta.visl.sdu.dk/cg3.html VISL CG-3] Constraint Grammar compiler/parser
 
 
*[http://en.wikipedia.org/wiki/Northern_Sami_language North] and [http://en.wikipedia.org/wiki/Lule_Sami_language Lule] Sami, [http://en.wikipedia.org/wiki/Faroese_language Faroese], [http://en.wikipedia.org/wiki/Komi_language Komi] and [http://en.wikipedia.org/wiki/Greenlandic_language Greenlandic] from the [http://en.wikipedia.org/wiki/University_of_Troms%C3%B8 University of Tromsø]
 
** Fred Karlsson's original Finnish FinCG is also available from the University of Tromsø as GPL.
 
*http://en.wikipedia.org/wiki/Norwegian_language Norwegian] Nynorsk and Bokmål online,Oslo-Bergen tagger
 
*http://en.wikipedia.org/wiki/Breton_language Breton], Welsh, Irish Gaelic and http://en.wikipedia.org/wiki/Norwegian_language Norwegian] (converted from the above) in Apertium (see CG in Apertium)
 
 
 
 
'''Non-free software'''
 
 
 
*Basque [http://paginaspersonales.deusto.es/abaitua/konzeptu/nlp/MGnag.html Basque]
 
 
*Catalan [http://mutis.upf.es/cgi-bin/catcg/demo.pl CATCG]
 
 
*Danish [http://beta.visl.sdu.dk/constraint_grammar.html/ DanGram]
 
 
*English [http://www2.lingsoft.fi/cgi-bin/engcg ENGCG], ENGCG-2, [http://beta.visl.sdu.dk/constraint_grammar.html/ VISL-ENGCG]
 
 
*Esperanto http://beta.visl.sdu.dk/constraint_grammar.html/ EspGram]
 
 
*French [http://beta.visl.sdu.dk/constraint_grammar.html/ FrAG]
 
 
*German [http://beta.visl.sdu.dk/constraint_grammar.html/ GerGram]
 
 
*Irish [https://www.cs.tcd.ie/Elaine.UiDhonnchadha/irish.htm online]
 
 
*Italian [http://beta.visl.sdu.dk/visl/it/parsing/automatic/parse.php ItaGram]
 
 
*Spanish [http://beta.visl.sdu.dk/constraint_grammar.html/ HISPAL]
 
 
*Swedish [http://www2.lingsoft.fi/doc/swecg/intro/ SWECG]
 
 
*Swahili
 
 
*Portuguese [http://beta.visl.sdu.dk/constraint_grammar.html/ PALAVRAS]
 
 
 
 
=Method of annotation=
 
 
Both the morphological and syntactic analysers use rule-based linguistic descriptions. The system works in the following way:
 
 
 
1. Tokenisation;
 
 
2. Lookup of morphological tags;
 
 
* Lexical component;
 
 
* Guesser;
 
 
3. Resolution of morphological ambiguities;
 
 
4. Lookup of syntactic tags;
 
 
5. Resolution of syntactic ambiguities
 
 
 
 
=Tokenisation=
 
 
The tokeniser identifies punctuation and multiword units, and splits enclitic forms into grammatical words.
 
 
 
=Morphological lookup=
 
 
 
This process begins with a lexical analysis based on a large lexicon including all inflected and central derived word forms. The lexical analyser assigns all possible morphological analyses to each word that is in the lexicon, and the remaining words are assigned an analysis by means of the guesser (a heuristic rule-based module). These rules are mainly governed by word shape, and if none of them apply, then a nominal analysis is given.
 
 
 
=Resolution of morphological ambiguities=
 
 
 
The rule-based Constraint Grammar parser is used to resolve some of the ambiguities at this stage. The constraints are partial paraphrases of form definitions of syntactic constructs such as the noun phrase. The English grammar for example, contains about 1,200 grammar-based constraints, plus 200 heuristic constraints.
 
 
 
=Syntactic lookup=
 
 
All possible syntactic tags are introduced for each word. This could, in some cases, mean that more than ten alternatives are given for one morphological reading.
 
 
 
=Resolution of syntactic ambiguities=
 
 
The parser finally consults a syntactic disambiguation grammar. The English version of the Constraint Grammar contains 800 syntactic constraints, of a similar form to the rules at the morphological resolution stage.
 
 
 
=Syntactic tags=
 
 
The English version of the Constraint Grammar marks the syntactic functions shown in table.
 
 
 
'''@+FAUXV''' finite auxiliary verb
 
 
'''@-FAUXV''' nonfinite auxiliary verb
 
 
'''@+FMAINV''' finite main verb
 
 
'''@-FMAINV''' nonfinite main verb
 
 
'''@SUBJ''' subject
 
 
'''@F-SUBJ''' formal subject
 
 
'''@OBJ''' object
 
 
'''@I-OBJ''' indirect object
 
 
'''@PCOMPL-S''' subject complement
 
 
'''@PCOMPL-O''' object complement
 
 
'''@APP''' apposition
 
 
'''@NPHR''' stray nominal
 
 
'''@N''' title
 
 
'''@O-ADVL''' object adverbial
 
 
'''@ADVL''' adverbial
 
 
'''@DN>''' determiner
 
 
'''@NN>''' premodifying noun
 
 
'''@AN>''' premodifying adjective
 
 
'''@QN>''' premodifying quantifier
 
 
'''@GN>''' premodifying genitive
 
 
'''@AD-A>''' premodifying ad-adjective
 
 
'''@<AD-A''' postmodifying ad-adjective
 
 
'''@<NOM-FMAINV''' postmodifying nonfinite verb
 
 
'''@<NOM''' other postmodifier
 
 
'''@<P-FMAINV''' nonfinite verb as complement of preposition
 
'''@<P''' other complement of preposition
 
 
'''@CC''' coordinator
 
 
'''@CS''' subordinator
 
 
'''@INFMARK''' infinitive marker
 
 
(ENGCG tags )
 
 
 
=Example=
 
 
 
As mentioned above, the syntactic tags are distinguished by the use of the `@' sign. The analysis is dependency based, but only partially. As can be seen in table 3.5, dependency relations are shown by the use of the left and right angle brackets, showing that a word is dependent on another to either the right of the left. In the example below, Karlsson is marked as `@<P' meaning that it is the complement of a preposition to be found previous to Karlsson.
 
 
 
"<*i>"
 
*"i" <*> <NonMod> PRON PERS NOM SG1 SUBJ @SUBJ
 
"<started>"
 
*"start" <SV> <SVO> <P/on>V PAST VFIN @+FMAINV
 
"<work>"
 
*"work" N NOM SG @OBJ
 
"<on>"
 
*"on" PREP @ADVL
 
"<an>"
 
*"an" <Indef> DET CENTRAL ART SG @DN>
 
"<*english>"
 
*"english" <*> <Nominal> A ABS @AN>
 
"<description>"
 
*"description" N NOM SG @<P
 
"<within>"
 
*"within" PREP @<NOM @ADVL
 
"<the>"
 
*"the" <Def> DET CENTRAL ART SG/PL @DN>
 
"<*constraint>"
 
*"constraint" <*> N NOM SG @NN>
 
"<*grammar>"
 
*"grammar" <*> N NOM SG @NN>
 
"<framework>"
 
*"framework" N NOM SG @<P
 
"<proposed>"
 
*"propose" <Vcog> <SVO> <SV> PCP2 @<NOM-FMAINV
 
"<by>"
 
*"by" PREP @ADVL
 
"<*karlsson>"
 
*"karlsson" <*> <Proper> N NOM SG @<P
 
"<$[>"
 
"<1990>"
 
*"1990" <1900> NUM CARD @ADVL
 
"<$;>"
 
"<1994a>"
 
*"1994a" <1994a> NUM CARD @ADVL
 
 
{ENCG output }
 
 
 
 
 
=Publications=
 
 
 
 
 
'''Early general Constraint Grammar publications:'''
 
 
*Karlsson, Fred (1990). "Constraint grammar as a framework for parsing running text". In: Karlgren, Hans (ed.), Proceedings of 13th International Conference on Computational Linguistics, volume 3, pp. 168-173, Helsinki, Finland.
 
*Karlsson et al. (1995), "Constraint Grammar - A Language-Independent System for Parsing Unrestricted Text". Mouton de Gruyter
 
*Tapanainen, Pasi (1996). "The Constraint Grammar Parser CG-2". No 27, Publications of the Department of General Linguistics, University of Helsinki.
 
 
 
'''Some publications concerning VISL Constraint Grammar systems:'''
 
 
 
*Valverde, Pilar & Bick, Eckhard (2010). "A Web Corpus of Spanish Automatically Annotated with Semantic Roles". In: Sánchez, A. & M. Almela. 2010. A Mosaic of Corpus Linguistics. Selected Approaches. Berlin/Frankfurt: Peter Lang. [Oral presentation at: 1st International Conerence on Corpus Linguistics (CILC-09), Murcia May 7-9 2009]
 
*Bick, Eckhard (2009). A Dependency Constraint Grammar for Esperanto. Constraint Grammar Workshop at NODALIDA 2009, Odense. NEALT Proceedings Series, Vol 8,
 
*Bick, Eckhard (2009). Introducing probabilistic information in Constraint Grammar parsing. Proceedings of Corpus Linguistics 2009, Liverpool, UK. Electronically published at ... (forthcoming)
 
*Bick, Eckhard & Valverde, Pilar (2009). Automatic Semantic Role Annotation for Spanish. Proceedings of NODALIDA 2009. NEALT Proceedings Series Vol. 4.
 
*Bick, Eckhard (2007). Automatic Semantic Role Annotation for Portuguese. In: Proceedings of TIL 2007 - 5th Workshop on Information and Human Language Technology / Anais do XXVII Congresso da SBC (Rio de Janeiro, July 5-6, 2007).
 
*Bick, Eckhard (2007), "Functional Aspects in Portuguese NER". In: Diana Santos & Nuno Cardoso (eds.), Reconhecimento de entidades mencionadas em português: Documentação e actas do HAREM, a primeira avaliação conjunta na área..
 
*Bick, Eckhard (2007), Dan2eng: Wide-Coverage Danish-English Machine Translation, In: Bente Maegaard (ed.), Proceedings of Machine Translation Summit XI, 10-14. Sept. 2007, Copenhagen, Denmark.
 
*Bick, Eckhard (2007), Tagging and Parsing an Artificial Language: An Annotated Web-Corpus of Esperanto, In: Proceedings of Corpus Linguistics 2007, Birmingham, UK. Electronically published at (http://ucrel.lancs.ac.uk/publications/CL2007/, Nov. 2007)
 
*Bick, Eckhard & Nygaard, Lars (2007). Using Danish as a CG Interlingua. A Wide-Coverage Norwegian-English Machine Translation System. In: Proceedings of the 16th Nordic Conference of Computational Linguistics. Tartu, Estonia. ISBN978-9985-4-0514-7
 
*Bick, Eckhard (2006), Noun Sense Tagging: Semantic Prototype Annotation of a Portuguese Treebank, In: Hajic, Jan & Nivre, Joakim (red.), Proceedings of the Fifth Workshop on Treebanks and Linguistic Theories (December 1-2, 2006, Prague, Czech Republic),
 
*Bick, Eckhard (2006), A Constraint Grammar-Based Parser for Spanish. In: Proceedings of TIL 2006 - 4th Workshop on Information and Human Language Technology (Ribeirão Preto, October 27-28, 2006).
 
*Bick, Eckhard (2006), "Functional Aspects in Portuguese NER", in: Renata Vieira et al. (eds.) Computational Processing of the Portuguese Language (Proceedings of PROPOR 2006, Itatiaia, May 15th-17th, 2006),
 
*Bick, Eckhard (2006), "A Constraint Grammar Based Spellchecker for Danish with a Special Focus on Dyslexics". In: Suominen, Mickael et.al. (ed.) A Man of Measure: Festschrift in Honour of Fred Karlsson on his 60th Birthday. Special Supplement to SKY Jounal of Linguistics, Vol. 19 (ISSN 1796-279X),
 
*Bick, Eckhard (2005), Turning Constraint Grammar Data into Running Dependency Treebanks, In: Civit, Montserrat & Kübler, Sandra & Martí, Ma. Antònia (red.), Proceedings of TLT 2005 (4th Workshop on Treebanks and Linguistic Theory, Barcelona, December 9th - 10th, 2005),
 
*Bick, Eckhard (2005), Gramática Constritiva na Análise Automática de Sintaxe Portuguesa. In: Berber Sardinha, Tony (ed.), A Língua Portuguesa no Computador [The Portuguese Language on the Computer]. Campinas: Mercado de Letras, São Paulo:
 
*Bick, Eckhard (2004), PaNoLa: Integrating Constraint Grammar and CALL, In: Henrik Holmboe (red.), Nordic Language Technology, Årbog for Nordisk Sprogteknologisk Forskningsprogram 2000-2004 (Yearbook 2003).
 
*Bick, Eckhard (2004), Parsing and evaluating the French Europarl corpus, In: Patrick Paroubek, Isabelle Robba & Anne Vilnat (red.): Méthodes et outils pour lévaluation des analyseurs syntaxiques (Journée ATALA, May 15, 2004).
 
*Bick, Eckhard (2003). "A Constraint Grammar Based Question-Answering System for Portuguese". In: Fernando Moura Pires & Salvador (eds.) Progress in Artificial Intelligence (Proceedings of EPIA'2003, Beja, Dec. 2003)
 
*Bick, Eckhard (2003), A CG & PSG Hybrid Approach to Automatic Corpus Annotation, in Kiril Simow & Petya Osenova: Proceedings of SProLaC2003 (at Corpus Linguistics 2003, Lancaster),
 
*Bick, Eckhard (2001), En Constraint Grammar Parser for Dansk, in Peter Widell & Mette Kunøe (eds.) 8. Møde om Udforskningen af Dansk Sprog, 12.-13. oktober 2000, pp. 40-50, Århus University
 
*Bick, Eckhard (2000), The Parsing System Palavras - Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Famework, Aarhus: Aarhus University Press (preprint version) -- dr.phil. thesis (cf. the Disputatio for an introduction)
 
*Bick, Eckhard (1998), Tagging Speech Data - Constraint Grammar Analysis of Spoken Portuguese, in: Proceedings of the 17th Scandinavian Conference of Linguistics, (Odense 1998)
 
*Bick, Eckhard (1996), Automatic Parsing of Portuguese. In García, Laura Sánchez (ed.), Anais / II Encontro para o Processamento Computacional de Português Escrito e Falado. Curitiba: CEFET-PR.
 
 
 
'''Other publications concerning Constraint Grammar'''
 
 
 
*Antonsen, Lene & Huhmarniemi, Saara & Trosterud, Trond (2009). Constraint Grammar in Dialogue systems. Constraint Grammar Workshop at NODALIDA 2009, Odense. NEALT Proceedings Series, Vol 8, pp.13-21. Tartu: Tartu University Library.
 
*Antonsen, Lene & Huhmarniemi, Saara & Trosterud, Trond (2009). Interactive pedagogical programs based on Constraint Grammar. Constraint Grammar Workshop at NODALIDA 2009, Odense. NEALT Proceedings Series, Vol 8, pp.10-17. Tartu: Tartu University Library.
 
*Lindström, Liina & Müürisep, Kaili (2009). Parsing Corpus of Estonian Dialects. Constraint Grammar Workshop at NODALIDA 2009, Odense. NEALT Proceedings Series, Vol 8, pp. 22-29. Tartu: Tartu University Library.
 
*Trosterud, Trond (2009). A Constraint Grammar for Faroese. Constraint Grammar Workshop at NODALIDA 2009, Odense. NEALT Proceedings Series, Vol 8, pp.1-7. Tartu: Tartu University Library.
 
*Dhonnchadha, E. Uí (2006). "A Part-of-speech tagger for Irish using Finite-State Morphology and Constraint Grammar Disambiguation". In: Proceedings of LREC'06. Genova, Italy.
 
*Atserias, J. et al. (2006). "FreeLing 1.3: Syntactic and semantic services in an open-source NLP library". In: Proceedings of LREC'06. Genoa, Italy (2006)
 
*Hurskainen, Arvi (2006), Constraint Grammar in Unconventional Use: Handling complex Swahili idioms and proverbs. In: Suominen, Mickael et.al. (ed.) A Man of Measure: Festschrift in Honour of Fred Karlsson on his 60th Birthday. Special Supplement to SKY Jounal of Linguistics, Vol. 19, pp. 397-406. Turku: The Linguistic Association of Finland
 
*Müürisep, Kaili and Uibo, Heli. "Shallow Parsing of Spoken Estonian Using Constraint Grammar". In: P.J.Henriksen & P.R.Skadhauge, Proceedings of NODALIDA-2005 special session on treebanking. Copenhagen Studies in Language #33/2006.
 
*Müürisep, Kaili et al. (2003). A New Language for Constraint Grammar: Estonian. In: International Conference Recent Advances in Natural Language Processing. Proceedings. Borovets, Bulgaria, 10-12 September 2003,
 
*Hagen, Kristin & Lane, Pia. & Trosterud, Trond (2001). "En grammatikkontrol for bokmål". In: Kjell Ivar Vannebo & Helge Sandøy (eds.): Språkknyt 3-2001.
 
*Hagen, K., Johannessen, J. B., Nøklestad, A.(2000). "A Constraint-Based Tagger for Norwegian". In: Lindberg, C.-E. og Lund, S.N. (red.): 17th Scandinavian Conference of Linguistic, Odense. Odense Working Papers in Language and Communication, No. 19, vol I.
 
*Arppe, Antti (2000). "Developing a grammar checker for Swedish". In: Nordgård, T. (ed.) Nodalida'99 Proceedings. Department of Linguistics, University of Trondheim.
 
*Birn, Jussi (2000). "Detecting grammar errors with Lingsoft's Swedish grammar checker". In: Nordgård, T. (ed.) Nodalida'99 Proceedings. Department of Linguistics, University of Trondheim.
 
*Lager, Torbjörn (1999). "The µ-TBL System: Logic Programming Tools for Transformation-Based Learning". In: Proceedings of CoNLL'99, Bergen.
 
*Padró, L.(1996). "POS Tagging Using Relaxation Labelling". In: Proceedings of COLING '96. Copenhagen, Denmark.
 
*Hurskainen, Arvi (1996). "Disambiguation of morphological analysis in Bantu languages". In: Proceedings of the 16th conference on Computational Linguistics. Copenhagen:ACL. Vol.1,
 
*Chanod, Jean-Pierre & Tapanainen, Pasi, "Tagging French - comparing a statistical and a constraint- based method", adapted from: Statistical and Constraint- based Taggers for French, Technical report MLTT-016, Rank Xerox Research Centre, Grenoble, 1994
 
*Voutilainen, Atro, Juha Heikkilä, and Arto Anttila (1992). "Constraint Grammar of English - A Performance-Oriented Introduction". No. 21, Publications of the Department of General Linguistics, University of Helsinki.
 
   
  +
;Constraint grammar
   
   

Revision as of 13:56, 30 November 2010

Download

Apertium
Constraint grammar

Install

Apertium
Constraint grammar