Difference between revisions of "Chunking"
Jump to navigation
Jump to search
| Line 27: | Line 27: | ||
==See also== |
==See also== |
||
* [[ |
* [[Apertium stream format#Chunks]] |
||
* [[Preparing to use apertium-transfer-tools]] |
* [[Preparing to use apertium-transfer-tools]] |
||
Revision as of 15:20, 4 September 2008
jacobn> But really I have a big problem about all this "shallow transfer".
<spectie> shallow transfer = no parse trees
<spectie> basically
<jimregan2> yep
<jacobn> HOW is reordering of the phrase then going to happen!!
jimregan2> we use chunking
<jimregan2> first we reorder words in the chunk, then we reorder chunks
<jacobn> Pls tell me 'bout it or point to a web page
<jimregan2> um
<jimregan2> it's easy enough
<jimregan2> first, we match phrase patterns
<jimregan2> adj+noun
<jimregan2> adj+adj+noun
<jimregan2> from these, we make a 'pseudo lemma', with a tag containing the type - normally 'SN' (noun phrase) or SV (verb phrase)
<jimregan2> then, we translate based on these pseudo words
<jimregan2> breaking the language down to its bare essentials, basically
<jimregan2> at the moment, I'm taking the 'hard wired' parts of the english to spanish chunker, and adapting it for french
<jimregan2> changing 'más' to 'plus' in a macro, etc.
<spectie> but the chunks cannot be recursive
See also
External links
- wikipedia
- Chunking (Natural Language Toolkit)
- CRFChunker (Conditional Random Fields English Phrase Chunker)
- JTextPro (A Java-based Text Processing Toolkit)