Cascaded Interchunk

Chunking

Chunking is based on source language patterns. It is used in language pairs such as English-Esperanto.

First, words are reordered into chunks.

Then, the chunks are reordered by matching patterns like adj+noun or adj+adj+noun.

From this, a ‘pseudo lemma’ is made with a tag containing the type – normally ‘SN’ (Noun Phrase) or ‘SV’ (Verb Phrase).

Basically after this, the translation is done with these pseudo words breaking the language down to its roots.

Chunks for an English phrase may look like:

SN (The dog)    SV (played with)    SN (the boy)

"The dog" is a noun phrase and so is "the boy" so they are chunked as such.

"played with" is a verb phrase and so is chunked as such and not as a noun phrase.

This method is used in shallow transfer translation engines such as Apertium because it doesn't use parse trees (which are normally used in "deep transfer"). See Parse tree on Wikipedia.

Cascaded Interchunk

Contents

Chunking

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools