Layouts

From Apertium
Revision as of 11:17, 24 July 2007 by Francis Tyers (talk | contribs)
Jump to navigation Jump to search

There are several possible layouts that apertium linguistic data packages may have.

Apertium 1.0

Apertium 2.0

Apertium 3.0

Example using English—Afrikaans:

Filename Comment
apertium-en-af.af.dix.xml Afrikaans monolingual dictionary
apertium-en-af.en-af.dix.xml English—Afrikaans bilingual dictionary
apertium-en-af.en.dix.xml English monolingual dictionary
apertium-en-af.symbols.xml List of grammatical symbols
apertium-en-af.af-en.t1x Afrikaans—English first stage transfer file (transfer)
apertium-en-af.af-en.t2x Afrikaans—English second stage transfer file (interchunk)
apertium-en-af.af-en.t3x Afrikaans—English third stage transfer file (postchunk)
apertium-en-af.en-af.t1x English—Afrikaans first stage transfer file (transfer)
apertium-en-af.en-af.t2x English—Afrikaans first stage transfer file (interchunk)
apertium-en-af.en-af.t3x English—Afrikaans first stage transfer file (postchunk)
modes.xml Modes specification file (see modes)

Using separate include files

One of the main things that separates this method is using separate include files for various bits, most just use it to define symbols, but it could be used for example for a large list of proper nouns, different domains or almost anything.

In all of the dictionary files:

<?xml version="1.0" encoding="UTF-8"?>
<dictionary>
<alphabet>ÄÊËÖÜäêëöüßABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz</alphabet>

  <!-- Symbol definitions -->
  <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="apertium-en-af.symbols.xml"/>

  <pardefs>
    
    ...
  
  </pardefs>
  <section id="main" type="inconditional">

    ...

  </section>
</dictionary>

In the apertium-en-af.symbols.xml file:

<?xml version="1.0" encoding="UTF-8"?> <!-- -*- nxml -*- -->

  <sdefs>
    <sdef n="n"       c="Noun"/>
    <sdef n="m"       c="Masculine"/>
    <sdef n="f"       c="Feminine"/>

    ...
  </sdefs>

Then in the Makefile.am, append the following to the TARGETS_COMMON:

TARGETS_COMMON = $(BASENAME).$(LANG1).dix $(BASENAME).$(LANG2).dix $(BASENAME).$(LANG1)-$(LANG2).dix \

And add the following targets underneth:

$(BASENAME).$(LANG1).dix:
        xmllint --xinclude $(BASENAME).$(LANG1).dix.xml > $(BASENAME).$(LANG1).dix
$(BASENAME).$(LANG2).dix:
        xmllint --xinclude $(BASENAME).$(LANG2).dix.xml > $(BASENAME).$(LANG2).dix
$(BASENAME).$(LANG1)-$(LANG2).dix:
        xmllint --xinclude $(BASENAME).$(LANG1)-$(LANG2).dix.xml > $(BASENAME).$(LANG1)-$(LANG2).dix

You may also want to specify a clean-dicts target:

clean-dicts:
        rm $(BASENAME).$(LANG1).dix
        rm $(BASENAME).$(LANG2).dix
        rm $(BASENAME).$(PREFIX1).dix