Difference between revisions of "Linguistic Resources Document"

From Apertium
Jump to navigation Jump to search
(Type of resources)
Line 1: Line 1:
  +
{{TOCD}}
  +
 
A '''Linguistic Resources Document''' (LRD) is an XML document consisting of a set of linguistic resources (dictionaries, cross models, corpora, links to other LRDs, etc.).
 
A '''Linguistic Resources Document''' (LRD) is an XML document consisting of a set of linguistic resources (dictionaries, cross models, corpora, links to other LRDs, etc.).
   
 
This document can be used with [[Crossdics#Using_a_Linguistic_Resources_Document|apertium-crossdics]] to indicate which resources (dictionaries and cross models) can be crossed.
 
This document can be used with [[Crossdics#Using_a_Linguistic_Resources_Document|apertium-crossdics]] to indicate which resources (dictionaries and cross models) can be crossed.
   
== Structure ==
+
== Structure of the document ==
  +
  +
=== Overview ===
   
 
<pre>
 
<pre>
Line 44: Line 48:
 
</pre>
 
</pre>
   
== Resource ==
+
=== Resource ===
   
 
A resource is defined with a set of properties.
 
A resource is defined with a set of properties.
Line 56: Line 60:
 
<property name="version" value="stable"/>
 
<property name="version" value="stable"/>
 
'''</resource>'''
 
'''</resource>'''
 
=== Resource properties ===
 
   
 
Possible values for resources are:
 
Possible values for resources are:
Line 76: Line 78:
 
* ¿more?
 
* ¿more?
   
== Set of resources ==
+
=== Set of resources ===
   
 
It is possible to group a number of resources with the <code>resource-set</code> tag, as follows:
 
It is possible to group a number of resources with the <code>resource-set</code> tag, as follows:
Line 95: Line 97:
   
 
This organisation can be useful to group linguistic data from certain language pair.
 
This organisation can be useful to group linguistic data from certain language pair.
  +
  +
== Type of resources ==
  +
  +
These are some of the resources that can be defined.
  +
  +
=== Morphological dictionary ===
  +
  +
<resource>
  +
<property name="name" value="apertium-es"/>
  +
'''<property name="type" value="mon"/>'''
  +
'''<property name="sl" value="es"/>'''
  +
<property name="src" value="apertium-es-ca.es.dix"/>
  +
<property name="version" value="stable"/>
  +
</resource>'''
  +
  +
=== Bilingual dictionary ===
  +
  +
<resource>
  +
<property name="name" value="apertium-es-ca"/>
  +
'''<property name="type" value="bil"/>'''
  +
'''<property name="sl" value="es"/>'''
  +
'''<property name="tl" value="ca"/>'''
  +
<property name="src" value="apertium-es-ca.es-ca.dix"/>
  +
<property name="version" value="stable"/>
  +
</resource>'''
  +
  +
=== Cross model ===
  +
  +
<resource>
  +
<property name="name" value="cm-es-ca-en"/>
  +
'''<property name="type" value="cross-model"/>'''
  +
'''<property name="sl" value="es"/>'''
  +
'''<property name="tl" value="en"/>'''
  +
<property name="src" value="cross-model-es-ca-en.xml"/>
  +
</resource>'''
  +
  +
=== Corpus ===
  +
  +
<resource>
  +
<property name="name" value="corpus-es"/>
  +
'''<property name="type" value="crp"/>'''
  +
'''<property name="sl" value="es"/>'''
  +
<property name="src" value="es-corpus.crp"/>
  +
</resource>'''
  +
 
=== Linguistic Resource Document ===
  +
  +
<resource>
  +
<property name="name" value="other-ling-resources"/>
  +
'''<property name="type" value="lrd"/>'''
  +
<property name="src" value="other-ling-resources.xml"/>
  +
</resource>'''
   
 
== Example of LRD ==
 
== Example of LRD ==

Revision as of 13:40, 10 March 2008

A Linguistic Resources Document (LRD) is an XML document consisting of a set of linguistic resources (dictionaries, cross models, corpora, links to other LRDs, etc.).

This document can be used with apertium-crossdics to indicate which resources (dictionaries and cross models) can be crossed.

Structure of the document

Overview

<?xml version="1.0" encoding="UTF-8"?>

<ling-resources>
   <name>...</name>
   <description>...</description>
   
   <resource>
      <property name="..." value="..."/>
      <property name="..." value="..."/>
      <property name="..." value="..."/>
      ...
   </resource>

   <resource-set>
      <name>...</name>
      <description>...</description>

      <resource>
         <property name="..." value="..."/>
         ...
      </resource>
      <resource>
         <property name="..." value="..."/>
         ...
      </resource>
      ...
   </resource-set>  

   <resource>
      <property name="..." value="..."/>
      <property name="..." value="..."/>
      <property name="..." value="..."/>
      ...
   </resource>
   ...  
</ling-resources>

Resource

A resource is defined with a set of properties.

<resource>
   <property name="name" value="apertium-es"/>
   <property name="type" value="mon"/>
   <property name="sl" value="es"/>
   <property name="for-crossing" value="yes"/>
   <property name="src" value="apertium-es-ca.es.dix"/>
   <property name="version" value="stable"/>
</resource>

Possible values for resources are:

  • name: the name of the resource.
  • type: the type of resource. Possible values are:
    • mon: morphological dictionary.
    • bil: bilingual dictionary.
    • crp: corpus.
    • lrd: link to Linguistic Resource Document.
    • cross-model: cross model document.
  • sl: source language (for example, in morphological and bilingual dictionaries)
  • tl: target language (for example, in bilingual dictionaries)
  • src: source (URL or file path)
  • version: version of the resource (for example, for dictionaries: stable, unstable, pre-aplha, etc).
  • ¿more?

Set of resources

It is possible to group a number of resources with the resource-set tag, as follows:

<resource-set>
   <name></name>
   <description></description>
   <resource>
      <property name="" value=""/>
      ...
   </resource>
   <resource>
      <property name="" value=""/>
      ...
   </resource>
   ...
</resource-set>

This organisation can be useful to group linguistic data from certain language pair.

Type of resources

These are some of the resources that can be defined.

Morphological dictionary

<resource>
   <property name="name" value="apertium-es"/>
   <property name="type" value="mon"/>
   <property name="sl" value="es"/>
   <property name="src" value="apertium-es-ca.es.dix"/>
   <property name="version" value="stable"/>
</resource>

Bilingual dictionary

<resource>
   <property name="name" value="apertium-es-ca"/>
   <property name="type" value="bil"/>
   <property name="sl" value="es"/>
   <property name="tl" value="ca"/>
   <property name="src" value="apertium-es-ca.es-ca.dix"/>
   <property name="version" value="stable"/>
</resource>

Cross model

<resource>
   <property name="name" value="cm-es-ca-en"/>
   <property name="type" value="cross-model"/>
   <property name="sl" value="es"/>
   <property name="tl" value="en"/>
   <property name="src" value="cross-model-es-ca-en.xml"/>
</resource>

Corpus

<resource>
   <property name="name" value="corpus-es"/>
   <property name="type" value="crp"/>
   <property name="sl" value="es"/>
   <property name="src" value="es-corpus.crp"/>
</resource>

Linguistic Resource Document

<resource>
   <property name="name" value="other-ling-resources"/>
   <property name="type" value="lrd"/>
   <property name="src" value="other-ling-resources.xml"/>
</resource>

Example of LRD

<?xml version="1.0" encoding="UTF-8"?>

<!-- Linguistic resources-->
<ling-resources>
   <name>My linguistic resources</name>
   <description>My linguistics resources: morphological and bilingual dictionaries, cross models, corpora, etc.</description>
   
   <resource-set>
      <name>My linguistic resources to get English-Spanish language pair.</name>
      <description>A description of this resource set</description>

      <!-- cross model en-ca-es -->
      <resource>
         <property name="name" value="cross-model-en-ca-es"/>
         <property name="type" value="cross-model"/>
         <property name="sl" value="en"/>
         <property name="tl" value="es"/>      
         <property name="for-crossing" value="yes"/>
         <property name="src" value="cross-model-es-ca-en.xml"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- cross model es-ca-en -->
      <resource>
         <property name="name" value="cross-model-es-ca-en"/>
         <property name="type" value="cross-model"/>
         <property name="sl" value="es"/>
         <property name="tl" value="en"/>      
         <property name="for-crossing" value="yes"/>
         <property name="src" value="cross-model-es-ca-en.xml"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'es' morphological dictionary -->
      <resource>
         <property name="name" value="apertium-es"/>
         <property name="type" value="mon"/>
         <property name="sl" value="es"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-es-ca.es.dix"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'en' morphological dictionary -->
      <resource>
         <property name="name" value="apertium-en"/>
         <property name="type" value="mon"/>
         <property name="sl" value="en"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-en-ca.en.metadix"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'en-ca' bilingual dictionary -->   
      <resource>
         <property name="name" value="apertium-en-ca"/>
         <property name="type" value="bil"/>
         <property name="sl" value="en"/>
         <property name="tl" value="ca"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-en-ca.en-ca.dix"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'es-ca' bilingual dictionary -->
      <resource>
         <property name="name" value="apertium-es-ca"/>
         <property name="type" value="bil"/>
         <property name="sl" value="es"/>
         <property name="tl" value="ca"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-es-ca.es-ca.dix"/>
         <property name="version" value="stable"/>
      </resource>
   </resource-set>
   
   <!-- Single corpus file -->
   <resource>
      <property name="name" value="corpus-es"/>
      <property name="type" value="corpus"/>
      <property name="sl" value="es"/>
      <property name="src" value="corpus-es.crp"/>        
   </resource>
   
   <!-- Repository (files like this) -->
   <resource>
      <property name="name" value="other-resources-1"/>
      <property name="type" value="lrd"/>
      <property name="src" value="other-ling-resources-file.xml"/>        
   </resource>
   
</ling-resources>

See also