Difference between revisions of "Linguistic Resources Document"

From Apertium
Jump to navigation Jump to search
m (Definition)
(Link to French page)
 
(13 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[Document de Ressources Linguistiques|En français]]

{{TOCD}}

A '''Linguistic Resources Document''' (LRD) is an XML document consisting of a set of linguistic resources (dictionaries, cross models, corpora, links to other LRDs, etc.).
A '''Linguistic Resources Document''' (LRD) is an XML document consisting of a set of linguistic resources (dictionaries, cross models, corpora, links to other LRDs, etc.).


This document can be used with [[Crossdics#Using_a_Linguistic_Resources_Document|apertium-crossdics]] to indicate which resources (dictionaries and cross models) can be crossed.
This document can be used, for example, with [[Crossdics#Using_a_Linguistic_Resources_Document|apertium-crossdics]] to indicate which resources (dictionaries and cross models) can be crossed.

== Structure of the document ==

=== Overview ===

<pre>
<?xml version="1.0" encoding="UTF-8"?>

<ling-resources>
<name>...</name>
<description>...</description>
<resource>
<property name="..." value="..."/>
<property name="..." value="..."/>
<property name="..." value="..."/>
...
</resource>

<resource-set>
<name>...</name>
<description>...</description>

<resource>
<property name="..." value="..."/>
...
</resource>
<resource>
<property name="..." value="..."/>
...
</resource>
...
</resource-set>

<resource>
<property name="..." value="..."/>
<property name="..." value="..."/>
<property name="..." value="..."/>
...
</resource>
...
</ling-resources>
</pre>

=== Resource ===

A resource is defined with a set of properties.

'''<resource>'''
<property name="name" value="apertium-es"/>
<property name="type" value="mon"/>
<property name="sl" value="es"/>
<property name="for-crossing" value="yes"/>
<property name="src" value="apertium-es-ca.es.dix"/>
<property name="version" value="stable"/>
'''</resource>'''

Possible values for resources are:

* '''name''': the name of the resource.
* '''type''': the type of resource. Possible values are:
** '''mon''': morphological dictionary.
** '''bil''': bilingual dictionary.
** '''crp''': corpus.
** '''lrd''': link to Linguistic Resource Document.
** '''cross-model''': cross model document.
* '''sl''': source language (for example, in morphological and bilingual dictionaries)
* '''tl''': target language (for example, in bilingual dictionaries)
* '''src''': source (URL or file path)
** file path: <code>/home/user/apertium-es-ca.es.dix</code>
** URL: <code>http://apertium.svn.sourceforge.net/viewvc/*checkout*/apertium/trunk/apertium-es-ca/apertium-es-ca.es.dix</code>
* '''version''': version of the resource (for example, for dictionaries: stable, unstable, pre-aplha, etc).
* ¿more?

=== Set of resources ===

It is possible to group a number of resources with the <code>resource-set</code> tag, as follows:

'''<resource-set>'''
<name></name>
<description></description>
'''<resource>'''
<property name="" value=""/>
...
'''</resource>'''
'''<resource>'''
<property name="" value=""/>
...
'''</resource>'''
...
'''</resource-set>'''

This organisation can be useful to group linguistic data from certain language pair.

== Type of resources ==

These are some of the resources that can be defined.

=== Morphological dictionary ===

<resource>
<property name="name" value="apertium-es"/>
'''<property name="type" value="mon"/>'''
'''<property name="sl" value="es"/>'''
<property name="src" value="apertium-es-ca.es.dix"/>
<property name="version" value="stable"/>
</resource>'''

=== Bilingual dictionary ===

<resource>
<property name="name" value="apertium-es-ca"/>
'''<property name="type" value="bil"/>'''
'''<property name="sl" value="es"/>'''
'''<property name="tl" value="ca"/>'''
<property name="src" value="apertium-es-ca.es-ca.dix"/>
<property name="version" value="stable"/>
</resource>'''

=== Cross model ===

<resource>
<property name="name" value="cm-es-ca-en"/>
'''<property name="type" value="cross-model"/>'''
'''<property name="sl" value="es"/>'''
'''<property name="tl" value="en"/>'''
<property name="src" value="cross-model-es-ca-en.xml"/>
</resource>'''

=== Corpus ===

<resource>
<property name="name" value="corpus-es"/>
'''<property name="type" value="crp"/>'''
'''<property name="sl" value="es"/>'''
<property name="src" value="es-corpus.crp"/>
</resource>'''

=== Linguistic Resource Document ===

<resource>
<property name="name" value="other-ling-resources"/>
'''<property name="type" value="lrd"/>'''
<property name="src" value="other-ling-resources.xml"/>
</resource>'''


== Example of LRD ==
== Example of LRD ==
Line 93: Line 242:
<resource>
<resource>
<property name="name" value="other-resources-1"/>
<property name="name" value="other-resources-1"/>
<property name="type" value="repository"/>
<property name="type" value="lrd"/>
<property name="src" value="other-ling-resources-file.xml"/>
<property name="src" value="other-ling-resources-file.xml"/>
</resource>
</resource>
Line 104: Line 253:


* [[Crossdics#Using_a_Linguistic_Resources_Document|Using a Linguistic Resources Document with apertium-crossdics]]
* [[Crossdics#Using_a_Linguistic_Resources_Document|Using a Linguistic Resources Document with apertium-crossdics]]
* [http://apertium.svn.sourceforge.net/viewvc/*checkout*/apertium/trunk/apertium-crossdics/resources/ling-resources.xml Download an example of Linguistic Resource Document]


[[Category:Documentation]]
[[Category:Dixtools]]
[[Category:Documentation in English]]

Latest revision as of 09:19, 6 October 2014

En français

A Linguistic Resources Document (LRD) is an XML document consisting of a set of linguistic resources (dictionaries, cross models, corpora, links to other LRDs, etc.).

This document can be used, for example, with apertium-crossdics to indicate which resources (dictionaries and cross models) can be crossed.

Structure of the document[edit]

Overview[edit]

<?xml version="1.0" encoding="UTF-8"?>

<ling-resources>
   <name>...</name>
   <description>...</description>
   
   <resource>
      <property name="..." value="..."/>
      <property name="..." value="..."/>
      <property name="..." value="..."/>
      ...
   </resource>

   <resource-set>
      <name>...</name>
      <description>...</description>

      <resource>
         <property name="..." value="..."/>
         ...
      </resource>
      <resource>
         <property name="..." value="..."/>
         ...
      </resource>
      ...
   </resource-set>  

   <resource>
      <property name="..." value="..."/>
      <property name="..." value="..."/>
      <property name="..." value="..."/>
      ...
   </resource>
   ...  
</ling-resources>

Resource[edit]

A resource is defined with a set of properties.

<resource>
   <property name="name" value="apertium-es"/>
   <property name="type" value="mon"/>
   <property name="sl" value="es"/>
   <property name="for-crossing" value="yes"/>
   <property name="src" value="apertium-es-ca.es.dix"/>
   <property name="version" value="stable"/>
</resource>

Possible values for resources are:

  • name: the name of the resource.
  • type: the type of resource. Possible values are:
    • mon: morphological dictionary.
    • bil: bilingual dictionary.
    • crp: corpus.
    • lrd: link to Linguistic Resource Document.
    • cross-model: cross model document.
  • sl: source language (for example, in morphological and bilingual dictionaries)
  • tl: target language (for example, in bilingual dictionaries)
  • src: source (URL or file path)
  • version: version of the resource (for example, for dictionaries: stable, unstable, pre-aplha, etc).
  • ¿more?

Set of resources[edit]

It is possible to group a number of resources with the resource-set tag, as follows:

<resource-set>
   <name></name>
   <description></description>
   <resource>
      <property name="" value=""/>
      ...
   </resource>
   <resource>
      <property name="" value=""/>
      ...
   </resource>
   ...
</resource-set>

This organisation can be useful to group linguistic data from certain language pair.

Type of resources[edit]

These are some of the resources that can be defined.

Morphological dictionary[edit]

<resource>
   <property name="name" value="apertium-es"/>
   <property name="type" value="mon"/>
   <property name="sl" value="es"/>
   <property name="src" value="apertium-es-ca.es.dix"/>
   <property name="version" value="stable"/>
</resource>

Bilingual dictionary[edit]

<resource>
   <property name="name" value="apertium-es-ca"/>
   <property name="type" value="bil"/>
   <property name="sl" value="es"/>
   <property name="tl" value="ca"/>
   <property name="src" value="apertium-es-ca.es-ca.dix"/>
   <property name="version" value="stable"/>
</resource>

Cross model[edit]

<resource>
   <property name="name" value="cm-es-ca-en"/>
   <property name="type" value="cross-model"/>
   <property name="sl" value="es"/>
   <property name="tl" value="en"/>
   <property name="src" value="cross-model-es-ca-en.xml"/>
</resource>

Corpus[edit]

<resource>
   <property name="name" value="corpus-es"/>
   <property name="type" value="crp"/>
   <property name="sl" value="es"/>
   <property name="src" value="es-corpus.crp"/>
</resource>

Linguistic Resource Document[edit]

<resource>
   <property name="name" value="other-ling-resources"/>
   <property name="type" value="lrd"/>
   <property name="src" value="other-ling-resources.xml"/>
</resource>

Example of LRD[edit]

<?xml version="1.0" encoding="UTF-8"?>

<!-- Linguistic resources-->
<ling-resources>
   <name>My linguistic resources</name>
   <description>My linguistics resources: morphological and bilingual dictionaries, cross models, corpora, etc.</description>
   
   <resource-set>
      <name>My linguistic resources to get English-Spanish language pair.</name>
      <description>A description of this resource set</description>

      <!-- cross model en-ca-es -->
      <resource>
         <property name="name" value="cross-model-en-ca-es"/>
         <property name="type" value="cross-model"/>
         <property name="sl" value="en"/>
         <property name="tl" value="es"/>      
         <property name="for-crossing" value="yes"/>
         <property name="src" value="cross-model-es-ca-en.xml"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- cross model es-ca-en -->
      <resource>
         <property name="name" value="cross-model-es-ca-en"/>
         <property name="type" value="cross-model"/>
         <property name="sl" value="es"/>
         <property name="tl" value="en"/>      
         <property name="for-crossing" value="yes"/>
         <property name="src" value="cross-model-es-ca-en.xml"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'es' morphological dictionary -->
      <resource>
         <property name="name" value="apertium-es"/>
         <property name="type" value="mon"/>
         <property name="sl" value="es"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-es-ca.es.dix"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'en' morphological dictionary -->
      <resource>
         <property name="name" value="apertium-en"/>
         <property name="type" value="mon"/>
         <property name="sl" value="en"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-en-ca.en.metadix"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'en-ca' bilingual dictionary -->   
      <resource>
         <property name="name" value="apertium-en-ca"/>
         <property name="type" value="bil"/>
         <property name="sl" value="en"/>
         <property name="tl" value="ca"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-en-ca.en-ca.dix"/>
         <property name="version" value="stable"/>
      </resource>
      
      <!-- 'es-ca' bilingual dictionary -->
      <resource>
         <property name="name" value="apertium-es-ca"/>
         <property name="type" value="bil"/>
         <property name="sl" value="es"/>
         <property name="tl" value="ca"/>
         <property name="for-crossing" value="yes"/>
         <property name="src" value="apertium-es-ca.es-ca.dix"/>
         <property name="version" value="stable"/>
      </resource>
   </resource-set>
   
   <!-- Single corpus file -->
   <resource>
      <property name="name" value="corpus-es"/>
      <property name="type" value="corpus"/>
      <property name="sl" value="es"/>
      <property name="src" value="corpus-es.crp"/>        
   </resource>
   
   <!-- Repository (files like this) -->
   <resource>
      <property name="name" value="other-resources-1"/>
      <property name="type" value="lrd"/>
      <property name="src" value="other-ling-resources-file.xml"/>        
   </resource>
   
</ling-resources>

See also[edit]