Difference between revisions of "AOT"

From Apertium
Jump to navigation Jump to search
m (addition)
 
(16 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  +
{{TOCD}}
  +
'''AOT''' is a [[morphological analyser]] (or lemmatiser) for [[Russian]] under the LGPL license. The main website is at [http://www.aot.ru www.aot.ru] but some of the download links are broken. Fortunately it is mirrored on SourceForge.
  +
  +
==Download==
  +
 
<pre>
 
<pre>
 
mkdir /tmp/RML
 
mkdir /tmp/RML
Line 7: Line 12:
 
tar -xzvf lemmatizer.tar.gz
 
tar -xzvf lemmatizer.tar.gz
 
tar -xzvf rus-src-morph.tar.gz
 
tar -xzvf rus-src-morph.tar.gz
export RML=~/RML
+
export RML=/tmp/RML
 
</pre>
 
</pre>
   
  +
==Compile==
Edit the fie "compile_morph.sh" and replace "gmake" with "make"
 
  +
 
Edit the files "./compile_ross.sh" and "compile_morph.sh" and replace "gmake" with "make"
  +
  +
Edit the files:
  +
  +
* "Source/MorphWizardLib/FormInfo.h"
  +
* "Source/AgramtabLib/GerGramTab.cpp"
  +
* "Source/AgramtabLib/RusGramTab.cpp"
  +
* "Source/AgramtabLib/EngGramTab.cpp"
  +
* "Source/AgramtabLib/agramtab_.cpp"
  +
* "Source/StructDictLib/Ross.h"
  +
* "Source/StructDictLib/Field.h"
  +
* "Source/GraphanLib/C_desc.cpp"
  +
* "Source/common/utilit.cpp"
  +
* "Source/common/PlmLine.cpp", and add:
  +
  +
<pre>
  +
#include <string.h>
  +
</pre>
  +
  +
at the top.
  +
  +
This can be done easily with "sed":
  +
<pre>
  +
sed -i 1i"#include <string.h>" Source/MorphWizardLib/FormInfo.h
  +
sed -i 1i"#include <string.h>" Source/AgramtabLib/GerGramTab.cpp
  +
sed -i 1i"#include <string.h>" Source/AgramtabLib/RusGramTab.cpp
  +
sed -i 1i"#include <string.h>" Source/AgramtabLib/EngGramTab.cpp
  +
sed -i 1i"#include <string.h>" Source/AgramtabLib/agramtab_.cpp
  +
sed -i 1i"#include <string.h>" Source/StructDictLib/Ross.h
  +
sed -i 1i"#include <string.h>" Source/StructDictLib/Field.h
  +
sed -i 1i"#include <string.h>" Source/GraphanLib/C_desc.cpp
  +
sed -i 1i"#include <string.h>" Source/common/utilit.cpp
  +
sed -i 1i"#include <string.h>" Source/common/PlmLine.cpp
  +
</pre>
  +
  +
  +
Then:
   
 
<pre>
 
<pre>
Line 16: Line 59:
 
./generate_morph_bin.sh Russian
 
./generate_morph_bin.sh Russian
 
</pre>
 
</pre>
  +
  +
==Use==
  +
  +
<pre>
  +
$ echo "язык" | iconv -f utf-8 -t koi8-r | ./Bin/TestLem Russian | head -3 | iconv -f koi8-r
  +
Loading..
  +
Input a word..
  +
>+ {ЯЗЫК, С, "од", ("мр,им,ед",) } Id=28549 Accented=ЯЗЫ'К
  +
</pre>
  +
  +
==External links==
  +
  +
* [http://aot.ru/docs/rusmorph.html Description of morphological tags]
  +
  +
[[Category:Russian]]

Latest revision as of 13:02, 3 November 2018

AOT is a morphological analyser (or lemmatiser) for Russian under the LGPL license. The main website is at www.aot.ru but some of the download links are broken. Fortunately it is mirrored on SourceForge.

Download[edit]

mkdir /tmp/RML
cd /tmp/RML
wget http://heanet.dl.sourceforge.net/project/rupostagger/rupostagger/0.1.02/rupostagger-0.1.02.tar.gz
tar -xzvf rupostagger-0.1.02.tar.gz
cp rupostagger-0.1.02/LemServer/aot.ru/* .
tar -xzvf lemmatizer.tar.gz
tar -xzvf rus-src-morph.tar.gz 
export RML=/tmp/RML

Compile[edit]

Edit the files "./compile_ross.sh" and "compile_morph.sh" and replace "gmake" with "make"

Edit the files:

  • "Source/MorphWizardLib/FormInfo.h"
  • "Source/AgramtabLib/GerGramTab.cpp"
  • "Source/AgramtabLib/RusGramTab.cpp"
  • "Source/AgramtabLib/EngGramTab.cpp"
  • "Source/AgramtabLib/agramtab_.cpp"
  • "Source/StructDictLib/Ross.h"
  • "Source/StructDictLib/Field.h"
  • "Source/GraphanLib/C_desc.cpp"
  • "Source/common/utilit.cpp"
  • "Source/common/PlmLine.cpp", and add:
#include <string.h>

at the top.

This can be done easily with "sed":

sed -i 1i"#include <string.h>" Source/MorphWizardLib/FormInfo.h
sed -i 1i"#include <string.h>" Source/AgramtabLib/GerGramTab.cpp
sed -i 1i"#include <string.h>" Source/AgramtabLib/RusGramTab.cpp
sed -i 1i"#include <string.h>" Source/AgramtabLib/EngGramTab.cpp
sed -i 1i"#include <string.h>" Source/AgramtabLib/agramtab_.cpp
sed -i 1i"#include <string.h>" Source/StructDictLib/Ross.h
sed -i 1i"#include <string.h>" Source/StructDictLib/Field.h
sed -i 1i"#include <string.h>" Source/GraphanLib/C_desc.cpp
sed -i 1i"#include <string.h>" Source/common/utilit.cpp
sed -i 1i"#include <string.h>" Source/common/PlmLine.cpp


Then:

./compile_morph.sh
./generate_morph_bin.sh Russian

Use[edit]

$ echo "язык" | iconv -f utf-8 -t koi8-r | ./Bin/TestLem Russian | head -3 | iconv -f koi8-r
Loading..
Input a word..
>+ {ЯЗЫК, С, "од",  ("мр,им,ед",) } Id=28549 Accented=ЯЗЫ'К

External links[edit]