AOT
Jump to navigation
Jump to search
Contents |
AOT is a morphological analyser (or lemmatiser) for Russian under the LGPL license. The main website is at www.aot.ru but some of the download links are broken. Fortunately it is mirrored on SourceForge.
Download
mkdir /tmp/RML cd /tmp/RML wget http://heanet.dl.sourceforge.net/project/rupostagger/rupostagger/0.1.02/rupostagger-0.1.02.tar.gz tar -xzvf rupostagger-0.1.02.tar.gz cp rupostagger-0.1.02/LemServer/aot.ru/* . tar -xzvf lemmatizer.tar.gz tar -xzvf rus-src-morph.tar.gz export RML=/tmp/RML
Compile
Edit the files "./compile_ross.sh" and "compile_morph.sh" and replace "gmake" with "make"
Edit the files:
- "Source/MorphWizardLib/FormInfo.h"
- "Source/AgramtabLib/GerGramTab.cpp"
- "Source/AgramtabLib/RusGramTab.cpp"
- "Source/AgramtabLib/EngGramTab.cpp"
- "Source/AgramtabLib/agramtab_.cpp"
- "Source/StructDictLib/Ross.h"
- "Source/StructDictLib/Field.h"
- "Source/GraphanLib/C_desc.cpp"
- "Source/common/utilit.cpp"
- "Source/common/PlmLine.cpp", and add:
#include <string.h>
at the top.
This can be done easily with "sed":
sed -i 1i"#include <string.h>" Source/MorphWizardLib/FormInfo.h sed -i 1i"#include <string.h>" Source/AgramtabLib/GerGramTab.cpp sed -i 1i"#include <string.h>" Source/AgramtabLib/RusGramTab.cpp sed -i 1i"#include <string.h>" Source/AgramtabLib/EngGramTab.cpp sed -i 1i"#include <string.h>" Source/AgramtabLib/agramtab_.cpp sed -i 1i"#include <string.h>" Source/StructDictLib/Ross.h sed -i 1i"#include <string.h>" Source/StructDictLib/Field.h sed -i 1i"#include <string.h>" Source/GraphanLib/C_desc.cpp sed -i 1i"#include <string.h>" Source/common/utilit.cpp sed -i 1i"#include <string.h>" Source/common/PlmLine.cpp
Then:
./compile_morph.sh ./generate_morph_bin.sh Russian
Use
$ echo "язык" | iconv -f utf-8 -t koi8-r | ./Bin/TestLem Russian | head -3 | iconv -f koi8-r Loading.. Input a word.. >+ {ЯЗЫК, С, "од", ("мр,им,ед",) } Id=28549 Accented=ЯЗЫ'К