Difference between revisions of "User:Rcrowther"

From Apertium
Jump to navigation Jump to search
m
 
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Installation (français)|En français]]
==Notes==
{{Main page header}}
If the language pair is used in reverse, lang Y -> lang X, then the monodix for lang Y works as an analyser (Left->Right/LR), the bidex works Right->Left, and lang X works as a generator (Right->Left/RL).


== To try Apertium ==
If you are creating a new pair, the necessary modules are the two monodix and the bidex. Other modules (e.g. Lexical selection, Chunker stages, Post Generator) are for refining translation results.
You can go online to the [https://apertium.org front page] :)


There are several applications which work from the desktop without full installation. For these and more graphical user interfaces, services, plugins, etc. goto [[Tools]].
In the Wiki, you may find references to the Lexical Selection module being placed *before* the Lexical Transfer ('translation') module. This was the original position. The position of the module is now *after* the Lexical Transfer module. This decision is final (if software is ever 'final'...).


If you would like install instructions for 'Apertium viewer', 'apy' (the Apertium server) etc. got to [[Tools]]. The install instructions can be found with the tool descriptions.
Many parts of the Wiki refer to Constraint Grammars (vislcg3 CG-3, sometimes HFST) for text disambiguation. These codebases can be used as modules in the Apertium workflow, but are not part of the Apertium project. They are maintained elsewhere. Also, the grammar information they use is sometimes maintained elsewhere. The modules would usually be placed after Morphological Analysis but before Lexical Transfer. Apertium pairs can be developed by inserting Context Grammars, but this would be unusual, as most of the same effects can be achieved using the Lexical Selection and/or Chunker modules. The Apertium modules are not as powerful as a Constraint Grammar (or need a lot of work to be that powerful), but offer much faster processing and are easy to read and maintain.


Also mentioned in the Wiki is a step 'POS Tagger'. Like a constraint grammar, this module was/is placed after Morphological Analysis but before Lexical Transfer. Like a constraint grammar, the POS Tagger is used for word disambiguation. However, a constraint grammar works by constructing rules which decide which word should be chosen. The POS Tagger works/worked by adding special tags to the incoming text, after which the module builds data by being 'trained' i.e. statistical analysis. Development in Apertium has revealed that the POS Tagger module, though powerful, offers little improvement in translation quality. The module has not been used in new pairs for some time.


The Post Generator module is sometimes referred to as an 'orthographical' analyser/text-modification module. The module was originally provided to convert Spanish-like 'de el' into 'del'. The word 'orthographical' suggests the Post Generator module can handle word compressions/apostrophes such as the common English form "John's house". However, when used in this way, the Post Generator module has several unexpected behaviours. Not bugs, but the module does not perform in a flexible way. The module continues to perform useful work converting 'de el' into 'del' or, in English, placing the correct form of 'a'/'an' determiners ('a house', but 'an apple'). However, the module is not suitable for general orthography. Apostrophes, for example, are often handled in a bidex.


== For those who want to install Apertium locally, and developers==
Several modules can do the work of other modules. For example, the first chunker module, Chunker (sometimes called, confusingly, 'IntraChunk') is a very powerful module that can perform the work of the Lexical Selector. Indeed, in several language pairs Chunker rules do lexical selection. However, Chunker code is clumsy and difficult to read. The current Lexical Selection module is clean, fast, and offers what computer programmers call 'separation of concerns'. That is, if the Lexical Selector can do the work then, to help developers and future readers, the code is better placed there. The same is true of the analysing Monodix (which could do disambiguation work) and InterChunk (which could do Post Generator work).
How to install Apertium core<ref>Apertium is a big system. There are many plugins, scripts, and extension projects. The core, the code which translates, is a multi-step set of tools joined by a stream format and, nowadays, invoked by scripts called 'modes'. You may also see the names 'lt-toolbox'/'lt-tools', 'apertium-lex-tools', and the simple title 'apertium'. These refer to groupings of the tools.

Packaged or compiled, these tools can be installed as one unit. From here on, we call them 'Apertium core'.
</ref> and language data on your system (developers may also want to consider their operating environment<ref>
Apertium is written to be platform-independent. However, it can be difficult to maintain platform-independence over a project this wide. If you intend to do something deep with Apertium, you will gain more help from the tools if you use the [http://ubuntu.com Ubuntu], or a similar Debian-based, operating system.

In no way does this mean that the Apertium project favours this platform.
</ref>).


===Installing: a summary===
Most people will need to,

====Install Apertium Core by packaging/virtual environment====
* Linux systems: [[Install Apertium core using packaging]]
* Windows and Apple systems: [[Apertium VirtualBox]]

==== For translators: Install language data/dictionaries/pairs from repositories ====
[[Install language data using packaging]], including hints about the Apertium package repository.

==== For language developers: Install language data/dictionaries/pairs by compiling ====
* Start a new language pair: [[How to bootstrap a new pair]]
* Work on an existing language pair: [[Install language data by compiling]]



===Alternatives===

====Installing Apertium core by compiling====

Apertium maintains a package repository that is up-to-date and reliable. If you do not want to work in core, or develop languages, please use either packaging or a virtual environment. The packages stay up-to-date and are stable. A compile will waste your time.

However, if you are planning to work on Apertium core, or have an operating system not covered above, go right ahead, [[Install Apertium core by compiling]]<ref name="about installing">Most people know the word 'install'. It means 'put code in my operating system'. When developing, it is not usual to fully 'install'. You get the code working enough to get results.

This is relevant to Apertium, which needs a rapid cycle for re-compiles. If you follow instructions to compile code, you will be discouraged from 'installing' builds. When we use the word 'install', we mean 'get code working on my computer'.</ref>

== Notes ==
<references/>

== Installation Videos ==

Most of these videos have been produced by Google Code-In students.

* Using Apertium Virtualbox on Windows: https://youtu.be/XCUWMCJkRDo
* Installing Apertium on Ubuntu (Romanian, English): https://www.youtube.com/watch?v=vy7rWy2u_m0
* Ubuntu'ya Apertium Kurulumu / Apertium installation on Ubuntu (Turkish, English subtitles): https://www.youtube.com/watch?v=I__-BiQe7zg
* Apertium on Slitaz (English): https://youtu.be/fCluA03oIXY
* How to Install Apertium On Macintosh: https://www.youtube.com/watch?v=oSuovCCsa68

[[Category:Installation]]
[[Category:Documentation in English]]


= Minimal installation from SVN=
This page is deprecated, and the information split across other pages.

It used to contain instructions on how to compile Apertium core. For this, please see [[Install Apertium core by compiling]]

How to create language builds with new and exisiting repository information. Please see [[Install language data by compiling]]

And details about the HFST and CG modules. Please see [[Installation of grammar libraries]]

Or start from the information root at [[Installation]]?

Latest revision as of 12:55, 24 April 2017

En français

InstallationResourcesContactDocumentationDevelopmentTools

Gnome-home.png Home PageBugs.png BugsInternet.png WikiGaim.png Chat


To try Apertium[edit]

You can go online to the front page :)

There are several applications which work from the desktop without full installation. For these and more graphical user interfaces, services, plugins, etc. goto Tools.

If you would like install instructions for 'Apertium viewer', 'apy' (the Apertium server) etc. got to Tools. The install instructions can be found with the tool descriptions.


For those who want to install Apertium locally, and developers[edit]

How to install Apertium core[1] and language data on your system (developers may also want to consider their operating environment[2]).


Installing: a summary[edit]

Most people will need to,

Install Apertium Core by packaging/virtual environment[edit]

For translators: Install language data/dictionaries/pairs from repositories[edit]

Install language data using packaging, including hints about the Apertium package repository.

For language developers: Install language data/dictionaries/pairs by compiling[edit]


Alternatives[edit]

Installing Apertium core by compiling[edit]

Apertium maintains a package repository that is up-to-date and reliable. If you do not want to work in core, or develop languages, please use either packaging or a virtual environment. The packages stay up-to-date and are stable. A compile will waste your time.

However, if you are planning to work on Apertium core, or have an operating system not covered above, go right ahead, Install Apertium core by compiling[3]

Notes[edit]

  1. Apertium is a big system. There are many plugins, scripts, and extension projects. The core, the code which translates, is a multi-step set of tools joined by a stream format and, nowadays, invoked by scripts called 'modes'. You may also see the names 'lt-toolbox'/'lt-tools', 'apertium-lex-tools', and the simple title 'apertium'. These refer to groupings of the tools. Packaged or compiled, these tools can be installed as one unit. From here on, we call them 'Apertium core'.
  2. Apertium is written to be platform-independent. However, it can be difficult to maintain platform-independence over a project this wide. If you intend to do something deep with Apertium, you will gain more help from the tools if you use the Ubuntu, or a similar Debian-based, operating system. In no way does this mean that the Apertium project favours this platform.
  3. Most people know the word 'install'. It means 'put code in my operating system'. When developing, it is not usual to fully 'install'. You get the code working enough to get results. This is relevant to Apertium, which needs a rapid cycle for re-compiles. If you follow instructions to compile code, you will be discouraged from 'installing' builds. When we use the word 'install', we mean 'get code working on my computer'.

Installation Videos[edit]

Most of these videos have been produced by Google Code-In students.


Minimal installation from SVN[edit]

This page is deprecated, and the information split across other pages.

It used to contain instructions on how to compile Apertium core. For this, please see Install Apertium core by compiling

How to create language builds with new and exisiting repository information. Please see Install language data by compiling

And details about the HFST and CG modules. Please see Installation of grammar libraries

Or start from the information root at Installation?