Apertium scalable service

From Apertium
Jump to: navigation, search

Contents

[edit] Introduction

This is the wiki page of ApertiumScalableServer, a scalable architecture to provide translation web services based on Apertium. This project has been moved to ScaleMT, a new architecture that supports different translation engines, document translation (odt and rtf at the moment), a new XML-RPC API, etc. Please check ScaleMT if you want a more feature-rich and tested software.

[edit] User manual

[edit] System architecture

There are two main applications that make the web service work:

  • ApertiumServerRouter: Runs on a JavaEE web container (like Apache Tomcat) and processes the HTTP translation requests. Spreads them between the different translation servers (that have Apertium installed). It also manages the different Apertium daemons running on the translation servers and, under certain circumstances, can start and stop translation servers.
  • ApertiumServerSlave : It's a simple Java application that runs on the translation servers. These servers must have Apertium installed. Receives translation requests from ApertiumServerRouter and sends them to the running Apertium instances. Note that the system is designed to run many ApertiumServerSlave instances (one per server) and only one ApertiumServerRouter instance.

[edit] Getting it

At the moment, the only way to get the applications is downloading its source code and compiling them. You'll need to download the source code of three projects from the Apertium svn repository. Before executing the following command, be sure you have Subversion installed.

svn co http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-scalable-service

To compile the source code you'll need:

  • A Java Development Kit compatible with Java version 6. It can be Sun's implementation or any other implementation that follows the specification (see [1]).
  • Maven. If you don't have Maven installed, simply download it, unzip it, and be sure that the bin directory is in your PATH.

Once you are sure you have Java JDK and Maven, you can build the applications.

  • Build ApertiumServerRMIInterfaces. This project contains the common classes of ApertiumServerSlave and ApertiumServerRouter:
cd ApertiumServerRMIInterfaces
mvn install
  • Build ApertiumServerSlave:
cd ApertiumServerSlave
mvn package

The compiled project can be found in target/ApertiumServerSlave-1.0-assembled.zip

  • Build ApertiumServerRouter
cd ApertiumServerRouter
mvn package

The compiled project can be found in target/ApertiumServerRouter.war

  • If you need the javadoc of any of the projects, from its root directory execute:
mvn javadoc:javadoc

And the javadoc website will be generated in target/site/apidocs

[edit] Installing

[edit] ApertiumServerSlave

Unzip ApertiumServerSlave-1.0-assembled.zip to the directory where you want to install it. Be sure that the machine has Internet connection, because the installation script will download Apertium from its SVN repository.

Then run the script installApertiumAndPairs.sh with:

./installApertiumAndPairs.sh

or

bash installApertiumAndPairs.sh

By default it will download and install Apertium and all the stable pairs, and install them under /home/youruser/local. You can change these this options with the following parameters:

  • -p Installation_prefix : Changes the installation prefix. If you run the script with the options -p /foo/bar it will install executables under /foo/bar/bin, libraries under /foo/bar/lib, etc.
  • -l pair1,pair2,pair3... : Installs only the specified language pairs. The list of pairs must be a subset of the list of stable pairs that can be found in Apertium wiki main page. Note that the language order must be the same that the one in main page, although translators in both ways will be installed, e.g. -p en-es will install translators from Spanish to English and from English to Spanish, but -p es-en won't install any translator. There are pairs that only install a translator in one way, see the arrows in Main page.

When installation is complete, you can safely remove apertium directory. ApertiumServerSlave can't work with an existing Apertium installation, because it modifies Apertium modes files to make it run as a daemon.

[edit] ApertiumServerRouter

As this application is packaged as a ready-to-deploy war file, there is no need to installation. To run it simply follow the instructions of your Java web container. But before running it, you'll probably need to configure it.

[edit] Configuring

[edit] ApertiumServerSlave

Application options can be changed by editing INSTALLATION_DIRECTORY/conf/configuration.properties. These are the options that can be changed and their meaning:

  • requestrouter_host: Name of the host where ApertiumServerRouter is running. When this application starts, it contacts ApertiumServerRouter to tell that the server is ready to perform translations. This is the only property you'll need to change to make the system work.
  • requestrouter_port: Port of requestrouter_host on which rmiregistry is listening. Default value is 1098.
  • requestrouter_objectname: Name of the RMI object exported by ApertiumSeverRouter. If you don't modify it inApertiumSeverRouter 's configuration, the default value is OK.
  • memoryrate_64bit: It is known that programs generally need more memory in 64-bit operative systems than on 32-bit ones. If the application is running on a 64-bit operative system, its free memory is multiplied by the value of this property. The default value is 0.6087. It is not recommended to change it. See the calibration section to know how to change this value.
  • daemon_frozen_time: If an Apertium instance doesn't emit any output during this time (in milliseconds), having received an input, we assume it is frozen. The default value, 20 seconds should be OK. Change it only if the system reports false frozen daemons.
  • daemon_check_status_period: Daemon status checking period, in milliseconds. A very low period can cause system overload, so there is no need to change this value.
  • apertium_timeout: Maximum time, in milliseconds, Apertium can take to perform a translation. If this time is exceeded, an error is returned to ApertiumServerRouter. Its default value is very high, so timeouts are only reported when there are unexpected errors.
  • apertium_max_deformat: Maximum number of simultaneously running Apertium deformatters. To tranlate a text, first it is deformatted launching an instance of the corresponding apertium deformatter (text deformatter or html deformatter), then it is sent to the right daemon, and finally, the daemon result is reformatted launching an instance of the corresponding apertium reformatter. The system's bottleneck is in the daemons, so the default value for this property is 1.
  • apertium_max_reformat: Maximum number of simultaneously running Apertium reformatters. The default value is 1.
  • apertium_null_mode_suffix: Suffix that all the modes that allow Apertium running as a daemon share. Don't change it.
  • apertium_supported_pairs: Comma-separated list of language pairs the system can translate with (because they can work as daemons). In this case the first code is the source language and the second code, the target language. So, we'll have both en-es and es-en. Don't modify this property. Its value is set by the installation script described above.
  • apertium_path: Prefix of the directories where Apertium is installed. Don't modify this property. Its value is set by the installation script described above. If you change this value to point to an existing Apertium installation, it won't work, because the Apertium installatin needs to be made with the provided installation script, that creates new modes files.

[edit] ApertiumServerRouter

Editing ApertiumServerRouter properties is a bit more difficult. You'll need to unzip ApertiumServerRouter.war, change the desired configuration properties and zip its content again. Main configuration options are located in file WEB-INF/classes/configuration.properties. These are the options present in this file:

  • requestrouter_rmi_host: Name of the host where ApertiumServerRouter will run. This is the only property you'll need to change to make the system work.
  • rmi_registry_port: Port on which rmiregistry is listening. Default value is 1098, so you'll need to manually start rmiregistry on port 1098. Remember that rmiregistry must run on the machine where ApertiumServerRouter runs, as well as on machines running ApertiumServerSlave. The difference is that ApertiumServerSlave starts RMIRegistry automatically, but ApertiumServerRouter doesn't, because of the restrictions of running in a Java web container.
  • requestrouter_rmi_name: Name of the RMI remote object exported by ApertiumServerRouter. The default value is OK if you don't modify the requestrouter_objectname property of ApertiumServerSlave.
  • requestrouter_rmi_port: Port on which RMI remote object exported by ApertiumServerRouter will listen. There is no need to modify it, unless you get an exception saying "port not available".
  • admissioncontrol_interval: Period, in milliseconds, of Admission control updating. Admission control is the subsystem that decides whether a request should be accepted or not, depending on system's load. Don't change this value unless you really know what you are doing.
  • admissioncontrol_treshold: If system "calculated load" is over this threshold, requests won't be accepted. The default value has been tested and should work OK, but if requests are rejected while the system is not overloaded, try to increase this value.
  • admissioncontrol_k: We get "calculated load" by combining real load and "calculated load" in the previous instant: calculated_load = real_load*k+previous_load*(1-k). The default value have been tested and it is not recommended to change it.
  • placement_controller_execution_period: Period, in milliseconds, of Placement controller execution. Placement controller decides which language pairs run on each translation server. This is a critic value. Changing it could make the system crash, so it is better to leave the default value.
  • server_status_updater_execution_period: Period, in milliseconds, of server status checking. It is recommended to leave the default value.
  • scheduler_maxcharacters_in_daemon_queue: If the number of characters of a language pair being translated by a server is lower than this value. a translation request of that language pair is sent to the server.It is recommended to leave the default value.
  • scheduler_maxelements_in_daemon_queue: If the number of request of a language pair being translated by a server is lower than this value. a translation request of that language pair is sent to the server.It is recommended to leave the default value.
  • scheduler_not_registered_priority_increment: The higher, the less priority unregistered users have.
  • scheduler_timeout: Maximum time, in milliseconds, a server can take to perform a translation. If this time is exceeded, an error is returned. Its default value is very high, so timeouts are only reported when there are unexpected errors.
  • load_prediction_alpha: It is very similar to admission control k. The predicted load of the different language pairs is calculated by combining the amount (and size) of requests received during a period of time, and the predicted load before this period, so predicted_load = measured_load*alpha+previous_prediceted_load*(1-alpha). Default value has been tested and it is not recommended to change it.
  • request_k: Constant CPU cost of processing a request. The CPU cost of a translation request is calculated by adding this value to the number of characters of the request. Don't change it.

To keep track of registered users and give them higher priority, their data are stored in a MySQL database. Database connection properties are configured in JPA configuration file:WEB-INF/classes/META-INF/persistence.xml. By default, it connects to a database called ApertiumWSUsers on localhost, with username apertium and password apertium.

[edit] Port summary

Be sure these ports are reachable, since they are needed by the system to work.

Machine running ApertiumServerRouter:

  • RMIRegistry port. By default, it is 1098. If you want to use another port for running RMIRegistry, change the property rmi_registry_port.
  • RMI remote object port. The port where the object that communicates with ApertiumServerSlave instances listens. By default it is 1432, but can be changed by editing the property requestrouter_rmi_port.
  • HTTP port. The port which the web server listens to.

Machine running ApertiumServerSlave:

  • RMIRegistry port: 1099.
  • RMI remote object port. The port where the object that communicates with ApertiumServerRouter listens. By default it is 1331, but it can be changed with the option -RMIPort <port-number> when running ApertiumServerSlave.

[edit] Logging

Both applications use Apache log4j to manage application messages. ApertiumServerRouter 's log messages are stored in /tmp/ApertiumServerRouter.log and ApertiumServerSlave's ones in /tmp/ApertiumServerSlave.log. The name of these files, along with many other logging properties can be changed editing the configuration file log4j.properties.

[edit] Running

Firstly, run
rmiregistry 1098
on the machine where you are going to run ApertiumServerRouter to start rmiregistry . Then run ApertiumServerRouter by deploying your re-zipped ApertiumServerRouter.war in your Java web server. For example, in Apache Tomcat, put that file in the directory called webapps.


Then, run ApertiumServerSlave on each of the servers you want to use to perform translations. Use the script run-apertium-server.sh and add a parameter with the name of the host where ApertiumServerSlave runs:

bash run-apertium-server.sh hostname

It will calculate the server's capacity by performing a series of translations and store it in conf/capacity.properties. If you have already run ApertiumServerSlave previously and you don't want to wait for the capacity calculation, add the argument -capacityFromConfigFile. Using this argument capacity is read from conf/capacity.properties and the startup time decreases.

bash run-apertium-server.sh hostname -capacityFromConfigFile

After reading or calculating capacity, it contacts ApertiumServerRouter and starts to receive translation requests. You can tune RMI ports and remote object name by editing run-apertium-server.sh. See javadoc of class com.gsoc.apertium.translationengines.main.Main for more information.

Of course, servers can be stopped (with Ctrl+C) or started at any time.

[edit] Dynamic server management: local networks

If you don't want to manually start and stop translation servers, ApertiumServerRouter can do it for you. It will decide to start or stop servers depending on the translation capacity needed by the incoming requests. You'll only have change some configuration properties, and ApertiumServerRouter will connect via SSH to the computers of your network where ApertiumServerSlave is installed, and run or stop it when needed. This is called On Demand Server Management mode.

To make ApertiumServerRouter work in On Demand Server Management mode, you'll have to follow a couple of additional configuration steps. After unzipping ApertiumServerRouter.war and editing WEB-INF/classes/configuration.properties, and before zipping it again, edit the following files located at WEB-INF/classes/:

  • OnDemandServerInterface.properties: Contains general options about dynamic server management:
    • class: Class that contacts servers to start and stop ApertiumServerSlave instances. Use the default value: com.gsoc.apertium.translationengines.router.ondemandservers.LocalNetworkOnDemandServer.
    • maxServers: Maximum number of servers started by ApertiumServerRouter. Must be equal or lower than the number of elements in the list of servers in LocalNetworkOnDemandServer.properties.
    • maxInactivityTime: Maximum time, in milliseconds, a server can run without receiving any load. After this time, the server is stopped.
    • startUpTimeout: Maximum time, in milliseconds, the system waits for newly started server to contact the request router. The default value should be fine.
  • LocalNetworkOnDemandServer.properties: Contains options about how to contact new servers when running in On Demand Server Management mode using class com.gsoc.apertium.translationengines.router.ondemandservers.LocalNetworkOnDemandServer.
    • hosts: Comma-separated list of servers with ApertiumServerSlave installed. Each element of the list follows this format: username:password@hostname:path. Username and password must belong to an existing user on the remote machine. Hostname is the host name of the remote machine, and path, the path where ApertiumServerSlave is installed. username, password and path are optional. If they are not specified, their values are read from the properties defaultUser, defaultPasword and defaultPath respectively.
    • defaultUser: Default user name.
    • defaultPassword: Default password.
    • defaultPath: Default ApertiumServerSlave installation path.

[edit] Dynamic server management: Amazon EC2

If you plan to deploy this Apertium web service implementation with dynamic server management on Amazon EC2, it is recommended to change the configuration explained in the previous section. With this new configuration, new server instances will be started and stopped, instead of starting and stopping the application on existing servers.

After unzipping ApertiumServerRouter.war and editing WEB-INF/classes/configuration.properties, and before zipping it again, edit the following files located at WEB-INF/classes/:

  • OnDemandServerInterface.properties: Contains general options about dynamic server management:
    • class: Class that contacts servers to start and stop ApertiumServerSlave instances. Use: com.gsoc.apertium.translationengines.router.ondemandservers.AmazonOnDemandServer.
    • maxServers: Maximum number of EC2 server instances started by ApertiumServerRouter.
    • maxInactivityTime: Maximum time, in milliseconds, a server can run without receiving any load. After this time, the server is stopped.
    • startUpTimeout: Maximum time, in milliseconds, the system waits for newly started server to contact the request router. The default value should be fine.
  • AmazonOnDemandServer.properties: Contains options about how to start new servers when running in On Demand Server Management mode using class com.gsoc.apertium.translationengines.router.ondemandservers.AmazonOnDemandServer.
    • amazon_id: Your Amazon Web Service Access Key ID. Compulsory property.
    • amazon_key: Your Amazon Web Service Secret Access Key. Compulsory property.
    • amazon_image_id: ID of an AMI that must run ApertiumServerSlave when started. Compulsory property.
    • amazon_security_groups: comma-separated list of security groups associated with the server instances that will be started. These groups must allow connections to the following ports:
      • Port on which RMI Registry runs: 1099.
      • Port on which the ApertiumServerSlave RMI remote object is exported: 1331.
    • amazon_key_name: Key pair associated with the server instances that will be started. Necessary if you want to manually connect via SSH to the instances and check that everything works as expected.
    • amazon_region_url: URL of the region where the new instances will be launched and the AMI will be looked for. If this property is not present, region EU-West is used.
    • amazon_avzone: availability zone where the new instances will be launched. It is a good idea to launch the request router and the apertium instances in the same availability zone. If you include the scripts explained below in your AMIs, you won't need to edit this property.

[edit] Building AMIs for Amazon EC2

[edit] Bootstrapping

To avoid creating new AMIs when a new version of Apertium Web Service is released, it is recommended to use a mechanism called bootstrapping. When the AMI starts, it downloads a package from Amazon S3, unzips it, and executes the script inside the package. The package also contains the lastest version of ApertiumServerSlave or ApertiumServerRouter, so the script installs it and changes the necessary configuration properties.

[edit] ApertiumServerRouter AMI

We followed these steps to create an AMI that runs ApertiumServerRouter:

  • Start a clean installation of Ubuntu 9.04 Base.
  • Install JRE 6.
  • Install Apache Tomcat. Download it from here and unzip to /root/apache-tomcat.
  • Install MySQL. Create the database and user specified in JPA configuration file. Give the user the right permissions.
  • Install s3cmd:
    apt-get install s3cmd
    As root user, configure it with your Amazon WS ID and secret key:
    s3cmd --configure
  • Install unzip:
    apt-get install unzip
  • Prepare bootstrap:
    • Put the file bootstrap that can be found in ApertiumServerRouter source code root/misc/ec2 in /etc/init.d, and a symbolic link from /etc/rc2.d/S99bootstrap to /etc/init.d/bootstrap:
      ln -s /etc/init.d/bootstrap etc/rc2.d/S99bootstrap
    • Put a file called bootstrap.tar.gz in a S3 bucket called org.apertium.server.router.bootstrap. This file must contain a folder called bootstrap containing a version of ApertiumServerRouter.war configured to run on Amazon EC2 and the script bootstrap.sh that can be found in ApertiumServerRouter source code root/misc/ec2.
  • Create AMI using EC2 commands.

[edit] ApertiumServerSlave AMI

We followed these steps to create an AMI that runs ApertiumServerRouter:

  • Start a clean installation of Ubuntu 9.04 Base.
  • Install JRE 6.
  • Install s3cmd:
    apt-get install s3cmd
    As root user, configure it with your Amazon WS ID and secret key:
    s3cmd --configure
  • Install libraries needed to run Apertium:
    sudo apt-get install subversion build-essential g++ pkg-config libxml2 libxml2-dev libxml2-utils xsltproc flex automake autoconf libtool libpcre3-dev 
  • Install ICU library:
    apt-get install libicu-dev
  • Install Apertium. To do so, compile ApertiumServerSlave and copy ApertiumServerSlave-1.0-assembled.zip to the running AMI. Unzip it and run installApertiumAndPairs.sh. Remember the values of the properties apertium_path and apertium_supported_pairs from conf/configuration.properties, after installing Apertium.
  • Run run-apertium-server.sh. Wait for server capacity to be calculated and then kill it. Keep the file capacity.properties that have been created in conf directory.
  • Prepare bootstrap:
    • Put the file bootstrap that can be found in ApertiumServerSlave source code root/misc/ec2 in /etc/init.d, and a symbolic link from /etc/rc2.d/S99bootstrap to /etc/init.d/bootstrap:
      ln -s /etc/init.d/bootstrap etc/rc2.d/S99bootstrap
    • Put a file called bootstrap.tar.gz in a S3 bucket called org.apertium.server.slave.bootstrap. This file must contain a folder called bootstrap containing ApertiumServerSlave-1.0-assembled.zip, the script bootstrap.sh that can be found in ApertiumServerSlave source code root/misc/ec2 and the file capacity.properties created in the previous step. Note that this file should only be used on an EC2 with the same size than the instance where the file was created. But before packing bootstrap.tar.gz, edit bootstrap.sh and write the values of the properties mentioned above at the beginning of the script.
  • You can remove the directory ApertiumServerSlave-1.0 to decrease the size of the AMI.
  • Create AMI using EC2 commands.

[edit] Advanced configuration: calibration

There are some advanced configuration properties we didn't explain in previous sections. This Apertium Web Service implementation estimates the CPU and memory capacity of each server, and the amount of load (for each language pair) the system will have to process (based on previous requests). Then, starts and stops daemons in the different servers to meet the load requirements. The CPU capacity is measured as the number of characters of a Spanish plain text that the server can translate into Catalan during a second.

[edit] Load Converter

The amount of load predicted for each language pair is based on the number of requests received for that pair during a past period of time and the number of characters of each request. As the CPU capacity needed to translate a fixed amount of characters depends on the language pair, it is necessary to convert the amount of characters of each request to the equivalent Spanish-Catalan amount of characters , i.e. , the amount of characters that needs the same CPU capacity to be translated from Spanish to Catalan than the original amount of characters to translated with the original language pair. Something similar happens with the format. The CPU capacity needed to translate a fixed amount of characters depends on the format. Usually the same amount of characters needs less CPU capacity when it is in HTML format, because HTML tags are not translated. So, the amount of characters of each HTML request is converted to the equivalent amount of plain text characters, i.e. that needs the same CPU capacity to be translated. Applying these two conversions to the number of characters of a request, it can be compared with server's capacity. To convert load between language pairs and formats, the conversion rates are stored in LoadConverter.properties are used. This file is located in the root of ApertiumServerRouter's classpath. There are different types of properties in this file:

  • source_language_code-target_language_code: Contains the rate to convert from an amount of characters of the pair named by this property key to the equivalent Spanish-Catalan amount of characters.
  • format_html: Rate to convert from an amount of characters with HTML format to the equivalent plain text amount of characters.
These values have been already calculated but in the future they won't be very accurate because the rules of the different language pairs will change and so will their speed. If you want to calculate them again, execute this command from an existing installation of ApertiumServerSlave:
java -jar ApertiumServerSlave-1.0.jar -pairsInformation -comparationPair es-ca -speedFile LoadConverter.properties -memoryFile MemoryRequirements.properties

A new version of LoadConverter.properties will appear in ApertiumServerSlave installation directory.

[edit] Memory requirements

To place each daemon on the right server the system needs to know how much memory is needed by each language pair. This information is stored in MemoryRequirements.properties. This file is located in the root of ApertiumServerRouter's classpath. For each property, the key is a language pair and the value the amount of megabytes of memory it requires.

These values have been already calculated but in the future they won't be very accurate because the rules of the different language pairs will change and so will their memory requirements. If you want to calculate them again, execute this command from an existing installation of ApertiumServerSlave:
java -jar ApertiumServerSlave-1.0.jar -pairsInformation -comparationPair es-ca -speedFile LoadConverter.properties -memoryFile MemoryRequirements.properties

A new version of MemoryRequirements.properties will appear in ApertiumServerSlave installation directory.

[edit] 64-bit operative systems

It is known that programs generally need more memory on 64-bit operative systems than on 32-bit ones. If ApertiumServerSlave is running on a 64-bit operative system, its free memory is multiplied by the value of the configuration property memoryrate_64bit. The value of this property is calculated as the average of the division between memory needed on 32-bit operative systems and memory needed on 64-bit operative systems for each language pair.

Having a version of MemoryRequirements.properties created on a 32-bit operative system and another version created on 64-bit one, the value of memoryrate_64bit can be calculated running the following command from ApertiumServerSlave installation directory:

java -jar ApertiumServerSlave-1.0.jar -compareMemory -file32 absolute_path_of_32_bit_memory_requirements_file -file64 absolute_path_of_64_bit_memory_requirements_file

[edit] Constant CPU cost of a request

The CPU cost of a translation request is calculated by adding the value of the property request_k to the number of characters of the request (previously converted to the equivalent amount of Spanish-Catalan characters). request_k represents the computational cost of all the operations needed to process a translation request on ApertiumServerSlave and not performed by an Apertium daemon, like unmarshalling the RMI request, invoking deformatter and reformatter, managing the queue, etc. Taking into account this CPU cost is very important to manage daemons accurately, because this cost can be higher than the CPU cost of translating the text using an Apertium daemon.

To estimate the value of request_k we need to measure some parameters in the system while it is loaded enough to consume all the CPU capacity of the server. It is recommended to use only one server and one language pair, Spanish-Catalan, because the arithmetic operations will be much more simple. During a time period t we measure the amount of characters processed by the system nc, and the number of served requests, np. If server's capacity is C, theorically the system can process C*t characters. If we substract nc from C*t, we get the number of characters equivalent to the computational cost of the constant part of all the request. If we divide this value by np, we have the constant CPU cost of each request.

To summarize:

k= ( C*t - nc ) / np

C = servers's capacity

t = test time

nc = number of characters processed during the test time

np = number of requests processed during test time

If you want to estimate this value on your own, server's capacity is shown when it starts, and test time and the number of characters and requests processed during that time is written in ApertiumServerRouter 's log file. Look for a line starting with "LoadPredictor -Requests received during".


[edit] API Specification

[edit] Introduction

This API is very similar to Google AJAX Language API to make as easy as possible switching to Apertium JSON API. For more information about Google AJAX Language API, see http://code.google.com/intl/en/apis/ajaxlanguage/documentation/reference.html#_intro_fonje .

There are two resources:

http://ApertiumServerInstallationHost/ApertiumServerRouter/resources/translate

and

http://ApertiumServerInstallationHost/ApertiumServerRouter/resources/listPairs

The first one translates pieces of plain text or html code, and the second one lists the available language pairs.

Both resources admit GET and POST http methods. The value of arguments must be properly escaped (e.g., via the functional equivalent of Javascript's encodeURIComponent() method).

[edit] Common arguments and response format

These arguments are all optional and common to both resources:

  • key : User's personal API key. Requests from registered users have higher priority.
  • callback : Alters the response format, adding a call to a Javascript function. See description below.
  • context : If callback parameter is supplied too, adds additional arguments to the function call. See description below.

If nor callback neither context arguments are supplied, this is the JSON object returned by both resources:

{ "responseData" : JSON Object with the requested data , "responseStatus" : Response numeric code , "responseDetails" : Error description }

If callback argument is supplied, a call to a function named by the callback value is returned. For instance, if callback argument's value is foo, this is the JavaScript code returned:

foo({ "responseData" : JSON Object with the requested data , "responseStatus" : Response numeric code , "responseDetails" : Error description })

If both callback and context arguments are supplied, the returned function call has more arguments. If callback's value is foo and context's value is 'bar:

foo('bar',JSON Object with the requested data , Response numeric code , Error description )

[edit] listPairs resource

This resource only accepts the common arguments.

The response data returned is an array of language pairs, following this format:

[{"sourceLanguage": source language code ,"targetLanguage": target language code }, ... ]

responseStatus is always 200, that means the request was processed OK, and responseDetails has a null value.

So if we call this resource with no arguments:

curl 'http://ApertiumServerInstallationHost/ApertiumServerRouter/resources/listPairs'

we get, for example:

{"responseData":[{"sourceLanguage":"ca","targetLanguage":"oc"},{"sourceLanguage":"en","targetLanguage":"es"}],"responseDetails":null,"responseStatus":200}

[edit] translate resource

This resource accepts the common arguments mentioned above, plus the following specific arguments:

  • q : Source text or HTML code to be translated. Compulsory argument.
  • langpair : Source language code and target language code, separated by '|' character, which is escaped as '%7C'. Compulsory argument.
  • format : Source format. text for plain text and html for HTML code. This argument is optional. If this argument is missing it is assumed that source is plain text.

The response data is JSON object following this format:

{ "translatedText" : translated text }

Many different response status codes can be returned. This is the list with all the codes and their meaning:

  • 200 : Text has been translated successfully, responseDetails field is null.
  • 400 : Bad parameters. A compulsory argument is missing, or there is an argument with wrong format. A more accurate description can be found in responseDetails field.
  • 451 : Not supported pair. Apertium can't translate with the requested language pair.
  • 452 : Not supported format. The translation engine doesn't recognize the requested format.
  • 500 : Unexpected error. An unexpected error happened. Depending on the error, a more accurate description can be found in responseDetails field.
  • 552 : Overloaded system. The system is overloaded and can't process the request.

Here is a simple example. Requesting a translation with:

curl 'http://ApertiumServerInstallationHost/ApertiumServerRouter/resources/translate?q=hello%20world&langpair=en%7Ces&callback=foo'

the result is:

foo({"responseData":{"translatedText":"hola Mundo"},"responseDetails":null,"responseStatus":200})

And if we add the context parameter:

curl 'http://ApertiumServerInstallationHost/ApertiumServerRouter/resources/translate?q=hello%20world&langpair=en%7Ces&callback=foo&context=a'

we get

foo('a',{"translatedText":"hola Mundo"},200,null)

[edit] Batch interface

More than one translation can be performed in the same request if we use more than one q argument or more than one langpair. If there is only one q argument and more than one langpair arguments, the same input string is translated with different language pairs. If there is only one langpair argument and more than one q arguments, the different input strings are translated with the same language pair. And if both arguments are supplied more than one time, and they are repeated exactly the same times, the first q is translated with the first langpair, the second q with the second langpair, etc.

The returned JSON changes a bit when using the batch interface. Now the field responseData contains an array of JSON objects, each one with the usual fields: responseData, responseStatus and responseDetails. Note that we have particular values of responseStatus and responseDetails for each translation, but global values too. If all the translation are OK, these values match, but if there is an error in any translation, global values of these fields take the value of the erroneous translation. If there is more than one erroneous translation, global fields take the value of one the the erroneus translations.

These examples show the described behaviour:

curl  'http://ApertiumServerInstallationHost/ApertiumServerRouter/resources/translate?q=hello%20world&q=bye&langpair=en%7Ces'
{"responseData":[{"responseData":{"translatedText":"hola Mundo"},"responseDetails":null,"responseStatus":200},
{"responseData":{"translatedText":"adiós"},"responseDetails":null,"responseStatus":200}],"responseDetails":null,"responseStatus":200}


curl  'http://ApertiumServerInstallationHost/ApertiumServerRouter/resources/translate?q=hello%20world&langpair=en%7Ces&langpair=en%7Cca&callback=foo'
foo({"responseData":[{"responseData":{"translatedText":"hola Mundo"},"responseDetails":null,"responseStatus":200},
{"responseData":{"translatedText":"Món d'hola"},"responseDetails":null,"responseStatus":200}],"responseDetails":null,"responseStatus":200})


curl  'http://ApertiumServerInstallationHost/ApertiumServerRouter/resources/translate?q=hello%20world&q=goodbye&langpair=en%7Ces&langpair=en%7Cca&callback=foo&context=bar'
foo('bar',[{"responseData":{"translatedText":"hola Mundo"},"responseDetails":null,"responseStatus":200},
{"responseData":{"translatedText":"adéu"},"responseDetails":null,"responseStatus":200}],200,null)
Personal tools