Difference between revisions of "User:Pranavaswaroop/Application"

From Apertium
Jump to navigation Jump to search
Line 15: Line 15:
   
 
'''Present Status of Apertium'''
 
'''Present Status of Apertium'''
  +
Apertium structural transfer uses finite-state pattern matching to detect, in the usual left-to-right, longest-match way, fixed-length patterns of lexical forms to process and performs the corresponding transformations. A shallowtransfer rule consists of a sequence of lexical forms to detect and the transformations that have to be applied to them.
   
 
==Project plan==
 
==Project plan==

Revision as of 03:02, 3 April 2009

Implementation of n-Stage Transfer

A proposal for Google Summer of Code 2009

Pranava Swaroop, March 2009

Introduction

Partial parsing technique tries to grasp the syntactic information reliably and efficiently without digging deep into the analysis. The technique is tested for its robustness and speed. It is basically controlled by a cascade of finite-state automata which consists of a pipeline of recognizers. There have been several implementations of partial parsing including Cass - a fast, robust partial parser, Mirine 2.2 - a Korean grammar-checker and APOLN - a partial parser of unrestricted natural language sentences. These implementations (especially the latter) have shown that partial parsers can be used to resolve treatment of less related languages. Therefore its implementation would be quite beneficial for Apertium. This document describes a novel implementation of the partial parser which would improve the treatment of less related languages and allow for more complex verb movement and proposes the funding of the project through Google Summer of Code 2009 program as a part of Apertium.

Background

Partial Parser
A partial parser uses 'semi-deterministic' robust parsing algorithms which permit the analysis of unrestricted texts. It works with simple grammars, which are usually defined with regular patterns. The output of the parser is a complete analysis tree. Partial parsers recognize phrase boundaries mainly on the basis of cues provided by the local contexts. Regardless of whether or not abstractions such as phrases occur in the model, most of the relevant information is contained directly in the sequence of words and part-of-speech tags to be processed.
A partial parser is mostly controlled by a cascaded set of finite state automata, hence forth it is described by a number of levels. In this finite state cascade, a genuine recursion is not possible. The whole strategy is based on "simple-first-parsing", as put by Steven Abney, we make easy calls first, whittling away at the harder decisions in the process.

Present Status of Apertium Apertium structural transfer uses finite-state pattern matching to detect, in the usual left-to-right, longest-match way, fixed-length patterns of lexical forms to process and performs the corresponding transformations. A shallowtransfer rule consists of a sequence of lexical forms to detect and the transformations that have to be applied to them.

Project plan

Project schedule

Deliverables

About the author

Mentors

External Links