User:Stan88

From Apertium
Jump to navigation Jump to search

Here are some problems that might appear when translating from Polish to English and vice-versa.

Main problems when working with pair (Polish,English)

Noun group inflections

In English a form of adjective does not depend on the noun it is describing. But in Polish adjectives and determiners have to agree in case, number and genders with the noun they modify.

So for example : I want a blue cat ( Ja chcę niebieskiego kota ).

"kota"(cat) is in the accusative case so "niebieski"(blue) must be in the same case. That's why it is not "Ja chcę niebieski kot" but "Ja chcę niebieskiego kota".

Also the inflection of a noun depends of the gender.

There are three main genders: masculine, feminine and neuter

Cat("kot") is masculine.

But parrot("papuga") is feminine.

So : I want a blue parrot ( Ja chcę niebieską papugę) - not "niebieskiego"

Each feminine noun has "a" suffix

Each neuter noun has "o" suffix

If a noun hasn't got "o" or "a" suffix it must be masculine.

There are some exceptions of the rule, but there is quite few of them.


Word order

In English the word order is almost fixed but in Polish it isn't.

Basic word order in Polish is SVO.

But it is possible to move words around in the sentence (Not all orders are correct but there are quite many of them)

For example : My favorite dish is spagetti carbonara ( Moim ulubionym daniem jest spagetti carbonara) and

It can also be "Moim daniem ulubionym spagetti carbonara jest" which literally translated would be : "My dish favourite spagetti carbonara is" and this makes no sense.

Another example : I love writing articles on Wiki. (Ja kocham pisać artykuły na Wiki). It can also be "Ja na Wiki pisać artykuły kocham" or "Ja pisać na Wiki kocham artykuły".

So it might me a problem when translating from Polish to English, because an order of the words in sentence should be changed (i.e. adjectives order in English is fixed but in Polish it isn't)

Numbering problem

Another problem is that in Polish (and not in English) when telling the amount of something, the noun form changes:

  • one chair - jedno krzesło
  • two chairs - dwa krzesła
  • three chairs - trzy krzesła
  • four chairs - cztery krzesła
  • five chairs - pięć krzeseł
  • six chairs - sześć krzeseł

etc

Here in English there is always chair(s), but in Polish it depends on the number.

So in English it depends only on plurality.

Rules according to wiki.

* The numeral jeden (1) behaves as an ordinary adjective, and no special rules apply. 
* After the numerals dwa (dwie), trzy, cztery (2, 3, 4), and compound numbers ending with them (22, 23, 24, etc.), the noun is plural and takes the same case as the numeral, and the resulting noun phrase is plural
* With other numbers (5, 6, etc., 20, 21, 25, etc.), if the numeral is nominative or accusative, the noun takes the genitive plural form, and the resulting noun phrase is neuter singular 
* With the masculine personal plural forms of numbers , the rule given above – that if the numeral is nominative or accusative the noun is genitive plural, and the resulting phrase is neuter singular – applies to all numbers other than 1.
* If the numeral is in the genitive, dative, instrumental or locative, the noun takes the same case as the numeral

Subject dropping

Polish is a pro-drop language, which English isn't:

Subject pronouns are frequently dropped.

For example: ma kota (literally "has a cat") may mean "he/she/it has a cat".

It is also possible to drop the object or even sometimes verb, if they are obvious from context.

For example, ma ("has") or nie ma ("has not") may be used as an affirmative or negative answer to a question "does... have...?".

It is very important part of the language and it is not the old-fashioned way to express yourself.

Defective verbs

In Polish there are sentences without any subject, which can't be literally translated to English.

Such as można ("it is possible"), wolno ("it is permitted") or pada ("it is raining").

In English you must use "it" to express yourself this way, which can't be used in Polish.

Preposition inflecting

In English there are almost no cases but in Polish there are 7 of them.

Prepositions in Polish require the fixed case in noun group, which they are describing (Not exactly fixed, because some prepositions may have different meanings).

For example : bez pięknego kota(without pretty cat), z pięknym kotem(with pretty cat), dla pięknego kota(for pretty cat), przeciwko pięknemu kotu(against pretty cat)

In all these examples, in English there is always preposition + pretty cat, but in Polish it is not constant.

Mathematics

Some differences in Polish:

A decimal point may be written as a comma e.g. 3,2 (means 3.2). (So comma when translating from Polish to English can't be always interpreted as a character whose role is dividing the sequence)

A division sign may be written as : (a colon) e.g. 2:3 (means 2/3)

Also, a multiplication sign like a decimal point e.g. 3 ∙ 2 (means 3 x 2).

How to enable multiple Kazakh language-variants on a mediawiki instance ?

This extension allows to automatically change the language variant (Arabic, Cyryllic, Latin) of a page. A variant is mostly the same language in a different script.

http://kk.wikipedia.org/wiki/%D0%91%D0%B0%D1%81%D1%82%D1%8B_%D0%B1%D0%B5%D1%82

Here is the example of how it should look.

Screenshot-kk wikipedia org 2015-01-09 18-01-50.png

Language Converter

Actually this plugin is the class KkConverter[1] which extends LanguageConverter[2].

You can obtain the source of KkConverter from [3] (file LanguageKk.php).

KKConverter Class

KkConverter::translate

It's major function is KkConverter::translate ($text,$toVariant) .

It translates given text to some variant.

KkConverter::findVariantLink

KkConverter::findVariantLink ( & $link,& $nt, $ignoreOtherCond = false )

A function wrapper:

-> if there is no selected variant, leave the link names as they were

-> do not try to find variants for usernames

Enabling the script

KkConverter is already included in mediawiki by default so you don't have to install it, you have to just turn it on.

To use the LanguageConverter, simply change one line in apps/mediawiki/htdocs/LocalSettings.php from $wgLanguageCode = 'xy' to $wgLanguageCode = 'kk'. ( for me this is line 113 )

After that, you should have the script enabled on your mediawiki instance.

Other Documentations

There is practically no docs about it.

The only sites that can help understanding how it works are in Chineese (but they are readable in English after translated) [4] (It is about Chineese automatic conversion, but it is almost the same way)

There is also one page in English [5] but it is mostly outdated.