Difference between revisions of "User:Stan88"

From Apertium
Jump to navigation Jump to search
 
(28 intermediate revisions by the same user not shown)
Line 1: Line 1:
  +
Here are some problems that might appear when translating from Polish to English and vice-versa.
What's difficult about the language pair polish <--> english ?
 
  +
== Main problems when working with pair (Polish,English) ==
   
  +
=== Noun group inflections ===
== Main problems with Polish<-->English pair : ==
 
 
'''A)'''
 
   
 
In English a form of adjective does not depend on the noun it is describing.
 
In English a form of adjective does not depend on the noun it is describing.
Line 32: Line 31:
   
   
  +
=== Word order ===
'''B)'''
 
   
Adding endings to lemmas is not sufficient in Polish.
+
In English the word order is almost fixed but in Polish it isn't.
 
Often some letter inside lemma changes when inflecting by cases or people
 
 
For example :
 
 
koło(circle) : I don't love circles -> Ja nie kocham kół
 
 
Some letters like 'o' or 'e' or 'a' are likely to change to 'ó','ę','ą'.
 
 
It is called [[alternation]] and it is caused by apophony.
 
 
'''C)'''
 
 
Possesive pronouns often change when inflecting by cases, people and genders in Polish.
 
 
But there are no cases and genders(grammar) in English.
 
 
So "Robot's" -> "Robota" (Robot is masculine)
 
 
but
 
 
"Book's" -> "Książki" (Książka is feminine)
 
 
and
 
 
"Chair's" -> "Krzesła" (Krzesło is neuter)
 
 
 
'''D)'''
 
 
In English the word order is fixed but in Polish it isn't.
 
   
 
Basic word order in Polish is SVO.
 
Basic word order in Polish is SVO.
   
But it is possible to move words around in the sentence.
+
But it is possible to move words around in the sentence (Not all orders are correct but there are quite many of them)
   
For example : My favorite dish is spagetti carbonara ( Moim ulubionym daniem jest spagetti carbonara) but it can also be (Moim daniem ulubionym spagetti carbonara jest).
+
For example : My favorite dish is spagetti carbonara ( Moim ulubionym daniem jest spagetti carbonara) and
   
  +
It can also be "Moim daniem ulubionym spagetti carbonara jest" which literally translated would be : "My dish favourite spagetti carbonara is" and this makes no sense.
So it might me a problem when translating from Polish to English, because an order of the words in sentence should be changed.
 
   
  +
Another example : I love writing articles on Wiki. (Ja kocham pisać artykuły na Wiki). It can also be "Ja na Wiki pisać artykuły kocham" or "Ja pisać na Wiki kocham artykuły".
   
  +
So it might me a problem when translating from Polish to English, because an order of the words in sentence should be changed (i.e. adjectives order in English is fixed but in Polish it isn't)
   
  +
=== Numbering problem ===
'''E)'''
 
   
Another problem is that in Polish when telling the amount of something strange thing happens :
+
Another problem is that in Polish (and not in English) when telling the amount of something, the noun form changes:
   
 
* one chair - jedno krzesło
 
* one chair - jedno krzesło
Line 95: Line 65:
 
etc
 
etc
   
  +
Here in English there is always chair(s), but in Polish it depends on the number.
Rules according to wikipedia.
 
  +
  +
So in English it depends only on plurality.
  +
  +
Rules according to [http://en.wikipedia.org/wiki/Polish_grammar wiki].
   
 
* The numeral jeden (1) behaves as an ordinary adjective, and no special rules apply.
 
* The numeral jeden (1) behaves as an ordinary adjective, and no special rules apply.
Line 103: Line 77:
 
* If the numeral is in the genitive, dative, instrumental or locative, the noun takes the same case as the numeral
 
* If the numeral is in the genitive, dative, instrumental or locative, the noun takes the same case as the numeral
   
  +
=== Subject dropping ===
 
'''F)'''
 
   
 
Polish is a pro-drop language, which English isn't:
 
Polish is a pro-drop language, which English isn't:
Line 118: Line 91:
 
It is very important part of the language and it is not the old-fashioned way to express yourself.
 
It is very important part of the language and it is not the old-fashioned way to express yourself.
   
  +
=== Defective verbs ===
'''G)'''
 
   
 
In Polish there are sentences without any subject, which can't be literally translated to English.
 
In Polish there are sentences without any subject, which can't be literally translated to English.
Line 126: Line 99:
 
In English you must use "it" to express yourself this way, which can't be used in Polish.
 
In English you must use "it" to express yourself this way, which can't be used in Polish.
   
  +
=== Preposition inflecting ===
'''H)'''
 
   
In English there are no cases but in Polish they exist.
+
In English there are almost no cases but in Polish there are 7 of them.
   
 
Prepositions in Polish require the fixed case in noun group, which they are describing (Not exactly fixed, because some prepositions may have different meanings).
 
Prepositions in Polish require the fixed case in noun group, which they are describing (Not exactly fixed, because some prepositions may have different meanings).
Line 135: Line 108:
   
 
In all these examples, in English there is always preposition + pretty cat, but in Polish it is not constant.
 
In all these examples, in English there is always preposition + pretty cat, but in Polish it is not constant.
  +
  +
=== Mathematics ===
  +
  +
Some differences in Polish:
  +
  +
A decimal point may be written as a comma e.g. 3,2 (means 3.2). (So comma when translating from Polish to English can't be always interpreted as a character whose role is dividing the sequence)
  +
  +
A division sign may be written as : (a colon) e.g. 2:3 (means 2/3)
  +
  +
Also, a multiplication sign like a decimal point e.g. 3 ∙ 2 (means 3 x 2).
  +
  +
== How to enable multiple Kazakh language-variants on a mediawiki instance ? ==
  +
  +
This extension allows to automatically change the language variant (Arabic, Cyryllic, Latin) of a page.
  +
A variant is mostly the same language in a different script.
  +
  +
http://kk.wikipedia.org/wiki/%D0%91%D0%B0%D1%81%D1%82%D1%8B_%D0%B1%D0%B5%D1%82
  +
  +
Here is the example of how it should look.
  +
  +
[[File:Screenshot-kk_wikipedia_org_2015-01-09_18-01-50.png]]
  +
  +
=== Language Converter ===
  +
Actually this plugin is the class KkConverter[https://doc.wikimedia.org/mediawiki-core/master/php/html/classKkConverter.html#details] which extends LanguageConverter[http://www.mediawiki.org/wiki/Writing_systems#LanguageConverter].
  +
  +
You can obtain the source of KkConverter from [https://doc.wikimedia.org/mediawiki-core/master/php/html/LanguageKk_8php_source.html] (file LanguageKk.php).
  +
  +
=== KKConverter Class ===
  +
  +
==== KkConverter::translate ====
  +
  +
It's major function is <code [lang='php']>KkConverter::translate ($text,$toVariant) </code>.
  +
  +
It translates given text to some variant.
  +
  +
==== KkConverter::findVariantLink ====
  +
  +
<code [lang='php']> KkConverter::findVariantLink ( & $link,& $nt, $ignoreOtherCond = false ) </code>
  +
  +
A function wrapper:
  +
  +
-> if there is no selected variant, leave the link names as they were
  +
  +
-> do not try to find variants for usernames
  +
  +
=== Enabling the script ===
  +
  +
KkConverter is already included in mediawiki by default so you don't have to install it, you have to just turn it on.
  +
  +
To use the LanguageConverter, simply change one line in apps/mediawiki/htdocs/LocalSettings.php from <code [lang='php']> $wgLanguageCode = 'xy' </code> to <code [lang='php']> $wgLanguageCode = 'kk'</code>. ( for me this is line 113 )
  +
  +
After that, you should have the script enabled on your mediawiki instance.
  +
  +
=== Other Documentations ===
  +
  +
There is practically no docs about it.
  +
  +
The only sites that can help understanding how it works are in Chineese (but they are readable in English after translated) [https://translate.google.com/translate?hl=pl&sl=zh-CN&tl=en&u=http%3A%2F%2Fzh.wikipedia.org%2Fwiki%2FHelp%3A%25E4%25B8%25AD%25E6%2596%2587%25E7%25BB%25B4%25E5%259F%25BA%25E7%2599%25BE%25E7%25A7%2591%25E7%259A%2584%25E7%25B9%2581%25E7%25AE%2580%25E3%2580%2581%25E5%259C%25B0%25E5%258C%25BA%25E8%25AF%258D%25E5%25A4%2584%25E7%2590%2586]
  +
(It is about Chineese automatic conversion, but it is almost the same way)
  +
  +
There is also one page in English [http://meta.wikimedia.org/wiki/Automatic_conversion_between_simplified_and_traditional_Chinese] but it is mostly outdated.

Latest revision as of 22:25, 9 January 2015

Here are some problems that might appear when translating from Polish to English and vice-versa.

Main problems when working with pair (Polish,English)[edit]

Noun group inflections[edit]

In English a form of adjective does not depend on the noun it is describing. But in Polish adjectives and determiners have to agree in case, number and genders with the noun they modify.

So for example : I want a blue cat ( Ja chcę niebieskiego kota ).

"kota"(cat) is in the accusative case so "niebieski"(blue) must be in the same case. That's why it is not "Ja chcę niebieski kot" but "Ja chcę niebieskiego kota".

Also the inflection of a noun depends of the gender.

There are three main genders: masculine, feminine and neuter

Cat("kot") is masculine.

But parrot("papuga") is feminine.

So : I want a blue parrot ( Ja chcę niebieską papugę) - not "niebieskiego"

Each feminine noun has "a" suffix

Each neuter noun has "o" suffix

If a noun hasn't got "o" or "a" suffix it must be masculine.

There are some exceptions of the rule, but there is quite few of them.


Word order[edit]

In English the word order is almost fixed but in Polish it isn't.

Basic word order in Polish is SVO.

But it is possible to move words around in the sentence (Not all orders are correct but there are quite many of them)

For example : My favorite dish is spagetti carbonara ( Moim ulubionym daniem jest spagetti carbonara) and

It can also be "Moim daniem ulubionym spagetti carbonara jest" which literally translated would be : "My dish favourite spagetti carbonara is" and this makes no sense.

Another example : I love writing articles on Wiki. (Ja kocham pisać artykuły na Wiki). It can also be "Ja na Wiki pisać artykuły kocham" or "Ja pisać na Wiki kocham artykuły".

So it might me a problem when translating from Polish to English, because an order of the words in sentence should be changed (i.e. adjectives order in English is fixed but in Polish it isn't)

Numbering problem[edit]

Another problem is that in Polish (and not in English) when telling the amount of something, the noun form changes:

  • one chair - jedno krzesło
  • two chairs - dwa krzesła
  • three chairs - trzy krzesła
  • four chairs - cztery krzesła
  • five chairs - pięć krzeseł
  • six chairs - sześć krzeseł

etc

Here in English there is always chair(s), but in Polish it depends on the number.

So in English it depends only on plurality.

Rules according to wiki.

* The numeral jeden (1) behaves as an ordinary adjective, and no special rules apply. 
* After the numerals dwa (dwie), trzy, cztery (2, 3, 4), and compound numbers ending with them (22, 23, 24, etc.), the noun is plural and takes the same case as the numeral, and the resulting noun phrase is plural
* With other numbers (5, 6, etc., 20, 21, 25, etc.), if the numeral is nominative or accusative, the noun takes the genitive plural form, and the resulting noun phrase is neuter singular 
* With the masculine personal plural forms of numbers , the rule given above – that if the numeral is nominative or accusative the noun is genitive plural, and the resulting phrase is neuter singular – applies to all numbers other than 1.
* If the numeral is in the genitive, dative, instrumental or locative, the noun takes the same case as the numeral

Subject dropping[edit]

Polish is a pro-drop language, which English isn't:

Subject pronouns are frequently dropped.

For example: ma kota (literally "has a cat") may mean "he/she/it has a cat".

It is also possible to drop the object or even sometimes verb, if they are obvious from context.

For example, ma ("has") or nie ma ("has not") may be used as an affirmative or negative answer to a question "does... have...?".

It is very important part of the language and it is not the old-fashioned way to express yourself.

Defective verbs[edit]

In Polish there are sentences without any subject, which can't be literally translated to English.

Such as można ("it is possible"), wolno ("it is permitted") or pada ("it is raining").

In English you must use "it" to express yourself this way, which can't be used in Polish.

Preposition inflecting[edit]

In English there are almost no cases but in Polish there are 7 of them.

Prepositions in Polish require the fixed case in noun group, which they are describing (Not exactly fixed, because some prepositions may have different meanings).

For example : bez pięknego kota(without pretty cat), z pięknym kotem(with pretty cat), dla pięknego kota(for pretty cat), przeciwko pięknemu kotu(against pretty cat)

In all these examples, in English there is always preposition + pretty cat, but in Polish it is not constant.

Mathematics[edit]

Some differences in Polish:

A decimal point may be written as a comma e.g. 3,2 (means 3.2). (So comma when translating from Polish to English can't be always interpreted as a character whose role is dividing the sequence)

A division sign may be written as : (a colon) e.g. 2:3 (means 2/3)

Also, a multiplication sign like a decimal point e.g. 3 ∙ 2 (means 3 x 2).

How to enable multiple Kazakh language-variants on a mediawiki instance ?[edit]

This extension allows to automatically change the language variant (Arabic, Cyryllic, Latin) of a page. A variant is mostly the same language in a different script.

http://kk.wikipedia.org/wiki/%D0%91%D0%B0%D1%81%D1%82%D1%8B_%D0%B1%D0%B5%D1%82

Here is the example of how it should look.

Screenshot-kk wikipedia org 2015-01-09 18-01-50.png

Language Converter[edit]

Actually this plugin is the class KkConverter[1] which extends LanguageConverter[2].

You can obtain the source of KkConverter from [3] (file LanguageKk.php).

KKConverter Class[edit]

KkConverter::translate[edit]

It's major function is KkConverter::translate ($text,$toVariant) .

It translates given text to some variant.

KkConverter::findVariantLink[edit]

KkConverter::findVariantLink ( & $link,& $nt, $ignoreOtherCond = false )

A function wrapper:

-> if there is no selected variant, leave the link names as they were

-> do not try to find variants for usernames

Enabling the script[edit]

KkConverter is already included in mediawiki by default so you don't have to install it, you have to just turn it on.

To use the LanguageConverter, simply change one line in apps/mediawiki/htdocs/LocalSettings.php from $wgLanguageCode = 'xy' to $wgLanguageCode = 'kk'. ( for me this is line 113 )

After that, you should have the script enabled on your mediawiki instance.

Other Documentations[edit]

There is practically no docs about it.

The only sites that can help understanding how it works are in Chineese (but they are readable in English after translated) [4] (It is about Chineese automatic conversion, but it is almost the same way)

There is also one page in English [5] but it is mostly outdated.