Difference between revisions of "User:Jimregan"

From Apertium
Jump to navigation Jump to search
m (Enriched Corpus of the Frequency Dictionary/A Grammar of the Polish Language)
m (anaphora)
 
(27 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{TOCD}}
[[User:Jimregan/apertium-en-pl.pl.dix|Polish monodix]]


[[IRC]] nick: jimregan
[[User:Jimregan/apertium-en-pl.en-pl.dix|Polish-English monodix]]


Melange link_id: jimregan


== One-liners ==
==Polish-English texts under free licences==
<pre>
{{see-also|Corpora}}
filtered-expand () { if [ $1 = "rl" ]; then dir=":<:"; else dir=":>:";fi; lt-expand $2 |grep -v __REGEXP__|grep -v '^:<:'|grep -v ':>:$' |grep -v $dir ; }
*[http://www.oreilly.com/openbook/freedom/index.html Free As In Freedom] - [http://stallman.helion.pl/ W obronie wolności] ("In the Defense of Freedom")
select-expand () { if [ $1 = "lr" ]; then dir=":<:"; else dir=":>:";fi; lt-expand $2 |grep -v __REGEXP__|grep -v '^:<:'|grep -v ':>:$' |grep -v $dir ; }
*[http://www.gutenberg.org/dirs/etext04/lchch10.txt Chess and Checkers: the Way to Mastership] - [http://www.gutenberg.org/files/15201/15201-8.txt Szachy i Warcaby: Droga do mistrzostwa]
*[http://en.wikisource.org/wiki/The_Tragedy_of_Romeo_and_Juliet The Tragedy of Romeo and Juliet] - [http://pl.wikisource.org/wiki/Romeo_i_Julia Romeo i Julia]
*[http://en.wikisource.org/wiki/Robinson_Crusoe Robinson Crusoe] - [http://pl.wikisource.org/wiki/Robinson_Cruzoe Przypadki Robinsona Cruzoe]
**Wikisource has a mechanism where they ''try'' to present automatic bilingual editions of any works they have: see [http://pl.wikisource.org/wiki/Robinson_Cruzoe?match=en Robinson Crusoe] for example. Unfortunately, it doesn't work, as different choices have been made in the laying out of different language editions. But it looks interesting.


list-multiple () { select-expand $1 $2 | awk -F':|:<:|:>:' -v dir="$1" '{ if (dir == "lr") print "^" $1 "$"; else print "^" $2 "$" }' | lt-proc -b $3 | awk -F/ '(NF > 2) { print $0 }' ; }
==Polish texts under free licences==
</pre>
*[http://www.mimuw.edu.pl/polszczyzna/ Enriched Corpus of the Frequency Dictionary] - Monolingual corpus of Polish. Manually tagged.


== Anaphora resolution ==
==Polish grammar==

*[http://free.of.pl/g/grzegorj/gram/gram00.html A Grammar of the Polish Language] by Grzegorz Jagodziński
<pre>
[00:17] <jimregan> anaphora is one of those polarising things about MT
[00:18] <jimregan> RBMT is like a 1920s man's man: a man's a man, even if he's a woman
[00:18] <jimregan> SMT is like a Thai prostitute: sometimes it's a man, sometimes it's a woman, sometimes it's both
</pre>

== Random IRC ==

<pre>
<jimregan> pl->cs adjectives up 17%
<jimregan> like they friggin' shares or somthing
<jimregan> *they're
<jimregan> *something
<jimregan> damn
<spectie> haha
<Kanmuri> "In market news today, PLCS ADJ was up 17%, while JR TYPNG was down 25%" ;D
</pre>

== On spectie and questions... ==
<pre>
<jimregan2> still though
<jimregan2> if the question ever comes about how to shoot your own leg off, you'd happily discuss aiming techniques
<spectie> haha
<spectie> ...or chop of your own legs with an axe while sitting in a wheelchair...
<jimregan2> Fuck! You've /thought/ about it!
<spectie> http://jayg123.googlepages.com/bestexitinterviewever
<spectie> LOL
</pre>

== Polish dictionaries ==
*[http://www.mimuw.edu.pl/~jsbien/BW/SSSP/SSSP.tex Polish-Swahili]

==Untranslatable==

- Wczoraj, bandyta napadł mię (Yesterday, a bandit attacked me)
- Co się stało? (What happened?)
- Mówił pieniądze albo śmierć (He said "money or death")
- A co zrobiłeś? (What did you do?)
- Ale śmierdziałem! (Oh, but I stank)

(Voiced consonants in Polish become devoiced at the end of words, so "śmierdź" and "śmierć" sound the same.)


[[Category:Users|Jimregan]]
[[Category:Users|Jimregan]]

Latest revision as of 23:02, 28 March 2012

IRC nick: jimregan

Melange link_id: jimregan

One-liners[edit]

filtered-expand () { if [ $1 = "rl" ]; then dir=":<:"; else dir=":>:";fi; lt-expand $2 |grep -v __REGEXP__|grep -v '^:<:'|grep -v ':>:$' |grep -v $dir ; }
select-expand () { if [ $1 = "lr" ]; then dir=":<:"; else dir=":>:";fi; lt-expand $2 |grep -v __REGEXP__|grep -v '^:<:'|grep -v ':>:$' |grep -v $dir ; }

list-multiple () { select-expand $1 $2 | awk -F':|:<:|:>:' -v dir="$1" '{ if (dir == "lr") print "^" $1 "$"; else print "^" $2 "$" }' | lt-proc -b $3 | awk -F/ '(NF > 2) { print $0 }' ; }

Anaphora resolution[edit]

[00:17]  <jimregan> anaphora is one of those polarising things about MT
[00:18]  <jimregan> RBMT is like a 1920s man's man: a man's a man, even if he's a woman
[00:18]  <jimregan> SMT is like a Thai prostitute: sometimes it's a man, sometimes it's a woman, sometimes it's both

Random IRC[edit]

<jimregan>      pl->cs adjectives up 17%
<jimregan>      like they friggin' shares or somthing
<jimregan>      *they're
<jimregan>      *something
<jimregan>      damn
<spectie>       haha
<Kanmuri>       "In market news today, PLCS ADJ was up 17%, while JR TYPNG was down 25%" ;D

On spectie and questions...[edit]

<jimregan2> still though
<jimregan2> if the question ever comes about how to shoot your own leg off, you'd happily discuss aiming techniques
<spectie> haha
<spectie> ...or chop of your own legs with an axe while sitting in a wheelchair...
<jimregan2> Fuck! You've /thought/ about it!
<spectie> http://jayg123.googlepages.com/bestexitinterviewever
<spectie> LOL

Polish dictionaries[edit]

Untranslatable[edit]

- Wczoraj, bandyta napadł mię (Yesterday, a bandit attacked me)
- Co się stało? (What happened?)
- Mówił pieniądze albo śmierć (He said "money or death")
- A co zrobiłeś? (What did you do?)
- Ale śmierdziałem! (Oh, but I stank)

(Voiced consonants in Polish become devoiced at the end of words, so "śmierdź" and "śmierć" sound the same.)