Difference between revisions of "User:Jimregan"

From Apertium
Jump to navigation Jump to search
m (anaphora)
 
(18 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
{{TOCD}}
 
{{TOCD}}
[[User:Jimregan/apertium-en-pl.en-pl.dix|Polish-English bidix]]
 
   
  +
[[IRC]] nick: jimregan
[[User:Jimregan/polish-verbs.yaml|polish-verbs.yaml]]
 
   
  +
Melange link_id: jimregan
   
  +
== One-liners ==
==Polish-English texts under free licences==
 
  +
<pre>
{{see-also|Corpora}}
 
  +
filtered-expand () { if [ $1 = "rl" ]; then dir=":<:"; else dir=":>:";fi; lt-expand $2 |grep -v __REGEXP__|grep -v '^:<:'|grep -v ':>:$' |grep -v $dir ; }
*[http://www.oreilly.com/openbook/freedom/index.html Free As In Freedom] - [http://stallman.helion.pl/ W obronie wolności] ("In the Defense of Freedom")
 
  +
select-expand () { if [ $1 = "lr" ]; then dir=":<:"; else dir=":>:";fi; lt-expand $2 |grep -v __REGEXP__|grep -v '^:<:'|grep -v ':>:$' |grep -v $dir ; }
*[http://www.gutenberg.org/dirs/etext04/lchch10.txt Chess and Checkers: the Way to Mastership] - [http://www.gutenberg.org/files/15201/15201-8.txt Szachy i Warcaby: Droga do mistrzostwa]
 
*[http://en.wikisource.org/wiki/The_Tragedy_of_Romeo_and_Juliet The Tragedy of Romeo and Juliet] - [http://pl.wikisource.org/wiki/Romeo_i_Julia Romeo i Julia]
 
*[http://en.wikisource.org/wiki/Robinson_Crusoe Robinson Crusoe] - [http://pl.wikisource.org/wiki/Robinson_Cruzoe Przypadki Robinsona Cruzoe]
 
**Wikisource has a mechanism where they ''try'' to present automatic bilingual editions of any works they have: see [http://pl.wikisource.org/wiki/Robinson_Cruzoe?match=en Robinson Crusoe] for example. Unfortunately, it doesn't work, as different choices have been made in the laying out of different language editions. But it looks interesting.
 
*[http://languagetool.cvs.sourceforge.net/*checkout*/languagetool/JLanguageTool/src/rules/false-friends.xml False friends dictionary] Also contains French and German.
 
   
  +
list-multiple () { select-expand $1 $2 | awk -F':|:<:|:>:' -v dir="$1" '{ if (dir == "lr") print "^" $1 "$"; else print "^" $2 "$" }' | lt-proc -b $3 | awk -F/ '(NF > 2) { print $0 }' ; }
==Polish texts under free licences==
 
  +
</pre>
*[http://www.mimuw.edu.pl/polszczyzna/ Enriched Corpus of the Frequency Dictionary] - Monolingual corpus of Polish. Manually tagged. A [http://korpus.pl/download/frek.bin.tar.bz2 compiled version] for [http://poliqarp.sourceforge.net/ Poliqarp] is also available.
 
   
  +
== Anaphora resolution ==
==Polish grammar==
 
  +
*[http://free.of.pl/g/grzegorj/gram/gram00.html A Grammar of the Polish Language] by Grzegorz Jagodziński
 
  +
<pre>
  +
[00:17] <jimregan> anaphora is one of those polarising things about MT
  +
[00:18] <jimregan> RBMT is like a 1920s man's man: a man's a man, even if he's a woman
  +
[00:18] <jimregan> SMT is like a Thai prostitute: sometimes it's a man, sometimes it's a woman, sometimes it's both
  +
</pre>
  +
  +
== Random IRC ==
  +
  +
<pre>
  +
<jimregan> pl->cs adjectives up 17%
  +
<jimregan> like they friggin' shares or somthing
  +
<jimregan> *they're
  +
<jimregan> *something
  +
<jimregan> damn
  +
<spectie> haha
  +
<Kanmuri> "In market news today, PLCS ADJ was up 17%, while JR TYPNG was down 25%" ;D
  +
</pre>
  +
  +
== On spectie and questions... ==
  +
<pre>
  +
<jimregan2> still though
  +
<jimregan2> if the question ever comes about how to shoot your own leg off, you'd happily discuss aiming techniques
  +
<spectie> haha
  +
<spectie> ...or chop of your own legs with an axe while sitting in a wheelchair...
  +
<jimregan2> Fuck! You've /thought/ about it!
  +
<spectie> http://jayg123.googlepages.com/bestexitinterviewever
  +
<spectie> LOL
  +
</pre>
   
 
== Polish dictionaries ==
 
== Polish dictionaries ==

Latest revision as of 23:02, 28 March 2012

IRC nick: jimregan

Melange link_id: jimregan

One-liners[edit]

filtered-expand () { if [ $1 = "rl" ]; then dir=":<:"; else dir=":>:";fi; lt-expand $2 |grep -v __REGEXP__|grep -v '^:<:'|grep -v ':>:$' |grep -v $dir ; }
select-expand () { if [ $1 = "lr" ]; then dir=":<:"; else dir=":>:";fi; lt-expand $2 |grep -v __REGEXP__|grep -v '^:<:'|grep -v ':>:$' |grep -v $dir ; }

list-multiple () { select-expand $1 $2 | awk -F':|:<:|:>:' -v dir="$1" '{ if (dir == "lr") print "^" $1 "$"; else print "^" $2 "$" }' | lt-proc -b $3 | awk -F/ '(NF > 2) { print $0 }' ; }

Anaphora resolution[edit]

[00:17]  <jimregan> anaphora is one of those polarising things about MT
[00:18]  <jimregan> RBMT is like a 1920s man's man: a man's a man, even if he's a woman
[00:18]  <jimregan> SMT is like a Thai prostitute: sometimes it's a man, sometimes it's a woman, sometimes it's both

Random IRC[edit]

<jimregan>      pl->cs adjectives up 17%
<jimregan>      like they friggin' shares or somthing
<jimregan>      *they're
<jimregan>      *something
<jimregan>      damn
<spectie>       haha
<Kanmuri>       "In market news today, PLCS ADJ was up 17%, while JR TYPNG was down 25%" ;D

On spectie and questions...[edit]

<jimregan2> still though
<jimregan2> if the question ever comes about how to shoot your own leg off, you'd happily discuss aiming techniques
<spectie> haha
<spectie> ...or chop of your own legs with an axe while sitting in a wheelchair...
<jimregan2> Fuck! You've /thought/ about it!
<spectie> http://jayg123.googlepages.com/bestexitinterviewever
<spectie> LOL

Polish dictionaries[edit]

Untranslatable[edit]

- Wczoraj, bandyta napadł mię (Yesterday, a bandit attacked me)
- Co się stało? (What happened?)
- Mówił pieniądze albo śmierć (He said "money or death")
- A co zrobiłeś? (What did you do?)
- Ale śmierdziałem! (Oh, but I stank)

(Voiced consonants in Polish become devoiced at the end of words, so "śmierdź" and "śmierć" sound the same.)