Difference between revisions of "User:Unhammer"

From Apertium
Jump to navigation Jump to search
(7 intermediate revisions by the same user not shown)
Line 46: Line 46:
 
...
 
...
 
<Unhammer> back to sme-nob, hopefully averting more water damage
 
<Unhammer> back to sme-nob, hopefully averting more water damage
  +
</pre>
  +
  +
<pre>
  +
<Claude_Royet-Journoud> Une liste d'infinitifs prolonge l'accident.
 
</pre>
 
</pre>
   
Line 53: Line 57:
 
<miri> I hope there is no correlation ;)
 
<miri> I hope there is no correlation ;)
 
</pre>
 
</pre>
  +
   
 
<pre>
 
<pre>
Line 74: Line 79:
   
 
IRC looks much better with some [http://www.vidarholen.net/contents/rage/ rage].
 
IRC looks much better with some [http://www.vidarholen.net/contents/rage/ rage].
  +
  +
<blockquote>
  +
Er man nihilistisk nok, kunne man også ta det uskyldigste av alle ord, infinitivsmerket, og misbruke og skjende det på denne måte: Det begynte «å» regne. Kan man se på verden med mindre begeistring?
  +
  +
–Bjørneboe
  +
</blockquote>
  +
   
 
==Compounding is fun==
 
==Compounding is fun==
Line 80: Line 92:
 
nyrestaurert: nyre|staur|ert
 
nyrestaurert: nyre|staur|ert
 
angrepsoppstillinger: angrep|sopp|stillinger
 
angrepsoppstillinger: angrep|sopp|stillinger
  +
snusleverandør: snus|leve|rand|ør
   
 
$ echo bildreportagen |apertium -d . swe-dan
 
$ echo bildreportagen |apertium -d . swe-dan
 
billede #rids mugmide tagene
 
billede #rids mugmide tagene
   
  +
  +
^einannan/ein<n><m><sg><ind><cmp>+nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind>$
  +
  +
$ echo nannannannannan|apertium -d . nno-nob-morph
  +
^nannannannannan/nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind>$^./.<sent><clb>$
  +
  +
  +
$ echo fornybart|apertium -d . nob-dan
  +
fodernyoverskæg
  +
  +
Let's try compounding on verb+noun:
  +
  +
$ echo regionspresident | apertium -d . nob-nno_e
  +
region spreie seie dent
  +
  +
noun+verb:
  +
  +
$ echo forringelse | apertium -d . nob-nno_e
  +
for ringel sjå
  +
  +
noun+verb+adj:
  +
  +
$ echo autoritært|lt-proc -we nob-dan.automorf.bin
  +
^autoritært/auto<n><mf><sp><cmp>+ri<vblex><inf><cmp>+tære<adj><pp><pl>/auto<n><mf><sp><cmp>+ri<vblex><inf><cmp>+tære<adj><pp><nt><sg><ind>/auto<n><mf><sp><cmp>+ri<vblex><inf><cmp>+tære<adj><pp><mf><sg><ind>$
  +
  +
  +
What if we allow turning double consonants into single before the compound border, then we can analyse compounds of words ending in double consonants followed by a word starting with the same consonant:
  +
  +
$ echo topprøve|apertium -d . nob-nno_e-morph|cg-conv
  +
"<topprøve>"
  +
"røve" adj pp pl
  +
"topp" n m sg ind cmp
  +
"prøve" n m sg ind
  +
"topp" n m sg ind cmp detriple
  +
  +
but of course the analyser decompounding doesn't know that the second word has to actually start with that same consonant:
  +
  +
$ echo HurtigrutenLive|apertium -d . nob-nno_e
  +
hurr TigruteinLive
   
 
[[Category:Users]]
 
[[Category:Users]]

Revision as of 17:15, 24 September 2022

I am Kevin Brubeck Unhammer.

In Apertium, I work on

I've studied computational linguistics / NLP at the University of Bergen, developed grammar checkers and Norwegian WordNets for Kaldera språkteknologi AS, and worked on Saami grammar checking, machine translation and corpus crawling for the University of Tromsø.

Me on the web:

I have an Apertium /wishlist.

♪ Unhaaamer Unhaaamer, He beat the Hun by luck.

Unhaaamer Unhaaamer, he's smarter than a duck ♪

Quotes

They've a temper, some of them—particularly verbs: they're the proudest
—adjectives you can do anything with, but not verbs—however, I can
manage the whole lot of them! Impenetrability! That's what I say!
— Humpty Dumpty

<Unhammer> Every time I start working on a new Apertium lang. pair, I get water damage in my apartment.
<spectie> Unhammer, are you sure you want to start working on ht-en
<spectie> what with your precarious plumbing situation ?
...
<Unhammer> back to sme-nob, hopefully averting more water damage
<Claude_Royet-Journoud> Une liste d'infinitifs prolonge l'accident.
  <miri> now the internet is back
  <miri> but there's no water in my building
  <miri> I hope there is no correlation ;)


<Unhammer>  [-#Ipmil-] {+Ipmil+}
<Unhammer> blasphemy


the warm soft short pants of the quick-scribbler: the vocative lapse from which it begins and the accusative hole in which it ends itself

– JJ


There are a number of languages spoken by human beings in this world.

– Harald Tveit Alvestrand, in RFC 1766, "Tags for the Identification of Languages"

IRC looks much better with some rage.

Er man nihilistisk nok, kunne man også ta det uskyldigste av alle ord, infinitivsmerket, og misbruke og skjende det på denne måte: Det begynte «å» regne. Kan man se på verden med mindre begeistring?

–Bjørneboe


Compounding is fun

   lemurtvillingene: lem|urt|villingene
   nyrestaurert: nyre|staur|ert
   angrepsoppstillinger: angrep|sopp|stillinger
   snusleverandør: snus|leve|rand|ør
   $ echo bildreportagen |apertium -d . swe-dan
   billede #rids mugmide tagene


   ^einannan/ein<n><m><sg><ind><cmp>+nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind>$
   …
   $ echo nannannannannan|apertium -d . nno-nob-morph
   ^nannannannannan/nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind><cmp>+nan<n><m><sg><ind>$^./.<sent><clb>$


   $ echo fornybart|apertium -d . nob-dan
   fodernyoverskæg

Let's try compounding on verb+noun:

   $ echo regionspresident | apertium -d . nob-nno_e
   region spreie seie dent

noun+verb:

   $ echo forringelse | apertium -d . nob-nno_e
   for ringel sjå

noun+verb+adj:

   $ echo autoritært|lt-proc -we nob-dan.automorf.bin
   ^autoritært/auto<n><mf><sp><cmp>+ri<vblex><inf><cmp>+tære<adj><pp><pl>/auto<n><mf><sp><cmp>+ri<vblex><inf><cmp>+tære<adj><pp><nt><sg><ind>/auto<n><mf><sp><cmp>+ri<vblex><inf><cmp>+tære<adj><pp><mf><sg><ind>$


What if we allow turning double consonants into single before the compound border, then we can analyse compounds of words ending in double consonants followed by a word starting with the same consonant:

   $ echo topprøve|apertium -d . nob-nno_e-morph|cg-conv
   "<topprøve>"
       "røve" adj pp pl
               "topp" n m sg ind cmp
       "prøve" n m sg ind
               "topp" n m sg ind cmp detriple

but of course the analyser decompounding doesn't know that the second word has to actually start with that same consonant:

   $ echo HurtigrutenLive|apertium -d . nob-nno_e
   hurr TigruteinLive