Difference between revisions of "One-liners"

From Apertium
Jump to navigation Jump to search
(Created page with '{{TOCD}} ==Useful (mostly) bash one-liners== * Perl regular-expression for removing all tags after the initial: perl -pe 's/(\^[^<]+<[^>]+>)(<\w+>)*\$/\1\$/g;' ^Lemma<V><Pres…')
 
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{TOCD}}
 
 
 
==Useful (mostly) bash one-liners==
 
==Useful (mostly) bash one-liners==
   
* Perl regular-expression for removing all tags after the initial: perl -pe 's/(\^[^<]+<[^>]+>)(<\w+>)*\$/\1\$/g;'
+
* Perl regular-expression for removing all tags after the initial:
  +
  +
<pre>
  +
perl -pe 's/(\^[^<]+<[^>]+>)(<\w+>)*\$/\1\$/g;'
   
 
^Lemma<V><Pres><Sg>$ -> ^Lemma<V>$
 
^Lemma<V><Pres><Sg>$ -> ^Lemma<V>$
  +
</pre>
  +
  +
* Get unknown words from chunked text and sort by frequency:
  +
  +
<pre>
  +
sed 's/\$\W*\^/$\n^/g' | grep '@' | sed 's/><.*/>$/g' | sort -f | uniq -ci | sort -gr
  +
</pre>
  +
  +
<pre>
  +
tr " " "\n" | grep "@" | tr -d "[:punct:]" | sort | uniq -c | sort -r
  +
</pre>
  +
  +
* Strip newlines:
  +
  +
<pre>
  +
sed ':a;N;$!ba;s/\n//g'
  +
</pre>
  +
  +
Alternatively: <code>tr '\n' ' '</code>
  +
[[Category:Tools]]

Latest revision as of 06:24, 27 June 2010

Useful (mostly) bash one-liners[edit]

  • Perl regular-expression for removing all tags after the initial:
perl -pe 's/(\^[^<]+<[^>]+>)(<\w+>)*\$/\1\$/g;'

^Lemma<V><Pres><Sg>$ -> ^Lemma<V>$
  • Get unknown words from chunked text and sort by frequency:
sed 's/\$\W*\^/$\n^/g' | grep '@' | sed 's/><.*/>$/g' |  sort -f | uniq -ci  | sort -gr
tr " " "\n" | grep "@" | tr -d "[:punct:]" | sort | uniq -c | sort -r
  • Strip newlines:
sed ':a;N;$!ba;s/\n//g'

Alternatively: tr '\n' ' '