Talk:Corpus test
Creation of a corpus
These 2 lines are not very clear for a non english native :
- Grep out all lines with # and @ - this will help you find problems in bidix (@) and target language monodix (#).
- Pipe through nl -s '. ' to get the right line numbers.
An example would be better. And on my computer, nl -s
does not work, but the option -n of grep (fgrep, egrep) does.
Why not something like :
fgrep -n "#"
monodixfgrep -n "@"
bidix
- I'm pretty sure that's not the intention; I think "nl" is here used to find the corpus line numbers, not dix line numbers --unhammer 11:21, 5 January 2012 (UTC)
- I perfectly agree with you. Let see what I put when I translated the page in French Test_de_corpus#Cr.C3.A9ation_d.27un_corpus. may be we should ask Francis what he means. I don't do that every time I find something difficult in an English text, I rather put a (?) and generaly Francis texts are more easy to follow than other English texts.
nl -s
does not work either on my computers. Bech 11:42, 5 January 2012 (UTC)
- I perfectly agree with you. Let see what I put when I translated the page in French Test_de_corpus#Cr.C3.A9ation_d.27un_corpus. may be we should ask Francis what he means. I don't do that every time I find something difficult in an English text, I rather put a (?) and generaly Francis texts are more easy to follow than other English texts.
It seems that nl
numbers lines in a file. The command is in the Debian (and Ubuntu?) package coreutils
.
$ man nl NAME nl - number lines of files
- Francis Tyers 00:18, 15 January 2012 (UTC)