Difference between revisions of "Talk:Corpus test"
Jump to navigation
Jump to search
(asking for a more clear formulation) |
|||
Line 12: | Line 12: | ||
* <code>fgrep -n "#"</code> monodix |
* <code>fgrep -n "#"</code> monodix |
||
* <code>fgrep -n "@"</code> bidix |
* <code>fgrep -n "@"</code> bidix |
||
: I'm pretty sure that's not the intention; I think "nl" is here used to find the ''corpus'' line numbers, not dix line numbers --[[User:Unhammer|unhammer]] 11:21, 5 January 2012 (UTC) |
Revision as of 11:21, 5 January 2012
Creation of a corpus
These 2 lines are not very clear for a non english native :
- Grep out all lines with # and @ - this will help you find problems in bidix (@) and target language monodix (#).
- Pipe through nl -s '. ' to get the right line numbers.
An example would be better. And on my computer, nl -s
does not work, but the option -n of grep (fgrep, egrep) does.
Why not something like :
fgrep -n "#"
monodixfgrep -n "@"
bidix
- I'm pretty sure that's not the intention; I think "nl" is here used to find the corpus line numbers, not dix line numbers --unhammer 11:21, 5 January 2012 (UTC)