Task ideas for Google Code-in/Comment XML

From Apertium
Jump to navigation Jump to search

Given an XML file on standard input like:

    <e><p><l>foo</l><r>bar</r></p></e>
    <e><p><l>baz</l><r>baa</r></p></e>
    <e><p><l>foa</l><r>yuo</r></p></e>
    <e><p><l>gae</l><r>yaa</r></p></e>

And a list in a file (filename given as the first argument to the script) like

    foo:
    :yaa

Print the following XML file to standard output:

    <!--<e><p><l>foo</l><r>bar</r></p></e>-->
    <e><p><l>baz</l><r>baa</r></p></e>
    <e><p><l>foa</l><r>yuo</r></p></e>
    <!--<e><p><l>gae</l><r>yaa</r></p></e>-->

The task should also take into account:

tags:

    foo<n>: = foo<s n="n"/>

spaces:

    foo bar<n>: = foo<b/>bar<s n="n"/>

grouped elements:

    foo# bar<n>: = foo<g><b/>bar</g><s n="n"/>

You can use any programming language you want, but Python is recommended.

Example

A more complete test example:

    $ <input.dix python3 script.py to-be-commented.txt >output.dix

input.dix:

    <e><p><l>foo<s n="n"/></l><r>bar<s n="n"/></r></p></e>
    <e><p><l>noo<s n="n"/></l><r>foo<s n="n"/></r></p></e>
    <e><p><l>foo<s n="vblex"/></l><r>yuo<s n="vblex"/></r></p></e>
    <e><p><l>yaa<s n="n"/></l><r>yum<s n="n"/></r></p></e>
    <e><p><l>gae<s n="n"/></l><r>yaa<s n="n"/></r></p></e>
    <e><p><l>bah<s n="vblex"/></l><r>yah<s n="vblex"/></r></p></e>
    <e><p><l>bah<s n="n"/></l><r>rah<s n="n"/></r></p></e>
    <e><p><l>bah<s n="n"/></l><r>meh<s n="n"/></r></p></e>
    <e><p><l>dum<g><b/>bum</g><s n="n"/></l><r>rum<s n="n"/></r></p></e>

to-be-commented.txt:

    foo<n>:
    :yaa<n>
    bah:rah
    dum# bum<n>:

output.dix:

    <!-- <e><p><l>foo<s n="n"/></l><r>bar<s n="n"/></r></p></e> -->
    <e><p><l>noo<s n="n"/></l><r>foo<s n="n"/></r></p></e>
    <e><p><l>foo<s n="vblex"/></l><r>yuo<s n="vblex"/></r></p></e>
    <e><p><l>yaa<s n="n"/></l><r>yum<s n="n"/></r></p></e>
    <!-- <e><p><l>gae<s n="n"/></l><r>yaa<s n="n"/></r></p></e> -->
    <!-- <e><p><l>bah<s n="vblex"/></l><r>rah<s n="vblex"/></r></p></e> -->
    <!-- <e><p><l>bah<s n="n"/></l><r>rah<s n="n"/></r></p></e> -->
    <e><p><l>bah<s n="n"/></l><r>meh<s n="n"/></r></p></e>
    <!-- <e><p><l>dum<g><b/>bum</g><s n="n"/></l><r>rum<s n="n"/></r></p></e> -->

Note how "bah:rah" comments out both the vblex and n entries (no tags in that pattern), but not the line that has "meh" in it's <r> element. Also, note how :yaa<n> does not comment out the line that has "yaa" in its <l> element.