Talk:Constraint-based lexical selection module
Revision as of 11:17, 1 December 2011 by Francis Tyers (talk | contribs)
rule formats
Note: Felipe doesn't like "skip".
- can't say I do either … sounds like a command rather than a constraint --unhammer 11:42, 18 November 2011 (UTC)
- also, in the <or>, should we read them as independent of each other? that's a bit confusing since otherwise they're all required and have a certain order --unhammer 11:42, 18 November 2011 (UTC)
The regular expression for the OR below is:
(nazi<adj>[0-9A-Za-z <>]*|totalitari<adj>[0-9A-Za-z <>]*|feixista<adj>[0-9A-Za-z <>]*|franquista<adj>[0-9A-Za-z <>]*|militar<adj>[0-9A-Za-z <>]*|fiscal<adj>[0-9A-Za-z <>]*)
- Francis Tyers 11:53, 18 November 2011 (UTC)
- 1
<rule> <remove lemma="règim" tags="n.*"> <acception lemma="diet" tags="n.*"/> </remove> <or> <skip lemma="nazi" tags="adj.*"/> <skip lemma="totalitari" tags="adj.*"/> <skip lemma="feixista" tags="adj.*"/> <skip lemma="franquista" tags="adj.*"/> <skip lemma="militar" tags="adj.*"/> <skip lemma="fiscal" tags="adj.*"/> </or> </rule>
- 2
<rule> <remove lemma="règim" tags="n.*"> <acception lemma="diet" tags="n.*"/> </remove> <or> <pattern lemma="nazi" tags="adj.*"/> <pattern lemma="totalitari" tags="adj.*"/> <pattern lemma="feixista" tags="adj.*"/> <pattern lemma="franquista" tags="adj.*"/> <pattern lemma="militar" tags="adj.*"/> <pattern lemma="fiscal" tags="adj.*"/> </or> </rule>
- 3
<rule> <remove-from lemma="règim" tags="n.*"> <translation lemma="diet" tags="n.*"/> </remove-from> <pattern> <or> <pattern-item lemma="nazi" tags="adj.*"/> <pattern-item lemma="totalitari" tags="adj.*"/> <pattern-item lemma="feixista" tags="adj.*"/> <pattern-item lemma="franquista" tags="adj.*"/> <pattern-item lemma="militar" tags="adj.*"/> <pattern-item lemma="fiscal" tags="adj.*"/> </or> </pattern> </rule> <rule c="la dona dels seus somnis"> <select-for lemma="dona" tags="n.*"> <translation lemma="wife" tags="n.*"/> </select> <pattern> <pattern-item lemma="de" tags="pr.*"/> <pattern-item lemma="*" tags="det.pos.*"/> <pattern-item lemma="somni" tags="n.*"/> </pattern> </rule>
- 4
<rule> <target lemma="règim" tags="n.*"> <remove lemma="diet" tags="n.*"/> </target> … </rule> <rule c="la dona dels seus somnis"> <target lemma="dona" tags="n.*"> <select lemma="wife" tags="n.*"/> </target> … </rule>
Old application strategy
The following is an inefficient implementation of the rule application process:
# s ("prova" n) ("event" n) (-3 "guanyador") (-2 "de") # # tipus = "select"; # centre = "^prova<n>.*" # tl_patro = ["^event<n>.*"] # sl_patro = {-3: "^guanyador<", -2: "^de<"} CLASS Rule: tipus = enum('select', 'remove') centre = ''; tl_patro = []; sl_patro = {}; rule_table = {}; # e.g. rule_table["estació"] = [rule1, rule2, rule3]; i = 0 DEFINE ApplyRule(rule, lu): FOREACH target IN lu.tl: SWITCH rule.tipus: 'select': IF target NOT IN rule.tl_patro: DELETE target 'remove': IF target IN rule.tl_patro: DELETE target FOREACH pair(sl, tl) IN sentence: FOREACH centre IN rule_table: IF centre IN sl: FOREACH rule IN rule_table[centre]: matched = False FOREACH context_item IN rule_table[centre][rule]: IF context_item in sentence: matched = True ELSE: matched = False # If all of the context items have matched, and none of them have not matched # if a rule matches break and continue to the pair. IF matched == True: sentence[i] = ApplyRule(rule_table[centre][rule], sentence[i]) break i = i + 1