Ideas for Google Summer of Code/Morphology with HFST
< Ideas for Google Summer of Code
		
		
		
		
		Jump to navigation
		Jump to search
		Revision as of 11:33, 2 February 2010 by Francis Tyers (talk | contribs)
This page will try to collect useful information to adapting the HFST lookup tools to use lttoolbox style tokenise-as-you-analyse.
$ https://hfst.svn.sourceforge.net/svnroot/hfst
hfst/trunk/hfst/hfst-tools/src
Files:
hfst-lookup.cc 
hfst-optimized-lookup.cc
Lines:
1184                KeyVector* kv = HFST::line_to_keyvector(&line, key_table,
                                                         &markup, &unknown);
1191                 kvs  = HFST::lookup_unique(kv, cascade[0],
                                                    key_table, &infinite);
707     lookups = lookup_all(t, kv, &flag_diacritic_set);
591 KeyVector* 
592 line_to_keyvector(char** s, KeyTable* kt, char** markup, bool* outside_sigma)
692 KeyVectorSet*
693 lookup_unique(KeyVector* kv, TransducerHandle t,
694               KeyTable* kt, bool* infinity)
hfst2/src
hfst2/sfst/hsfst.C:
247 KeyVectorVector * lookup_all(TransducerHandle t,
248                              KeyVector * input_string,
249                              KeySet * skip_symbols) 
258        return find_all_output_strings( pT,input_string, &ks);
288  KeyVectorVector * find_all_output_strings( Transducer * t,
                                             KeyVector * input,
                                             KeySet * skip_symbols) 
295      find_all_continuations(start, input_position,
                             last_input_position,
                             skip_symbols);
149   find_all_continuations(Node * n,
                         KeyVector::iterator input_position,
                         KeyVector::iterator input_end_position,
                         KeySet * skip_symbols,
                         bool preserve_epsilons=false) 

