Tokenisation for spaceless orthographies
Jump to navigation
Jump to search
See Task ideas for Google Code-in/Tokenisation for spaceless orthographies being worked on User:Eiji in 2023
Paper Findings GSoC2023
https://docs.google.com/document/d/1aTTGoLLCpr2gncq2FJIWG0InUH3tJ6epHxioKEDhNPs/edit?usp=sharing
https://github.com/yypy22/gsoc_try/
https://github.com/yypy22/apertium-jpn
see also
https://arxiv.org/pdf/2010.06858.pdf
https://docs.google.com/document/d/1p2qFp1g9OufeL_Obgg8vpljfgAwKz4D2briQvqepsMw/edit