Tokenisation for spaceless orthographies
See Task ideas for Google Code-in/Tokenisation for spaceless orthographies being worked on User:Eiji in 2023
Paper Findings GSoC2023
https://docs.google.com/document/d/1aTTGoLLCpr2gncq2FJIWG0InUH3tJ6epHxioKEDhNPs/edit?usp=sharing
https://github.com/yypy22/gsoc_try/