Difference between revisions of "User:Ruthenian8/GSOC 2021 progress report"
		
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
		
		
		
		
		
	
Ruthenian8 (talk | contribs)  | 
				Ruthenian8 (talk | contribs)  m  | 
				||
| Line 48: | Line 48: | ||
'''Development log''': Updates coming up  | 
  '''Development log''': Updates coming up  | 
||
'''Status''': Naive coverage 82%, type coverage 76%.  | 
  '''Status''': Naive coverage 82%, type coverage 76%.  | 
||
Revision as of 06:17, 13 July 2021
- Title: Morphological analyzer for Bagvalal
 - Proposal: proposal
 - Abstract: Bagvalal is an endangered typologically rare Caucasian language from the Nakh-Daghestanian family. Its conservation and study are constrained by the lack of sufficient NLP-tools that can be used to process field data. 
My proposal is to develop an fst-powered morphological analyzer for Bagvalal using all the available grammatical and lexical information. In the future this project can allow Apertium to support morphological analysis for multiple Nakh-Daghestanian languages and develop corresponding language pairs. - GitHub repo: bagvalal
 
| Week | Intended changes | Status | 
|---|---|---|
| Week 1 | Testing and refining the existing rules for the closed word classes (e. g. numerals, clitics and pronouns). | Complete | 
| Week 2 | Writing documentation and tests. | Complete | 
| Week 3 & 4 | Testing and refining the existing rules for the open word classes (e. g. verbs, nouns and adjectives). Writing documentation and tests.  | 
Complete | 
| Week 5 | Adding the missing adjectives and adverbs from the available dictionaries (see the Resources section above). Testing the analysis results and the model performance.  | 
In progress | 
| Week 6 | Adding the missing nouns. Testing the analysis results and the model performance.  | 
In progress | 
| Week 7 & 8 | Adding the missing verbs, participles, converbs and masdars.  Testing the analysis results and the model performance.  | 
In progress | 
| Week 9 & 10 | Tokenizing the corpora. Converting the existing annotations to an appropriate format Creating word-analysis pairs. Writing documentation.  | 
In progress | 
| Week 11 | Expelling the false analyses from the model Testing and debugging. Finishing the work on the documentation  | 
In progress | 
| Week 12 | Running all the tests and debugging | In progress | 
Development log: Updates coming up
Status: Naive coverage 82%, type coverage 76%.