Difference between revisions of "User:Ruthenian8/GSOC 2021 progress report"
		
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
		
		
		
		
		
	
Ruthenian8 (talk | contribs)  | 
				Ruthenian8 (talk | contribs)   | 
				||
| Line 20: | Line 20: | ||
! scope="row"| Week 3 & 4  | 
  ! scope="row"| Week 3 & 4  | 
||
| Testing and refining the existing rules for the open word classes (e. g. verbs, nouns and adjectives).<br/>Writing documentation and tests.  | 
  | Testing and refining the existing rules for the open word classes (e. g. verbs, nouns and adjectives).<br/>Writing documentation and tests.  | 
||
| Complete  | 
|||
| In progress  | 
  |||
|-  | 
  |-  | 
||
! scope="row"| Week 5  | 
  ! scope="row"| Week 5  | 
||
Revision as of 06:15, 13 July 2021
- Title: Morphological analyzer for Bagvalal
 - Proposal: proposal
 - Abstract: Bagvalal is an endangered typologically rare Caucasian language from the Nakh-Daghestanian family. Its conservation and study are constrained by the lack of sufficient NLP-tools that can be used to process field data. 
My proposal is to develop an fst-powered morphological analyzer for Bagvalal using all the available grammatical and lexical information. In the future this project can allow Apertium to support morphological analysis for multiple Nakh-Daghestanian languages and develop corresponding language pairs. - GitHub repo: bagvalal
 
| Week | Intended changes | Status | 
|---|---|---|
| Week 1 | Testing and refining the existing rules for the closed word classes (e. g. numerals, clitics and pronouns). | Complete | 
| Week 2 | Writing documentation and tests. | Complete | 
| Week 3 & 4 | Testing and refining the existing rules for the open word classes (e. g. verbs, nouns and adjectives). Writing documentation and tests.  | 
Complete | 
| Week 5 | Adding the missing adjectives and adverbs from the available dictionaries (see the Resources section above). Testing the analysis results and the model performance.  | 
In progress | 
| Week 6 | Adding the missing nouns. Testing the analysis results and the model performance.  | 
In progress | 
| Week 7 & 8 | Adding the missing verbs, participles, converbs and masdars.  Testing the analysis results and the model performance.  | 
In progress | 
| Week 9 & 10 | Tokenizing the corpora. Converting the existing annotations to an appropriate format Creating word-analysis pairs. Writing documentation.  | 
In progress | 
| Week 11 | Expelling the false analyses from the model Testing and debugging. Finishing the work on the documentation  | 
In progress | 
| Week 12 | Running all the tests and debugging | In progress | 
Development log: Updates coming up