Difference between revisions of "User:Ruthenian8/GSOC 2021 progress report"
		
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
		
		
		
		
		
	
Ruthenian8 (talk | contribs)  (Created page with "* '''Title''': Morphological analyzer for Bagvalal * '''Proposal''': [https://drive.google.com/file/d/1Y05eQtFP7ioz50z2GlUdvB2Edh4lel6G/view?usp=sharing proposal] * '''Abstrac...")  | 
				Ruthenian8 (talk | contribs)   (Add table)  | 
				||
| Line 1: | Line 1: | ||
* '''Title''': Morphological analyzer for Bagvalal  | 
  * '''Title''': Morphological analyzer for Bagvalal  | 
||
* '''Proposal''': [https://drive.google.com/file/d/1Y05eQtFP7ioz50z2GlUdvB2Edh4lel6G/view?usp=sharing proposal]  | 
  * '''Proposal''': [https://drive.google.com/file/d/1Y05eQtFP7ioz50z2GlUdvB2Edh4lel6G/view?usp=sharing proposal]  | 
||
* '''Abstract''': Bagvalal is an endangered typologically rare Caucasian language from the Nakh-Daghestanian family. Its conservation and study are constrained by the lack of sufficient NLP-tools that can be used to process field data. <br/>My proposal is to develop an fst-powered morphological analyzer for Bagvalal using all the available grammatical and lexical information. In the future this project can allow Apertium to support morphological analysis for multiple Nakh-Daghestanian languages and develop corresponding language pairs.  | 
|||
| ⚫ | |||
* '''GitHub repo''': [https://github.com/ruthenian8/bagvalal bagvalal]  | 
  * '''GitHub repo''': [https://github.com/ruthenian8/bagvalal bagvalal]  | 
||
* '''Progress''': updates coming up  | 
  * '''Progress''': updates coming up  | 
||
{| class="wikitable"  | 
|||
|-  | 
|||
! scope="col"| Week  | 
|||
! scope="col"| Intended changes  | 
|||
! scope="col"| Status  | 
|||
|-  | 
|||
! scope="row"| Week 1  | 
|||
| Testing and refining the existing rules for the closed word classes (e. g. numerals, clitics and pronouns).  | 
|||
| Complete  | 
|||
|-  | 
|||
! scope="row"| Week 2  | 
|||
| Writing documentation and tests.  | 
|||
| In progress  | 
|||
|-  | 
|||
! scope="row"| Week 3 & 4  | 
|||
| Testing and refining the existing rules for the open word classes (e. g. verbs, nouns and adjectives).<br/>Writing documentation and tests.  | 
|||
| In progress  | 
|||
|-  | 
|||
! scope="row"| Week 5  | 
|||
| Adding the missing adjectives and adverbs from the available dictionaries (see the Resources section above).<br/>Testing the analysis results and the model performance.  | 
|||
| In progress  | 
|||
|-  | 
|||
! scope="row"| Week 6  | 
|||
| Adding the missing nouns.<br/>Testing the analysis results and the model performance.  | 
|||
| In progress  | 
|||
|-  | 
|||
! scope="row"| Week 7 & 8  | 
|||
| Adding the missing verbs, participles, converbs and masdars. <br/>Testing the analysis results and the model performance.  | 
|||
| In progress  | 
|||
|-  | 
|||
! scope="row"| Week 9 & 10  | 
|||
| Tokenizing the corpora.<br/>Converting the existing annotations to an appropriate format<br/>Creating word-analysis pairs.<br/>Writing documentation.  | 
|||
| In progress  | 
|||
|-  | 
|||
! scope="row"| Week 11  | 
|||
| Expelling the false analyses from the model<br/>Testing and debugging.<br/>Finishing the work on the documentation  | 
|||
| In progress  | 
|||
|-  | 
|||
! scope="row"| Week 12  | 
|||
| Running all the tests and debugging  | 
|||
| In progress  | 
|||
|}  | 
|||
| ⚫ | |||
Revision as of 10:55, 18 June 2021
- Title: Morphological analyzer for Bagvalal
 - Proposal: proposal
 - Abstract: Bagvalal is an endangered typologically rare Caucasian language from the Nakh-Daghestanian family. Its conservation and study are constrained by the lack of sufficient NLP-tools that can be used to process field data. 
My proposal is to develop an fst-powered morphological analyzer for Bagvalal using all the available grammatical and lexical information. In the future this project can allow Apertium to support morphological analysis for multiple Nakh-Daghestanian languages and develop corresponding language pairs. - GitHub repo: bagvalal
 - Progress: updates coming up
 
| Week | Intended changes | Status | 
|---|---|---|
| Week 1 | Testing and refining the existing rules for the closed word classes (e. g. numerals, clitics and pronouns). | Complete | 
| Week 2 | Writing documentation and tests. | In progress | 
| Week 3 & 4 | Testing and refining the existing rules for the open word classes (e. g. verbs, nouns and adjectives). Writing documentation and tests.  | 
In progress | 
| Week 5 | Adding the missing adjectives and adverbs from the available dictionaries (see the Resources section above). Testing the analysis results and the model performance.  | 
In progress | 
| Week 6 | Adding the missing nouns. Testing the analysis results and the model performance.  | 
In progress | 
| Week 7 & 8 | Adding the missing verbs, participles, converbs and masdars.  Testing the analysis results and the model performance.  | 
In progress | 
| Week 9 & 10 | Tokenizing the corpora. Converting the existing annotations to an appropriate format Creating word-analysis pairs. Writing documentation.  | 
In progress | 
| Week 11 | Expelling the false analyses from the model Testing and debugging. Finishing the work on the documentation  | 
In progress | 
| Week 12 | Running all the tests and debugging | In progress | 
Development log: Updates coming up