<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Chebrolutejasvi%2FGSOC_2020_proposal%3A_Hindi-Telugu</id>
	<title>Chebrolutejasvi/GSOC 2020 proposal: Hindi-Telugu - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.apertium.org/w/index.php?action=history&amp;feed=atom&amp;title=Chebrolutejasvi%2FGSOC_2020_proposal%3A_Hindi-Telugu"/>
	<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Chebrolutejasvi/GSOC_2020_proposal:_Hindi-Telugu&amp;action=history"/>
	<updated>2026-04-18T13:53:33Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.34.1</generator>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Chebrolutejasvi/GSOC_2020_proposal:_Hindi-Telugu&amp;diff=71956&amp;oldid=prev</id>
		<title>Chebrolutejasvi at 16:41, 31 March 2020</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Chebrolutejasvi/GSOC_2020_proposal:_Hindi-Telugu&amp;diff=71956&amp;oldid=prev"/>
		<updated>2020-03-31T16:41:54Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;a href=&quot;//wiki.apertium.org/w/index.php?title=Chebrolutejasvi/GSOC_2020_proposal:_Hindi-Telugu&amp;amp;diff=71956&amp;amp;oldid=71951&quot;&gt;Show changes&lt;/a&gt;</summary>
		<author><name>Chebrolutejasvi</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Chebrolutejasvi/GSOC_2020_proposal:_Hindi-Telugu&amp;diff=71951&amp;oldid=prev</id>
		<title>Chebrolutejasvi at 16:36, 31 March 2020</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Chebrolutejasvi/GSOC_2020_proposal:_Hindi-Telugu&amp;diff=71951&amp;oldid=prev"/>
		<updated>2020-03-31T16:36:45Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #222; text-align: center;&quot;&gt;Revision as of 16:36, 31 March 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-deleted&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-addedline diff-side-added&quot;&gt;&lt;div&gt;[[Category:GSoC_2020_student_proposals]]&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-deletedline diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td colspan=&quot;2&quot; class=&quot;diff-empty diff-side-added&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;div&gt;== &#039;&#039;&#039;Contact Information&#039;&#039;&#039; ==&lt;/div&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;div&gt;== &#039;&#039;&#039;Contact Information&#039;&#039;&#039; ==&lt;/div&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-deleted&quot;&gt;&lt;br /&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;
  &lt;td class=&quot;diff-context diff-side-added&quot;&gt;&lt;br /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Chebrolutejasvi</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.apertium.org/w/index.php?title=Chebrolutejasvi/GSOC_2020_proposal:_Hindi-Telugu&amp;diff=71950&amp;oldid=prev</id>
		<title>Chebrolutejasvi: Created page with &quot; == &#039;&#039;&#039;Contact Information&#039;&#039;&#039; ==   Name: Tejasvi Chebrolu (chebrolutejasvi on WIki)  Location: Hyderabad, India  University: International Institute of Information Technology ...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.apertium.org/w/index.php?title=Chebrolutejasvi/GSOC_2020_proposal:_Hindi-Telugu&amp;diff=71950&amp;oldid=prev"/>
		<updated>2020-03-31T16:31:32Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot; == &amp;#039;&amp;#039;&amp;#039;Contact Information&amp;#039;&amp;#039;&amp;#039; ==   Name: Tejasvi Chebrolu (chebrolutejasvi on WIki)  Location: Hyderabad, India  University: International Institute of Information Technology ...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Contact Information&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Name: Tejasvi Chebrolu (chebrolutejasvi on WIki)&lt;br /&gt;
&lt;br /&gt;
Location: Hyderabad, India&lt;br /&gt;
&lt;br /&gt;
University: International Institute of Information Technology&lt;br /&gt;
&lt;br /&gt;
E-Mail: tejasvi.chebrolu@research.iiit.ac.in&lt;br /&gt;
&lt;br /&gt;
IRC: chebrolutejasvi&lt;br /&gt;
&lt;br /&gt;
Timezone: UTC +5:30 or IST&lt;br /&gt;
&lt;br /&gt;
Github: https://github.com/tejasvicsr1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Why is it that I am interested in Apertium?&amp;#039;&amp;#039;&amp;#039;==&lt;br /&gt;
Apertium is an open-source organisation dedicated to machine translation. As a child, I grew up in different places and was exposed to different languages. This led to me being fascinated with language translation and I wanted to contribute to help in making communication easier for everyone using machine translation. &lt;br /&gt;
&lt;br /&gt;
Apertium focuses on low- resource languages. Growing up in India, a country with 22 official recognised languages and many more unrecognised ones, there was a lack of a good quality machine translation service. There are hardly any resources for most Indian languages and the work Apertium does manages to counter this.  &lt;br /&gt;
&lt;br /&gt;
Apertium is a rule-based system. As a student of Computational Linguistics, we have multiple linguistics courses. As a student and an undergraduate researcher, I am interested in rule-based systems and Apertium provides an excellent platform to further my interests.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Which of the published tasks am I interested in? What do I plan to do?&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
I am going to work on “ Adopt an unreleased language pair: Hindi - Telugu”. I want to get the pair released in both the directions. I expect the &amp;#039;&amp;#039;&amp;#039;WER&amp;#039;&amp;#039;&amp;#039; to be around 25%. This would mean updating both the monolingual dictionaries along with the bilingual dictionary. At the same time, I would be writing transfer rules to ensure the release of the pair.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Why should Google and Apertium sponsor it?&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As of 2019, Hindi has 341 million speakers while Telugu has 82 million speakers. In spite of these huge numbers, there are very few resources which can effectively translate between these languages. &lt;br /&gt;
Creating some basic rules for the transfer between Hindi (an Indo-Aryan language) and Telugu (a Dravidian language) would further the development of translation systems between these two sets of languages.&lt;br /&gt;
&lt;br /&gt;
Places like Telangana, which speak the language Dakhini (a language which is considered to be a mixture of Hindi and Telugu), are extremely populated areas. Creating a good quality translator would help in furthering the research done in languages like Dakhini(with very few speakers) due to easy conversion between Hindi and Telugu due to Apertium.&lt;br /&gt;
&lt;br /&gt;
Apertium has very few Indian language pairs(both Indian languages). It has only one Indian language pair in the trunk; no language pairs in staging; no language pair in the nursery; and six language pairs in the incubator. Creating a language pair consisting of a Dravidian language and an Indo-Aryan language will help even the other languages due to the rules that would be created.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Who will benefit from this?&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Creating a good translator for Hindi - Telugu would have a huge impact on society. It would help in better documentation of official documents (Telugu is not an official language but Hindi is). India has a huge population and this would help in easier communication. It would help in creating a good, online bilingual dictionary. It would, again, help in the translation between Dravidian and Indo-Aryan languages which, as of right now, is very infrequent and inaccurate.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Work Plan&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Current Status of the Pair&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
There is no pre-existing Hindi-Telugu pair in Apertium right now. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Hindi Monolingual Dictionary:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
1) There exists a decent amount of words in the monolingual dictionary along with paradigms.&lt;br /&gt;
&lt;br /&gt;
2) Constraint grammar exists.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Telugu Monolingual Dictionary:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
&lt;br /&gt;
1) There are hardly any words in the monolingual dictionary. Only the alphabets have been added. &lt;br /&gt;
&lt;br /&gt;
2) There is no Constraint grammar.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Resources to enhance dictionaries&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Hindi - Telugu Dictionary (~30,000 words)[https://tdil-dc.in/index.php?option=com_download&amp;amp;task=showresourceDetails&amp;amp;toolid=1563&amp;amp;lang=en]&lt;br /&gt;
&lt;br /&gt;
Hindi Monolingual Corpus (~36,000 sentences)[https://tdil-dc.in/index.php?option=com_download&amp;amp;task=showresourceDetails&amp;amp;toolid=1894&amp;amp;lang=en]&lt;br /&gt;
&lt;br /&gt;
Telugu Monolingual Corpus (~32,000 sentences)[https://www.tdil-dc.in/index.php?option=com_download&amp;amp;task=showresourceDetails&amp;amp;toolid=1892&amp;amp;lang=en]&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Detailed Plan&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! &amp;#039;&amp;#039;&amp;#039;PHASE&amp;#039;&amp;#039;&amp;#039; !! &amp;#039;&amp;#039;&amp;#039;DURATION&amp;#039;&amp;#039;&amp;#039; !! &amp;#039;&amp;#039;&amp;#039;TASKS&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
| Post-Application Period      || April 1st - May 3rd      || &lt;br /&gt;
*Bootstrap the hin-tel pair.&lt;br /&gt;
*Add basic words to the Telugu monolingual dictionary.&lt;br /&gt;
*Complete the rest of the coding challenge.&lt;br /&gt;
*Getting familiar with Apertium tools.&lt;br /&gt;
*Find more resources.&lt;br /&gt;
*Read about HFST.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
|Community Bonding Week     || May 4th - May 31st     || &lt;br /&gt;
*Read the Apertium Documentation entirely.&lt;br /&gt;
*Discuss with mentors the broad plan and iron out exact details.&lt;br /&gt;
*Start creating transfer rules.&lt;br /&gt;
*Make frequency lists.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week One      || June 1st - June 7th      || &lt;br /&gt;
*Adding nouns and verbs to the Telugu monolingual dictionary.&lt;br /&gt;
*Start working on constraint grammar.&lt;br /&gt;
*Defining paradigms for Telugu.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Two      || June 8th - June 14th      || &lt;br /&gt;
*Add pronouns and adjectives to the dictionary.&lt;br /&gt;
*Add conjunctions, prepositions, adverbs etc. &lt;br /&gt;
*Create transfer rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Three     || June 15th - June 21st      || &lt;br /&gt;
*Add to the bilingual dictionary. &lt;br /&gt;
*Start creating disambiguation rules.&lt;br /&gt;
*Add to the Telugu monolingual dictionary.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Four      || June 22nd - June 28th      || &lt;br /&gt;
*Fix the Hindi monolingual dictionary for any errors.&lt;br /&gt;
*Add words to the Hindi dictionary.&lt;br /&gt;
*Add to the bilingual dictionary.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;DELIVERABLE 1:&amp;#039;&amp;#039;&amp;#039; || ||&lt;br /&gt;
*Reach 3500 words in the bilingual dictionary. &lt;br /&gt;
*Reach 4000 words in the Telugu monolingual dictionary.&lt;br /&gt;
    &lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Five      || June 29th - July 5th      || &lt;br /&gt;
*Add to the bilingual dictionary. &lt;br /&gt;
*Create transfer rules.&lt;br /&gt;
*Add disambiguation rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Six      || July 6th - July 12th      || &lt;br /&gt;
*Add compound words.&lt;br /&gt;
*Add disambiguation rules.&lt;br /&gt;
*Add to the constraint grammar.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Seven      || July 13th - July 19th      || &lt;br /&gt;
*Add multi-words to the bilingual dictionary.&lt;br /&gt;
*Add more transfer rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Eight      || July 20th - July 26th      || &lt;br /&gt;
*Test on data present in books.&lt;br /&gt;
*Add transfer rules.&lt;br /&gt;
*Add disambiguation rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;DELIVERABLE 2:&amp;#039;&amp;#039;&amp;#039; || ||&lt;br /&gt;
*Reach 7500 words in the bilingual dictionary. &lt;br /&gt;
*Complete 90% of transfer rules.&lt;br /&gt;
*Reach a WER(~40%) so that there is an understandable translation between the languages.&lt;br /&gt;
      &lt;br /&gt;
|-&lt;br /&gt;
| Week Nine      || July 27th - August 2nd     || &lt;br /&gt;
*Expand the bilingual dictionary.&lt;br /&gt;
*Create more disambiguation rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Ten      || August 3rd - August 9th      || &lt;br /&gt;
*Add more transfer rules.&lt;br /&gt;
*Finish the constraint grammar.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Eleven      || August 10th - August 16th      || &lt;br /&gt;
*Test the system with natural language examples.&lt;br /&gt;
*Update the rules based on the results. &lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Twelve      || August 17th - August 23rd     || &lt;br /&gt;
*Testvoc the hin-tel pair.&lt;br /&gt;
*Add more rules, if needed.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Thirteen      || August 24th - August 30th      || &lt;br /&gt;
*Add documentation.&lt;br /&gt;
*Evaluation of results.&lt;br /&gt;
*Fix any bugs, if found. &lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;FINAL EVALUATION OBJECTIVES:&amp;#039;&amp;#039;&amp;#039; || ||&lt;br /&gt;
*Achieve a WER rate of around 25%.&lt;br /&gt;
*Reach at least 10,000 words in the bilingual dictionary. &lt;br /&gt;
*If there is time left over, convert the bilingual dictionary into IPA notation for easy use in the future. &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Coding Challenge&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
Install Apertium: Link to screenshot.[https://photos.app.goo.gl/s8yXX66dWUd1pKn67]&lt;br /&gt;
&lt;br /&gt;
Completed the HOWTO.&lt;br /&gt;
&lt;br /&gt;
Completed the MT course.&lt;br /&gt;
&lt;br /&gt;
Since there were no words at all in the Telugu monolingual dictionary no work could be done on the story. (As of right now.). Will be completed in the post-application period.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Skills&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
I am a first-year undergraduate student at International Institute of Information Technology, Hyderabad where I am studying Computational Linguistics. The course requires a strong understanding of Computer Science along with Linguistics. I have done courses in Linguistics, Semantics, Data Structures and Algorithms, and Software Systems.&lt;br /&gt;
&lt;br /&gt;
I am proficient in a multitude of programming languages like C++, Python, XML, Bash Scripting, HTML. I have created websites and web apps apart from simple games for my courses. As part of the Linguistics courses, I had to create transfer rules for an English-Hindi pair. I have also built Brill’s POS tagger for languages like Hindi and Telugu. Currently, I am working on a system to help solve Arithmetic Word Problems in Hindi. &lt;br /&gt;
&lt;br /&gt;
As mentioned before, I am fluent in multiple languages (English, Hindi, Telugu, Odiya, Gujarati). I also have a decent understanding of French.&lt;br /&gt;
 &lt;br /&gt;
Since most of my projects were part of course curriculum they are not available on my GitHub profile but I can send the files if needed.&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Non-Summer-Of-Code plans for the Summer&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
I will be having my college summer vacations during the GSoC period and hence I do not have any other commitments and can spend around 40 hours a week. &lt;br /&gt;
Since we are on lockdown because of COVID-19, I have a reduced workload in the first two weeks as our end-semester examinations could be postponed. At this point, however, it seems unlikely due to online classes. However, as a precaution, I have kept the workload heavy before and after the period to ensure no hiccups in the project.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Contact Information&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Name: Tejasvi Chebrolu (chebrolutejasvi on WIki)&lt;br /&gt;
&lt;br /&gt;
Location: Hyderabad, India&lt;br /&gt;
&lt;br /&gt;
University: International Institute of Information Technology&lt;br /&gt;
&lt;br /&gt;
E-Mail: tejasvi.chebrolu@research.iiit.ac.in&lt;br /&gt;
&lt;br /&gt;
IRC: chebrolutejasvi&lt;br /&gt;
&lt;br /&gt;
Timezone: UTC +5:30 or IST&lt;br /&gt;
&lt;br /&gt;
Github: https://github.com/tejasvicsr1&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Why is it that I am interested in Apertium?&amp;#039;&amp;#039;&amp;#039;==&lt;br /&gt;
Apertium is an open-source organisation dedicated to machine translation. As a child, I grew up in different places and was exposed to different languages. This led to me being fascinated with language translation and I wanted to contribute to help in making communication easier for everyone using machine translation. &lt;br /&gt;
&lt;br /&gt;
Apertium focuses on low- resource languages. Growing up in India, a country with 22 official recognised languages and many more unrecognised ones, there was a lack of a good quality machine translation service. There are hardly any resources for most Indian languages and the work Apertium does manages to counter this.  &lt;br /&gt;
&lt;br /&gt;
Apertium is a rule-based system. As a student of Computational Linguistics, we have multiple linguistics courses. As a student and an undergraduate researcher, I am interested in rule-based systems and Apertium provides an excellent platform to further my interests.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Which of the published tasks am I interested in? What do I plan to do?&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
I am going to work on “ Adopt an unreleased language pair: Hindi - Telugu”. I want to get the pair released in both the directions. I expect the &amp;#039;&amp;#039;&amp;#039;WER&amp;#039;&amp;#039;&amp;#039; to be around 25%. This would mean updating both the monolingual dictionaries along with the bilingual dictionary. At the same time, I would be writing transfer rules to ensure the release of the pair.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Why should Google and Apertium sponsor it?&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
As of 2019, Hindi has 341 million speakers while Telugu has 82 million speakers. In spite of these huge numbers, there are very few resources which can effectively translate between these languages. &lt;br /&gt;
Creating some basic rules for the transfer between Hindi (an Indo-Aryan language) and Telugu (a Dravidian language) would further the development of translation systems between these two sets of languages.&lt;br /&gt;
&lt;br /&gt;
Places like Telangana, which speak the language Dakhini (a language which is considered to be a mixture of Hindi and Telugu), are extremely populated areas. Creating a good quality translator would help in furthering the research done in languages like Dakhini(with very few speakers) due to easy conversion between Hindi and Telugu due to Apertium.&lt;br /&gt;
&lt;br /&gt;
Apertium has very few Indian language pairs(both Indian languages). It has only one Indian language pair in the trunk; no language pairs in staging; no language pair in the nursery; and six language pairs in the incubator. Creating a language pair consisting of a Dravidian language and an Indo-Aryan language will help even the other languages due to the rules that would be created.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Who will benefit from this?&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Creating a good translator for Hindi - Telugu would have a huge impact on society. It would help in better documentation of official documents (Telugu is not an official language but Hindi is). India has a huge population and this would help in easier communication. It would help in creating a good, online bilingual dictionary. It would, again, help in the translation between Dravidian and Indo-Aryan languages which, as of right now, is very infrequent and inaccurate.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Work Plan&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Current Status of the Pair&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
There is no pre-existing Hindi-Telugu pair in Apertium right now. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Hindi Monolingual Dictionary:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
1) There exists a decent amount of words in the monolingual dictionary along with paradigms.&lt;br /&gt;
&lt;br /&gt;
2) Constraint grammar exists.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Telugu Monolingual Dictionary:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
&lt;br /&gt;
1) There are hardly any words in the monolingual dictionary. Only the alphabets have been added. &lt;br /&gt;
&lt;br /&gt;
2) There is no Constraint grammar.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Resources to enhance dictionaries&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Hindi - Telugu Dictionary (~30,000 words)[https://tdil-dc.in/index.php?option=com_download&amp;amp;task=showresourceDetails&amp;amp;toolid=1563&amp;amp;lang=en]&lt;br /&gt;
&lt;br /&gt;
Hindi Monolingual Corpus (~36,000 sentences)[https://tdil-dc.in/index.php?option=com_download&amp;amp;task=showresourceDetails&amp;amp;toolid=1894&amp;amp;lang=en]&lt;br /&gt;
&lt;br /&gt;
Telugu Monolingual Corpus (~32,000 sentences)[https://www.tdil-dc.in/index.php?option=com_download&amp;amp;task=showresourceDetails&amp;amp;toolid=1892&amp;amp;lang=en]&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Detailed Plan&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! &amp;#039;&amp;#039;&amp;#039;PHASE&amp;#039;&amp;#039;&amp;#039; !! &amp;#039;&amp;#039;&amp;#039;DURATION&amp;#039;&amp;#039;&amp;#039; !! &amp;#039;&amp;#039;&amp;#039;TASKS&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|-&lt;br /&gt;
| Post-Application Period      || April 1st - May 3rd      || &lt;br /&gt;
*Bootstrap the hin-tel pair.&lt;br /&gt;
*Add basic words to the Telugu monolingual dictionary.&lt;br /&gt;
*Complete the rest of the coding challenge.&lt;br /&gt;
*Getting familiar with Apertium tools.&lt;br /&gt;
*Find more resources.&lt;br /&gt;
*Read about HFST.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
|Community Bonding Week     || May 4th - May 31st     || &lt;br /&gt;
*Read the Apertium Documentation entirely.&lt;br /&gt;
*Discuss with mentors the broad plan and iron out exact details.&lt;br /&gt;
*Start creating transfer rules.&lt;br /&gt;
*Make frequency lists.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week One      || June 1st - June 7th      || &lt;br /&gt;
*Adding nouns and verbs to the Telugu monolingual dictionary.&lt;br /&gt;
*Start working on constraint grammar.&lt;br /&gt;
*Defining paradigms for Telugu.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Two      || June 8th - June 14th      || &lt;br /&gt;
*Add pronouns and adjectives to the dictionary.&lt;br /&gt;
*Add conjunctions, prepositions, adverbs etc. &lt;br /&gt;
*Create transfer rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Three     || June 15th - June 21st      || &lt;br /&gt;
*Add to the bilingual dictionary. &lt;br /&gt;
*Start creating disambiguation rules.&lt;br /&gt;
*Add to the Telugu monolingual dictionary.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Four      || June 22nd - June 28th      || &lt;br /&gt;
*Fix the Hindi monolingual dictionary for any errors.&lt;br /&gt;
*Add words to the Hindi dictionary.&lt;br /&gt;
*Add to the bilingual dictionary.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;DELIVERABLE 1:&amp;#039;&amp;#039;&amp;#039; || ||&lt;br /&gt;
*Reach 3500 words in the bilingual dictionary. &lt;br /&gt;
*Reach 4000 words in the Telugu monolingual dictionary.&lt;br /&gt;
    &lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Five      || June 29th - July 5th      || &lt;br /&gt;
*Add to the bilingual dictionary. &lt;br /&gt;
*Create transfer rules.&lt;br /&gt;
*Add disambiguation rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Six      || July 6th - July 12th      || &lt;br /&gt;
*Add compound words.&lt;br /&gt;
*Add disambiguation rules.&lt;br /&gt;
*Add to the constraint grammar.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Seven      || July 13th - July 19th      || &lt;br /&gt;
*Add multi-words to the bilingual dictionary.&lt;br /&gt;
*Add more transfer rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Eight      || July 20th - July 26th      || &lt;br /&gt;
*Test on data present in books.&lt;br /&gt;
*Add transfer rules.&lt;br /&gt;
*Add disambiguation rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;DELIVERABLE 2:&amp;#039;&amp;#039;&amp;#039; || ||&lt;br /&gt;
*Reach 7500 words in the bilingual dictionary. &lt;br /&gt;
*Complete 90% of transfer rules.&lt;br /&gt;
*Reach a WER(~40%) so that there is an understandable translation between the languages.&lt;br /&gt;
      &lt;br /&gt;
|-&lt;br /&gt;
| Week Nine      || July 27th - August 2nd     || &lt;br /&gt;
*Expand the bilingual dictionary.&lt;br /&gt;
*Create more disambiguation rules.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Ten      || August 3rd - August 9th      || &lt;br /&gt;
*Add more transfer rules.&lt;br /&gt;
*Finish the constraint grammar.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Eleven      || August 10th - August 16th      || &lt;br /&gt;
*Test the system with natural language examples.&lt;br /&gt;
*Update the rules based on the results. &lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Twelve      || August 17th - August 23rd     || &lt;br /&gt;
*Testvoc the hin-tel pair.&lt;br /&gt;
*Add more rules, if needed.&lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| Week Thirteen      || August 24th - August 30th      || &lt;br /&gt;
*Add documentation.&lt;br /&gt;
*Evaluation of results.&lt;br /&gt;
*Fix any bugs, if found. &lt;br /&gt;
&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;FINAL EVALUATION OBJECTIVES:&amp;#039;&amp;#039;&amp;#039; || ||&lt;br /&gt;
*Achieve a WER rate of around 25%.&lt;br /&gt;
*Reach at least 10,000 words in the bilingual dictionary. &lt;br /&gt;
*If there is time left over, convert the bilingual dictionary into IPA notation for easy use in the future. &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Coding Challenge&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
Install Apertium: Link to screenshot.[https://photos.app.goo.gl/s8yXX66dWUd1pKn67]&lt;br /&gt;
&lt;br /&gt;
Completed the HOWTO.&lt;br /&gt;
&lt;br /&gt;
Completed the MT course.&lt;br /&gt;
&lt;br /&gt;
Since there were no words at all in the Telugu monolingual dictionary no work could be done on the story. (As of right now.). Will be completed in the post-application period.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Skills&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
I am a first-year undergraduate student at International Institute of Information Technology, Hyderabad where I am studying Computational Linguistics. The course requires a strong understanding of Computer Science along with Linguistics. I have done courses in Linguistics, Semantics, Data Structures and Algorithms, and Software Systems.&lt;br /&gt;
&lt;br /&gt;
I am proficient in a multitude of programming languages like C++, Python, XML, Bash Scripting, HTML. I have created websites and web apps apart from simple games for my courses. As part of the Linguistics courses, I had to create transfer rules for an English-Hindi pair. I have also built Brill’s POS tagger for languages like Hindi and Telugu. Currently, I am working on a system to help solve Arithmetic Word Problems in Hindi. &lt;br /&gt;
&lt;br /&gt;
As mentioned before, I am fluent in multiple languages (English, Hindi, Telugu, Odiya, Gujarati). I also have a decent understanding of French.&lt;br /&gt;
 &lt;br /&gt;
Since most of my projects were part of course curriculum they are not available on my GitHub profile but I can send the files if needed.&lt;br /&gt;
&lt;br /&gt;
== &amp;#039;&amp;#039;&amp;#039;Non-Summer-Of-Code plans for the Summer&amp;#039;&amp;#039;&amp;#039; ==&lt;br /&gt;
&lt;br /&gt;
I will be having my college summer vacations during the GSoC period and hence I do not have any other commitments and can spend around 40 hours a week. &lt;br /&gt;
Since we are on lockdown because of COVID-19, I have a reduced workload in the first two weeks as our end-semester examinations could be postponed. At this point, however, it seems unlikely due to online classes. However, as a precaution, I have kept the workload heavy before and after the period to ensure no hiccups in the project.&lt;/div&gt;</summary>
		<author><name>Chebrolutejasvi</name></author>
		
	</entry>
</feed>