Difference between revisions of "Retrospective: Google Code-In 2017"

From Apertium
Jump to navigation Jump to search
(Create GCI 2017 retrospective page)
 
Line 69: Line 69:
 
== Organizing mentors ==
 
== Organizing mentors ==
   
  +
'''What went right:'''
<!--
 
  +
* Mentors who were active were pretty active, on IRC and on task pages.
@m5w: from our chat.
 
  +
* No students complained about late responses or ineffective mentoring.
   
  +
'''What went wrong:'''
mentors publicly declaring their timezones and the times they will respond on IRC or on task pages, and the times they may respond, and then commit to putting in a certain number of hours every week. Among mentors, discussing when people are going on holiday, when people have exams, etc. to balance workload would also be helpful. Much of this depends on having a mailing list set up for mentors which I was really surprised we didn't have this time.
 
  +
* Some mentors were positively flooded with task review, with (at one point) more than 48-hour backlogs.
  +
* Students had little idea about when and which mentors would be available at any given time. This was especially important for tasks where only one mentor was active/was involved enough to help.
   
  +
'''How we can improve it:'''
I definitely would've put in more time had I committed to a structured schedule
 
  +
* For the benefit of students and other mentors, mentors should publicly declare:
and it's good for people to know when I have exams/classes starting instead of me just dropping off the face of the Earth
 
  +
** what timezone they're in
-->
 
  +
** what rough time slots they ''will'' be available to respond to IRC, task pages, pull requests, etc.
  +
** what rough time slots they ''may'' be available
  +
* Among mentors, mentors should commit to putting in a certain number of hours on a weekly basis—this sort of declaration would help mentors keep their own commitments and also not feel guilty about not doing work which they weren't supposed to do to begin with.
  +
* Among mentors, mentors should also discuss when they will be on vacation, or have exams/classes/etc., so that they don't simply drop off the face of the Earth for no apparent reason in the middle of the contest.
  +
* Depending on general availability and time as discussed above, an appropriate number of tasks and task instances can be assigned to each mentor, to avoid demanding more work than is possible.
  +
* All these would be greatly helped by a mailing list for mentors.
   
 
== The wiki ==
 
== The wiki ==

Revision as of 16:25, 1 February 2018

Google Code-In 2017 was certainly an overall success for Apertium. Students completed upwards of 140 tasks which is probably the highest count yet. There were a lot of new contributors, some of whom seem like they will (to varying extents) continue contributing to many different Apertium projects.

However, some aspects of Apertium's involvement in GCI can be improved so that mentors and students have an even better experience in the future.


Selecting Winners/Finalists

What went right:

  • Worthy and deserving students were selected as Winners/Finalists.

What went wrong:

  • Very few mentors were involved in the selection of Winners/Finalists.
  • The other mentors were not told when/where/how the selection discussion would take place. These details and the selected students were not known even after the discussion was over.
  • There was no transparency among mentors for what criteria were used to select Winners/Finalists.

How we can improve it:

  • It appears that this was not the case in previous years of GCI and only happened this year because of the increased workload. The next couple sections talk about managing workload so it is not discussed here.
  • In previous years, there was a spreadsheet containing the top ten students where mentors could rank the students, and the students with the highest overall rankings would be selected as Winners/Finalists. This is a good system but it has drawbacks:
    • All mentors do not interact with all students, so each mentor's ranking can only be partially meaningful.
    • The evaluation criteria are not specifically discussed or standardized.
    • Any system of ranked voting suffers from the deficiencies of Arrow's impossibility theorem.
  • A possible solution is to have a paragraph written for each student by the mentor(s) that worked most closely with them, describing the quality of their work and interaction with appropriate evidence. Then, all the mentors can read this description and follow the process of cardinal voting, i.e. each mentor assigns a numerical 'grade' to each student and the students are finally ranked by average grade. The benefits are:
    • Even if a mentor does not interact with a student they can still judge the work.
    • The 'grade' can be split into different categories, like "code quality", "style/frequency of communication", "willingness to help others", etc. Mentors would have a transparent, standardized system to evaluate students, and possibly this system could be told to students too so they know what is valued in the community.
    • Arrow's impossibility theorem does not apply to cardinal systems.
    • The results are actually more accurate (see the "psychological studies" references on the Wikipedia page).
    • No special process is required beyond a shared online spreadsheet with a sum and average value function.


Organizing and managing tasks

What went right:

  • There were a large number of tasks covering a wide range of Apertium systems and a wide range of skills.
  • Each task typically had more than one mentor.
  • Each task typically had a decent description of the task and links for further information.
  • As the contest progressed, creating new tasks was hassle-free.

What went wrong:

  • Some tasks were very poorly described (sometimes just one sentence, no links).
  • There was a lot of confusion around when the contest started about tasks not being uploaded properly, missing mentors/tags, etc.
  • Many tasks—especially ones with only a single instance, meant to fix a single issue—were claimed by students who clearly needed a lot more experience with the relevant software before they could make any progress. There were thus too many instances of timing out, submitting completely irrelevant work, "yo bro er whattami suppost to do", and so on.

How we can improve it:

  • In the task planning phase, mentors should only add themselves to a task when they have verified that the description is complete and makes sense. (Mentors should also discuss what constitutes a "complete description".) Tasks should only be published when they have been reviewed in this manner.
  • The uploading issues will probably not be present next year because we solved them this year.
  • Quoted from an email on apertium-stuff:

Each task ... would require the student to have completed work equivalent to the previous tasks. The first one or two tasks in the chain would be beginner tasks, easier than our current beginner tasks but not as easy as ["get on IRC"].

Chain 1:

  • "Download and compile one Apertium translation pair, and send a screenshot of trial translations"
  • "Add 200 words to the bilingual dictionary" or "Add 1 lexical transfer rule"
  • "Add 500 words to the bilingual dictionary" or "Add 10 lexical transfer rules" or "Write a constrastive grammar" or ...

Chain 2:

  • "Install a few translation pairs from your distribution software repository (or download and compile if you want to). Fork APy and run it locally on your computer. Send a screenshot of trial queries." or similarly for html-tools
  • all the issue-fixing or feature-proposal tasks for APy or similarly for html-tools
  • tasks which involve modifying or testing with components of both ("Fix html-tools behavior when APy is down" etc.)

Similar 'chains' could be made for begiak and the lttoolbox tasks.

  • Tasks should also clearly define what it means to be "completed" so that mentors do not need to waste time commenting on irrelevant/very poor submissions.
  • These structured tasks will not only solve the problems mentioned above, but also make the learning curve much less steep, encouraging more students to work with Apertium. (Apertium probably has one of the steepest learning curves of all GCI organizations.) More tasks will be completed (always encouraging!), especially the initial tasks in the chains above, and the complex tasks will receive more relevant attention.


Organizing mentors

What went right:

  • Mentors who were active were pretty active, on IRC and on task pages.
  • No students complained about late responses or ineffective mentoring.

What went wrong:

  • Some mentors were positively flooded with task review, with (at one point) more than 48-hour backlogs.
  • Students had little idea about when and which mentors would be available at any given time. This was especially important for tasks where only one mentor was active/was involved enough to help.

How we can improve it:

  • For the benefit of students and other mentors, mentors should publicly declare:
    • what timezone they're in
    • what rough time slots they will be available to respond to IRC, task pages, pull requests, etc.
    • what rough time slots they may be available
  • Among mentors, mentors should commit to putting in a certain number of hours on a weekly basis—this sort of declaration would help mentors keep their own commitments and also not feel guilty about not doing work which they weren't supposed to do to begin with.
  • Among mentors, mentors should also discuss when they will be on vacation, or have exams/classes/etc., so that they don't simply drop off the face of the Earth for no apparent reason in the middle of the contest.
  • Depending on general availability and time as discussed above, an appropriate number of tasks and task instances can be assigned to each mentor, to avoid demanding more work than is possible.
  • All these would be greatly helped by a mailing list for mentors.

The wiki

The eternal git/svn issue