PMC proposals/Move apertium to github

From Apertium
Jump to navigation Jump to search

Summary[edit]

git provides a large number of advantages over subversion, including a very good branching mechanism, offline commit history, a bisection tool for locating broken commits, and excellent merge/rebase capabilities. The built-in documentation is very good, and (unlike svn) the command line git command comes in glorious ANSI colour :) Since you have the full history on your computer, you can grep through commits or check out earlier versions _quickly_, without even being online. Repositories also tend to take less disk space.

Making use of a service such as Github.com would also allow for each apertium module to be in a separate repository, with the possibility for creating central repositories (such as incubator) which link to all of the included modules. Github also provides an issue tracker and a system for making commits in a personal fork of the upstream repository, then requesting that your changes be pulled into upstream. Note that apertium can retain its current method of allowing people to commit directly, but retain the option of using pull requests for those who don't plan to contribute regularly. Sourceforge could be retained for mailing lists and similar services.

Migration of the repositories from subversion to git should be relatively simple. Tools exist for creating git repositories from subversion while retaining all commit history. The migration should begin with smaller apertium modules, such as the contents of nursery and incubator. The more central modules, such as lttoolbox and apertium itself, can be moved last. Documentation will need be updated, but a simple guide similar to https://wiki.gnome.org/TranslationProject/GitHowTo should be sufficient. Much of the information contained therein is probably not necessary for apertium workflow, making for a simpler, easier-to-write document. For more complex requirements, the existing git documentation is excellent and there are many resources for a variety of git recipes. I will create a draft version of a document covering apertium general use prior to the beginning of the move.

Proposed by: User:Leftmostcat

Related reading[edit]

GUI's:

In detail[edit]

Caveats[edit]

  • The svn repo contains several larger binaries and their history. The total sum of those would need to be cloned for every person who intends to seriously work with the subproject. A shallow clone (equivalent to svn checkout) can only be used for basic patchwork (cannot clone, fetch, push into, or push from shallow clones). See https://git.wiki.kernel.org/index.php/GitFaq#How_do_I_do_a_quick_clone_without_history_revisions.3F and following point. Tino Didriksen 16:56, 6 September 2013 (UTC)
    Because of the ability to separate repositories, the impact of this would be minimized. To work on a language pair, it would only be necessary to clone the pair itself. —Leftmostcat 17:16, 6 September 2013 (UTC)
    I once (about 2 years ago?) tried to checkout all of apertium svn into one big git repo (ie. with full history). It took less space than the SVN checkout (git is quite good at compressing history because it has to, SVN keeps only local copy of everything and thus doesn't bother?) --unhammer 07:59, 8 September 2013 (UTC)
s/local copy/local copies/ -- Jimregan 15:09, 9 September 2013 (UTC)
  • I would not recommend shallow clones, since 1) most apertiumers will be new to git, and it just adds more complexity 2) you typically don't save much drive space: http://blogs.gnome.org/simos/2009/04/18/git-clones-vs-shallow-git-clones/ 3) people will be checking out a repo at a time, not everything that was in SVN, and 4) maybe it's not such a bad thing that repos with many versions of big binaries stand out like a sore thumb ;-) --unhammer 07:59, 8 September 2013 (UTC)

Comments[edit]

  • "Note that apertium can retain its current method of allowing people to commit directly". Yuck. Github makes pulling easy enough that this should never, ever be considered. Also, the biggest benefit of Github is "drive by contributions" -- there's no need to be registered with a project to contribute, and forking is simplified to the point that you can just click 'edit' on a file and it does it transparently (and edit in the browser, without needing to download anything). I think that's a better selling point to most people than things like bisect :) -- Jimregan 15:28, 9 September 2013 (UTC)
    +1 – if we use git just like svn, we might as well use svn unhammer (talk) 10:28, 17 November 2014 (CET)
  • Another (more git-like?) way of doing this is to simply say "if a language pair/sub-project wants to move off svn, let the committers to that module decide". E.g. apertium-en-hi is already on github. --unhammer 18:47, 9 September 2013 (UTC)
To explain a bit more: The "migration strategy" would be that we are open to letting people use git if they want to for their projects. So if $newcontributor wants to make apertium-fie-bar and host it on github, that's fine; they get to use their favourite VC, while people who feel that learning a new VC is a waste of time get to stay with SVN. And they don't have to learn a new VC just because the cool kids say it's what everybody's doing these days (which is a bad reason and very unmotivating). That is, they don't have to learn git until they find an interest in fie-bar and have an incentive to learn git, in which case motivation is there. (If you're into apertium to do actual work, then cool tricks like being able to do 'git merge --abort --dammit' is not the right motivation.)
as User:Sushain mentions, it might be nice to have just the "tools" things in git (html-tools, apy, scrapers), since these are typically worked on by people who prefer git, and you don't need to check them out in order to work on language pairs. --unhammer (talk) 09:11, 5 December 2014 (CET)

Voting[edit]

For[edit]

Against[edit]

Abstain[edit]

See also[edit]