Difference between revisions of "Talk:PMC proposals/Move Apertium to Github"

From Apertium
Jump to navigation Jump to search
Line 12: Line 12:
   
 
=== Approach 1 ===
 
=== Approach 1 ===
  +
  +
==== Variant A ====
 
A monorepo with all the lingustic data, pairs and language modules. Other folders in SVN like the core engine and peripheral tools (e.g. APy) would live in their own repos.
 
A monorepo with all the lingustic data, pairs and language modules. Other folders in SVN like the core engine and peripheral tools (e.g. APy) would live in their own repos.
   
Line 18: Line 20:
 
* There are no meta-repos or submodules to deal with.
 
* There are no meta-repos or submodules to deal with.
 
* GitHub’s interface can be used directly.
 
* GitHub’s interface can be used directly.
  +
* Less need for extremely complicated git commands
  +
* Possible to do partial checkouts using SVN
   
 
==== Cons ====
 
==== Cons ====
Line 30: Line 34:
 
* Commit access will continue to give write access to everything
 
* Commit access will continue to give write access to everything
 
* Contradictory to the Git/GitHub philosophy (bad impression)
 
* Contradictory to the Git/GitHub philosophy (bad impression)
  +
  +
  +
  +
==== Variant B ====
  +
Several monorepos, one for each of:
  +
* incubator
  +
* pairs
  +
* languages
  +
* tools
  +
  +
==== Pros ====
  +
* Large-scale editing of e.g. 15 pairs is easy.
  +
* There are no meta-repos or submodules to deal with.
  +
* GitHub’s interface can be used directly.
  +
* Less need for extremely complicated git commands
  +
* Possible to do partial checkouts using SVN
  +
  +
==== Cons ====
   
 
=== Approach 2 ===
 
=== Approach 2 ===

Revision as of 18:58, 28 October 2017

Reasons to Switch

  • GitHub’s excellent issue tracker
  • More people outside Apertium are far more familiar with Git vs. SVN (especially younger folks, see GCI/GSoC)
  • More people outside Apertium have GitHub accounts, easier to start-up for a new user
  • GitHub’s interface is far superior to SourceForge’s interface
  • Avoids SourceForge’s downtime (not so bad lately)
  • SourceForge gives an awful impression
  • More visibility as an FOSS project (people browse GitHub)

Prevailing Approaches

Common pros/cons are excluded for the sake of brevity.

Approach 1

Variant A

A monorepo with all the lingustic data, pairs and language modules. Other folders in SVN like the core engine and peripheral tools (e.g. APy) would live in their own repos.

Pros

  • Large-scale editing of e.g. 15 pairs is easy.
  • There are no meta-repos or submodules to deal with.
  • GitHub’s interface can be used directly.
  • Less need for extremely complicated git commands
  • Possible to do partial checkouts using SVN

Cons

  • The monorepo would be massive (> 3 GB).
    • Most devs (aside from the couple core devs) would have to use GitHub’s SVN bridge to work on a pair.
      • This is highly contradictory to working on GitHub
      • People new to Apertium will have to learn SVN, negating some reasons to switch
    • Diluted usefulness of branches, PRs and hooks
    • GitHub doesn’t necessarily allow repos larger than 1 GB (unclear whether this limit refers to bare repo). If GitHub decides to stop us at some point after we switch, that’s really bad.
  • Everyone will disable email notifications (“watching” a repo) since there will be too much spam
  • Massive number of issue labels to curate and apply (non-members cannot tag an issue when submitting), reducing the effectiveness of the issue tracker
  • Commit access will continue to give write access to everything
  • Contradictory to the Git/GitHub philosophy (bad impression)


Variant B

Several monorepos, one for each of:

  • incubator
  • pairs
  • languages
  • tools

Pros

  • Large-scale editing of e.g. 15 pairs is easy.
  • There are no meta-repos or submodules to deal with.
  • GitHub’s interface can be used directly.
  • Less need for extremely complicated git commands
  • Possible to do partial checkouts using SVN

Cons

Approach 2

Individual repos for each pair, language module and tools. A couple of “meta-repos” that contain submodules pointing to collections of repos, e.g. apertium-staging would contain ~8 submodules pointing to each of the pairs in SVN’s /staging and apertium-all would have submodules to apertium-staging, apertium-incbuator, apertium-languages, etc. This hierarchy would be maintained via GitHub’s repo tags (a.k.a. “topics”), i.e. apertium-xxx-yyy could be marked with the incubator tag to end up in apertium-incubator.

Pros

  • Usable issue tracker for each repo
  • Fits into the Git/GitHub philosophy
  • People who wish to use Git can contribute using that (while it's still possible to use the SVN bridge for those who want that)
  • Familiar branching, PR and hooks that work as expected
  • Email notifications and watching repos is useful
  • An analogous change to 15 pairs will result in 15 different commits, each repo has its own history (both pro and con).
  • Granular permissions (not everyone has access to literally everything, especially useful for GCI/GSoC)
    • RESPONSE: Could be considered more bureaucratic

Cons

  • Harder for people who make changes to lots of pairs at the same time (i.e. couple of core devs)
    • Commands are more gnarly (git submodule can be pretty unintuitive)
      • RESPONSE: Possible to mitigate with aliases and cheat sheets
    • An analogous change to 15 pairs will result in 15 different commits, each repo has its own history (both pro and con).
  • Somewhat harder for people who use the meta-repos
    • RESPONSE: It’s really not that difficult to checkout (git submodule update --recursive --init) and pull updates to a meta-repo (git pull --recurse-submodules) and with aliases it can be even shorter.
  • Requires tooling to keep meta-repos up-to-date
    • RESPONSE: These are super simple scripts based on GitHub’s reliable API. Sushain is willing to write them and Tino is willing to host (and perhaps code review).
  • GitHub doesn’t provide a nice interface to view repos in a tree format
    • RESPONSE: Sushain will though! See this page that can be trivially finished to cover all our repos and is a very simple single HTML file (and pretty IMO). This page is automatically generated from the repo tags.