Difference between revisions of "Talk:PMC proposals/Move Apertium to Github"
 (→Cons)  | 
				|||
| (11 intermediate revisions by 4 users not shown) | |||
| Line 6: | Line 6: | ||
* Avoids SourceForge’s downtime (not so bad lately)  | 
  * Avoids SourceForge’s downtime (not so bad lately)  | 
||
* SourceForge gives an awful impression  | 
  * SourceForge gives an awful impression  | 
||
* More visibility as an FOSS project  | 
  * More visibility as an FOSS project  | 
||
** GitHub has become the de-facto host for open source: people searches for "github apertium" to find apertium's code  | 
|||
== Prevailing Approaches ==  | 
  == Prevailing Approaches ==  | 
||
| Line 21: | Line 22: | ||
* GitHub’s interface can be used directly.  | 
  * GitHub’s interface can be used directly.  | 
||
* Less need for extremely complicated git commands  | 
  * Less need for extremely complicated git commands  | 
||
* Possible to do partial checkouts using SVN  | 
  * Possible to do partial checkouts using SVN  | 
||
** This removes completely possible pro #1  | 
|||
==== Cons ====  | 
  ==== Cons ====  | 
||
| Line 28: | Line 30: | ||
*** This is highly contradictory to working on GitHub  | 
  *** This is highly contradictory to working on GitHub  | 
||
*** People new to Apertium will have to learn SVN, negating some reasons to switch  | 
  *** People new to Apertium will have to learn SVN, negating some reasons to switch  | 
||
*** '''ALTERNATIVE VIEWPOINT:''' there is no "learning SVN", it's three commands.  | 
|||
** Diluted usefulness of branches, PRs and hooks  | 
  ** Diluted usefulness of branches, PRs and hooks  | 
||
** GitHub doesn’t necessarily allow repos larger than 1 GB (unclear whether this limit refers to bare repo). If GitHub decides to stop us at some point after we switch, that’s really bad.  | 
  ** GitHub doesn’t necessarily allow repos larger than 1 GB (unclear whether this limit refers to bare repo). If GitHub decides to stop us at some point after we switch, that’s really bad.  | 
||
| Line 34: | Line 37: | ||
* Commit access will continue to give write access to everything  | 
  * Commit access will continue to give write access to everything  | 
||
* Contradictory to the Git/GitHub philosophy (bad impression)  | 
  * Contradictory to the Git/GitHub philosophy (bad impression)  | 
||
* Given that the usual recovery/fix for repo inconsistencies is to wipe and re-clone, having to re-clone a huge monorepo would greatly exacerbate those kinds of issues  | 
|||
==== Variant B ====  | 
  ==== Variant B ====  | 
||
| Line 52: | Line 54: | ||
==== Cons ====  | 
  ==== Cons ====  | 
||
* All the cons in ''Variant A'', minus:  | 
|||
** Repos will be smaller than the massive monorepo  | 
|||
* Moving a package between release states and preserving history is complicated (can be scripted)  | 
|||
==== Variant C ====  | 
|||
Several repos:  | 
|||
* One for each of the modules in languages/  | 
|||
* One for all the released pairs  | 
|||
* One for incubator  | 
|||
* One for each of the core tools  | 
|||
=== Approach 2 ===  | 
  === Approach 2 ===  | 
||
| Line 65: | Line 77: | ||
* Granular permissions (not everyone has access to literally everything, especially useful for GCI/GSoC)  | 
  * Granular permissions (not everyone has access to literally everything, especially useful for GCI/GSoC)  | 
||
** <strong>RESPONSE:</strong> Could be considered more bureaucratic  | 
  ** <strong>RESPONSE:</strong> Could be considered more bureaucratic  | 
||
*** Re-response: not really. Granular permissoins are a (good) option, but it's not mandatory. We could use "org" permissions instead of "repo" permissions  | 
|||
* Empowerment for package maintainers:  | 
|||
** They could enforce workflows (code reviews, etc) for specific packages, and accept easily patches from other people (via pull requests) before requesting commit access.  | 
|||
==== Cons ====  | 
  ==== Cons ====  | 
||
| Line 71: | Line 86: | ||
*** <strong>RESPONSE:</strong> Possible to mitigate with aliases and cheat sheets  | 
  *** <strong>RESPONSE:</strong> Possible to mitigate with aliases and cheat sheets  | 
||
** An analogous change to 15 pairs will result in 15 different commits, each repo has its own history (''both pro and con'').  | 
  ** An analogous change to 15 pairs will result in 15 different commits, each repo has its own history (''both pro and con'').  | 
||
*** <strong>RESPONSE:</strong> Already happening for most of the people. Almost no-one has the whole SVN repo, but multiple SVN subfolders.  | 
|||
* Somewhat harder for people who use the meta-repos  | 
  * Somewhat harder for people who use the meta-repos  | 
||
** <strong>RESPONSE:</strong> It’s really not that difficult to checkout (<code>git submodule update --recursive --init</code>) and pull updates to a meta-repo (<code>git pull --recurse-submodules</code>) and with aliases it can be even shorter.  | 
  ** <strong>RESPONSE:</strong> It’s really not that difficult to checkout (<code>git submodule update --recursive --init</code>) and pull updates to a meta-repo (<code>git pull --recurse-submodules</code>) and with aliases it can be even shorter.  | 
||
| Line 77: | Line 93: | ||
* GitHub doesn’t provide a nice interface to view repos in a tree format  | 
  * GitHub doesn’t provide a nice interface to view repos in a tree format  | 
||
** <strong>RESPONSE:</strong> Sushain will though! See this [https://rawgit.com/sushain97/apertium-on-github/master/source-browser.html page] that can be trivially finished to cover all our repos and is a very simple single HTML file (and pretty IMO). This page is automatically generated from the repo tags.  | 
  ** <strong>RESPONSE:</strong> Sushain will though! See this [https://rawgit.com/sushain97/apertium-on-github/master/source-browser.html page] that can be trivially finished to cover all our repos and is a very simple single HTML file (and pretty IMO). This page is automatically generated from the repo tags.  | 
||
== Related Concerns ==  | 
|||
* Mailing lists - should probably be preserved on SourceForge for now until/unless we choose to switch to another solution or self-host them.  | 
|||
* Existing issues - Sushain volunteers to manually transpose (or find an automatic solution) to moving our existing issues (pretty small #)  | 
|||
Latest revision as of 17:55, 1 February 2018
Contents
Reasons to Switch[edit]
- GitHub’s excellent issue tracker
 - More people outside Apertium are far more familiar with Git vs. SVN (especially younger folks, see GCI/GSoC)
 - More people outside Apertium have GitHub accounts, easier to start-up for a new user
 - GitHub’s interface is far superior to SourceForge’s interface
 - Avoids SourceForge’s downtime (not so bad lately)
 - SourceForge gives an awful impression
 - More visibility as an FOSS project
- GitHub has become the de-facto host for open source: people searches for "github apertium" to find apertium's code
 
 
Prevailing Approaches[edit]
Common pros/cons are excluded for the sake of brevity.
Approach 1[edit]
Variant A[edit]
A monorepo with all the lingustic data, pairs and language modules. Other folders in SVN like the core engine and peripheral tools (e.g. APy) would live in their own repos.
Pros[edit]
- Large-scale editing of e.g. 15 pairs is easy.
 - There are no meta-repos or submodules to deal with.
 - GitHub’s interface can be used directly.
 - Less need for extremely complicated git commands
 - Possible to do partial checkouts using SVN
- This removes completely possible pro #1
 
 
Cons[edit]
- The monorepo would be massive (> 3 GB).
- Most devs (aside from the couple core devs) would have to use GitHub’s SVN bridge to work on a pair.
- This is highly contradictory to working on GitHub
 - People new to Apertium will have to learn SVN, negating some reasons to switch
 - ALTERNATIVE VIEWPOINT: there is no "learning SVN", it's three commands.
 
 - Diluted usefulness of branches, PRs and hooks
 - GitHub doesn’t necessarily allow repos larger than 1 GB (unclear whether this limit refers to bare repo). If GitHub decides to stop us at some point after we switch, that’s really bad.
 
 - Most devs (aside from the couple core devs) would have to use GitHub’s SVN bridge to work on a pair.
 - Everyone will disable email notifications (“watching” a repo) since there will be too much spam
 - Massive number of issue labels to curate and apply (non-members cannot tag an issue when submitting), reducing the effectiveness of the issue tracker
 - Commit access will continue to give write access to everything
 - Contradictory to the Git/GitHub philosophy (bad impression)
 - Given that the usual recovery/fix for repo inconsistencies is to wipe and re-clone, having to re-clone a huge monorepo would greatly exacerbate those kinds of issues
 
Variant B[edit]
Several monorepos, one for each of:
- incubator
 - pairs
 - languages
 - tools
 
Pros[edit]
- Large-scale editing of e.g. 15 pairs is easy.
 - There are no meta-repos or submodules to deal with.
 - GitHub’s interface can be used directly.
 - Less need for extremely complicated git commands
 - Possible to do partial checkouts using SVN
 
Cons[edit]
- All the cons in Variant A, minus:
- Repos will be smaller than the massive monorepo
 
 - Moving a package between release states and preserving history is complicated (can be scripted)
 
Variant C[edit]
Several repos:
- One for each of the modules in languages/
 - One for all the released pairs
 - One for incubator
 - One for each of the core tools
 
Approach 2[edit]
Individual repos for each pair, language module and tools. A couple of “meta-repos” that contain submodules pointing to collections of repos, e.g. apertium-staging would contain ~8 submodules pointing to each of the pairs in SVN’s /staging and apertium-all would have submodules to apertium-staging, apertium-incbuator, apertium-languages, etc. This hierarchy would be maintained via GitHub’s repo tags (a.k.a. “topics”), i.e. apertium-xxx-yyy could be marked with the incubator tag to end up in apertium-incubator.
Pros[edit]
- Usable issue tracker for each repo
 - Fits into the Git/GitHub philosophy
 - People who wish to use Git can contribute using that (while it's still possible to use the SVN bridge for those who want that)
 - Familiar branching, PR and hooks that work as expected
 - Email notifications and watching repos is useful
 - An analogous change to 15 pairs will result in 15 different commits, each repo has its own history (both pro and con).
 - Granular permissions (not everyone has access to literally everything, especially useful for GCI/GSoC)
- RESPONSE: Could be considered more bureaucratic
- Re-response: not really. Granular permissoins are a (good) option, but it's not mandatory. We could use "org" permissions instead of "repo" permissions
 
 
 - RESPONSE: Could be considered more bureaucratic
 - Empowerment for package maintainers:
- They could enforce workflows (code reviews, etc) for specific packages, and accept easily patches from other people (via pull requests) before requesting commit access.
 
 
Cons[edit]
- Harder for people who make changes to lots of pairs at the same time (i.e. couple of core devs)
- Commands are more gnarly (
git submodulecan be pretty unintuitive)- RESPONSE: Possible to mitigate with aliases and cheat sheets
 
 - An analogous change to 15 pairs will result in 15 different commits, each repo has its own history (both pro and con).
- RESPONSE: Already happening for most of the people. Almost no-one has the whole SVN repo, but multiple SVN subfolders.
 
 
 - Commands are more gnarly (
 - Somewhat harder for people who use the meta-repos
- RESPONSE: It’s really not that difficult to checkout (
git submodule update --recursive --init) and pull updates to a meta-repo (git pull --recurse-submodules) and with aliases it can be even shorter. 
 - RESPONSE: It’s really not that difficult to checkout (
 - Requires tooling to keep meta-repos up-to-date
- RESPONSE: These are super simple scripts based on GitHub’s reliable API. Sushain is willing to write them and Tino is willing to host (and perhaps code review).
 
 - GitHub doesn’t provide a nice interface to view repos in a tree format
- RESPONSE: Sushain will though! See this page that can be trivially finished to cover all our repos and is a very simple single HTML file (and pretty IMO). This page is automatically generated from the repo tags.
 
 
Related Concerns[edit]
- Mailing lists - should probably be preserved on SourceForge for now until/unless we choose to switch to another solution or self-host them.
 - Existing issues - Sushain volunteers to manually transpose (or find an automatic solution) to moving our existing issues (pretty small #)