Difference between revisions of "Talk:PMC proposals/Move Apertium to Github"

From Apertium
Jump to navigation Jump to search
(Created page with "<h1><a id="Apertium_on_GitHub_0"></a>Apertium on GitHub</h1> <h2><a id="Reasons_to_Switch_2"></a>Reasons to Switch</h2> <ol> <li>GitHub’s excellent issue tracker</li> <li>Mo...")
 
Line 1: Line 1:
<h1><a id="Apertium_on_GitHub_0"></a>Apertium on GitHub</h1>
<h1>Apertium on GitHub</h1>
<h2><a id="Reasons_to_Switch_2"></a>Reasons to Switch</h2>
<h2>Reasons to Switch</h2>
<ol>
<ol>
<li>GitHub’s excellent issue tracker</li>
<li>GitHub’s excellent issue tracker</li>
Line 10: Line 10:
<li>More visibility as an FOSS project (people browse GitHub)</li>
<li>More visibility as an FOSS project (people browse GitHub)</li>
</ol>
</ol>

<h2><a id="Prevailing_Approaches_11"></a>Prevailing Approaches</h2>
<h2>Prevailing Approaches</h2>
<p>Common pros/cons are excluded for the sake of brevity.</p>
<p>Common pros/cons are excluded for the sake of brevity.</p>
<h3><a id="Approach_1_15"></a>Approach 1</h3>
<h3>Approach 1</h3>
<p>A monorepo with all the lingustic data, pairs and language modules. Other folders in SVN like the core engine and peripheral tools (e.g. APy) would live in their own repos.</p>
<p>A monorepo with all the lingustic data, pairs and language modules. Other folders in SVN like the core engine and peripheral tools (e.g. APy) would live in their own repos.</p>
<h4><a id="Pros_18"></a>Pros</h4>
<h4>Pros</h4>
<ul>
<ul>
<li>Large-scale editing of e.g. 15 pairs is easy.</li>
<li>Large-scale editing of e.g. 15 pairs is easy.</li>
Line 20: Line 21:
<li>GitHub’s interface can be used directly.</li>
<li>GitHub’s interface can be used directly.</li>
</ul>
</ul>
<h4><a id="Cons_23"></a>Cons</h4>
<h4>Cons</h4>
<ul>
<ul>
<li>The monorepo would be massive (&gt; 3 GB).
<li>The monorepo would be massive (&gt; 3 GB).
Line 39: Line 40:
<li>Contradictory to the Git/GitHub philosophy (bad impression)</li>
<li>Contradictory to the Git/GitHub philosophy (bad impression)</li>
</ul>
</ul>

<h3><a id="Approach_2_35"></a>Approach 2</h3>
<h3>Approach 2</h3>
<p>Individual repos for each pair, language module and tools. A couple of “meta-repos” that contain submodules pointing to collections of repos, e.g. <code>apertium-staging</code> would contain ~8 submodules pointing to each of the pairs in SVN’s /staging and <code>apertium-all</code> would have submodules to <code>apertium-staging</code>, <code>apertium-incbuator</code>, <code>apertium-languages</code>, etc. This hierarchy would be maintained via GitHub’s repo tags (a.k.a. “topics”), i.e. apertium-xxx-yyy could be marked with the <code>incubator</code> tag to end up in <code>apertium-incubator</code>.</p>
<p>Individual repos for each pair, language module and tools. A couple of “meta-repos” that contain submodules pointing to collections of repos, e.g. <code>apertium-staging</code> would contain ~8 submodules pointing to each of the pairs in SVN’s /staging and <code>apertium-all</code> would have submodules to <code>apertium-staging</code>, <code>apertium-incbuator</code>, <code>apertium-languages</code>, etc. This hierarchy would be maintained via GitHub’s repo tags (a.k.a. “topics”), i.e. apertium-xxx-yyy could be marked with the <code>incubator</code> tag to end up in <code>apertium-incubator</code>.</p>
<h4><a id="Pros_38"></a>Pros</h4>
<h4>Pros</h4>
<ul>
<ul>
<li>Usable issue tracker for each repo</li>
<li>Usable issue tracker for each repo</li>
Line 54: Line 56:
</li>
</li>
</ul>
</ul>
<h4><a id="Cons_47"></a>Cons</h4>
<h4>Cons</h4>
<ul>
<ul>
<li>Harder for people who make changes to lots of pairs at the same time (i.e. couple of core devs)
<li>Harder for people who make changes to lots of pairs at the same time (i.e. couple of core devs)

Revision as of 02:58, 28 October 2017

Apertium on GitHub

Reasons to Switch

  1. GitHub’s excellent issue tracker
  2. More people are far more familiar with Git vs. SVN (especially younger folks, see GCI/GSoC)
  3. More people have GitHub accounts, easier to start-up for a new user
  4. GitHub’s interface is far superior to SourceForge’s interface
  5. Avoids SourceForge’s downtime (not so bad lately)
  6. SourceForge gives an awful impression
  7. More visibility as an FOSS project (people browse GitHub)

Prevailing Approaches

Common pros/cons are excluded for the sake of brevity.

Approach 1

A monorepo with all the lingustic data, pairs and language modules. Other folders in SVN like the core engine and peripheral tools (e.g. APy) would live in their own repos.

Pros

  • Large-scale editing of e.g. 15 pairs is easy.
  • There are no meta-repos or submodules to deal with.
  • GitHub’s interface can be used directly.

Cons

  • The monorepo would be massive (> 3 GB).
    • Most devs (aside from the couple core devs) would have to use GitHub’s SVN bridge to work on a pair.
      • This is highly contradictory to working on GitHub
      • People will have to learn SVN, negating some reasons to switch
    • Diluted usefulness of branches, PRs and hooks
    • GitHub doesn’t necessarily allow repos larger than 1 GB (unclear whether this limit refers to bare repo). If GitHub decides to stop us at some point after we switch, that’s really bad.
  • Everyone will disable email notifications (“watching” a repo) since there will be too much spam
  • Massive number of issue labels to curate and apply (non-members cannot tag an issue when submitting), reducing the effectiveness of the issue tracker
  • Commit access will continue to give write access to everything
  • Contradictory to the Git/GitHub philosophy (bad impression)

Approach 2

Individual repos for each pair, language module and tools. A couple of “meta-repos” that contain submodules pointing to collections of repos, e.g. apertium-staging would contain ~8 submodules pointing to each of the pairs in SVN’s /staging and apertium-all would have submodules to apertium-staging, apertium-incbuator, apertium-languages, etc. This hierarchy would be maintained via GitHub’s repo tags (a.k.a. “topics”), i.e. apertium-xxx-yyy could be marked with the incubator tag to end up in apertium-incubator.

Pros

  • Usable issue tracker for each repo
  • Fits into the Git/GitHub philosophy
  • Everyone can contribute using Git (no need for SVN bridge)
  • Familiar branching, PR and hooks that work as expected
  • Email notifications and watching repos is useful
  • Granular permissions (not everyone has access to literally everything, especially useful for GCI/GSoC)
    • RESPONSE: Could be considered more bureaucratic

Cons

  • Harder for people who make changes to lots of pairs at the same time (i.e. couple of core devs)
    • Commands are more gnarly (git submodule can be pretty unintuitive)
      • RESPONSE: Possible to mitigate with aliases and cheat sheets
    • An analogous change to 15 pairs will result in 15 different commits, each repo has its own history.
  • Somewhat harder for people who use the meta-repos
    • RESPONSE: It’s really not that difficult to checkout (git submodule update --recursive --init) and pull updates to a meta-repo (git pull --recurse-submodules) and with aliases it can be even shorter.
  • Requires tooling to keep meta-repos up-to-date
    • RESPONSE: These are super simple scripts based on GitHub’s reliable API. Sushain is willing to write them and Tino is willing to host (and perhaps code review).
  • GitHub doesn’t provide a nice interface to view repos in a tree format