Apertium has a problem with anaphora resolution.
- If you have "el seu" in Catalan and are translating to French it could be "son" (third-person singular or "leur" (third-person plural). If you are translating to English or Russian then you also need to know the gender of the possessor (его, ее, их).
- If you are generating subject pronouns for a language, often you need to know the gender of the pronoun, e.g. "ha arribat" could be "He has arrived" or "She has arrived". In this case the "frequent" thing to do is to use the masculine pronoun, but that just relies on the male pronouns are used more frequently (see below):
Usually this kind of thing is done over parse trees, but Apertium doesn't have parse trees, so we'd need to find another way to do it.
Masculine and feminine subject pronouns in English wikipedia:
5682787 he 3469648 He 1508156 she 839442 She