John and Jane are both looking for new jobs in the tech ﬁeld. They both have a similar level of education and similar job experience. However, when looking for positions on online job sites, John is shown far more high-paying job postings than Jane. This is due to processes called “machine learning” and “word embedding”. You may ask yourself what this has to do with automatic translation, but the reality is that it plays a role in sexism in machine translation. What is exactly is word embedding and how does it affect sexism in machine translation? And more importantly, how do we ﬁx it?
What is “word embedding” and how does it relate to machine translation?
As machine translation becomes more popular due to its efﬁciency, convenience, and price, it’s important to consider what sacriﬁces need to be made in order to make it possible. Machine translation uses machine learning and algorithms to help it decide how words will be translated. This is done by programming computers to use certain algorithms so it can begin to teach itself — in this case, by scanning online texts and matching up segments of phrases and paragraphs. Unfortunately, this usually means easily accessible documents, such as Google News, or Wikipedia. The computer algorithms will then search for co-occurrences, words that are often found together, which in turn can mean perpetuating outdated stereotypes.
Why is it sexist?
Since machine learning can only access surface level texts, the information the computer is receiving is often biased and can result in the computer teaches itself to be sexist. For instance, if you decide to use English-French (English as a source language and French as a target language) as a language pair with Google Translate, it can alter a translation to reﬂect sexism in the modern world. If you type “engineer” in English, it will be translated as “un ingénieur” (male engineer). Similarly, if you type in “nurse”, Google Translate will use “une infirmière” (female nurse) as the French equivalent. If you want a female engineer translated to French, you must specify “female engineer” in the English copy, or “male nurse” in order to get the intended translation. Part of this issue stems from English being a gender neutral language, while French requires its nouns to be assigned a gender. But why is it that “engineer” is male and “nurse” is female when both genders are equally capable of doing both jobs? It’s due in part to cooccurrences. When computers scan large volumes of texts, it most often ﬁnds nouns such as “engineer” or “doctor” associated with male pronouns, like “his” or “him”. The same happens with certain “feminine jobs”, like “teacher” or “nurse”. Through automatic translation sites, these stereotypes can be perpetuated and unintentionally integrated into new texts. For a more in depth look at algorithms and their potential beneﬁts and consequences, check out this video from Dataguele.
How do we ﬁx it?
Sexism in machine translation reﬂects a sexist view of society as a whole. A very practical and manageable solution to this problem would be to give machine learning softwares access to more reliable and less biased texts. This would enable them to not only “gain knowledge” but also to avoid more sexist biases. Another option would be to code in more equality. For instance, Google translate has implemented new ways to begin to ﬁx this problem by providing both genders as options in translations. As mentioned in a Google blog post, the reach of their gender equal options only applies to certain languages for the time being, but they have mentioned plans to expand it. Creating a sort of coded equality between masculine and feminine words could help to get rid of these gender biases. However, even in looking for solutions, more problems are found. There is no one universal ﬁx for this, as it’s not just gender bias that is concerned, but many minority groups. The ultimate solution would be to work on a lot biases that are still found in today’s society.
So where do John and Jane end up? Ideally equality would be coded into more programs, including job searching websites, and they would end up with similar jobs. Unfortunately, that’s not terribly realistic right now and this phenomenon is reﬂected across a myriad of platforms, including automatic translation. Perhaps the true solution lies much deeper, in creating more equal access and opportunity across the board for all genders.