The Gender Genie

Image credit: 

Although it has existed since 2003, BookBlog's Gender Genie was news to me. Based on the research of Moshe Koppel, Bar-Ilan University in Israel, and Shlomo Argamon, Illinois Institute of Technology, the Gender Genie implements an algorithm that (sometimes) predicts the sex of the author of a piece of text.

I must say, this premise seemed odd to me. I don't tend to assume that there is some algorithmically determinable masculinity or femininity to any text, but of course, the first thing I tried was pasting in my own writing sample (from a piece of fiction). And the results...correct! "The Gender Genie thinks the author of this passage is: male!" With a "Male Score" of 843 and a "Female Score" of 403, my 663-word passage was apparently way-male.

So what's the story here? Here's info from the author of the Gender Genie:

Most of the time, people drop their writing into [The Gender Genie] and, when they don't get the result they expect, declare it to be wrong, wrong, wrong. Yet, a lot of its users still find it and its analysis to be a fun time waster. Despite having written the program, I didn't come up with the algorithm and believe that the Genie works no better than the flip of a coin. However, I don't think it to be a complete time waster since there actually is some academic study that went into it.

In the most basic terms, the computational linguists behind the algorithm, Koppel and Argamon, took a bunch of fiction and looked for trends based on gender. Using complicated formulas, they determined that male writers tended to write more about specific things like an apple, a book, or the car. In contrast, female writers wrote about connections to things like my apple, your book, or our car. The nouns themselves (apple, book, car) didn't matter much but the preceding qualifier, whether an article (a, an, the) or possessive (my, your, our), did.

Read more about the Gender Genie or try the Gender Genie yourself.

More from mental_floss...

June 24, 2007 - 11:05pm
submit to reddit