Guessing gender from browser history
by Nat Torkington| @gnat | comments: 20
I just found a clever trick for guessing gender from browser history. I tried it and then realized that I'm a crappy test for the system: yes, likelihood of my being male is 99%. But if I read a hardcore geek tech blog, then that's probably the case anyway. I could emulate that behaviour with a simple return(G_MALE) in the code.
I pushed the link to a few women for some more strenuous testing. Penny Leach was told she's 52% likely to be female, and Laurel at O'Reilly was told she's 50% likely to be female. Perhaps on the internet, everyone surfs like a MALE with probability 50%. How'd the test work for you? Let me know in the comments ....
tags: biology, just fun, web 2.0
| comments: 20
submit:
0 TrackBacks
TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/6636
Comments: 20
I am male but I showed as female on the test due to an interest in social networking and books/movies:
Likelihood of you being FEMALE is 57%
Likelihood of you being MALE is 43%
Site Male-Female Ratio
amazon.com 0.9
blogger.com 1.06
flickr.com 1.15
netflix.com 0.79
typepad.com 0.94
linkedin.com 0.94
zoominfo.com 0.83
godaddy.com 1.17
I guess this is a serious think wich will become more and more important in online biz.
Interesting -- it guessed me as 98% female, on my home computer, which has a lot of non-work-related stuff (eg shopping for clothes and pharmacy items) on it. I'm wondering if my work computer would test more male?
I also note that it only uses a small proportion of my actual browser history, and doesn't note frequency of visit. It only listed about 25 sites, one of which is a friend's university which I clicked through to from her blog just the one time, but ignored eg. radar.oreilly.com which I visit pretty often, or twitter.com, or freebase.com (where I work). The ones listed by the tool seem to be mainstream consumer brands (amazon, youtube, walgreens.com, eddiebauer.com, priceline.com) but that doesn't form most of my web behaviour. Then again, maybe what mainstream consumer sites you visit is all it needs to guess your gender.
Likelihood of you being FEMALE is 82%
Likelihood of you being MALE is 18%
I am a male. Way off the mark.
>>I guess this is a serious think wich will become more and more important in online biz.
I can think of only a handful of products that are female specific (e.g. pads). How important is it really that you know the gender of a person on the Internet in order to successfully market to them? For example, does it really matter if you are selling sewing machines (or dresses or catalytic converters ...) specifically to a man or woman? I would think the more important thing is finding individuals (be they men or women) who are interested in what you're selling and convincing them to buy.
If this trick helps you do that, great, but I don't see how gender designations are helpful.
the first time I tried it (a few days ago) i got 66% female, 34% male.
Today it was a little closer... 51% female, 49% male.
(In case you can't guess from my name, I'm male.)
Perhaps this says more about what content there is online... Certain types of content have near ubiquity on the web - and that may in turn imply something about the gender of the publishers...
If certain sites/content are hard to avoid - or perhaps classified/weighted toward one gender or another more strongly than maybe they should be, then it will likely be hard to achieve a "female" score. Just a thought.
Of course, this trick only works if there are a balance of sites that have an odds ratio of less than 1 to plug into the formula. If the net were thoroughly male dominated, the range of scores for males might be in the [50,80] range, whereas the range for females might be in the [30,55] range. Thus a "55%" might actually be a very feminine score, even though it appears neutral.
Bayesian statistics can be neat, but do be careful of one's conclusions.
It figured out I'm female:
Likelihood of you being FEMALE is 89%
Likelihood of you being MALE is 11%
I recently booked hotels and flights for our next vacation and it looks like the travel sites weigh heavily female.
I got a 'perfect' score...
Likelihood of you being FEMALE is 0%
Likelihood of you being MALE is 100%
The most lopsided sites were
thepiratebay.org 2.13
gizmodo.com 2.08
alleyinsider.com 1.94
Which is more a reflection of how male dominated the tech world is.
Likelihood of you being FEMALE is 0%
Likelihood of you being MALE is 100%
Ha! I knew there was no WOMAN in me!!
Don't let the frivolity of the example exploit distract you from the real problem. Javascript used to allow direct access to the browser history, but then later versions stopped it to prevent malicious scripts from taking advantage of it. In this new form you still can't directly access the history, but you can take (educated) guesses about the history and determine if it is correct.
This really just shows that a 10+ year old bug was never really fixed.
95% male for me (In reality, I am a guy). I've hit a lot of sports sites in the last week (because of the recent baseball trading deadline), which I'm guessing could have skewed things.
I wonder what would have happened if I had taken the test about 2 weeks ago, when I was busy scouring the web for gardening advice?
There's a 58% chance of me being female, because I visited lots of book sites (as I do almost every day) and a few department store sites.
One of the method's weaknesses is that it gives equal weight to a site where you spent 45 seconds and a site where you spent 3 hours. I searched a couple department store sites briefly for a particular item, and did my standard daily perusal of book sites. But, in that same time period, I spent my typical long daily hours following my Red Sox on mlb.com, and getting my daily political news fix at realclearpolitics.com.
In other words, the vast majority of my online time was spent at MLB.com and RealClearPolitics.com, yet these are given the same weight as the 75 seconds I spent at target.com.
So, as this stands, we expect some silly results from this, and the results will change substantially daily. But if the method could be time-weighted, it would indeed be interesting!
Last I checked I was female, but this test said:
Likelihood of you being FEMALE is 2%
Likelihood of you being MALE is 98%
The site I visited with the highest result was io9.com (score of 2.33).
Apparently, you must be a guy if you (like me) are interested in science fiction, tech, and politics.
I'm female; totally unsurprised by these results, though, as most of my browsing time is spent on "male-centric themes" like technology, economics, and on Wikipedia in general.
Likelihood of you being FEMALE is 35%
Likelihood of you being MALE is 65%
It gave me a 50-50 split. I'm a grandmother of 3 in real life. I have my browser set to empty history when it is closed, so the sample size is pretty small.
Likelihood of you being FEMALE is 11%
Likelihood of you being MALE is 89%
The likely culprits for my feminine side:
missingmoney.com
nature.org
burlingtoncoatfactory.com
coupons.com
myhotcomments.com
simplyhired.com
yellowpages.com
whitepages.com
I thought it was interesting for the low ration for tinyurl.com, I guess anything involved with shortening anything would be anti-male.
Post A Comment:
STAY CONNECTED
RECENT COMMENTS
- Kevin Shockey on Guessing gender from browser history: Likelihood of you being...
- dagny on Guessing gender from browser history: It gave me a 50-50 spli...
- koranes on Guessing gender from browser history: Nice Idea, well, now I ...
- Frank on Guessing gender from browser history: I did like the Idea unt...
- Jenny on Guessing gender from browser history: I'm female; totally uns...
- Dori on Guessing gender from browser history: Last I checked I was fe...
- Kevin Farnham on Guessing gender from browser history: There's a 58% chance of...
- Chris Spurgeon on Guessing gender from browser history: 95% male for me (In rea...
- Andrew on Guessing gender from browser history: Don't let the frivolity...
- anjan bacchu on Guessing gender from browser history: Likelihood of you being...


Julia Soergel [08.01.08 06:49 AM]
Now I'm really starting to get worried. 80% likely to be male – maybe I should try to kick you of my feed reader and get another job? Guerilla knitting might be a perfect choice.