Real estate search patterns and AOL users
Galen on 08 7, 2006
Yesterday AOL proudly announced the release of 20 million web queries from 650,000 users (screenshot), with each user “anonymized,” but identified by a unique ID. This is appalling – it means that potentially thousands of social security numbers and email addresses are now free for spammers and thieves to harvest, along with a lot of other personally identifying information. Think about what you search for – email addresses, people’s addresses, business secrets and even social security numbers come to mind. AOL quickly realized their mistake and pulled the plug, but not before the dataset had taken on a life of its own.
So, spammers and thieves are having a field day, but now that it’s out, we might as well use it for educational purposes. It’s a big, unwieldy file, but I’ll try to post some real estate search patterns by tomorrow. If you’re hoping to do your own analysis on this dataset, I wager that there will be a nice web interface for you to use within a week (Consumerist thinks so too). I’ll let you know when it pops up.
More on the ramifications of the release at TechCrunch. If you’re going to cancel your AOL account, good luck.
8 Responses to “Real estate search patterns and AOL users”
Leave a Reply
Live Comment Preview
Popular Posts
Recent Posts
Recent Comments
- ARDELL: Ted, As you may h
- FB_1078796303: I'm not sure why, bu
- FB_1078796303: Ardell, "Every ag
- Dustin: Ted: How odd... when
- ARDELL: Craig, Ray puts h





I saw that on Jim’s site early this morning and my jaw dropped. It is definitely shocking that they allowed this to be released. The data has the poential to be completely disruptive to a thousands of AOL users. I’m sure glad I never took part in their service.
[...] This week it is the Property Grunt’s turn to host the Real Estate Carnival. Take a moment to check out some of the posts within their blog. I see that the guys from vopenhouse.com have started a blog and have been included in this weeks carnival. On a side note, it is interesting to see what AOL did this week with the release of some data that many peoples think should not have been released. Have a look at this post at TechCrunch.com – the comments have exploded there (195 so far). Now it seems that AOL is apologizing for it. Interested in cancelling…. saw this link on raincityguide.com. [...]
[...] 20 million reasons to cancel AOL August 7, 2006 As promised earlier, I did some scans through the massive privacy invasion from AOL for some real state search insight. I’ll leave it to other sites to search for the tell you about the disgusting things people search for. Not many AOL searchers are looking for “seattle real estate” in those words – in fact only 21 of the 20 million queries contained that text and those users largely went to the big Google-optimized sites like Seattle Power Search (the number one result) or the Seattle Times (the number 3 result). [...]
Boy am I glad I dumped my AOL stock.
Perhaps AOL was attempting to emulate Microsoft? The MSN Search Team recently gave selected university researchers access to 15 million real-user queries (which were also filtered & anonymized prior to their access of this data).
The big difference is that the researchers are under strict license in using the data, and since MS is providing them with grant money the likelihood of a leak is very slim. AOL appears to have put the data in public domain, so Joe Consumer (or John Criminal Mind) can do what they want with it while MS kept the researchers on a very short leash.
Yes.. try out the AOL search database yourself.. It is just fun to look at some of the search data..
http://data.aolsearchlogs.com/log/random.cgi
Another site where you can search this data is here
http://www.datablunder.com/logitems/query/
Here’s a *quick* site where you can search the AOL data for yourself:
http://www.frogspy.com