Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

I'm torn. I love open data, but I fully expect that someone will (partially) deanonymize this.


I share your concern.

Once data like this is deanonymized, it's out there forever -- there's no going back to fix it like you would a software bug. So you need perfect understanding and provable security at release time to guaranteed safety into the indefinite future. That's not an easy constraint to satisfy.


The thing that terrifies me specifically is that there's been work done - I believe it was a branch of the US military studying network traffic patterns - showing that you can reconstruct profiles based on behavior patterns and link them back to the original user with high success rates.


Probobly. Hopefully so have they learned from the fallout from the AOL search log case ( https://en.wikipedia.org/wiki/AOL_search_data_leak ). That case was certainly a big mess.


I'm not sure why you're downvoted, the AOL search log case was a huge mistake on the part of AOL and I'm quite surprised that Yahoo! would take a risk like this. The real risk is never in just this data but in combining it with other public datasets. I haven't looked at what is in this particular dump in detail but if there is data that had to be anonymized (as they claim) then you can bet that there will be people already busy trying to reverse that.


Yep, and this is why stuff like this should have a formal opt-in process.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: