Why not just rate limit responses? That ends up costing bot makers about the sam...

TeMPOraL · on Feb 24, 2015

> Imagine one day when we've automated the world and the only reason humans have to do any work is so that robots can prove they're human. This whole thing is ridiculous.

... did you just solve the whole problem of on-going automation? Capitalism is saved! We will all work as CAPTCHA breakers. ;).

chrismcb · on Feb 24, 2015

The article claims bits can solve the most distorted text with 99.8% accuracy. I'm not that accurate. Perhaps someone can write a captcha breaking chrome extension, so I don't have to bother.

RubyPinch · on Feb 24, 2015

https://chrome.google.com/webstore/detail/rumola-bypass-capt...

the power of paying other peole cents to fill in captchas

dredmorbius · on Feb 24, 2015

An issue that I've run into is that 1) Google registers traffic from Tor proxies as suspicious (with some reason), it 2) puts Captchas in front of you (which are getting quite difficult to solve), and 3) if you're rate-limiting Tor proxies (around 6,000 - 7,000 worldwide as I was checking earlier), you're going to block a lot of legitimate Tor traffic.

Similarly for VPNs and other tools. Whose use is fairly likely to increase as people start seeking ways to avoid ubiquitous surveillance.

There are other options, including a few tools that look at how to provide a fair and anonymous reputation system for Tor clients:

http://arxiv.org/pdf/1412.4707v1.pdf

https://gnunet.org/node/1704

logn · on Feb 24, 2015

I think you can rate limit without tying it to IP address. If each page returns a session key only valid for the next page request, then you force bots to wait as long as you want and/or spend money on extra memory for parallel sessions. One problem with this is, e.g., if people come to your site from an indexed link and have no possible session yet. In that case you probably would want to add a delay after some amount of requests per IP, so you'd slow Tor users down but only on the first request to your site. If your page is JS or browser dependent in some way, then bots would probably need about 100 MB per thread. All of this is in the ballpark of paying people to solve captchas.

This was a problem before Tor anyhow. You can run a proxy for a few cents a day.

I just don't see how captchas are some awesome solution. In any anti-bot technology, the cost to circumvent it is pennies. It strikes me more as something like DRM which just makes content producers feel good, but really only punishes average people.

edit: sorry I hadn't read your links. Good points and hopefully someone like CloudFlare would make this easy for people to add to their sites.

dredmorbius · on Feb 24, 2015

You'd have to limit novel sessions to very low activity rates. That would require some sort of persistence token (not necessarily a cookie), and if provided on an anonymised basis, one that's verifiable but not predictable or traceable to prior cookies. Which is what much of the references I provided covers.

Sorting a mechanism for allocating those tokens' seed values is difficult. FAUST requires an unblinded token request initially.

CAPTCHAs had been useful, though always problematic. The goal isn't perfection but costs. Problem is that costs keep falling.

wdr1 · on Feb 24, 2015

Rate limiting alone would still leave open many categories of abuse.

Benferhat · on Feb 24, 2015

I might even incorporate the request rate into a bot detection algo, maybe have it trigger temporary hellbans.

rdl · on Feb 24, 2015

Request rate is definitely one thing you can limit, but it's tricky when attackers potentially control large numbers of IP addresses.

There's an annoying triangle here: wanting to preserve privacy (== unlinkability), machine-independence, and "working well for good traffic with limited resources, as well as blocking attackers with substantially more resources". Ideally it is "choose zero", I'd be happy if the state of the art were even at "choose one".

rdl · on Feb 24, 2015

er, I meant choose two, and we're generally at zero or one.