Having written some analytics services myself, I'm curious what sort of performance you're seeing on the ingest side with tornado/python. What kind of throughput per server (or systemwide) are you seeing?
When we did the migration over to Storm & Cassandra, we did a 1 time ingestion of old events. This was very fast, peaking around 50k events/sec. Additional synthetic testing showed we could easily beat that. It was actually the old system that bottlenecked, and prevented us from turning the dial way up :)
The ingest side isn't too crazy. We're doing between 50-100 insert-type requests per second per server (although we run multiple instances of Tornado per server so it's actually fewer per Tornado process). The stats on the read side are more impressive. :)
We currently are not. We tried them out, but found that the counter value would spontaneously change from time to time. This could have been user error, but we weren't able to resolve it via our own wits or the user group.
We evolved our design to use simple integers and add to them in a way that didn't require atomicity. That has been rock solid.