The Tech Stack Behind Keen IO’s Analytics Backend Service

dzello · on Nov 18, 2013

Hi everyone, Josh here. Lemme know if you have any questions you'd like to ask.

csarva · on Nov 18, 2013

Having written some analytics services myself, I'm curious what sort of performance you're seeing on the ingest side with tornado/python. What kind of throughput per server (or systemwide) are you seeing?

dzello · on Nov 18, 2013

When we did the migration over to Storm & Cassandra, we did a 1 time ingestion of old events. This was very fast, peaking around 50k events/sec. Additional synthetic testing showed we could easily beat that. It was actually the old system that bottlenecked, and prevented us from turning the dial way up :)

dkador · on Nov 18, 2013

CTO of Keen IO here.

The ingest side isn't too crazy. We're doing between 50-100 insert-type requests per second per server (although we run multiple instances of Tornado per server so it's actually fewer per Tornado process). The stats on the read side are more impressive. :)

nemothekid · on Nov 18, 2013

Are you guys using Cassandra Counters?

dzello · on Nov 18, 2013

We currently are not. We tried them out, but found that the counter value would spontaneously change from time to time. This could have been user error, but we weren't able to resolve it via our own wits or the user group.

We evolved our design to use simple integers and add to them in a way that didn't require atomicity. That has been rock solid.