This is very cool. But why didn't they just go with Redis? Their blog stated that this requirement: "be very fast even with millions of entries" wasn't met by Redis, but I have a very hard time believing that.
Others have mentioned that the article calls out ignoring traditional in-memory cache daemons because of the additional network time, but with a targeted p50 response time (of their HTTP service fronting all of this), and caches like Redis and memcached being able to respond in hundreds of microseconds... it does feel like they didn't actually run the numbers.
The other natural alternative would simply be to run Redis/memcached and colocate this HTTP service on the same box. Now your "network latency" component is almost entirely negligible, and you've deferred the work of managing memory to applications _designed_ around doing so.
There’s no network latency involved as it’s an in memory, embedded database.
Also, max throughput here is ultimately a sum of how fast the memory allocator is and memory bandwidth which, even with a poor implementation would be orders of magnitude more than what you can do with standard NICs and Linux kernel networking stack.
Moreover, as a consequence, far less context switching is involved.
Per the article, Redis requires a network call to the redis server. They wanted an in memory cache. Not sure how would you keep the per machine in memory cache in synch without change propagation to all the machines. This looks like a simple TTL cache.
1. Redis has a lot of functionality that a simple cache client doesn't need.
2. Redis's connection model can lead to complications.
3. Redis's to-disk checkpointing in practice uses a lot of memory.
4. Redis's poorly chosen default settings have cost the industry an uncounted but large sum of money.
5. Redis is written in C. That's a bad idea for a networked application.
6. Redis's creator is a person who doesn't deserve our support. He's constantly combative with experts who have give him good advice about how to improve Redis because he has a vision of "simplicity" which translates to "what I already understand."
Redis is a decent choice if you need all of its features. It's got a wide spectrum. But, if you don't need ALL of them, then pick a simpler and better designed system.
Actually the main thing about simplicity is having an ego small enough that you don't need to show, in your code / design, everything you are capable of understanding, if less is already enough. The fact I understand strong consistency does not mean this is a good design choice for Redis. Actually who followed certain advices provided by experts, now closed the doors of their business in the database area. You can't be hostage of your users.
> Redis is written in C. That's a bad idea for a networked application.
That’s a heafty claim. It is certainly more complicated to write a network complication in C vs a higher level language like Go, but by no means a bad idea in terms of outcome.
A bigger issue may be the QPS that a cache generates, as you tend to check large quantities of values against it. So the network round trip to Redis isn’t ideal, and you may get into a situation where it’s single threaded nature becomes a bottleneck if your values are large (tho generally Redis is memory or network bounded, not CPU).
> That’s a heafty claim. It is certainly more complicated to write a network complication in C vs a higher level language like Go, but by no means a bad idea in terms of outcome.
It is dangerous to write network connected applications in C. This is not a hefty claim, it's well understood in the industry and most major tech firms avoid writing new software this way.
> A bigger issue may be the QPS that a cache generates, as you tend to check large quantities of values against it. So the network round trip to Redis isn’t ideal, and you may get into a situation where it’s single threaded nature becomes a bottleneck if your values are large (tho generally Redis is memory or network bounded, not CPU).
For most modern deployment models of Redis, I do not think that this is correct. Redis tends to be run locally with API responders, and as such the RTT will be lost in the noise that most web frameworks introduce. Surely if there is a big RTT that is a problem, but that's not a Redis-specific problem (although the consistent tcp connections with potentially sparse usage it prefers may add knock on effects in this condition).
So Memcached, Redis, and Postgres are all dangerous to use because they are written in C?
Does your blanket statement apply to C++ and other C derivatives as well? If so, we lose MySQL, Oracle SQL, MS SQL. Basically every database written more than a decade ago.
From what I can tell general consensus seems to be that all of those systems are pretty stable and not dangerous because of their language choice.
> Memcached, Redis, and Postgres are all dangerous to use because they are written in C?
Yes, actually. And in fact, both Redis and Memcached have been implicated in both security issues and their policy misconfigurarions have lead to widespread DDoS attacks.
As for Postgres, I think most of the industry has finally stopped directly connecting postgres to the public internet. It took many years to get any of these projects as stable as they are, and all have had major security issues in that path.
You should expect those problems if you start a new project in C, with roughly the same lifecycle. Even if you play it in fast forward (say, 25% faster) you're still in for years of major security and stability issues.
This seems ill considered in 2019.
> Does your blanket statement apply to C++ and other C derivatives as well? If so, we lose MySQL, Oracle SQL, MS SQL. Basically every database written more than a decade ago.
No, although you really need to stick to the libraries to make C++ safe. Bare pointer handling and falling back on C-like semantics is dangerous.
> From what I can tell general consensus seems to be that all of those systems are pretty stable and not dangerous because of their language choice.
I'm not sure how we'd directly measure it. Most folks I know trust Memcached slightly more than Redis because its smaller, but consider both to be risky and best when not directly connected to the public internet, as is common with MySQL and Postgres.
On this KirinDave is absolutely correct. Those systems are stable in spite of the language choice; anyone seeking to do similar work today would be professionally negligent to choose C.
I don't love the way the previous comment is written, but it's substantive. This is just drama; it distills the worst possible reading from the parent comment and tries to fix that meaning for the rest of the thread, which is exactly the opposite of what the guidelines ask you to do.
Further: the Redis take on display in that comment is pretty mainstream – very much including the statement about Sanfilippo's obstinacy – among systems developers. Even if you're an advocate for Redis, it's good to at least see the brief its detractors bring against it.
I agree that Sanfilippo can be obstinate. "Redis' creator doesn't deserve our support" is a shitty and vindictive conclusion to come to from that obstinance.
If that's ^^ unsubstantive drama to you, and the thing it's responding to isn't, calibrate your sensors because they're off.
I don't really understand why this is so traumatic to say. There are lots of people doing amazing things. Maybe we can look at projects run by people who don't disregard good technical advice for years out of what they themselves have called pride.
I knew my post would be controversial, but I certainly wasn't expecting that the main complaint leveled is that I'm supposed to ignore he and his community's prior transgressions because remembering them is "vindictive."
You disagree with it. That's fine. But people are allowed to say that things are or aren't worthy of support. Ironically, what they shouldn't do is call them "shitty and vindictive", which that comment didn't do.
Actually the comment is not substantive. Not telling what is the configuration that creates problems, not telling what's wrong about the Redis connection handling, not comparing Redis fork based persistence with the alternative in memory systems have, and so forth, is a exactly the hand waving the OP accuses of doing. Very short not detailed criticisms of systems are actually FUD because in complex systems the devil is in the details.
If it's the only think you criticize about him, then I guess he is worth of support. There is nothing wrong with this terminology unless you can prove otherwise.
Weird, I didn't mention it on the first post. I had 6 specific criticisms, 5 of which were technical.
If anything, this is an example of how the Redis community has actors that pop up trying to distract from direct criticism.
I'm being downvoted and flagged despite my post being substantive, on topic, and in direct response to a question. There are valid reasons not to use Redis.