More

Jonhoo · 2026-02-24T19:03:30 1771959810

Not at the moment, but it's a good idea for the next iteration of the class!

Jonhoo · on Feb 21, 2024

They're both based on the techniques outlined in the Noria paper (https://www.usenix.org/conference/osdi18/presentation/gjengs...) and my thesis (https://jon.thesquareplanet.com/papers/phd-thesis.pdf), so not terribly surprising they carry some resemblance :p

Jonhoo · on Feb 21, 2024

Just poking my head in to say that I technically never departed ReadySet — what happened was that I co-founded the company, but was so burnt-out when it came to databases after my PhD that I decided to leave the running of the company to others. Then, US visa regulations made it so that I couldn't really be involved _at all_ if I wasn't an actual employee, which meant I truly was "just a founder" with no real involvement in the company's execution if you will. Now that I'm back in Europe, that's changing a bit, and I have regular calls with the CEO and such!

AYBABTME · on Feb 21, 2024

Thanks so much for your work! Visa problems are a bane on the US. I hope that this situation will improve for you and get better in the future.

Jonhoo · on Oct 19, 2023

Oh hey, that's me!

A better link is https://www.youtube.com/watch?v=jf_ddGnum_4 which has chapter marks and has the power outage in the middle spliced away :p

faitswulff · on Oct 19, 2023

Hi Jon, thanks for recording! I'm excited to watch this. Do you happen to know why captions aren't enabled on your videos? Oftentimes the issue is that the video's primary language isn't set. Once this is done, youtube will probably caption the rest, though I'm not sure if that's true of videos of all lengths.

Jonhoo · on Oct 19, 2023

Hope you enjoy it! I have captions "enabled" (and primary language set) on all my videos, but my experience has been that YouTube is very hit-or-miss with whether it adds auto-caption to longer videos (somewhere around 2h seems to be the limit). Sometimes it appears later, it just takes a while, other times it just never manifests. It's unfortunate, but as far as I can tell there's nothing I can do about it :'(

sarp · on Oct 19, 2023

Hi Jon, challenge author here! First time watching your content, it was fun to see a Rust expert go through the challenge live. Saw the first hour, I noticed that during Bencode parsing, trying to find the most elegant way to implement it slowed you down a bit. (I also have this tendency and I'm sure having so many viewers doesn't help :)) Great progress by the way in 4 hours, hope you get to finish the challenge soon!

Already__Taken · on Oct 19, 2023

thanks Jon really enjoyed the process. I always wondered why oci registries don't bit torrent the images. now I understand why they might not have been fond of the approach.

pests · on Oct 20, 2023

Hey! Came across your videos randomly a few months ago and just wanted to say great content. Funny running into you here.

Jonhoo · on Nov 15, 2022

:wave: Author of the paper this work is based on here.

I'm so excited to see dynamic, partially-stateful data-flow for incremental materialized view maintenance becoming more wide-spread! I continue to think it's a _great_ idea, and the speed-ups (and complexity reduction) it can yield are pretty immense, so seeing more folks building on the idea makes me very happy.

The PlanetScale blog post references my original "Noria" OSDI paper (https://pdos.csail.mit.edu/papers/noria:osdi18.pdf), but I'd actually recommend my PhD thesis instead (https://jon.thesquareplanet.com/papers/phd-thesis.pdf), as it goes much deeper about some of the technical challenges and solutions involved. It also has a chapter (Appendix A) that covers how it all works by analogy, which the less-technical among the audience may appreciate :) A recording of my thesis defense on this, which may be more digestible than the thesis itself, is also online at https://www.youtube.com/watch?v=GctxvSPIfr8, as well as a shorter talk from a few years earlier at https://www.youtube.com/watch?v=s19G6n0UjsM. And the Noria research prototype (written in Rust) is on GitHub: https://github.com/mit-pdos/noria.

As others have already mentioned in the comments, I co-founded ReadySet (https://readyset.io/) shortly after graduating specifically to build off of Noria, and they're doing amazing work to provide these kinds of speed-ups for general-purpose relational databases. If you're using one of those, it's worth giving ReadySet a look to get these kinds of speedups there! It's also source-available @ https://github.com/readysettech/readyset if you're curious.

exabrial · on Nov 15, 2022

for readyset: Is there a deb package available or something lighter weight than docker, kubernets, etc? I'd just like to run it as a regular unix process and start/stop it with systemd.

greg-m · on Nov 15, 2022

yes! shoot me an email - greg@readyset.io - we're in the process of building binaries for more platforms, lmk which you need.

exabrial · on Nov 15, 2022

I mean just the standard x86/ubuntu 22.04 would be nice. It'd reduce a lot of friction to people try to evaluate your product!

marzoevam · on Nov 16, 2022

Here's a link to our binaries! https://docs.readyset.io/releases/readyset-core/

exabrial · on Nov 16, 2022

Thanks! We're going to give this a whirl

vlovich123 · on Nov 16, 2022

Hi Greg. Fancy seeing you here :)

brancz · on Nov 15, 2022

I don’t really know either very well, but how does Noria compare to Naiad? Are they comparable at all?

I already had Naiad on my reading list, definitely adding Noria as well! Thank you very much for your work!

Jonhoo · on April 5, 2022

The trick is "partial view materialization" (https://jon.thesquareplanet.com/papers/phd-thesis.pdf). Basically, you only materialize results for commonly-accessed keys, and compute other keys on-demand.

dkhenry · on April 5, 2022

Is there a way to federate which keys are commonly accessed ? Like if I commonly access the entire table can I direct inbound traffic to different application servers and have them access different caches so each cache can pull only a subset of the data into the cache, and not worry about things like which keys are being written globally

glittershark · on April 5, 2022

We've thought about that, actually! We have an experimental mode where multiple copies of the same query can be created (actually just multiple copies of the leaf node in the dataflow graph, so intermediate state is reused) with different subsets of keys materialized - the idea is then that these separate readers would be run on different regions, so eg the reader in the EU region gets keys for EU users, and the reader in the NA region gets keys for NA users.

Jonhoo · on April 5, 2022

Oh hey, that's my thesis! Happy to answer any questions you may have about it :) There's also the OSDI'18 paper here which may be of interest: https://jon.tsp.io/papers/osdi18-noria.pdf

ignoramous · on April 6, 2022

I haven't read much about Noria other than this readme [0], but would like to know if you are familiar enough to contrast Materialize [1] with it in terms of perf, overhead, approach, and fundamental (design) principles?

[0] https://github.com/mit-pdos/noria

[1] https://github.com/MaterializeInc/materialize

benesch · on April 6, 2022

Back when we announced Materialize we got the same question in reverse! You can read my response from a few years back here: https://hackertimes.com/item?id=22362301

Unfortunately I'm not privy to whatever improvements ReadySet has made in the past two years, so I can't comment on differences between ReadySet and Materialize. Perhaps Jon can, though!

ignoramous · on April 9, 2022

Insightful. Thanks!

adamgordonbell · on April 5, 2022

This is super exciting. Ever since I talked to you about Noria I've been telling people about this concept. I'm excited to see a production ready implementation of it.

BenoitP · on April 5, 2022

Big fan here!

I've been following the space since a bit of time, and I must say it's exciting. To me this is the future of apps where the Truth lives server-side, and everything reacts from there; With partial state evaluation lowering resource consumption to a minimum.

Kafka Streams and Apache Flink seem to be focused on real-time analytics, and I wish they'd get there to stimulate the space.

Are you affiliated with ReadySet?

Jonhoo · on April 5, 2022

I'm pretty excited about it too! I remember when I initially started the research I was amazed that this didn't already exist.

Some context: https://twitter.com/jonhoo/status/1511401461669720068

Basically, I co-founded the company around the time I graduated, but had had my fill of database research after six years of PhD. So I joined AWS to work on Rust while Alana (the CEO) took on leading ReadySet.

educaysean · on April 5, 2022

According to the linked article, Jon appears to be one of the co-founders

Jonhoo · on March 27, 2020

Nope, Dashmap is all xacrimon, and came on the scene long before my port. We've been collaborating on writing a shared benchmarking suite over at https://github.com/jonhoo/bustle/ though. For the time being, it looks like Dashmap outperforms the port of ConcurrentHashMap (called "flurry"), often by a significant amount. It seems to be mainly due to the garbage collection scheme flurry uses, but we're still digging into it (maybe you want to come help?).

In any case, I'm glad you enjoy the videos!

phibz · on March 27, 2020

Ha "straight from the horse's mouth."

I'd love to help out if I can.

Jonhoo · on March 28, 2020

Awesome! Some good places to read up on and join the discussion are https://github.com/jonhoo/bustle/issues/2, https://github.com/jonhoo/flurry/issues/50, and https://github.com/jonhoo/flurry/issues/80. Happy to guide you further there!

Jonhoo · on March 27, 2020

I can't speak to the implementation differences between the two, but I know the author of dashmap is relatively active in responding online, so they may show up shortly to explain. In terms of performance comparisons, we're actually working on building a shared benchmarking tool for all of Rust's concurrent maps that you may find interesting: https://github.com/jonhoo/bustle.

Jonhoo · on March 27, 2020

I actually gave a talk about exactly this a few weeks back that may be relevant: https://youtube.com/watch?v=QAz-maaH0KM