How (and why) to run SQLite in production

ojosilva · on March 27, 2024

> First is that SQLite is simple. The database is a literal file on disk. The engine is a single executable.

No, it's not a single executable. There's no database executable as far as your app goes. There's no engine. The "engine" is a library, meant to be embedded into other code and may be concurrent and have many "executables" if you implement it that way or have different apps access the database.

Think of SQLite as an intricate file format for which your app will use an API (ie sqlite.c) to access.

01HNNWZ0MV43FF · on March 27, 2024

It's also not just "a file on disk". I know this because I re-read the docs about locking last night: https://www.sqlite.org/lockingv3.html

Between this and WAL, which-ever one you pick, you have the following caveats:

- You aren't supposed to use a SQLite DB through a hard or soft link

- By default, in rollback journal mode, it will create a temporary `-journal` file during every write

- If you're doing atomic transactions on multiple DBs, it also creates a super-journal file

- You definitely can't `cp` a database file if it's in use, the copy may be corrupt

- It relies on POSIX locks to cooperate with other SQLite threads / processes, the docs advise that locks don't work right in many NFS implementations

- In WAL mode it needs a `-wal` and a `-shm` file, and I believe the use of shared memory makes it extra impossible to run it over NFS

SQLite is not simple, it's like a million lines of code. It is as simple as a DB can be while still implementing SQL and ACID without any network protocol.

I am thinking of writing a toy DB for fun and frankly I am not sure if I want to start with SQLite's locking and pager or just make an on-demand server architecture like `sccache` uses. Then the only lock I'd have to worry about is an exclusive lock on the whole DB file.

klabb3 · on March 27, 2024

Yes. If you write your server in eg Go or Rust, you can have a “single executable deployment” so to say. But the main limitation is no horizontal scalability. So if you want web scale, you have to use something faster, like /dev/null.

callamdelaney · on March 27, 2024

Webscale always reminds me of this video [1] - MongoDB is webscale.

[1] - https://www.youtube.com/watch?v=b2F-DItXtZs

edit: and now I believe you are directly referencing this but may be useful for others who don't get the reference

klabb3 · on March 27, 2024

> and now I believe you are directly referencing this

Yep haha.

> but may be useful for others who don't get the reference

Probably also yes. Important lore for techno-archeologists.

PokestarFan · on March 27, 2024

You might be interested in https://devnull-as-a-service.com/

criddell · on March 27, 2024

There are ODBC drivers for SQLite. It can make sense when you want to offer SQLite as an option alongside other, more traditional DBMS. If your queries and data are relatively simple, it will probably work just fine.

Kwpolska · on March 27, 2024

ODBC drivers for SQLite are just a thin abstraction over the C library.

aaviator42 · on March 27, 2024

I've been using SQLite in production for years and am very happy with it. I believe it's sufficient for the vast majority of web use cases.

I have a bit of battle-tested code that gives me a nice key-value interface to SQLite files, perhaps others will find it useful: https://github.com/aaviator42/StorX

skybrian · on March 27, 2024

I went through this presentation looking for "how do you do backups" and it glosses over it. But the author blogged about that separately [1].

It seems he uses Litestream with DigitalOcean Spaces for this. Looks like they start at $5 per month for 250 GB [2]. Would that be the best way for a hobbyist to get started?

[1] https://fractaledmind.github.io/2023/09/09/enhancing-rails-s... [2] https://www.digitalocean.com/pricing/spaces-object-storage

stpn · on March 27, 2024

Fwiw, I've been using hosted LiteFS Cloud[0] (written by the same person who wrote litestream) on fly.io. The pricing starts at $5/mo for the first 10GB (kind of steep) but it's been working out so far.

(no affiliation to fly/litefs, just a fan)

[0]: https://fly.io/blog/litefs-cloud/

yread · on March 27, 2024

Maybe I'm crazy but I just have a cron job that does .backup during a time with little activity. My db is only 40GB or so and stored on an NVMe so it finishes quickly.

Nextgrid · on March 27, 2024

Nowadays you're considered crazy if you don't pay rent-seekers for every simple thing.

hnben · on March 27, 2024

this comments seems to say, that you could corrupt your copy, if you copy it while it is in use https://hackertimes.com/item?id=39838753

simonw · on March 27, 2024

The ".backup" mechanism is safe - it's directly copying the file with "cp" that doesn't work.

Here's how to use .backup:

    sqlite3 data.db ".backup 'backup.db'"

You can also do this:

   sqlite3 data.db 'VACUUM INTO "backup.db";'

That's slower, but results in a smaller backup file: https://www.sqlite.org/lang_vacuum.html

hnben · on March 27, 2024

thanks!

gwd · on March 27, 2024

For my project I've been using Litestream to Cloudflare's R2, which is compatible with Amazon S3, and has a very generous free tier.

skybrian · on March 28, 2024

It seems that the R2 free tier is "Available on Workers Paid plan" which is being replaced by the "Standard plan" [1].

I interpret this as saying that you can't get R2 literally for free. Looks like Cloudflare KV could work, with values that go up to a bit over 25MB.

$5 per month is very reasonable if you're using it for real, though.

[1] https://blog.cloudflare.com/workers-pricing-scale-to-zero/

Glench · on March 27, 2024

I use Litestream for my production web app and stream to Backblaze B2. It's S3 compatible and literally free!

jddj · on March 27, 2024

Wasabi should give you a lot more storage for almost the same price

begueradj · on March 27, 2024

I checked your blog and I am impressed you are still very curious to learn things which didn't exist before you retired (generative AI).

skybrian · on March 27, 2024

Thanks! Just tinkering, though. I skim AI papers sometimes, but don’t do the homework.

abhiyerra · on March 27, 2024

We recently moved some of our cron jobs that needed a database backend to GitHub Actions + Cloudflare D1 which is SQLite in the Cloud and couldn’t be happier.

klabb3 · on March 27, 2024

TIL D1 is SQLite. Why don’t they market it more prominently that way? If their sql is interoperable with SQLite it’s a huge selling point, both for reducing vendor lock-in and compatibility with existing frameworks and such.

walthamstow · on March 27, 2024

TIL for me too. Looks like it's in beta still so maybe not fully marketable yet.

begueradj · on March 27, 2024

Laravel switched its default database to SQLite since the current version (11).

criddell · on March 27, 2024

Windows has included SQLite in the base install for almost 10 years now. They call it winsqlite.

Here’s an article about using it in PowerShell scripts:

https://renenyffenegger.ch/notes/Windows/dirs/Windows/System...

helsinki · on March 27, 2024

Feels like half of the comments in this thread are covert advertisements from cloud providers.

jddj · on March 27, 2024

Kind of a funny observation in the context of moving away from managed DBs.

Is it that the suggestions are for companies other than the big 3 that it caught your eye?

Glench · on March 27, 2024

Since the page didn't seem to link to it, here's a really great article about using SQLite in production for moderately-sized web apps: https://blog.wesleyac.com/posts/consider-sqlite

That article, along with the ability to have streaming backups with Litestream and not wanting to pay for a separate DB server, inspired me to use SQLite in my SaaS four years ago. I've been enjoying the operational simplicity a lot although there is not a lot of community documentation currently about tuning SQLite for web app loads. My app does about 120m hits a month (mostly cached and not hitting the DB) on a $14/month single-processor DigitalOcean droplet.

abbbi · on March 27, 2024

another article praising sqlite in production and working around a known sqlite limitation: concurrent writes. As they link in the article:

> But, my favorite feature of the gem is its improved concurrency support.

> [..] https://fractaledmind.github.io/2023/12/11/sqlite-on-rails-i...

Really. At the point you are experiencing database locked in your productive app that uses sqlite as backend, i would strongly suggest to use another database backend that was designed with concurrent writes in mind.

Nothing against sqlite in production, its nice, as long as your workload meets its feature set.

wokwokwok · on March 27, 2024

What both articles are saying is that concurrent writes under a certain threshold or heavy read-only activity is perfectly acceptable usage pattern for sqlite in production.

...the problem here is that historically rails (and others) use deferred transactions and that causes sqlite to fail even under trivial load conditions without simply waiting for the db to free for the next write, because people who've written the drivers don't understand how to use sqlite.

If you use sqlite correctly under heavy load it's slow not unreliable.

> as long as your workload meets its feature set.

Sure... but to be fair that probably covers a lot of microservices and probably a lot of apps too.

Obviously as you scale, it won't, so sure, it's a limited use case... but, heck, I've seen dozens of microservices each with their 'own database' (ie. read same RDS, with different database instances) that all fail at once when that instance goes down. Woops.

Better? Worse?

Hm. Sqlite is insanely reliable. I like isolated reliable services.

It's not for everything, but nothing is... I think it's suitable for more use cases than you're giving it credit for.

duped · on March 27, 2024

What's wrong with PRAGMA journal_mode=WAL? It enables concurrent writes, at the expense of disk. Which to my understanding is the same as any other database backend with write ahead logging to enable concurrent writers.

Anecdotally, I've seen the WAL file grow way too large even after all writes have finished and should shrink, but that's manageable.

jitl · on March 27, 2024

WAL allows one writer concurrent with any number of readers. It does not allow two concurrent writers.

abbbi · on March 27, 2024

WAL does not help with "database locked" situations. At some point you will see them even with WAL enabled, and your application frontend code has to deal with the timeout and retry or whatever.

adius · on March 27, 2024

You just need to set busy_timeout > 0, and you'll basically never have database locked situations. It's just unfortunate, that it's not activated per default: https://sqlite.org/forum/info/7e456bf5544ab128

abbbi · on March 27, 2024

yeah, but that will slow down your app and may help in a short run, but you basically just shift the problem and make it even worse if your workload gets even higher.

gwd · on March 27, 2024

You should only be waiting if someone else is writing. And yes, disallowing any concurrent writes, as SQLite does, is certainly going to be a bottleneck at some point, but unless you have a particularly write-heavy workload, it will probably get you pretty far.

duped · on March 27, 2024

Then use a different busy handler

gwd · on March 27, 2024

I'm relatively new to the DB field, and have only really used SQLite; but it seems obvious to me that if you're doing a transaction, particularly a complicated one, then you should expect your transaction to fail occasionally due to concurrent changes, and be executing your transaction in a loop. This should be true for any database.

If you loop-retry all transactions which fail due to transitory effects, then you won't have a problem with "database locked" situations.

Abysmally documented, but this is what I use for golang + sqlite:

https://pkg.go.dev/gitlab.com/martyros/sqlutil@v0.0.0-202312...

EDIT: Typo

simonw · on March 27, 2024

In my experience most SQLite writes take less than 1ms.

Do your writes really need to be concurrent if they run that fast?

Hard to get upset about waiting for the current write to complete before you get your turn when we are talking delays measured in thousandths of a second.

If you have more than 1000 writes per second then maybe this is something to worry about. The solution there is probably to run a slightly more powerful server!

abbbi · on March 27, 2024

back in the days where we hit this issue (mostly on windows systems) i used to create a little stress tool, you would be surprised how fast you reach the database-locked state.

ive just put it here: https://github.com/abbbi/sqlitestress

maybe its useful for some people to simulate their workloads.

swah · on March 27, 2024

And once its locked do you have to hold all operations for X ms?

ihateolives · on March 27, 2024

Tested it repeatedly on Hetzner CPX11 (2 vCPU, 2GB RAM) instance, got one lock around row 49000, then it resumed until the end.

nikisweeting · on March 27, 2024

Sure the actual write might only take 1s but the transaction might lock for 10ms.

Does SQLite support overlapping write transactions with locks on different rows?

Also do you know if WAL2 mode changes anything? https://www.sqlite.org/cgi/src/timeline?r=wal2

nikisweeting · on March 27, 2024

*typo, meant 1ms not 1s

keyle · on March 27, 2024

Before you even need to consider postgres, you can batch your writes to sqlite!

I am no sqlite fanboy, although I might be, but I found the industry seems to run to postgres for just about anything. I prefer simplicity first.

mattmanser · on March 27, 2024

How it that simplicity?

To batch updates makes the code far more complex.

To install any full strength DB is trivial.

I don't get the 'simplicity'?

vidarh · on March 27, 2024

That depends entirely on what you're doing. If your workload is heavily transactional, then sure, that might add complexity.

The simplicity is not having a separate process that can fail, and that requires fail over, and monitoring.

mattmanser · on March 30, 2024

You're misunderstanding. No-one mentioned transactions, that's not what we're talking about.

We're talking about concurrent writes. We're talking about batching inserts. Because SQLite can't handle a high throughput, so you batch a bunch of inserts together.

If the only way to get performance is to batch inserts, then you've got to write a whole load of manual queue code to queue up X number of inserts to insert them all at once.

Worse still, if your server crashes, bug, etc. you've just lost all those inserts. But you've already responded with 201s! So if you want any sort of guarantee, you've got to write even more code to cache them on disk or redis or something.

You're basically re-implementing features of postgre, badly, to make up for SQLite's deficiencies,

It really doesn't matter HOW you do it, it's the fact you have to do it at all. It's not simpler, it's more complicated. Installing/using a fully fledged DB is trivial these days.

mythz · on March 27, 2024

We're also defaulting to using SQLite first for our new Apps with Simplicity, Speed, Cost and flexibility being the motivating factors.

Not needing a managed database also allows us to move away from AWS onto much better value Hetzner servers, much love to Litestream [1] which makes replication to R2/S3 effortless. It's a silly 30-40x cheaper hosting SQLite Web Apps on Hetzner compared to the "recommended" managed DB configurations on AWS/Azure.

[1] https://docs.servicestack.net/ormlite/litestream

stpn · on March 27, 2024

I've been running sqlite and litefs in production for my side project over the last year or so now.

It really is quite nice how much database you get for the cost and maintenance overhead required when litefs is thrown in the mix.

5Qn8mNbc2FNCiVV · on March 27, 2024

Crazy to me that the whole thread here doesn't mention how schema changes are more difficult and sometimes also lead to inconsistent data https://simonwillison.net/2020/Sep/23/sqlite-advanced-alter-...

Doesn't help that most ORMs (or other migration tools) don't generate you the correct migration you need

massysett · on March 28, 2024

I routinely migrate my SQLite databases. I keep the schema in a text file and I write a separate SQL file to insert all data from the old database into a new one, making changes as needed.

aitchnyu · on March 27, 2024

Is there a similar effort for Django? Django Sqlite driver is slow and not tested for production.

infamia · on March 27, 2024

Setting SQLite PRAGMAs via Django settings was recently merged. I would expect to see this show up in the next version or two of Django.

https://code.djangoproject.com/ticket/24018

With the proper PRAGMAs set and modern hardware, you can do 400+ writes/sec with 4 uvicorn workers and ~100 clients connected. The achilles heal is writing lots of large files. That will cause concurrent requests to wait around for the disk to finish writing to the WAL.

frodowtf · on March 27, 2024

It also seems to have the same concurrency issues as described in the article. At least from my experience the "database is locked" error appears quite often.

hnfong · on April 2, 2024

A rather late reply, but in case anyone is reading this.... Django is basically designed to have sqlite deadlocks, and there's a trivial fix (that the dev team refused to include) that allows fixes the problem and allows you to run up to moderate loads.

https://centraltrunks.blogspot.com/2022/07/django-sqlite-dat...

https://code.djangoproject.com/ticket/29280

(I authored the rant in the first link)

jddj · on March 27, 2024

Seems as good a place as any to put this question: has anyone had any issues with Litestream?

I've got a more basic backup solution running currently and don't want to put another moving piece in the way of a production service unless it's extremely solid, but I do like the idea.

zimpenfish · on March 27, 2024

I've been running Litestream for a few years now on several machines without any problems[1]. But I also use Borg with the `.backup` command to take backups. No harm in deepening your depth of defence.

[1] Had a problem with corruption back in 2021 but that was fixed quickly.

fractaledmind · on March 27, 2024

I've been using it for a couple different production apps, and it has been great. No problems for me.

hnarn · on March 28, 2024

Running a single-node Postgres server isn't very complicated, so my first thought would be what benefits SQLite bring in terms of setup and maintainability that makes it so much more attractive than Postgres. SQLite is great for a lot of use cases, but using it as your primary database in production just feels like contrarianism to me, or some kind of psychological idea to use "pure" solutions.

Another one of my rule of thumbs is that unless there is massive evidence to the contrary, you should simply use what "most people use" for each specific use case. In my world, when it comes to web applications, that is Postgres.

thrdbndndn · on March 27, 2024

Might want to proofread it. There are a few paragraphs that are repeating and out of place:

> Why? So, let’s explore that together. Who here is running or has run an application in production with SQLite? Who has experimented with SQLite for an app, but not shipped it to production? There are a couple hands up, but not many. So, let’s turn this question around.

lxgr · on March 27, 2024

This pretty clearly seems to be a transcript of a conference talk, and redundancy and repetition are quite useful for getting a point across in spoken language.

Maybe it could be more accessible when rewritten as a blog post, but if it’s a useful talk I’d much rather have an official transcript than only a video link and maybe Youtube’s auto-transcript.

So much good content is buried in video form for me (I really don’t like watching technical videos and prefer reading that content).

thrdbndndn · on March 27, 2024

This paragraph appears twice in the article at different places.

I don't think the speaker said it twice word by word in the talk.

hyperadvanced · on March 27, 2024

I think this is paraphrased or exact copy from a conference talk. Not that I love the rhetorical question mode of communication, but it’s more understandable.

thrdbndndn · on March 27, 2024

I was not saying this particular paragraph itself is repeating.

It literally appears twice in the article at different places.

fractaledmind · on March 27, 2024

Yep, that was an errant duplication. I did proofread, but how is it that one thing always seems to sneak through anyway? Thanks for the catch. Fixing now.

toastercat · on March 29, 2024

Very frustrating read. The author glosses over the "why not", and never brings up the topic of concurrent writes or how they deal with that in Production.

nprateem · on March 27, 2024

It's not immediately clear. How does this work over the network for multiple app servers (or does it)?

Is the DB hosted on something like nfs, or are writes synced to all app servers or something else?

fractaledmind · on March 27, 2024

This presentation is focused on the use-case of vertically scaling a single server and driving everything through that app server, which is running SQLite embedded within your application process.

This is the sweet-spot for SQLite applications, but there have been explorations and advances to running SQLite across a network of app servers. LiteFS (https://fly.io/docs/litefs/), the sibling to Litestream for backups (https://litestream.io), is aimed at precisely this use-case. Similarly, Turso (https://turso.tech) is a new-ish managed database company for running SQLite in a more traditional client-server distribution.

vander_elst · on March 27, 2024

IIUC it does not work, the assumption here is that you can serve all your traffic with a single instance

account-5 · on March 27, 2024

I'm creating a flutter app, I need a database but don't need online storage, local only. SQLite really is the only option for this.

criddell · on March 27, 2024

Firebird is sometimes an option.

account-5 · on March 28, 2024

At first I misread this as firebase but searched for Firebird. I want a local to the device database.

p0w3n3d · on March 27, 2024

Postgres. Only postgres.

Why? Because once your app gets more than _n_ users you will run into scaling problems and then you end up with huge tech debt

hruk · on March 27, 2024

But n can be so big that it you never run into it. My company has a read-heavy SQLite-backed service that serves roughly 150,000 daily users with p99.9 at about 5 milliseconds. We did some rough projection and determined we could reach the total addressable market of our product in the US without even approaching having a problem.

p0w3n3d · on March 28, 2024

How can your company guarantee p99.9 if there is only one instance? Is there any log shipping/duplication etc? Is consistency maintained on one server fault?

hruk · on March 30, 2024

p99.9 referring to latency. However, we also do a weekly test of how quickly we recover from a catastrophic crash, which is roughly about 6 minutes (which is the amount of time it takes for the autoscaling group to spin up a new host, Litestream to restore the database from s3, and the server to start up again).

Honestly, 99.9% uptime is pretty generous - we can fit in quite a few catastrophes per year and still have 99.9% uptime. In the 2 years this service has been running, we've had 100% uptime via zero-downtime deployments, anyway.

In terms of monitoring, traces and error logs are shipped to our observability solution, yes.

sgbeal · on March 27, 2024

> Why? Because once your app gets more than _n_ users you will run into scaling problems and then you end up with huge tech debt

Anecdote: the sqlite forum runs on a single sqlite db and has well over 1500 users. Similarly, the sqlite source site is heavily visited by thousands of folks and actively used in write mode by its developers. It sibling project, the Fossil SCM, also runs entirely from a single sqlite db.

sqlite db forum stats: <https://sqlite.org/forum/reports>