This is basically the exact use case SQLite was designed for; PostgreSQL is a marvel, and at the end of the day presents a much more robust RDBMS, but it's never going to beat SQLite at the thing SQLite was designed for.
I am 99% sure SQLite is going to win unless you actually care about data durability at power loss time. Even if you do, I feel I could defeat Postgres on equal terms if you permit me access to certain ring-buffer-style, micro-batching, inter-thread communication primitives.
Sqlite is not great at dealing with a gigantic wall of concurrent requests out of the box, but using a little bit of innovation in front of SQLite can solve this problem quite well. The key is resolve the write contention outside of the lock that is baked into the SQLite connection. Writing batches to SQLite on a single connection with WAL turned on and Sync set to normal is pretty much like operating at line speed with your IO subsystem.
"An SQLite database is highly resistant to corruption. If an application crash, or an operating-system crash, or even a power failure occurs in the middle of a transaction, the partially written transaction should be automatically rolled back the next time the database file is accessed. The recovery process is fully automatic and does not require any action on the part of the user or the application."
"Crash testing seeks to demonstrate that an SQLite database will not go corrupt if the application or operating system crashes or if there is a power failure in the middle of a database update. A separate white-paper titled Atomic Commit in SQLite describes the defensive measure SQLite takes to prevent database corruption following a crash. Crash tests strive to verify that those defensive measures are working correctly.
It is impractical to do crash testing using real power failures, of course, and so crash testing is done in simulation. An alternative Virtual File System is inserted that allows the test harness to simulate the state of the database file following a crash."
By default WAL mode can rollback committed transactions in cases of power failure, but you can do `PRAGMA synchronous = FULL` to trade speed for durability.
If you can have one "database" thread and 63 "worker" threads, send messages back and forth, and don't hold open transactions, this would probably work with sqlite. Aka treat sqlite like redis.
so in your example the database thread is the Redis thread and the worker thread are your http server thread I assume.
This is a good analogy, but there are still lot of wire heavy scenario a real database like postgresql or mysql will have better throughput than redis.
Redis has the same limitation (only one transaction at a time) and is used a lot for webapps. It solves this by requiring full transactions up front. The ideal case for sqlite for performance is to have only a single process/thread directly interacting with the database and having other process/threads send messages to and from the database process.
As long as you turn it into a throughput race instead of a latency race, PostgreSQL can definitely win. SQLite has a primitive query builder and a limited selection of query execution steps to choose from. For instance, all joins in SQLite are inner loop joins. It can't do hash or merge joins. It can't do GIN or columnstore indexes. If a query needs those things, PostgreSQL can provide them and can beat SQLite.
out of interest, what columnstore indexes are available to postgres? Would be happy to find out that I'm missing something.
I know citus can provide columnar tables but I can't find columnar indexes for regular row-based tables in their docs. (use case of keeping an OLTP table but wanting to speed up a tiny subset of queries)
Closest thing I could find was Swarm64 for columnar indexes but it doesn't seem to be available anymore.
> just from the fact that there are less layers of code to be worked through
This is not an invariant. I've seen be true, and I've seen it be false. Sometimes that extra code is just cruft yes. Other times though it is worth it to set up your data (or whatever) to take advantage of mechanical sympathies in hot paths, or filter the data before the expensive processing step, etc.
I'm not talking about extra code, I'm talking about _layers_ of code. With PostgreSQL you're still sending data over TCP/IP or a UNIX socket, and are copying things around in memory. Compare that to SQLite that runs in the memory space of the program, thus no need for copying and socket traffic. There's just less middlemen (middlepersons?) with SQLite that are unavoidable with PostgreSQL. So less layers = less interpreting/serialization/deserialization/copying/... = higher performance. I will even argue that even if the SQLite query engine is slightly less efficient than PostgreSQL, you're still winning because of less memory copying going around.
> less interpreting/serialization/deserialization/copying/... = higher performance
Unfortunately for many database workloads you are overestimating the relative cost of this factor.
> even if the SQLite query engine is slightly less efficient than PostgreSQL
And this is absurd - the postgresql query engine isn't just "slightly" more efficient. It is tremendously more sophisticated. People using a SQL datastore as a glorified key-value store are not going to notice - which seems to be a large percentage of the sqlite install base.
It's not really a fair comparison.
With SQLite, though, you could reasonably just skip doing fancy joins and do everything in tiny queries in tight loops because SQLite is literally embedded in your app’s code. You can be careless with SQLite in ways you cannot with a monolithic database server because of that reason. I still agree there are use cases where a centralized database is better, but SQLite is a strange beast that needs a special diet to perform best.
Sometimes. This is only universally true if your datasets are tiny and your access patterns are simple. Moving a shitty suboptimal O(n^2) or worse algorithm locally when something linear or better is possible is going to hurt no matter where the DB is.
> but SQLite is a strange beast that needs a special diet to perform best.
I don’t see what is strange about it - for large datasets it’s the same complexity issues as anywhere else.
Not sure specifically what your comment is trying to add, since I acknowledged the type of use case SQLite excels in - those where roundtripping are a dominating cost and “k-v” stores, ie simple queries. My entire point is that those are a common but still niche use case.
Then it depends on the usage. You'd likely need to run with synchronous mode on, and even on WAL, multiple separate write transactions is a issue. If you don't have many writes or buffer them into not many transactions, SQLite is the most performant.
I think the (unsatisfying) answer is "it depends". There's a huge amount of diversity in database workloads, even among the workloads served by SQLite as we mention in the paper.
For read-mostly to read-only OLTP workloads, read latency is the most important factor, so I predict SQLite would have an edge over PostgreSQL due to SQLite's lower complexity and lack of interprocess communication.
For write-heavy OLTP workloads, coordinating concurrent writes becomes important, so I predict PostgreSQL would provide higher throughput than SQLite because PostgreSQL allows more concurrency.
For OLAP workloads, it's less clear. As a client-server database system, PostgreSQL can afford to be more aggressive with memory usage and parallelism. In contrast, SQLite uses memory sparingly and provides minimal intra-query parallelism. If you pressed me to make a prediction, I'd probably say SQLite would generally win for smaller databases. PostgreSQL might be faster for some workloads on larger databases. However, these are just guesses and the only way to be sure is to actually run some benchmarks.
Is it the SQL dialect there lacking or is it the built-in functions?
I agree that SQLite default functionality is very thin compared to PostgreSQL - especially with respect to things like date manipulation - but you can extend it with more SQL functions (and table-valued functions) very easily.
I like SQLite (qualifying not for you, simonw, but for others). But I hate that I can't be lazy by using arrays in SQLite... because they don't exist. group_concat is a poor approximation.
Also, I genuinely dislike how loose SQLite is with allowed syntax. Probably it's preference. But even interactively I prefer to know immediately that I messed up a query. SQLite is so forgiving I've often wasted time trying to understand why my results are nonsense (because I typoed in the query and SQLite didn't fail the query).
But I also strongly dislike Python for that reason and I know where you stand there. Maybe SQLite/PostgreSQL is similar to the dynamic/static language preference divide.
I'm increasingly finding myself using SQLite's JSON features for array stuff - they're surprisingly convenient once you get into the habit of using them. A couple of recent examples:
Adding user-defined functions to SQLite is not difficult, and the mechanism is quite flexible. You can create extensions and load them when you create the SQLite connection to have the functions available in queries. I wrote a blog post explaining how to do that using Rust, and the example is precisely a `regex_extract` function [0].
If you need them, you also have a "stdlib" implemented for Go [1] and a pretty extensive collection of extensions [2]
Probably also worth noting: you don't need to build (many kinds of) extensions as C-compatible code and separate .so files that you load.
SQLite is an in-process database. You can give it a callback func to execute. So your regex-extract can literally just call a function in your code: https://sqlite.org/appfunc.html
edit: Python's stdlib documentation concisely shows how easy this can be: https://docs.python.org/3/library/sqlite3.html#sqlite3.Conne... Basically every SQLite library should have something similar. This extreme ease of extending is a big part of why SQLite has so little built-in.