News | drihu.com

By jschafer, 10 months ago

Note that the measurements in the paper were made before they fixed a bug where they confused bits and bytes. So SQLite only used 1/8 of the reserved bloom filter space, thus increasing the false positive rate significantly:

https://sqlite.org/src/info/56d9bb7aa63043f5

I found and reported the bug because I wanted to know how the bloom filters work in SQLite for my uni seminar paper. Still wondering if one can find those kind of bugs with test cases.

By agilob, 10 months ago

On top of that I don't think it's fair to say it's 10x faster when it btree was tested only on integer index primary key column. Benchmarks with that bold statements should include short string (1-16 chars maybe) and UUID indexes at least.

By jschafer, 10 months ago

I do not know if it is still the case, but the last time I looked into the source code SQLite did hash all strings to the exact same value.

So the bloom filter optimization does not work there.

It had to do with the different ways strings can be compared with collating functions, as strings may be equal even if they have different bytes: https://sqlite.org/forum/forumpost/0846211821

By thaumasiotes, 10 months ago

Why do you want a UUID index? Use an integer index and have the UUID in another column.

By dymk, 10 months ago

Because if you want to refer to things by a UUID, now you have two indexes

By leourbina, 10 months ago

UUIDs are very wasteful [1]. For most use cases you can replace them with much shorter strings and still have very low chances of collisions [2]

[1] https://henvic.dev/posts/uuid/

[2] https://alex7kom.github.io/nano-nanoid-cc/

By cogman10, 10 months ago

Call me crazy, but I'm simply splitting my UUID into the higher and lower bits and indexing off that.

IE

    CREATE TABLE foo(
        id_ms    UNSIGNED BIG INT NOT NULL,
        id_ls    UNSIGNED BIG INT NOT NULL,
        PRIMARY KEY (id_ms, id_ls)
    ) WITHOUT ROWID;

That works well with UUIDv7 and is just storing 128bits rather than a full string. In most languages it's pretty trivial to turn 2 longs into a UUID and vice versa.

By diroussel, 10 months ago

Is there any advantage to this approach over Postgres native uuid support which should store the same number of bits?

By cogman10, 10 months ago

No. This approach is strictly for DBS like sqlite without uuid or 128bit integer support.

By PittleyDunkin, 10 months ago

Sure, at cost of increased complexity of access. Sometimes the waste is worth the simplicity.

By dymk, 10 months ago

Sounds complex, just use a UUID. If that’s the dominating factor for storage, then you have a different problem to solve.

By sgarland, 10 months ago

In SQLite, if you were to define a TEXT column (or anything other than INTEGER, for that matter) with a UUID as the PK, you’d already have two indices, because it stores data based on the rowid [0]. So you’d already have a level of indirection, where the “PK” would be pointing to the rowid.

You could define the table as WITHOUT ROWID [1], but as docs point out, the average row size shouldn’t exceed 200 bytes for the default 4 KiB page size. Since a UUID in text form is at best 32 chars, that doesn’t leave much for the rest of the columns.

[0]: https://www.sqlite.org/lang_createtable.html#rowid

[1]: https://www.sqlite.org/withoutrowid.html

By Scene_Cast2, 10 months ago

It's also a problem in machine learning. Your data might be mangled due to a bug but the NN will still extract something useful out of it. Or, on the flip side, if you make a change to the data and things do break (learning stops converging), you never really know if it's the architecture or the data that's the issue.

By ramraj07, 10 months ago

How much of a slowdown did you estimate this bug caused?

By jschafer, 10 months ago

SQLite only knows nested loop joins and the bloom filter can just tell us "no need to do a join, there is definitely no matching entry".

If it has a false positive all the time (the worst case) then the performance is the same as before the bloom filter optimization was implemented (besides the small bloom filter overhead).

As the bloom filter size in SQLite directly depends on the table size I estimated a false positive rate of 63.2% due to this bug, while it could have been just 11.75%.

By metadat, 10 months ago

It actually would have performed faster, but the false positive rate drastically increased.

By LudwigNagasena, 10 months ago

I guess the person is asking how much a slowdown did the whole query receive.

By devoutsalsa, 10 months ago

90%?

By QuadrupleA, 10 months ago

Small aside based on some comments here -

People frequently bring up write concurrency issues in SQLite, with the implied idea that if two physical people are using your app at once they'll get errors. But of course it's concurrency at a transaction level, vastly different - if your writes take 1ms (in SQLite they might even be faster), you can support 1,000 writes per second. And if your users are generating a write click every 5 seconds (a very engaged user base, in most apps more users are readers) you can support 5,000 simultaneous physical people before you need to start worrying about scaling, replication, and distributed shenanigans.

If only 5% of those users are writers and the rest are readers / lurkers, you can support 100,000 users simultaneously.

I suspect close to 99% of distributed app architectures are premature optimizations for a scale the company will never reach.

By IgorPartola, 10 months ago

The issue isn’t quite just that. If you are running your app and database on one single server and don’t need to move past that, SQLite is a great solution. But let’s say you have a web server and a worker server or more than one of each. Now you can’t access SQLite from multiple hosts as it doesn’t have a network access model. So your architecture is somewhat limited. You can still utilize it by either putting a network front end on it or setting up an API server in front of it that everything else uses. But at that point you have basically abstracted away SQLite enough that just using Postgres is easier. And with Postgres you get a lot more features that go hand in hand with multiple servers: replication, failover, etc.

I am a huge fan of SQLite but its best use case is a local database for a local application. You can use it for a client-server setup in some special circumstances but in my mind you’d need a compelling reason to do that rather than using the standard solution of something like Postgres.

By QuadrupleA, 10 months ago

Agreed, if you have to have a web server and separate worker servers, then Postgres is likely better than a weird network layer on top of SQLite. But I'd question the true need for need separate worker servers. Is it for performance or functionality, or just hand-wavey stuff like "best practices" or "modern" or "scalability" without any hard numbers attached? If a single server can handle 100,000 simultaneous users, that's quite "scalable".

I guess my point is, a single cloud VM with the (web) app server and a SQLite database will take you VERY far these days in terms of performance and concurrent users. The web server becomes your client-server layer, which can possibly eliminate the need for a separate client-server interface on the database end. Of course each app is different, but it's worth taking a hard look at whether you need that additional server & network connection (i.e. a bunch of app servers sharing the same database, that can't be done with an API). It's good to delete any part or process that you can.

Cloud vendor developer-marketing and resume-driven-development has pushed the industry into ever more complicated distributed infrastructures, but I suspect the needs of 99% of businesses could be handled much more simply, with vertical scaling if needed (larger VM class) before diving into kubernetes, clustering, load balancing, lambda, microservices, replication, etc.

By IgorPartola, 10 months ago

Even if you are running it all on the same machine, a separate worker queue process and a web server process cannot share an SQLite database file.

And it’s not that it’s best practice. It is because sometimes you legitimately do need separate processes. Imagine you have a web UI for a service that transcodes video. You aren’t running transcodes on the same CPU cores as your web requests. And chances are you need a lot more power for the worker side than the web side.

For toy projects you can get away with almost any setup. For a project where people are paying money for a service you really need to start engineering your infrastructure because you have actual trade offs.

By sgarland, 10 months ago

> Even if you are running it all on the same machine, a separate worker queue process and a web server process cannot share an SQLite database file.

From SQLite’s FAQ page, yes they can [0]. Two processes cannot simultaneously write to it, but they can both have it open, and read. With a modicum of tuning and application error handling, you can quite easily have multiple processes writing.

> …you really need to start engineering your infrastructure because you have actual trade offs.

Step one is fully understanding the abilities and limitations of available tooling.

[0]: https://www.sqlite.org/faq.html

By IgorPartola, 10 months ago

Correct. You cannot have concurrent writers even to two independent tables. That page also clearly states the limitations of even this global lock system. I should have been more precise with my language in that while it’s possible it functionally can get to being useless pretty quickly.

SQLite is a wonderful database engine. But it isn’t the end all be all and no it doesn’t scale in the same way as something like Postgres. You cannot stretch it but you hit diminishing returns fairly quickly.

By jeremyvisser, 10 months ago

> Even if you are running it all on the same machine, a separate worker queue process and a web server process cannot share an SQLite database file.

Excuse me? The only way you would be able to come to this conclusion is if you did no reading whatsoever and never tried it.

This “confidently wrong” attitude really needs to stop.

By IgorPartola, 10 months ago

If you read the relevant docs you will see how this is implemented and maybe realize that locking your entire database for a write transaction isn’t something that works for a whole lot of cases. Of course multiple readers are allowed, but multiple writers even to different tables still contend for the same lock (that doesn’t work on all file systems). There isn’t table-level locking, let alone row level locking.

By SeenNotHeard, 10 months ago

Exactly. As Dr. Hipp writes on the web site, "SQLite does not compete with client/server databases. SQLite competes with fopen()."

10 months ago

[deleted]

By infogulch, 10 months ago

Sure, but is this host incapable of performing 1k req/s? It's a crazy world if we have a DBMS that can parse a query, execute reads and writes against a database file, and return the results faster than it can be string concatenated into an html response.

By IgorPartola, 10 months ago

That’s not the point. And the 1k req/s is a made up number in this case. The point is that SQLite is a single process or at best a single process group if you implement locking. It’s not about performance. I wouldn’t be surprised if SQLite is faster for writes than Postgres, being simpler. The point is that one is a really good bicycle and another is a really good all terrain truck. Both have uses and for a certain type of getting from A to B there is overlap in their utility but they have different use cases.

By dang, 10 months ago

By jedberg, 10 months ago

Bloom filters are great, I wish more people knew about them. The most important part about a bloom filter:

They will never have a false negative (and only sometimes a false positive).

We used this to vastly improve render times for comments pages on reddit. We used two tricks. The first was to store the time of your last vote as a first class property on your user object. If you loaded a comments page for a link that was submitted after your last vote, we knew that you couldn't have voted on any of those comments.

But if you had voted afterwards, we had to look up every single comment on the page to see if you had voted on it (we couldn't only do the comments made before your last vote because we didn't know the creation time until after we looked up the comment, and it was faster to just look up the vote).

But with a bloom filter, we could very quickly look up all the comments and get back a list of all the ones you voted on (with a couple of false positives in there). Then we could go to the cache and see if your actual vote was there (and if it was an upvote or a downvote). It was only after a failed cache hit did we have to actually go to the database.

But that bloom filter saved us from doing sometimes 1000s of cache lookups.

By jart, 10 months ago

That's wonderful. Can you help me understand why every single button click on Reddit has an 800ms user visible latency gap? Reddit has always been this way, ever since its very beginning. Its latency has always been so slow compared to Hacker News. Why can't Reddit go snappy like HN? Is Python just really really slow compared to LISP? Why didn't your bloom filter improve user visible latency?

By jedberg, 10 months ago

I can't tell if you're trolling or serious. If you're serious, I think the problem may be on your end, because I don't see that latency, and when I worked there, we didn't see that latency either.

My average latency on reddit is about 150ms, most of that network transit time. Some calls are slow, yes, but on average site wide when I worked there is was about 300ms (yes we measured it).

> Is Python just really really slow compared to LISP?

Now I'm almost certain you're trolling. Reddit was originally written in LISP. It was rewritten to Python for a massive speedup. Then we rewrote parts of it in C for another speedup. And now I'm pretty sure they have Rust and Golang too.

Also, I'm pretty sure HN isn't in Arc anymore.

By jart, 10 months ago

I am serious. If you're not logged in then reddit pages will take 150ms to respond if you exclude the snooserv ?rdt=xxxx redirect (which makes it take 200ms total).

    master jart@luna:~/cosmo$ time curl --tcp-fastopen -L 'https://www.reddit.com/r/torties/comments/1hldi83/i_adopted_two_girls/' 2>/dev/null | head -c1 >/dev/null
    real    0m0.203s
    user    0m0.016s
    sys     0m0.006s

But if you're logged in as "jart" and use that chrome inspect feature to "copy as curl command" and run it on the shell, then the above URL takes 800ms to respond. Pretty much every page on Reddit takes 800ms+ to send back the first byte of content. Yes I use old reddit mode. I imagine the same problem afflicts all your long time loyal fans who uses Reddit while logged in. FWIW my ping to reddit.com is 4.5ms.

By jedberg, 10 months ago

Something is broken in your profile then. I also use old reddit and most pages load within 150ms for me.

By jart, 10 months ago

Any idea how I get unbroken?

By jedberg, 10 months ago

You could contact them but honestly that probably wouldn't work very well. Things that might help: switch your settings from old reddit to new reddit, use new reddit to load a few pages, and then switch back. Also try using "newest" reddit at https://sh.reddit.com.

By jart, 10 months ago

Hey that brought it down to about 650ms. Thank you! That's a noticeable improvement.

10 months ago

[deleted]

By simonw, 10 months ago

Did you maintain a single bloom filter for each user listing the comment IDs they had voted on across the whole site, or was it one bloom filter per user per thread?

By jedberg, 10 months ago

I honestly don't remember for sure, but I believe it was one row per comment with a list of everyone who voted on it. So you could quickly get the answer to "did user X vote on comment Y?"

And of course the answer from a bloom filter was either "no" or "probably yes?", which was good enough to then do an actual lookup of that person's vote on that comment.

By kwillets, 10 months ago

The description of nested loop join is confusing; it's mainly a single pass through the outer table with one B-tree probe per row to each inner table.

The linked paper is clearer:

"However, the inner loops in the join are typically accelerated with existing primary key indexes or temporary indexes built on the fly."

"Note that SQLite probes the part table index for every tuple in the lineorder table."

The Bloom filter does not reduce the cardinality of the join, it simply replaces each B-tree probe and filter with a Bloom probe on a pre-filtered key set.

This technique is well-known; the paper cites several examples, and Bloom filter pushdown is common to many commercial systems.

By avinassh, 10 months ago

hey could you let me know which part you found confusing specifically, so that I can rephrase it? Thanks!

By kwillets, 10 months ago

The animation, pseudocode, and join order discussion all imply that the cartesian product is being generated.

By mythz, 10 months ago

Thanks to its simplicity for development and hosting SQLite has become our first choice for new Apps. We use a number of different ways to workaround its single concurrent writer limitation [1]. Whilst we're currently using Litestream for replication, we're currently evaluating switching to SQLite's native rsync [2].

[1] https://servicestack.net/posts/scalable-sqlite

[2] https://www.sqlite.org/rsync.html

By Faaak, 10 months ago

Dumb question, but how do you use SQLite on kubernetes? Indeed SQLite doesn't work over NFS (and I guess other remote file shares) so how do you share access to it to other pods?

By moooo99, 10 months ago

Without an additional layer you will have to be happy with a single vertically scaled instance of your application. If you want to resort to horizontal scaling, you can look into something like LiteFS

By joshlemer, 10 months ago

Is switching to SQLite really making hosting your web apps less of a headache? Most hosting providers make spinning up your standard client-server RDBMSs (MySQL, Postgres) a breeze.

By KronisLV, 10 months ago

It probably depends!

If you’re using one of the hosting providers or managed services that just give you a DB to use and you don’t have to think about administering it, it’s a pretty relaxed experience.

Even if you self-host, running a container with the DB of your choice usually isn’t that much more difficult than just having an SQLite DB for your app, at that point it’s probably more about the features which each solution provides.

Personally, I have some stuff running on MariaDB, PostgreSQL and SQLite and I’ve almost never had any problems with either. Then again, I’ve never gotten to a scale with personal stuff where the specifics of that choice would matter much.

By mythz, 10 months ago

Not having to worry about needing to configure, manage or pay for any additional infrastructure dependencies definitely makes hosting a lot simpler. Using RDS was the only thing keeping us on AWS, by switching to SQLite we're now running all new Apps on Hetzner VMs which costs around ~€0.60/mo to host a .NET + SQLite Docker App.

By ngrilly, 10 months ago

Litestream replicates continuously. sqlite3_rsync takes a snapshot. How do you plan to use the latter?

By PartiallyTyped, 10 months ago

Just a thought, just because a general problem is NPHard doesn't mean that we can't find specific solutions quickly or that a given input is hard to search for. If the downstream effect results in an order of magnitude less work, it makes sense, it's just a tradeoff.

By somat, 10 months ago

The one I liked was the postgres genetic optimizer.

https://www.postgresql.org/docs/17/geqo-pg-intro.html

The theory being an exhaustive search of all possible query plans is np-hard and would take too long, so you do a limited, iterative, best fit search.

My understanding is it never worked super great and would only be used if your query exceeded some high level of complexity. I distinctly remember it being removed at some point, but I see it mentioned in current docs, so I am probably wrong about that.

Anyway I wonder if, with some of the new advances in machine learning, it would be worth revisiting this approach to optimization.

By thesz, 10 months ago

As you mentioned machine learning, a useful way to implement inference in language models is to use a beam search: https://en.wikipedia.org/wiki/Beam_search

Beam search approximates NP-hard solutions (make wider beam, have better approximation), is very old and it is used in SQLite query planner: https://www.sqlite.org/queryplanner-ng.html

By bawolff, 10 months ago

Well yes, heurstics for query planning is a very well researched field

By PartiallyTyped, 10 months ago

I was more thinking about solving NP hard problems. Modern CPUs are fast, if the benefit is worth it against the downstream task, just do it.

By eru, 10 months ago

Most instances of most NP hard problems are fast and easy to solve in practice.

Eg you have to go to quite a bit of effort to construct a knapsack problem that's hard to solve.

By DHRicoF, 10 months ago

What complexity class will be the problem of construct only hards to solve knapsack (or others) problems?

By eru, 10 months ago

For knapsack, you can do that easily in polynomial time. Well, given a few minimal assumptions, like P!=NP; because otherwise there are no hard instances.

First, you start with an NP problem where virtually all instances are expected to be hard, like finding the pre-image to a given sha256 hash digest. Second, you sample a random instance in O(n). Third and last, you reduce this problem from the original sha256 inversion to knapsack. You can do this in polynomial time, because knapsack is NP complete.

Note for the pedantic: inverting sha256 is certainly in NP, but it's not expected to be NP complete.

Second note for the pedantic: because sha256's digest has a specific fixed size, you can technically solve any problems around it in constant time with a big lookup table. So you should replace sha256 in my example with any other problem that's expected to be hard on average.

By klaussilveira, 10 months ago

If anyone is interested, here are multiple membership filter implementations: https://github.com/FastFilter/fastfilter_cpp/

By shayansm1, 10 months ago

https://llimllib.github.io/bloomfilter-tutorial/ for understanding how bloom filters work

By bogdan-lab, 10 months ago

The article states that order of join matters because then nest loops differently. But we still go through entire loops everywhere. Where do those numbers in the example come from? If we have 1000, 20 and 200 elements in 3 loops, algorithmically, it does not matter in which order you iterate. Complexity is always 1000×20×200.

What am I missing?

By kebsup, 10 months ago

You don't go through entire loops everywhere because if there isn't a match in the first two tables, you don't have to check the match with the third table.

It's better to check A x C before A x B if you know that A x C has less matching rows, because the final loop will be shorter.

By bogdan-lab, 10 months ago

Ah, I see the numbers in the example are the numbers of matched rows, not a total number of rows... Make sense. I do not work with databases, did not know that you should pay attention to the order here.

By DaveMcMartin, 10 months ago

SQLite is getting better and better. I am using it in production for a bunch of websites and never got a problem.

By immibis, 10 months ago

It should be fine for read-only data. If you want to write, be aware that only one process can write at a time, and if you forget to set busy_timeout at the start of the connection, it defaults to zero milliseconds and you'll get an error if another process has locked the database for writing while you try to read or write it. Client-server databases tend to handle concurrent writers better.

By bawolff, 10 months ago

I think people overstate this.

Yes, the sqlite concurrency model is a bad choice if you have a high degree of concurrent writes. However for many applications that simply isn't true. When it comes to websites i think people significantly overestimate the amount of concurrent writes.

By ComputerGuru, 10 months ago

This is a baseline benchmark I published of various p9x latencies for multiple readers/writers with a single SQLite database with different configurations set: https://github.com/mqudsi/sqlite-readers-writers

Even if you don’t use message passing to use one thread to perform all updates/writes, it still performs very solidly for many use cases.

By kevingadd, 10 months ago

Depending on the amount of write throughput you need and whether you care about latency, "concurrent writes" aren't necessarily a problem either. You can just shove them into a queue and then have a single thread pull stuff out of the queue and push it into SQLite. That still scales, up to a point.

By dinosaurdynasty, 10 months ago

FYI this is basically what Redis does (which does not support concurrent writes or reads afaik), and that still scales quite a lot.

By immibis, 10 months ago

No, because your readers are still blocked by writers and still randomly fail if you forgot to set busy_timeout, which is something at least one person didn't they had to do until they read this comment.

By prisenco, 10 months ago

| readers are still blocked by writers

We're all using WAL mode these days, right?

By gwervc, 10 months ago

No, it's indeed a very real problem. I ran into with a very small service.

By liontwist, 10 months ago

You want to configure it so it has a timeout. Take turns. That’s how locks work.

The only difference between SQLite and Postgres write locking is the granularity.

By miunau, 10 months ago

It's not a trivial difference like you suggest.

By liontwist, 10 months ago

I agree It’s not a trivial difference in implementation and the concurrent write performance is probably a lot worse.

But it’s talked about as if it’s a categorical limitation, the app will fail if there is concurrency. But it’s just a question of how much time will be spent locking.

A website with a 16 process pool for handling requests will be fine.

By immibis, 10 months ago

sqlite also polls for lock availability

By bob1029, 10 months ago

SQLite really isn't meant to be used exactly like a hosted solution. I don't know who is advocating for this.

If you are sharing your database between processes or machines, you are probably doing "the fancy new SQLite thing" wrong.

If you need to share information contained in a SQLite database with other processes or machines, you can write application-level code to handle this concern.

By liontwist, 10 months ago

Sharing between processes? Absolutely. Machines? No.

For example, LAMP stack applications could swap the M for SQlite (and I think it would have been better historically if they did).

By bob1029, 10 months ago

Sharing between processes isn't impossible but this where you get all of your performance caveats.

I think it is a bit unfair to argue against SQLite on performance grounds but then not explore ways to utilize instances of it within a single process scope where it is most effective.

Sharing a single SQLite connection across all threads within a process is where you get the best performance outcomes. Everything is serialized within the SQLite provider - assuming you are using a mainstream build. So, if your storage subsystem is fast (NVMe & friends), you will achieve very good outcomes. Any utilization model that requires a file open/close operation every time you want to touch the data is a complete non-starter by comparison. The "unit of work" paradigm that hosted providers recommend is catastrophic for performance with SQLite.

One option for the "I absolutely must share this SQLite instance with 30+ services" scenario is to simply wrap it with a REST API or similar.

By bawolff, 10 months ago

I don't see why performance would be significantly different between multiple threads using same sqlite db vs multiple processes on same machine. Can you explain more what you mean?

By bob1029, 10 months ago

See section 5:

https://www.sqlite.org/lockingv3.html

In the single process access model, you can connect exactly once and remain in a reserved lock state the entire time.

By bawolff, 10 months ago

Maybe i misunderstand, but it doesn't seem like the linked document supports what you are saying. The linked docment also doesn't describe how things work in WAL mode which is how most people would (or should) use sqlite in production.

By liontwist, 10 months ago

> but this where you get all of your performance caveats.

You mean the one where it locks on write? It’s totally fine, if you wrote any cross process code yourself it’s probably going to do similar locking.

By immibis, 10 months ago

Yeah, but you're locking the whole file and if you try to open it while it's locked, sleep-polling for it to be unlocked. It's safe, but it's a minimum viable product - as they say, sqlite is a replacement for fopen, not for a fully featured database. Client/server systems have better locking.

By liontwist, 10 months ago

I mostly agree. If you need an RDMS get a real one that has all the features and granular locking.

The only point I’m disagreeing about is the blanket statement “your SQLite website won’t work if it has concurrent writers”. It will.

By LAC-Tech, 10 months ago

I thought WAL mode solves this. Am I misunderstanding the docs, or this SQLite running without a write ahead log?

There are advantages and disadvantages to using WAL instead of a rollback journal. Advantages include:

WAL is significantly faster in most scenarios. WAL provides more concurrency as readers do not block writers and a writer does not block readers. Reading and writing can proceed concurrently. Disk I/O operations tends to be more sequential using WAL. WAL uses many fewer fsync() operations and is thus less vulnerable to problems on systems where the fsync() system call is broken.*

By immibis, 10 months ago

Then you have to remember to enable WAL mode. And it still has its caveats, but different ones: it's possible that the WAL file can grow without bound under certain conditions.

By LAC-Tech, 10 months ago

You know it never occurred to me that there's probably a whole new generation (experience wise at least) of programmers who 1) know SQLIte is commonly used for web backends now but 2) don't know about WAL mode.

To me the concept of SQLite in these scenarios, without the WAL, is just nuts.

10 months ago

[deleted]

By liontwist, 10 months ago

So configure busy_timeout, that’s what it’s for.

By bingaweek, 10 months ago

[flagged]

By y1n0, 10 months ago

I bet your great to work with.

By apitman, 10 months ago

I've been really impressed with sqlite+litefs on fly.io. it's pretty easy to get distributed reads with single master writes.

By remram, 10 months ago

If you never got a problem then how is it getting better?

By agilob, 10 months ago

What if we didn't know we had a problem? And now it's "ooooh it's better now".

By FpUser, 10 months ago

Packs more features and improvements over the time. One can read it to see that it gets better

By eru, 10 months ago

Perhaps it's getting faster?

By datadeft, 10 months ago

Next should be this -> https://x.com/lemire/status/1869752213402157131

What a progress we have with these. Amazing times.

By nsteel, 10 months ago

Maybe not such a great fit for sqlite:

> One of the challenges with binary fuse filters, is that they are immutable once populated, so data cannot be added incrementally, and they consume a significant amount of memory during the populate process

By ComputerGuru, 10 months ago

Same restriction with cuckoo filters. Are there any better than bloom filters without this restriction?

By cb321, 10 months ago

I think you may be confusing the strict immutability of binary fuse with the "degraded" (aka need-to-resize-a-hash-table-once-"full") of https://en.wikipedia.org/wiki/Cuckoo_filter under high hash table load.

In any event, to address your question, most of the time people truncate to some round number of bytes (8, 16, 32, 64 bits, really) because this is very cheap, but it’s actually barely more expensive to truncate to any smaller than 64 number of bits with masks & shifts. Doing so with a back-to-back array of such truncated b-bit numbers (https://github.com/c-blake/adix/blob/master/adix/sequint.nim) structured as a regular hash table (such as robin hood linear probing (https://github.com/c-blake/adix/blob/master/adix/bltab.nim) where a hit or miss will likely only induce a nearly guaranteed single cache line miss up to ~90..95+% utilization) lets you make a filter with many fewer CPU cache misses (~10X fewer, but everything always depends on where you land in the parameter space) than a Bloom filter at a small 2..4X cost in more space.

There are some example numbers and a “calculus analysis” at the bottom of https://github.com/c-blake/adix/blob/master/tests/bl.nim, but it’s all only a few hundred lines of Nim and you could re-do a test in your favorite ProgLang. This table does eventually "fill up" like Cuckoo or any fixed malloc'd arena at which point people usually double/whatever to make more space resulting in a linear amortized cost growing up from zero. I would just call this a b-bit hash existence filter or maybe b-filter or bit-level filter if you want to get brief.

FWIW, I believe this was even understood by the aboriginal paper by Bloom in his 1970 CACM article (https://cacm.acm.org/research/space-time-trade-offs-in-hash-...) based upon his footnote2, though I think he was referring less to a CPU cache and more to "loading a whole word of memory at a time" like the 36-bit word IBM mainframes of the day, though these are similar mathematically (just think of a 64B cache-line as a 512-bit aligned word). Somehow it got lost in the teaching of Bloom filters.

To speculate on that "somehow", once you have a new dimension (accuracy in space-time-accuracy here), it is easy/natural to only consider "projections", but such oversimplifications can lead one astray. E.g., most people I know optimize for space only to indirectly optimize for time, but most discussion on this topic is about space-accuracy.

By eru, 10 months ago

You can convert static data structures like these into dynamic ones with about a logarithmic slowdown. So it might still be worthwhile.

(You could also combine a static filter with a dynamic bloom filter in front. A bit like generational garbage collection.)

By kanak, 10 months ago

For others interested in this idea, look up "Bentley-Saxe transformation". These lecture notes are very readable: https://jeffe.cs.illinois.edu/teaching/datastructures/notes/... .

A recent paper trying to systematically apply the idea to databases "Towards Systematic Index Dynamization" ( https://bpb-us-e1.wpmucdn.com/sites.psu.edu/dist/b/123163/fi... )

By eru, 10 months ago

Thanks! I got my intro to this topic from Okasaki's Purely Functional Data Structures.

By tehjoker, 10 months ago

It's also not clear that a better filter will result in a substantial improvement for SQLite. If the improvement is on the order of a tens or hundreds of nanoseconds per a query, that's nothing compared to hitting disk.

Edit: I guess you could get really efficient at detecting things not in the database, but the soon as you get a single false or true positive the process runs into the equivalent of a screeching halt if you need to actually check if it's there.

By ray_v, 10 months ago

Steve Gibson (grc.com) did a really great job of explaining how cascading bloom filters work in order to efficiently achieve fast certificate revocation lookups in Firefox. It's definitely worth checking out the episode

https://twit.tv/posts/tech/cascading-bloom-filters-revolutio...

10 months ago

[deleted]

By mahmoodz98, 10 months ago

What happens if the table is one with a big number of deletes? The Bloom filter false positive rate will keep increasing as time goes on.

One way to address this is to recalculate it every n deletes, but this sounds similar to AUTOVACUUM issues in PostgreSQL and might result in unexpected drops in performance

By sethammons, 10 months ago

We do rotating filters for that. Items are added to the current and next bloom filters, and we stop serving from one, serving from the next, delete the former, and start the next filter.

Another option is a cuckoo filter; it is like a bloom but allows deletion

By schobi, 10 months ago

The article says "at the start of the join operation the bits will be set in the bloom filter".

So maybe this is built for every query? Would be needed anyway if there is a where clause for the joined table.

By tacone, 10 months ago

You might reset the single position (which would invalidate all matching items) which is better than recalculating everything or mass invalidation.

My guess though is that many use cases are likely data stores that don't experience deletes at all.

By NDizzle, 10 months ago

I like to think it's more than a nit to pick, but does anyone else absolutely despise the way the sql is written in the example? Join table1, table2, table3, table4... then the "on" logic in the where clause, without explicitly defining which columns belong to which table? Completely unsupportable and wastes so much time years from now. Please don't write sql like that, everyone.

By somat, 10 months ago

"update ... from ..." still uses that syntax and it throws me for a loop every time. I have to stop and think about it while convincing myself it is doing the same thing as "join on".

https://www.postgresql.org/docs/16/sql-update.html

By scotty79, 10 months ago

> SQLite does Nested Loop join

Only that? Never anything better? Really?

EDIT: Really.

Section titled Joins here https://sqlite.org/optoverview.html

states:

"SQLite implements joins as nested loops."

That's quite shocking.

While doing MySQL and Postgres when nested loop showed up in EXPLAIN in almost all cases I knew I botched my query and/or indexes.

By bawolff, 10 months ago

If you mean in mysql explain: "Using join buffer (Block Nested Loop)", its not slow because nested loop algorithm is being used, its slow because of the join buffer part, which is an optimization used when its not possible to immediately get the right row of the inner table via an index.

As far as i know (might be wrong,im not really familiar with mysql internals), mysql (like sqlite) generally uses nested loop joins all the time. The EXPLAIN just only says something in the join buffer case. When using a simple nested loop join, EXPLAIN does not mention the fact that it is using that algorithm.

By scotty79, 10 months ago

Fair enough, most of my memories about building fast, complex queries come from my Postgres times.

Though I remember one instance where while using MySQL in a web app it turned out that N+1 was faster than doing a JOIN.

By sgarland, 10 months ago

Then your table schema was likely sub-optimal.

By scotty79, 10 months ago

It was long time ago. The schema was basically one table with a lot of fields and records and multiple small tables with way fewer records. Query in question resulted in small number of rows (because of LIMIT clause).

It was faster to get rows from big table and then run few additional queries to get data for found rows from smaller ones than join everything together in one query. Maybe the nested loop buffer was culprit or whatever. Maybe I ran into edge case of query planner MySQL had 20 years ago. Who knows.

By ncruces, 10 months ago

[flagged]

By jpalawaga, 10 months ago

SQLite is self-described as not open contribution. So yes by their own measure they've made it more difficult to mainline features (and intentionally so).

By steve_gh, 10 months ago

I submitted a bug report on SQLite a year or so back (a simple test case only, not a solution). The folks were super nice, and their patch went into the next release.

By ncruces, 10 months ago

Me too, repeatedly. I've asked questions, reported bugs, asked for enhancements, made suggestions, submitted patches for consideration, and was always welcomed. Even when I'm asking for stuff that doesn't necessarily align with their goals.

OTOH, I've requested clarification (just some basic documentation really) on the “open contribution” fork of SQLite… and they never documented their own code.

And I'm sorry, I know sarcasm isn't the way here, and is impolite, but that was exactly the point.

Less than a week ago we had a whole thread where, again, we discussed the impossibility of improving SQLite from the outside because it's not “open contribution.”

Well, this is just a great example of much larger feature that was developed in collaboration with them.

By gpcz, 10 months ago

Even if true, it seems like they're doing a pretty good job on their own.

By binary132, 10 months ago

Open contribution isn’t a good in and of itself.

How bloom filters made SQLite 10x faster