Once your website reaches a certain size, the JSON will be too big to load. Then you'll have to offload the search request to a server. Either self-hosted, or a service like Algolia.
You can probably push that "certain size" a long way down the road if you need to. (at least in this specific client side text search case, rather than a generic "how do I server large json files" way)
If you tweak the search box so it doesn't do anything until you've typed at least 2 or 3 letters, you could then serve regenerated son files that only contain that matches that have that prefix... No need for any of the json payload for words/phrases starting with aa..rt if someone's typed "ru" into the search box.
That means you'd have 676 distinct json files, but you'd only ever load the one you need...
But that requires a network request after typing to get results, which is about the same user experience of a search bar that requests some search API.
Seems to me that often, though not always, this network request would happen whilst the user is typing — say, busy typing chars 3, 4 and 5. That the netw req won't be noticeable for the human
And, if typing more chars, or backspace-deleting to fix a typo ... no netw request required.
And the same scenario, if typing a 2nd word.
I'm guessing in like 90% of the cases, it'll seem as if there was never a netw req.
Push the corpus into SQLite, it has built-in FTS engines[^1]. Then serve it with anything. Unfortunately this needs server side code, but like 30 lines of PHP.
You can do SQLite in the browser but it’ll have to download the entire dB file instead of only opening the pages it needs (because the naive JS port can’t convert page requests to range requests).
It should be possible to support loading only the required pages on the browser with SQLite compiled to WASM along with a custom VFS implementation. Here’s a project[1] which does something similar (selectively load the SQLite DB on demand), albeit with the SQLite file in a torrent.
Did anyone try to use IndexedDB instead of a big load of json with a separated json approach?
I mean a bunch of static json files, that are only loaded if not already in indexeddb... like patches?
JSON may be a very bandwidth-inefficient format. A format that can be parsed from a stream could save RAM and bandwidth, especially on mobile which is most constrained.
However it compresses very well on the wire.
One of the simplest "streamed-json" solutions is to have one json per line of input and parse that incrementally.