Skip to main content
0 percent read
SQLite FTS at blog scale: enough rope for real search

SQLite FTS at blog scale: enough rope for real search

3 min read · by admin

Share

Reactions

## Outline


- Hook: Elasticsearch isn’t the default for every product.
- What FTS5 buys you on SQLite with zero new services.
- Query hygiene: normalization, minimum lengths, LIKE fallback.
- BM25 “good enough” behavior and when rankings feel wrong.
- Exit criteria: fuzzy search, facets, analyzers — then consider an external engine.

## Draft


If you write about Rails long enough, someone will imply your blog “needs Elasticsearch” before you’ve shipped your fifth post. Heavy search stacks can be great. They are also extra services to host, secrets to rotate, reindex jobs to babysit, and bills that arrive while traffic still looks like you, your relatives, and a handful of readers.

For an app already on **SQLite**, **FTS5** is a reasonable first swing: one database file, no separate search daemon, and **BM25** scoring that behaves in predictable ways when you combine titles and bodies. You trade cutting-edge typo tolerance for simplicity you can actually operate.

### What you get without spinning up infra


SQLite’s FTS layer gives tokenization, prefix queries, and ranked matches in the deployment shape you already understand. At hundreds or thousands of posts, sluggish search is rarely “the index is too weak.” It is usually messy input—no Unicode normalization, no minimum token length—or bad choices about **what text you index** (raw HTML noise, duplicated nav, boilerplate footers).

A pipeline worth keeping strips cruft nobody searches for, rejects silly-short tokens so bots cannot hammer substring scans, and keeps the FTS mirror aligned when posts publish or update. When FTS cannot run or the query is too short for your rules, return an honest empty state or a bounded **`LIKE`** fallback—not a stack trace.

### Ranking is statistics, not taste


BM25 boosts rare tokens and punishes repetitive filler. Readers who search **`rails`** plus a distinctive error substring often see the right post early on—not because the engine “reads” articles, but because the combination is oddly specific.

When results look silly, check boring causes before you tune weights. Duplicate titles confuse humans and scores. Very short posts give the ranker almost nothing to work with. And plenty of sites see most traffic from listings and recency anyway—search polish matters less until people actually use the box.

### Knowing when to graduate


Stay on FTS until you truly need aggressive fuzzy matching, facets across huge catalogs, or per-locale analyzers. At that point, budgeting **Meilisearch**, **OpenSearch**, or hosted search stops being résumé garnish and turns into workload planning.

Until then, search rides along with your Rails stack—fewer dashboards, fewer 3 a.m. cluster mysteries.

Enjoyed this post?

Share it with a teammate, browse more posts, or explore related tags to keep learning.

New posts by email

Get notified when we publish—unsubscribe anytime.

We only use your address for this list.

Selected from similar categories and overlapping topics.

More in this category

Keep exploring Tutorials guides and articles.