Postgres Ate the Search Stack

Install two extensions in Postgres and you can replace Elasticsearch and Pinecone in most applications. That sounds too good to be true. I spent time exploring BM25 and pgvector to understand when this actually works and when it doesn't.

The promise: consolidate your infrastructure. The reality: it depends on your scale and requirements.

The Old Stack

The traditional architecture looked like this:

Postgres stores your data. Elasticsearch indexes it for search. Pinecone holds vector embeddings for semantic similarity.

Between them runs a complex ETL pipeline. Change data capture streams updates from Postgres. Background workers sync to Elasticsearch. Separate jobs push embeddings to Pinecone.

This creates problems. Sync delays mean eventual consistency everywhere. Your search results lag behind your database. Your vector similarity doesn't see the latest data. Operational complexity multiplies with each additional service.

You run three databases to serve one application.

BM25 in Postgres

Postgres has built-in full-text search with ts_rank. It works for basic cases but has fundamental limitations.

The ranking algorithm is primitive. It counts term frequencies without considering inverse document frequency. This enables keyword stuffing. A document that repeats "postgres" fifty times ranks higher than one that uses it naturally.

Length bias affects results. Short documents with high term density outrank longer comprehensive documents. Boolean matching dominates over ranked retrieval.

BM25 fixes this. It's a probabilistic ranking function that considers term frequency, inverse document frequency, and document length normalization.

Three extensions bring BM25 to Postgres:

pg_textsearch from Tiger Data
pg_search from ParadeDB
VectorChord-BM25

ParadeDB's pg_search claims 3x faster queries than Elasticsearch in their benchmarks. The architecture advantage is real-time updates with no ETL lag.

When you still need Elasticsearch:

Billions of documents requiring distributed search
Complex aggregations across massive datasets
Multi-region deployments with sophisticated replication

When Postgres BM25 works:

Most applications with millions to tens of millions of rows
Teams that want fewer moving parts
Cases where transactional consistency with relational data matters

pgvector - The Vector Database Story

pgvector adds vector similarity search to Postgres. You store embeddings in a column and query them with distance functions.

It supports multiple distance metrics: L2, cosine, inner product. Two indexing strategies optimize searches: HNSW for speed and IVFFlat for memory efficiency.

The killer feature: ACID compliance with your relational data. Run a transaction that updates both your rows and their vector embeddings. No sync delays. No eventual consistency.

pgvector vs Pinecone:

Supabase published benchmarks comparing pgvector to Pinecone. With proper tuning, pgvector matched Pinecone's query performance. Cost differences were significant - pgvector came in 75-79% cheaper in their setup.

But Pinecone abstracts complexity. You don't tune index parameters or worry about memory. pgvector requires manual optimization.

Production optimization:

The HNSW index needs to stay in memory. For 1 million rows with 1536-dimensional embeddings, expect 8GB+ of index data.

Tune m and ef_construction during index builds. Higher values improve recall but slow down indexing and increase memory.

Set hnsw.ef_search at query time. This controls the recall vs speed tradeoff.

Monitor with pg_stat_statements to identify slow queries.

Hybrid Search

RAG applications in 2025 use hybrid search as the production standard. Not experimental. Standard.

Combine BM25 keyword matching with vector embeddings for semantic similarity. Use Reciprocal Rank Fusion to merge results.

BM25 catches exact terms. Vectors catch meaning. Together they work better than either alone.

In Postgres, this simplifies:

One database. One transaction. No sync delays between keyword and vector indexes.

Your search architecture becomes a few SQL queries instead of a distributed system.

When to Choose What

Use Postgres extensions when:

You want fewer moving parts in production
Your search corpus fits in tens of millions of rows
Transactional consistency between data and search matters
You're building RAG applications
Team size or ops capacity is limited

Use specialized tools when:

Elasticsearch still wins for billions of documents, multi-region distributed search, and complex aggregations at massive scale.

Pinecone makes sense when you want zero ops overhead and extreme scale without tuning expertise.

What I Learned

Infrastructure consolidation isn't just about cost. It's about operational complexity.

Every additional service adds sync logic, monitoring, incident response, version upgrades, and security patches. The cognitive load multiplies.

The ETL tax is real. Sync delays compound. Eventual consistency creates edge cases. Debugging data flow across three systems takes longer than debugging one.

"Just use Postgres" is increasingly viable for more workloads. Extensions keep expanding what it can handle.

But specialized tools still have their place. Elasticsearch and Pinecone solve specific problems at specific scales better than Postgres ever will.

The trend is clear: Postgres keeps eating adjacent infrastructure. BM25 and pgvector are just the latest examples.