My ERP projects need search — search across customers, invoices, products, and employees. My instinct was to reach for Elasticsearch. Then I ran benchmarks: for datasets under 5 million rows, PostgreSQL full-text search with a GIN index on tsvector matched Elasticsearch's latency for most query types and was dramatically simpler to operate. No additional infrastructure, no synchronization lag, no Elasticsearch cluster to manage. I've been shipping PostgreSQL FTS in production ERP systems for two years and it handles everything short of fuzzy search and complex relevance ranking.
PostgreSQL's FTS system converts text into a `tsvector` — a sorted list of lexemes (normalized word stems) with positional information. A query is converted to a `tsquery` — a boolean expression of lexemes. The `@@` operator matches a tsvector against a tsquery. PostgreSQL ships with dictionaries for 20+ languages that handle stemming (running → run), stop words (the, a, is), and accents. For Indonesian content in my ERP projects, I use the `english` dictionary as a fallback (Indonesian FTS dictionaries exist but require custom configuration).
The production pattern is to add a generated `tsvector` column to your table and build a GIN index on it. The generated column is maintained automatically by PostgreSQL — no triggers or external sync needed. GIN (Generalized Inverted Index) is the index type designed for tsvectors: it stores a mapping from each lexeme to the rows containing it, making full-text queries O(log N) rather than O(N). The index is rebuilt incrementally on inserts and updates, not from scratch. For a table with 1 million rows, a GIN index on tsvector enables sub-10ms full-text queries.
PostgreSQL's `ts_rank()` function scores each matching row by relevance — how many times the query terms appear, their positions in the document, and their proximity. `ts_rank_cd()` (cover density) additionally considers how close the matching terms are to each other. The limitation: ts_rank requires reading the tsvector of every matching row, which is expensive at scale. For 1 million matches on a broad query, ranking can take seconds. The optimization: add WHERE filters to reduce the match set before ranking, or limit to the first N matches and rank only those.
PostgreSQL FTS Pipeline:
─────────────────────────────────────────────────────────
Text input: "Running PostgreSQL queries efficiently"
│
▼ to_tsvector('english', text)
tsvector: 'effici':4 'postgresql':2 'queri':3 'run':1
(stemmed, stop words removed, positions recorded)
│
▼ stored in generated column + GIN index
GIN Index: lexeme → [row_ids]
'postgresql' → [1, 5, 23, 44]
'queri' → [1, 3, 5, 12]
'run' → [1, 8, 19]
│
Query: plainto_tsquery('english', 'postgresql queries')
→ 'postgresql' & 'queri' (AND logic)
│
▼ GIN index intersection
Matching rows: [1, 5] → O(log N), not O(N)
│
▼ ts_rank() for ordering
Ranked results with relevance score
Performance benchmark (1M rows, GIN index):
Simple phrase search: ~8ms
Boolean AND (2 terms): ~12ms
Autocomplete (prefix :*): ~5ms
Broad match (many results): can be slow → use LIMIT earlyFrom my experience with ERP search: combine full-text search with trigram similarity (pg_trgm extension) for typo-tolerance. tsvector handles stemmed exact word matches well, but fails on misspellings like 'recieve' instead of 'receive'. Add a `gin_trgm_ops` index on the raw text column alongside the tsvector GIN index and use `similarity()` or `word_similarity()` as a secondary ranking signal. A query like: `WHERE search_vector @@ plainto_tsquery('english', $1) OR similarity(name, $1) > 0.3 ORDER BY ts_rank(...) DESC` handles both exact and fuzzy matches.
Here's the complete production pattern I use in NestJS + PostgreSQL ERP systems. The pattern includes: a generated tsvector column combining multiple text fields (with different weights using setweight), a GIN index on the tsvector, a search function using plainto_tsquery (which handles multi-word queries without requiring users to know tsquery syntax), ts_rank for ordering, and ts_headline for generating search result snippets with highlighted matches.
In an ERP context, a customer name match should rank higher than a match in the notes field. PostgreSQL's `setweight()` function assigns weights A-D to tsvectors before concatenating them. Weight A is the highest — a match on an A-weighted field scores higher than a match on a B-weighted field. I use: A for name/title/code, B for description, C for tags/categories, D for notes/remarks. This mimics the basic TF-IDF weighting that Elasticsearch uses, without the complexity of a dedicated search cluster.
-- Production FTS setup for ERP (customers table)
-- 1. Add generated tsvector column with weighted fields
ALTER TABLE customers ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
setweight(to_tsvector('english', coalesce(code, '')), 'A') ||
setweight(to_tsvector('english', coalesce(email, '')), 'B') ||
setweight(to_tsvector('english', coalesce(notes, '')), 'D')
) STORED;
-- 2. GIN index for fast searching
CREATE INDEX idx_customers_search ON customers USING GIN(search_vector);
-- 3. Trigram index for autocomplete + fuzzy matching
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_customers_name_trgm ON customers USING GIN(name gin_trgm_ops);
-- 4. Search query (used in NestJS service)
SELECT
id, name, code, email,
ts_rank(search_vector, query) AS rank,
ts_headline('english', name, query, 'MaxWords=10, MinWords=5') AS snippet
FROM customers, plainto_tsquery('english', $1) AS query
WHERE
search_vector @@ query
OR similarity(name, $1) > 0.3 -- fuzzy fallback
ORDER BY rank DESC, similarity(name, $1) DESC
LIMIT 20;
-- 5. Autocomplete endpoint (prefix match)
SELECT id, name
FROM customers
WHERE to_tsvector('english', name) @@ to_tsquery('english', $1 || ':*')
LIMIT 10;In a NestJS service, full-text search with raw SQL gives more control than an ORM-abstracted query. I use the TypeORM DataSource or a raw pg client with parameterized queries. The key is to use `plainto_tsquery` rather than `to_tsquery` for user-facing search — `plainto_tsquery` converts a plain text query string (as a user types it) to a tsquery, handling AND logic automatically and ignoring syntax. `to_tsquery` requires users to know tsquery operators like & and |.
Benchmarks show PostgreSQL FTS is competitive with Elasticsearch on phrase search and boolean queries. But for queries that match millions of rows (e.g., a common word in a large dataset), PostgreSQL's ranking step reads every matching tsvector and can take 25+ seconds. The threshold where Elasticsearch outperforms PostgreSQL FTS is roughly 1 million+ rows with high-cardinality queries. For my ERP datasets (10K-500K rows), PostgreSQL FTS is fast enough. If your search needs to rank across millions of records with complex relevance, move to Elasticsearch or a PostgreSQL extension like ParadeDB.
Standard `plainto_tsquery` doesn't do prefix matching — 'inv' won't match 'invoice'. For autocomplete, use `to_tsquery('english', $1 || ':*')` with the `:*` operator, which matches any lexeme starting with the prefix. Combine this with LIMIT 10 and you have a fast autocomplete endpoint. The `:*` operator works on GIN indexes and is fast even on large tables. I've shipped this pattern for customer name autocomplete and product code lookup in ERP systems — the response time is under 20ms even with 100K+ rows.
PostgreSQL FTS covers 90% of ERP search requirements. The cases where I'd add Elasticsearch: dataset exceeds 5 million rows with broad-match queries, you need semantic/vector search (searching by meaning, not keywords), you need real-time cross-service search aggregation (search across multiple databases simultaneously), or your stakeholders require features like 'did you mean?' spelling correction using the Elasticsearch dictionary. For everything else, PostgreSQL FTS plus pg_trgm is simpler, cheaper, and sufficient.