SearchJanuary 2025·8 min read

How Search Became an AI Problem

For most of its history, search was a text matching problem. The engineering challenge was speed: index a lot of documents, return matching ones fast. Relevance was a tiebreaker. Today, relevance is the product. And relevance, at any meaningful depth, is an AI problem.

The keyword era and its limitations

BM25 and its predecessors assumed that the presence of query terms in a document was evidence of relevance. This worked well enough when queries were short, vocabularies were controlled, and users were forgiving. None of those conditions hold today. Queries are conversational. Vocabulary between users and documents is mismatched. Users tolerate exactly zero irrelevant results before leaving.

Semantic gap: the real problem

The semantic gap — the mismatch between how a user describes what they want and how a document describes what it contains — is the fundamental problem search has to solve. A user searching for 'comfortable shoes for standing all day' might find a perfect match in a product listed as 'professional ergonomic footwear with memory foam insole.' Zero shared terms. Perfect relevance. Classical retrieval returns nothing useful; semantic retrieval returns the right product.

Vector search isn't a silver bullet

Dense retrieval using vector embeddings solves the semantic gap problem, but introduces new failure modes. Exact term matches — brand names, SKUs, specific model numbers — are often better served by lexical retrieval. The state of the art is hybrid: combine dense and sparse signals, then use a learned ranker to merge them. The product decision is which signal to trust when, and that decision requires understanding your query distribution.

The ranking layer is where the business lives

Retrieval narrows the candidate set. Ranking decides what the user sees. The difference between ranking by relevance and ranking by expected value — where expected value includes price, margin, availability, and behavioral signals — is where search becomes a business problem. The best search products aren't optimizing for relevance alone; they're optimizing for user satisfaction and business outcome simultaneously, and the product work is designing the right objective.

Evaluation is the gating constraint

You cannot improve what you cannot measure. Search evaluation is hard because relevance is contextual, subjective, and query-dependent. The teams winning at search investment heavily in offline evaluation (human judgments, graded relevance, NDCG) and online evaluation (A/B testing, interleaving). The product discipline required to build and maintain an evaluation system is often larger than the engineering discipline required to build the search system itself.

Search became an AI problem the moment users started expecting it to understand them rather than just match them. The product implication is that investing in the retrieval layer, the ranking layer, and the evaluation layer isn't optional — it's the search product. Everything else is infrastructure.

Evgeniy Abzalov

AI Product Leader

Discuss this →