All topics
Retrieval

Semantic search vs keyword search

Why "vibes-based" search returns things keyword search misses — and where it still loses.

Updated April 2026

Keyword search matches strings; semantic search matches meaning. Keyword search will return a video only if the exact word you typed appears in the transcript. Semantic search compares the meaning of your query to the meaning of every chunk in the library using vector embeddings, so a search for "how to stop procrastinating" can surface a talk titled "beating the resistance" — even though none of those words overlap.

How semantic search actually works

Every chunk of text in your library is run through an embedding model (e.g. OpenAI text-embedding-3, Cohere embed-v3) that produces a vector — a list of ~1500 numbers representing the chunk's meaning. Your query gets the same treatment. The retriever returns the chunks whose vectors are closest to the query's vector (cosine similarity). Closeness in vector space ≈ closeness in meaning, learned from billions of training examples.

Where keyword still wins

Exact identifiers — product names, error codes, people, URLs, code snippets, legal citations. If you're searching for "GPT-5" or "useEffect" or "Form 1099-MISC," you want exact matches, not paraphrases. Keyword (or BM25) also handles negation and operators ("foo AND NOT bar") that semantic search struggles with.

The hybrid approach

Production retrievers blend both. BrainTube runs BM25 and a dense vector index in parallel, fuses the results with reciprocal rank fusion (RRF), then re-ranks the top ~50 with a cross-encoder model. The result: paraphrase queries surface conceptually related content, and exact-name queries still return the right hit at rank 1.

Why retriever quality matters more than model quality

A frontier LLM with a bad retriever hallucinates because it's given wrong or no context. A modest LLM with a great retriever cites accurately because it's grounded. BrainTube spends compute on the retriever — chunking, embeddings, hybrid scoring, re-ranking — so any LLM you point at it gets sharper answers.

Frequently asked

Try BrainTube on your own corpus

Free tier, no card. Export anytime.

Start free

More to read