Vector search is becoming increasingly prominent. At Trip we’re exploring its use and – in the spirit of transparency – we wish to share insights into what it is and how it differs from keyword/lexical search! And, to be clear, we’re at the start of the journey…!
From Keywords to Concepts: How Vector Search Is Changing Information Retrieval
For decades, information retrieval has been built on keyword search — matching the words in a user’s query to the same words in documents. It’s the logic behind databases, search engines, and Boolean queries, and it has served information specialists well, particularly when controlled vocabularies like MeSH are used.
But language is slippery. Two people can describe the same idea in very different ways — “heart attack” vs. “myocardial infarction,” “blood sugar” vs. “glucose.” Keyword search struggles when users and authors use different terms for the same concept.
That’s where vector search comes in — a new approach that focuses on meaning rather than exact wording.
What Is Vector Search? (An Intuitive Explanation)
At its core, vector search represents meaning mathematically.
Instead of treating text as a bag of words, it converts language into numbers that capture relationships between concepts.
This transformation happens in three main steps.
1. Text to Vectors — Turning Language into Numbers
The starting point is a language model — a type of AI system trained on vast amounts of text (for example, research papers, books, and web content). During training, the model learns how words appear together and in what contexts. Over time, it builds a kind of map of language, where meanings cluster naturally.
Here’s how this works in practice:
- Words that often appear in similar contexts, such as doctor and physician, end up close together in this semantic map.
- Words that rarely co-occur or belong to very different contexts, like insulin and wheelchair, are far apart.
When text is processed by the model, each sentence or paragraph is represented as a vector — a list of numbers indicating its position in this high-dimensional space.
For instance:
- “High blood pressure” →
[0.13, -0.45, 0.77, …] - “Hypertension” →
[0.12, -0.47, 0.75, …]
These numbers are coordinates on hundreds of “meaning axes” that the model has learned automatically. While humans can’t easily interpret each axis, together they capture how phrases relate semantically to everything else in the model’s training data.
You can think of these dimensions as encoding things like:
- Whether the phrase is medical or general
- Whether it describes a disease, treatment, or symptom
- Its relationships to concepts such as “cardiovascular” or “chronic condition”
If two texts have vectors that are close together, it means the model recognises that they have similar meanings.
So:
- “High blood pressure” and “hypertension” → almost identical
- “High blood pressure” and “low blood pressure” → related but opposites
- “High blood pressure” and “migraine” → far apart
This process — called embedding — is how modern AI systems move from words to concepts.
2. Measuring Similarity
When a user searches, their query is also converted into a vector. The system then compares that query vector to every document (or passage) vector in its database using a measure of semantic closeness, often called cosine similarity.
The closer two vectors are, the more related their meanings. This allows vector search to identify results that discuss the same idea even when the words are completely different.
For example, a query about “lowering blood pressure without medication” might retrieve:
- Trials on “lifestyle modification for hypertension”
- Reviews of “dietary sodium reduction”
- Cohort studies on “exercise and cardiovascular risk”
— even if the exact phrase “lowering blood pressure without medication” doesn’t appear in any of those documents.
3. Returning Results
Instead of relying on literal matches, vector search retrieves the documents (or parts of documents) closest in meaning to the user’s query.
In contrast:
- Keyword search finds what you said.
- Vector search finds what you meant.
How It Differs from Keyword Search
| Feature | Keyword Search | Vector Search |
|---|---|---|
| Basis | Exact word matching | Conceptual similarity |
| Strengths | Transparent, precise, good for controlled vocabularies | Finds semantically related content, handles synonyms and context |
| Weaknesses | Misses relevant material with different wording | May surface loosely related material if not tuned carefully |
| Good for | Narrow, well-defined, reproducible queries | Exploratory or question-based searching |
Many systems now use hybrid search, combining keyword and vector methods. Keywords help with precision and reproducibility; vectors help with recall and conceptual understanding.
Why It Matters for Information Specialists
For information professionals, vector search introduces both power and complexity.
It enables:
- Retrieval of semantically related evidence, even when vocabulary differs.
- More natural-language searching — closer to how users think and ask questions.
- The foundation for AI-driven Q&A tools, where the system retrieves and synthesises the most relevant evidence rather than just listing papers.
But it also brings new challenges:
- Relevance can be fuzzier and harder to explain.
- Transparency and reproducibility — essential in evidence-based work — need careful handling.
- Understanding how a system defines “similarity” becomes as crucial as knowing how it handles Boolean logic or MeSH terms.
The Bottom Line
Vector search doesn’t replace traditional methods — it expands them.
It’s a bridge between human language and machine understanding.
In short:
Keyword search finds the words. Vector search finds the meaning.
Together, they represent the next chapter in evidence discovery and retrieval — one that blends linguistic nuance, AI, and the information specialist’s craft.
1 Pingback