When you can to compute over large data sets online, a new world of possibilities for new applications and features opens up. This page describes some of the most well known problems people use Vespa to solve.
Vespa is a full-featured text search engine with full support for traditional information retrieval as well as modern embedding based techniques. Moreover, since these approaches can be combined efficiently in the same query and ranking model, it's easy to use elements of both in the same application, something no other technologies can do currently but which is usually necessary to get good results in real applications. Search applications usually make use of these features of Vespa:
No matter which features you combine, you'll benefit from Vespa's linear scalability, automatic data management and online elasticity, and support for sustained high volume and fully realtime writes which allows you to both add new documents, and cheaply update fields of existing documents while serving.
These example open source Vespa text search applications can be used as a starting point:
Recommendation, content personalization and ad targeting is all the same thing when it comes to implementation: For a given user or context, evaluate machine-learned content recommender models to find the best items and show them to the user. Usually it is also necessary to filter out unwanted items based on metadata, such as e.g. language used, or remaining ad budget. In addition, it is often necessary to group the recommended items to make browsing easier or filter out those that are too similar.
Vespa makes it possible to do the whole process online, at the moment when the recommendation is needed, which ensures recommendations are up-to-date and makes it affordable to make them specifically for each user or situation. These features of Vespa are usually leveraged:
These example open source Vespa recommendation applications can be used as a starting point:
Question answering provides direct answers to user's question. This is needed in chat-bots, virtual assistants and similar, and is also becoming an expected feature of high-end search solutions, where a direct answer is provided on queries that seem to be questions.
A high quality question answerer works as follows: Text snippets are represented by vector embeddings which are indexed for fast matching with ANN. The best candidates found by ANN matching are evaluated in a transformer based language model which outputs the score of the snippet as well as the beginning and end of the text answer.
By using Vespa, the entire process can be implemented as an application on a single platform, and made to execute with a latency of a few tens of milliseconds while scaling to any volume, while delivering quality on par with the research state of the art.
See our blog post showing how to replicate the best question answering performance from the research community as a production ready Vespa application, and the followup post how we brought down the response time to tens of milliseconds. The complete source for this application is also available.
Applications that use semi-structured data - that is a combination of data-base like data and plain text - usually benefit from allowing users to navigate in the data using both structured navigation and text search. The most common example of this is e-commerce, or shopping sites.
This makes use of traditional text search in conjunction with sorting, grouping and "filtering by metadata. As any query can be grouped and filtered, this allows users to switch between drilling down by metadata and searching by text seamlessly without losing context. Commonly some of the metadata is supplied by parent documents (such as the merchant of a product). Some e-commerce applications also make use of embeddings to provide search, navigation or recommendation in an embedding space.
Personal search (not to be confused with personalization) is to provide search in personal collections of data where there is never a need to search across many collections in a single query. In such applications it is not cost-effective to do the work to maintain global reverse indexes and the best solution is to search by streaming through the raw data at query time. Latency can still be bounded for arbitrary sized collections as each is distributed over a number of nodes to bound the size of a given user's collection on a given node.
Vespa provides a streaming mode where the usual functionality of the engine is backed by searching streaming through the raw data stored in Vespa, no indexes necessary. This allows powerful personal search applications to be implemented easily and cheaply at any scale. Read more in our blog post on personal search.
Many applications which make use of textual input make use of typeahead suggestions, where a number of suggested completions are presented while the user is typing. This usually involves searching and ranking matching candidate completions with really low latency - a suitable job for Vespa. Vespa features usually involved in this are: