Vespa Training

Build expertise in AI-powered search and retrieval with hands-on Vespa training.

Vespa is the AI Search Platform used by companies like Spotify, Yahoo, and Perplexity.ai to power large-scale applications in retrieval-augmented generation (RAG), recommendation, personalization, and e-commerce search.

To help practitioners get up to speed quickly, Vespa offers instructor-led training courses designed for search engineers, data scientists, and technical practitioners who want to move beyond theory and into production-ready systems.

Our courses combine foundational knowledge with practical, hands-on exercises so participants leave with skills they can immediately apply in their own projects. We currently offer two courses:

Vespa Fundamentals

1 Day – Core concepts for all practitioners

This introductory course provides a solid foundation in Vespa’s architecture, APIs, and query model. Participants will gain a working understanding of Vespa’s unique capabilities and learn how to apply them in real-world applications.

Who should attend: Search engineers, data scientists, and developers who are new to Vespa or want to refresh their fundamentals.

Vespa for E-Commerce Ranking

1 Day – Practical techniques for search and recommendations in online retail

This advanced course focuses on applying Vespa in e-commerce scenarios, where precision, personalization, and real-time performance are critical.

Who should attend: Teams building e-commerce search, recommendation engines, and personalization systems who want to move beyond keyword search and improve conversion with modern AI-driven ranking.

Vespa Training

Vespa Fundamentals

Agenda

Introduction to Vespa

What Vespa is, when to use it, and when not to.

Overview of Vespa’s architecture, query model, and deployment model.

A first look at Vespa’s API and tools.

Lexical Search

How attributes and indexes differ and when to use each.

Designing rank profiles for different use cases.

Hands-on exploration with Vespa’s YQL query language.

Working with Tensors

Introduction to tensors as Vespa’s core data structure.

How to index and query tensors for similarity search and recommendations.

Understanding tensor operations for ranking and personalization.

Hybrid Search

Combining lexical and vector search for retrieval-augmented generation.

Using embedders to create embeddings on the fly.

Ranking strategies for hybrid use cases.

Grouping and Aggregation

Using grouping to explore, filter, and diversify results.

Aggregating metrics for analytics and monitoring.

Performance and Scaling 101

Scaling Vespa to billions of documents.

Designing for low-latency queries and high throughput.

Best practices for tuning performance.
Vespa for E-Commerce Ranking

Agenda

E-Commerce Search Fundamentals

Combining BM25, nativeRank, document fields, and user preferences in a two-phase rank profile.

Designing for relevance, business goals, and customer experience.

Vector Search in Practice

Choosing and fine-tuning embedding models for product catalogs.

Evaluating semantic search quality using golden sets with LLMs as judges.

Learning to Rank

Training and deploying gradient boosted decision tree (GBDT) models.

Using multi-phase ranking to incorporate multiple signals.

Real-Time Recommendations

Implementing recommendations with sparse tensors, co-occurrence analysis, and basket similarity.

Scaling recommendation workloads with Vespa’s query pipeline.

Chunking for RAG

Leveraging Vespa’s built-in chunking for product descriptions, reviews, and rich content.

Combining document-level and chunk-level scoring for higher-quality retrieval.

Advanced Hybrid Search

Accelerating semantic search with Matryoshka embeddings and binarization.

Using float vectors as re-rankers for improved relevance.

Model Tuning

Why off-the-shelf encoders often underperform in retail search.

Techniques for fine-tuning encoders for product and user intent.

Late Interaction Models

Introducing ColPali for visual retrieval directly on product images, without OCR pipelines.

Cross-Encoders and Multi-Phase Ranking

Deploying ONNX cross-encoders as third-phase re-rankers.

Running GPU-accelerated inference in Vespa for maximum precision.
The RAG Blueprint

A Guided Best Practice for Building Production-Ready Retrieval-Augmented Generation systems.

While Vespa’s training courses provide hands-on instruction, The RAG Blueprint serves as a self-guided framework for teams designing large-scale retrieval-augmented generation applications.

The RAG Blueprint covers every stage of system design and deployment, from chunking and hybrid retrieval to phased ranking, embedding strategies, and performance tuning. It draws on proven patterns used in production by leading AI-driven companies and is intended to help engineers, architects, and technical managers shorten the path from prototype to production.

Key benefits of The RAG Blueprint

Step-by-step guidance on architecting robust RAG systems.

Practical examples of hybrid search, late interaction models, and on-the-fly embeddings.

Insights on scaling to billions of documents while keeping latency low.

Recommendations for evaluation, monitoring, and continuous improvement.

Who it’s for: Teams exploring or scaling RAG, whether for chatbots, agentic AI workflows, or domain-specific discovery applications.
Explore The RAG Blueprint

Why Vespa Training?

Hands-on learning: All sessions include practical labs with real data and queries.
Production focus: Learn techniques proven in large-scale systems.
Expert instructors: Training is led by Vespa engineers and practitioners who work with enterprise deployments every day.

Explore More

Retrieval Augmented Generation

Discover Vespa’s RAG features for hybrid search, combining text-vector, token-vector, and machine-learned ranking, all designed to scale effortlessly and handle any query volume or data size without compromising on quality.

Vespa RAG Features

The RAG Blueprint Blog

Read more about The RAG Blueprint from the Vespa engineering blog.

Vespa RAG Manager’s Guide

This management guide outlines how businesses can deploy generative AI effectively, focusing on Retrieval-Augmented Generation (RAG) to integrate private data for tailored, context-rich responses.

Read Manager's Guide

RAG Technical Guide

Learn how Vespa RAG allows language models to access up-to-date or specific domain knowledge beyond their training, improving performance in tasks such as question answering and dynamic content creation.

Read Technical Guide