Webinar Transcript

Unlock the Future of eCommerce

One Platform, Unlimited Possibilities

Session 2: Vinted Case Study

Ernestas Poskus, Search Engineering Manager

Session Summary

Ernestas Poskus of Vinted shared how the company transitioned from an overloaded, fragmented Elasticsearch setup to a unified Vespa-based search architecture. The move delivered dramatic gains in latency, scalability, cost efficiency, and developer velocity. With Vespa now powering real-time indexing, AI-driven personalization, and multilayered ranking, Vinted has future-proofed its platform and enabled fast innovation in search, recommendations, and experimentation.

Transcript

Hello and thank you for having me. I’m pleased to share the story of how we unified our search infrastructure at Vinted and unlocked new capabilities by adopting the Vespa search engine.

This is a story of moving beyond limitations—from a fragmented, operationally heavy setup to one with clarity, speed, and flexibility. I’ll walk you through our architectural evolution and how Vespa became the backbone of our search, powering fast results, personalized suggestions, and recommendations.

I’ve been at Vinted for over a decade, starting as a full-stack engineer, then moving through backend development, site reliability engineering, and eventually into product leadership. That journey taught me the value of scalability and operational resilience—experience that now shapes our search platform team.

Our team was created to migrate away from Elasticsearch and build our new foundation on Vespa. In some ways, it felt like working for seven different companies under one Slack domain.

About Vinted

Vinted is one of the most popular online marketplaces for second-hand items, operating in 20+ countries. We support 15 spoken languages, making search a linguistic and technical challenge. Our infrastructure handles over 75 billion active items in real time and serves around 25,000 search queries per second, each returning up to 1,000 results.

Why We Moved Away from Elasticsearch

At one point, we were managing six large Elasticsearch clusters. These created significant operational overhead: coordinating updates, alias switches, reindexing, performance degradation, and shard management. Feature development slowed because the platform couldn’t support new capabilities without risk.

Eventually, this setup became unsustainable—costly in time, compute, and team productivity.

Enter Vespa

We discovered Vespa thanks to a persistent data scientist who encouraged us to use it for homepage recommendations. Reluctantly, we tried it—and were immediately impressed. It was faster, easier to manage, and impactful on both performance and business outcomes.

Vespa wasn’t a popular or obvious choice back then, but its proven scalability (originating at Yahoo) and support for real-time indexing, lexical and vector search, partial document updates, and built-in ML features made it the right one.

Our Migration Outcomes

Migrated from 6 Elasticsearch clusters to 1 Vespa deployment
Cut server costs by 50%
Improved query latency by over 2x
Increased indexing speed by over 3x
Achieved sub-second data visibility for real-time updates (vs. 6 mins previously)

Architectural Simplicity

Previously, our architecture was a complex chain of loosely connected services across multiple teams. Now, we have a unified platform where all phases of search—including ML-based ranking and retrieval—happen in one place. This reduced system complexity, improved collaboration, and eliminated silos.

Advanced Ranking and ML Integration

Vespa allows us to:

Run multiple ML models natively
Deploy real-time recommendation engines using two-tower neural networks and ANN search
Tune search with fine-grained control over multi-phase ranking logic
Co-locate data, compute, and ranking for deterministic performance

We also run six ML models in production with plans to expand, including GPU-ready inference and LLM-powered RAG across the org.

Real-Time Personalization at Scale

Vespa powers:

Instant homepage personalization (e.g., adapting to PlayStation game searches in seconds)
Counterfeit detection via image search
Stream processing with Apache Flink integrated into Vespa as a persistent, searchable index

What’s Next

We’re expanding use cases each quarter and integrating Vespa more deeply into our experimentation platforms and personalization engines. It’s a strategic enabler, letting us experiment faster and deliver real-time, AI-powered features across the business.

Closing Thoughts

Migrating to Vespa wasn’t just an engine swap—it was a transformation. From silos to synergy. From delays to real time. Vespa is now central to our infrastructure, enabling smarter, faster, and more unified search.

Thank you to the Vespa team and our engineering crew. It’s been an exciting journey—and we’re just getting started.