Vespa and Elasticsearch / Solr (Lucene)

With focus on big data serving, Vespa is optimized for:

Low millisecond response
High write and query load
Machine Learning integration
Automated high availability operations

Vespa supports true realtime writes, true partial updates, and is also easy to operate at large scale. Vespa is the only open source platform optimized for such big data serving.

For Solr-users: How I learned Vespa by thinking in Solr.

Also see the Q&A and recording of the "The Great Search Engine Debate - Elasticsearch, Solr or Vespa?" meetup.

Analytics vs. Big Data Serving

To decide whether Elasticsearch or Vespa is the right choice for a use case, consider if it needs to be optimized for analytics or serving.

Analytics	Big data serving
Response time in low seconds	Response time in low milliseconds
Low query rate	High query rate
Time series, append only	Random writes
Down time, data loss acceptable	High availability, no data loss, online redistribution
Massive data sets (trillion of docs) are cheap	Massive data sets are more expensive
Analytics GUI integration	Machine learning integration

Scaling

The fundamental unit of scale in Elasticsearch is the shard. Sharding allows scale out by partitioning the data into smaller chunks that can be distributed across a cluster of nodes. The challenge is to figure out the right number of shards, because you only get to make the decision once per index. And it impacts both performance, storage and scale, since queries are sent to all shards. So how many shards are the right number of shards?

In Vespa you do not have to worry about the number of shards and re-sharding. Vespa will take care of that. You have a cluster of nodes, and you can add or remove nodes without re-sharding, which means no downtime for re-sharding.

Vespa allows applications to grow (and shrink) their hardware while serving queries and accepting writes as normal. Data is automatically redistributed in the background using the minimal amount of data movement. No restarts or other operations are needed, just change the hardware listed in the configuration and redeploy the application.

For a detailed guide on how to set up a multinode Vespa system see Multi-Node Quick Start.

Vespa and Elasticsearch / Solr (Lucene)

Analytics vs. Big Data Serving

Scaling

Copyright Vespa.ai