The Vitess Docs | What Is Vitess

What Is Vitess

Vitess is an open-source database clustering system for horizontal scaling of MySQL. It distributes data across multiple MySQL instances through sharding while presenting a unified database interface to your application. Queries work as if hitting a single MySQL server, but Vitess routes them to the appropriate shards automatically.

Vitess runs in public and private cloud environments, and on dedicated hardware. It supports MySQL and Percona Server for MySQL.

Why use Vitess #

Vitess addresses three challenges that arise when scaling MySQL:

Scaling beyond a single server - Vitess shards data across multiple MySQL instances without requiring you to embed sharding logic in your application.
Managing database clusters - Vitess handles failovers, backups, and topology changes across many MySQL instances.
Protecting your database - Vitess pools connections, rewrites problematic queries, and enforces limits to prevent any single query from degrading performance.

How it works #

Vitess sits between your application and MySQL. It consists of:

VTGate - A proxy that accepts MySQL protocol connections from your application and routes queries to the appropriate shards.
VTTablet - An agent running alongside each MySQL instance that manages queries, connections, and replication.
Topology Service - A consistent data store (etcd, ZooKeeper, or Consul) that maintains the cluster state.

Your application connects to VTGate using any MySQL-compatible driver. Vitess also includes native JDBC and Go drivers for optimized performance.

Adoption #

Vitess served all YouTube database traffic for over five years. Companies including Slack, Square, and JD.com run Vitess in production.

Features #

Performance
- Connection pooling - Multiplex front-end application queries onto a pool of MySQL connections to optimize performance.
- Query de-duping – Reuse results of an in-flight query for any identical requests received while the in-flight query was still executing.
- Transaction manager – Limit number of concurrent transactions and manage timeouts to optimize overall throughput.
Protection
- Query rewriting and sanitization – Add limits and avoid non-deterministic updates.
- Query blocking – Customize rules to prevent potentially problematic queries from hitting your database.
- Query killing – Terminate queries that take too long to return data.
- Table ACLs – Specify access control lists (ACLs) for tables based on the connected user.
Monitoring
- Performance analysis tools let you monitor, diagnose, and analyze your database performance.
Topology Management Tools
- Cluster management tools (handles planned and unplanned failovers)
- Web-based management GUI
- Designed to work in multiple data centers / regions
Sharding
- Virtually seamless dynamic re-sharding
- Vertical and Horizontal sharding support
- Multiple sharding schemes, with the ability to plug-in custom ones

Comparisons to other storage options #

The following sections compare Vitess to two common alternatives, a vanilla MySQL implementation and a NoSQL implementation.

Vitess vs. Vanilla MySQL #

Vitess improves a vanilla MySQL implementation in several ways:

Vanilla MySQL	Vitess
Every MySQL connection has a memory overhead that ranges between 256KB and almost 3MB, depending on which MySQL release you're using. As your user base grows, you need to add RAM to support additional connections, but the RAM does not contribute to faster queries. In addition, there is a significant CPU cost associated with obtaining the connections.	Vitess creates very lightweight connections. Vitess' connection pooling feature uses Go's concurrency support to map these lightweight connections to a small pool of MySQL connections. As such, Vitess can easily handle thousands of connections.
Poorly written queries, such as those that don't set a LIMIT, can negatively impact database performance for all users.	Vitess employs a SQL parser that uses a configurable set of rules to rewrite queries that might hurt database performance.
Sharding is a process of partitioning your data to improve scalability and performance. MySQL lacks native sharding support, requiring you to write sharding code and embed sharding logic in your application.	Vitess supports a variety of sharding schemes. It can also migrate tables into different databases and scale the number of shards up or down. These functions are performed non-intrusively, completing most data transitions with just a few seconds of read-only downtime.
A MySQL cluster using replication for availability has a primary database and a few replicas. If the primary fails, a replica should become the new primary. This requires you to manage the database lifecycle and communicate the current system state to your application.	Vitess helps to manage the lifecycle of your database servers. It supports and automatically handles various scenarios, including primary failure detection and recovery. It also has support for data backups and restores.
A MySQL cluster can have custom database configurations for different workloads, like a primary database for writes, fast read-only replicas for web clients, slower read-only replicas for batch jobs, and so forth. If the database has horizontal sharding, the setup is repeated for each shard, and the app needs baked-in logic to know how to find the right database.	Vitess uses a topology backed by a consistent data store, like etcd or ZooKeeper. This means the cluster view is always up-to-date and consistent for different clients. Vitess also provides a proxy that routes queries efficiently to the most appropriate MySQL instance.

Vitess vs. NoSQL #

If you're considering a NoSQL solution primarily because of concerns about the scalability of MySQL, Vitess might be a more appropriate choice for your application. While NoSQL provides great support for unstructured data, Vitess still offers several benefits not available in NoSQL datastores:

NoSQL	Vitess
NoSQL databases do not define relationships between database tables, and only support a subset of the SQL language.	Vitess is not a simple key-value store. It supports complex query semantics such as where clauses, JOINS, aggregation functions, and more.
NoSQL datastores do not usually support transactions.	Vitess supports transactions.
NoSQL solutions have custom APIs, leading to custom architectures, applications, and tools.	Vitess adds very little variance to MySQL, a database that most people are already accustomed to working with.
NoSQL solutions provide limited support for database indexes compared to MySQL.	Vitess allows you to use all of MySQL's indexing functionality to optimize query performance.

Architecture >>