MySQL provides the auto-increment feature to assign monotonically incrementing
IDs to a column in a table. However, when a table is sharded across multiple
instances, maintaining the same feature is a lot more tricky.
Vitess Sequences fill that gap:
Inspired from the usual SQL sequences (implemented in different ways by
Oracle, SQL Server and PostgreSQL).
Very high throughput for ID creation, using a configurable in-memory block allocation.
Transparent use, similar to MySQL auto-increment: when the field is omitted in
an insert statement, the next sequence value is used.
Using auto-increment can leak confidential information about a service. Let’s
take the example of a web site that store user information, and assign user IDs
to its users as they sign in. The user ID is then passed in a cookie for all
The client then knows their own user ID. It is now possible to:
Try other user IDs and expose potential system vulnerabilities.
Get an approximate number of users of the system (using the user ID).
Get an approximate number of sign-ins during a week (creating two accounts a
week apart, and diffing the two IDs).
Auto-incrementing IDs should be reserved for either internal applications, or
exposed to the clients only when safe.
Let’s start by looking at the MySQL auto-increment feature:
A row that has no value for the auto-increment value will be given the next ID.
The current value is stored in the table metadata.
Values may be ‘burned’ (by rolled back transactions).
Inserting a row with a given value that is higher than the current value will
set the current value.
The value used by the primary in a statement is sent in the replication stream,
so replicas will have the same value when re-playing the stream.
There is no strict guarantee about ordering: two concurrent statements may
have their commit time in one order, but their auto-incrementing ID in the
opposite order (as the value for the ID is reserved when the statement is
issued, not when the transaction is committed).
MySQL has multiple options for auto-increment, like only using every N number
(for active-active configurations), or performance related features (locking
that table’s current ID may have concurrency implications).
When inserting a row in a table with an auto-increment column, if the value
for the auto-increment row is not set, the value for the column is returned to
the client alongside the statement result.
An early design was to use a single unsharded database and a table with an
auto-increment value to generate new values. However, this has serious
limitations, in particular throughtput, and storing one entry for each value in
that table, for no reason.
So we decided instead to base sequences on a MySQL table, and use a single value
in that table to describe which values the sequence should have next. To
increase performance, we also support block allocation of IDs: each update to
the MySQL table is only done every N IDs (N being configurable), and in between
only memory structures in vttablet are updated, making the QPS only limited by
The sequence table then is an unsharded single row table that Vitess can use to generate monotonically increasing ids. The VSchema allows you to associate a column of a table with the sequence table. Once they are associated, an insert on that table will transparently fetch an id from the sequence table, fill in the value, and route the row to the appropriate shard.
Since sequences are unsharded tables, they will be stored in the database (in our tutorial example, this is the commerce database).
The final goal is to have Sequences supported with SQL statements, like:
/* DDL support */CREATE SEQUENCE my_sequence;
SELECTNEXT VALUE FROM my_sequence;
ALTER SEQUENCE my_sequence ...;
DROP SEQUENCE my_sequence;
SHOWCREATE SEQUENCE my_sequence;
In the current implementation, we support the query access to Sequences, but not
the administration commands yet.