The Vitess Docs | Managed, Online Schema Changes

Edit this page

Managed, Online Schema Changes

<< Unmanaged Schema Changes Online DDL strategies >>

Vitess offers managed, online schema migrations (aka Online DDL), transparently to the user. Vitess Online DDL offers:

Non-blocking migrations
Migrations are asyncronously auto-scheduled, queued and executed by tablets
Migration state is trackable
Migrations are cancellable
Migrations are retry-able
Lossless, revertible migrations
Support for declarative migrations
Support for postponed migrations
Support for failover agnostic migrations
Support for concurrent migrations
Support for INSTANT DDL
See also advanced usage

As general overview:

User chooses a strategy for online DDL (online DDL is opt in)
User submits one or more schema change queries, using the standard MySQL CREATE TABLE, ALTER TABLE, DROP TABLE, CREATE VIEW, ALTER VIEW, DROP VIEW syntax.
Vitess responds with a Job ID for each schema change query.
Vitess resolves affected shards.
A shard's primary tablet schedules the migration to run when possible.
Tablets will independently run schema migrations:
- ALTER TABLE statements run via a VReplication workflow. Read more on ddl strategies
- CREATE TABLE and CREATE VIEW statements run directly.
- DROP TABLE statements run safely and lazily.
- ALTER VIEW and DROP VIEW are internally modified to allow quick revert.
Vitess provides the user a mechanism to view migration status, launch (if required), complete (if required), cancel or retry migrations, based on the job ID.

Syntax and supported statements #

Online DDL applies to the following statements:

CREATE TABLE
ALTER TABLE
DROP TABLE
CREATE VIEW
ALTER VIEW
DROP VIEW

Other DDL statements, such as RENAME, TRUNCATE, OPTIMIZE, etc., are not managed by the Online DDL mechanism and are executed directly on the backend MySQL servers.

ALTER TABLE #

Use the standard MySQL ALTER TABLE syntax to run online DDL. Whether your schema migration runs synchronously (the default MySQL behavior) or asynchronously (aka online), is determined by ddl_strategy.

We assume we have a keyspace (schema) called commerce, with a table called demo, that has the following definition:

CREATE TABLE `demo` (
  `id` int NOT NULL,
  `status` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB

Consider the following schema migration statement:

ALTER TABLE demo MODIFY id bigint UNSIGNED;

This statement can be executed as:

a vitess (aka online), managed online migration
a synchronous, managed schema change
a synchronous, unmanaged schema change

See DDL Strategies for discussion around the different options.

CREATE TABLE #

Use the standard MySQL CREATE TABLE syntax. The query goes through the same migration flow as ALTER TABLE does. The tablets eventually run the query directly on the MySQL backends.

DROP TABLE #

Use the standard MySQL DROP TABLE syntax. The query goes through the same migration flow as ALTER TABLE does. Tables are not immediately dropped. Instead, they are renamed into special names. For a while, the operation is revertible, after which the table lifecycle mechanism recognizes that the table is eligible for destruction, to then slowly and safely transition through lifecycle states into finally getting dropped.

CREATE VIEW, ALTER VIEW, DROP VIEW #

Use the standard MySQL syntax for these statements. All queries go through the same migration flow as above, and are revertible. Like DROP TABLE, a DROP VIEW statements does not immediately drop a view, but instead renamed is for safe keeping.

Statement transformations #

Vitess may modify your queries to qualify for online DDL statement. Modifications include:

A multi-table DROP statement is replaced by multiple DROP statements, each operating on a single table (and each tracked by its own job ID).
A CREATE INDEX statement is replaced by the equivalent ALTER TABLE statement.

ddl_strategy #

You will either set vtgate --ddl_strategy command line flag value, or will override it with the @@ddl_strategy session variable, or use the vtctldclient --ddl-strategy flag to control your schema migration strategy, and specifically, to enable and configure online DDL. Details in DDL Strategies.

Running, tracking and controlling Online DDL #

Vitess provides two interfaces to interacting with Online DDL:

SQL commands, via VTGate
Command line interface, via vtctldclient

Supported interactions are:

Running migrations (submitting Online DDL requests)
Tracking migrations
Launching a migration or all migrations, if explicitly set to postpone launch.
Completing a migration or all migrations, if explicitly set to postpone completion.
Cancelling a migration
Cancelling all pending migrations
Retrying a migration
Reverting a migration

See Audit and Control for a detailed breakdown. As quick examples:

Executing an Online DDL via VTGate/SQL #

mysql> set @@ddl_strategy='vitess';

mysql> alter table corder add column ts timestamp not null default current_timestamp;
+--------------------------------------+
| uuid                                 |
+--------------------------------------+
| bf4598ab_8d55_11eb_815f_f875a4d24e90 |
+--------------------------------------+

mysql> drop table customer;
+--------------------------------------+
| uuid                                 |
+--------------------------------------+
| 6848c1a4_8d57_11eb_815f_f875a4d24e90 |
+--------------------------------------+

Executing an Online DDL via vtctldclient #

$ vtctldclient ApplySchema --ddl-strategy "vitess" --sql "ALTER TABLE demo MODIFY id bigint UNSIGNED" commerce
a2994c92_f1d4_11ea_afa3_f875a4d24e90

You my run multiple migrations withing the same ApplySchema command:

$ vtctldclient ApplySchema --ddl-strategy "vitess" --sql "ALTER TABLE demo MODIFY id bigint UNSIGNED; CREATE TABLE sample (id int PRIMARY KEY); DROP TABLE another;" commerce
3091ef2a_4b87_11ec_a827_0a43f95f28a3

ApplySchema accepts the following flags:

--ddl_strategy: by default migrations run directly via MySQL standard DDL. This flag must be applied to indicate an online strategy. See also DDL strategies and ddl_strategy flags.
--migration_context <unique-value>: all migrations in a ApplySchema command are logically grouped via a unique context. A unique value will be supplied automatically. The user may choose to supply their own value, and it's their responsibility to provide with a unique value. Any string format is accepted. The context can then be used to search for migrations, via SHOW VITESS_MIGRATIONS LIKE 'the-context'. It is visible in SHOW VITESS_MIGRATIONS ... output as the migration_context column.

Migration flow and states #

A migration can be in any one of these states:

queued: a migration is submitted
ready: a migration is picked from the queue to run
running: a migration was started. It is periodically tested to be making progress.
complete: a migration completed successfully
failed: a migration started running and failed due to whatever reason
cancelled: a pending migration was cancelled

A migration is said to be pending if we expect it to run and complete. Pending migrations are those in queued, ready and running states.

For more about internals of the scheduler and how migration states are controlled, see Online DDL Scheduler

Configuration #

--retain_online_ddl_tables: (vttablet) determines how long vttablet should keep an old migrated table before purging it. Type: duration. Default: 24 hours.
Example: vttablet -retain_online_ddl_tables 48h
--migration_check_interval: (vttablet) interval between checks for submitted migrations. Type: duration Default: 1 minute.
Example: vttablet --migration_check_interval 30s
--enable_online_ddl: (vtgate) whether Online DDL operations are at all possible through VTGate. Type: boolean. Default: true
Example: vtgate --enable_online_ddl=false to disable access to Online DDL via VTGate.

Auto resume after failure #

VReplication based migrations (ddl_strategy="vitess") are failover agnostic. They automatically resume after either planned promotion (PlannedReparentShard), emergency promotion (EmergencyReparentShard) or completely external reparenting.

Once the new primary is in place and turns active, it auto-resumes the VReplication stream. The online DDL scheduler assumes ownership of the stream and follows it to completion.

The new primary must be available within 10 minutes, or else the migration is considered to be stale and is aborted.

Auto retry after failure #

Vitess management can issue an automated retry (starting the migration afresh).

which vttablet initiated the migration
how many times a migration has been retried
whether a migration failed due to a vttablet failure (as is the case in a failover scenario)

Vitess will auto-retry a failed migration when:

The migration failed due to a vttablet failure, and
it has not been retried (this is a temporary restriction)

The migration will be transitioned into queued state, as if the user requested a retry operation. Note that this takes place on a per-shard basis.

The primary use case is a primary failure and failover. The newly promoted tablet will be able to retry the migration that broke during the previous primary failure. To clarify, the migration will start anew, as at this time there is no mechanism to resume a broken migration.

Throttling #

vitess utilizes the tablet throttler, which is a cooperative throttler service based on replication lag. The tablet throttler automatically detects topology REPLICA tablets and adapts to changes in the topology. See Tablet throttler.

NOTE that at this time (and subject to change) the tablet throttler is disabled by default. Enable it via vtctldclient UpdateThrottlerConfig --enable <keyspace>. If the tablet throttler is disabled, schema migrations will not throttle on replication lag.

Table cleanup #

All ALTER strategies leave artifacts behind. Whether successful or failed, either the original table or the ghost table is left still populated at the end of the migration. Vitess explicitly makes sure the tables are not dropped at the end of the migration. This is for two reasons:

Make the table/data still available for a while, and
In MySQL prior to 8.0.23, a DROP TABLE operation can be dangerous in production as it commonly locks the buffer pool for a substantial period.

The tables are kept for 24 hours after migration completion. Vitess automatically cleans up those tables as soon as a migration completes (either successful or failed). You will normally not need to do anything.

Artifact tables are identifiable via artifacts column in a SHOW VITESS_MIGRATION ... command. You should generally not touch these tables. It's possible to DROP those tables with direct DDL strategy. Note that dropping tables in production can be risky and lock down your database for a substantial period of time. Dropping artifact tables also makes the migrations impossible to revert.

Noblob #

The noblob binlog row image is supported by the MoveTables and Reshard VReplication workflows.

NOTE replication does not support blobs as part of the PRIMARY KEY in MoveTables operations, or as part of the migration key in Online DDL operations.

<< Unmanaged Schema Changes Online DDL strategies >>