Reshard

Reshard a keyspace to achieve horizontal scaling

This documentation is for a new (v2) set of vtctld commands that start in Vitess 11.0. See RFC for more details.

Command #

Reshard <options> <action> <workflow identifier>

or

Reshard [-source_shards=<source_shards>] [-target_shards=<target_shards>] [-cells=<cells>] [-tablet_types=<source_tablet_types>]  [-skip_schema_copy] [-auto_start] [-stop_after_copy] [-timeout=timeoutDuration] [-reverse_replication] [-keep_data] <action> <keyspace.workflow>

Description #

Reshard is used to create and manage workflows to horizontally shard an existing keyspace. The source keyspace can be unsharded or sharded.

Parameters #

action #

Reshard is an “umbrella” command. The action sub-command defines the operation on the workflow. Action must be one of the following: Create, Complete, Cancel, SwitchTraffic, ReverseTrafffic, Show, or Progress

options #

Each action has additional options/parameters that can be used to modify its behavior.

actions are common to both MoveTables and Reshard v2 workflows. Only the create action has different parameters, all other actions have common options and similar semantics. These actions are documented separately.

source_shards #

mandatory

Comma separated shard names to reshard from.

target_shards #

mandatory

Comma separated shard names to reshard to.

-cells #

optional\

Comma separated Cell(s) or CellAlias(es) to replicate from.

-tablet_types #

optional
default empty

Source Vitess tablet_type, or comma separated list of tablet types, that should be used for choosing source tablet(s) for the reshard.

Note: If replicating from primary, you must explicitly use -tablet_types=primary. If not specified, it defaults to the tablet type(s) specified by the -vreplication_tablet_type VTTablet command line flag. -vreplication_tablet_type defaults to “PRIMARY,REPLICA”.

-skip_schema_copy #

optional
default false

If true the source schema is copied to the target shards. If false, you need to create the tables before calling reshard.

-auto_start #

optional default true

Normally the workflow starts immediately after it is created. If this flag is set to false then the workflow is in a Stopped state until you explicitly start it.

Uses #
  • allows updating the rows in _vt.vreplication after MoveTables has setup the streams. For example, you can add some filters to specific tables or change the projection clause to modify the values on the target. This provides an easier way to create simpler Materialize workflows by first using MoveTables with auto_start false, updating the BinlogSource as required by your Materialize and then start the workflow.
  • changing the copy_state and/or pos values to restart a broken MoveTables workflow from a specific point of time.

-stop_after_copy #

optional default false

If set, the workflow will stop once the Copy phase has been completed i.e. once all tables have been copied and VReplication decides that the lag is small enough to start replicating, the workflow state will be set to Stopped.

Uses #
  • If you just want a consistent snapshot of all the tables you can set this flag. The workflow will stop once the copy is done and you can then mark the workflow as Completed

-timeout #

optional
default 30s

For primary tablets, SwitchTraffic first stops writes on the source primary and waits for the replication to the target to catchup with the point where the writes were stopped. If the wait time is longer than timeout the command will error out. For setups with high write qps you may need to increase this value.

-reverse_replication #

optional
default true

SwitchTraffic for primary tablet types, by default, starts a reverse replication stream with the current target as the source, replicating back to the original source. This enables a quick and simple rollback using ReverseTraffic. This reverse workflow name is that of the original workflow concatenated with _reverse.

If set to false these reverse replication streams will not be created and you will not be able to rollback once you have switched write traffic over to the target.

-keep_data #

optional
default false

Usually, the target data (tables or shards) are deleted by Cancel. If this flag is used with MoveTables, target tables will not be deleted and, with Reshard, target shards will not be dropped.

workflow identifier #

All workflows are identified by targetKeyspace.workflow where targetKeyspace is the name of the keyspace to which the tables are being moved. workflow is a name you assign to the Reshard workflow to identify it.

The most basic Reshard Workflow lifecycle #

  1. Initiate the migration using Create
    Reshard -source_shards=<source_shards> -target_shards=<target_shards> Create <keyspace.workflow>
  2. Monitor the workflow using Show or Progress
    Reshard Show <keyspace.workflow> or
    Reshard Progress <keyspace.workflow>
  3. Confirm that data has been copied over correctly using VDiff
  4. Cutover to the target keyspace with SwitchTraffic
    Reshard SwitchTraffic <keyspace.workflow>
  5. Cleanup vreplication artifacts and source shards with Complete
    Reshard Complete <keyspace.workflow>