vtbackup

The Vitess Batch Command for Backup Maintenance

vtbackup is a batch comand to perform a single pass of backup maintenance for a shard.

When run periodically for each shard, vtbackup can ensure these configurable policies:

  • There is always a recent backup for the shard.
  • Old backups for the shard are removed.

Whatever system launches vtbackup is responsible for the following:

  • Running vtbackup with similar flags that would be used for a vttablet and mysqlctld in the target shard to be backed up.
  • Provisioning as much disk space for vtbackup as would be given to vttablet. The data directory MUST be empty at startup. Do NOT reuse a persistent disk.
  • Running vtbackup periodically for each shard, for each backup storage location.
  • Ensuring that at most one instance runs at a time for a given pair of shard and backup storage location.
  • Retrying vtbackup if it fails.
  • Alerting human operators if the failure is persistent.

Example Usage #

On a running Vitess cluster, the following command will create a backup using vtbackup for keyspace commerce and shard 0.

export TOPOLOGY_FLAGS="--topo_implementation etcd2 --topo_global_server_address localhost:2379 --topo_global_root /vitess/global"
export VTROOT="/tmp"

mkdir -p $VTROOT/{backups,socket}

vtbackup \
 $TOPOLOGY_FLAGS \
 --backup_storage_implementation file \
 --file_backup_storage_root $VTROOT/backups/vitess-local/ \
 --logtostderr=true \
 --mysql_socket $VTROOT/socket/mysql.sock \
 --port 15500 \
 --mysql_port 33306 \
 --init_shard=0 \
 --init_keyspace=commerce \
 --db_dba_user=vt_dba

While it is running, vtbackup serves debugging info and metrics on port 15500, and starts a mysqld daemon serving on port 33306.

Options #

NameTypeDefinition
--allow_first_backupbooleanAllow this job to take the first backup of an existing shard.
--alsologtostderrbooleanlog to standard error as well as files
--azblob_backup_account_key_filestringPath to a file containing the Azure Storage account key; if this flag is unset, the environment variable VT_AZBLOB_ACCOUNT_KEY will be used as the key itself (NOT a file path).
--azblob_backup_account_namestringAzure Storage Account name for backups; if this flag is unset, the environment variable VT_AZBLOB_ACCOUNT_NAME will be used.
--azblob_backup_buffer_sizeintThe memory buffer size to use in bytes, per file or stripe, when streaming to Azure Blob Service. (default 104857600)
--azblob_backup_container_namestringAzure Blob Container Name.
--azblob_backup_parallelismintAzure Blob operation parallelism (requires extra memory when increased -- a multiple of azblob_backup_buffer_size). (default 1)
--azblob_backup_storage_rootstringRoot prefix for all backup-related Azure Blobs; this should exclude both initial and trailing '/' (e.g. just 'a/b' not '/a/b/').
--backup_engine_implementationstringSpecifies which implementation to use for creating new backups (builtin or xtrabackup). Restores will always be done with whichever engine created a given backup. (default "builtin")
--backup_storage_block_sizeintif backup_storage_compress is true, backup_storage_block_size sets the byte size for each block while compressing (default is 250000). (default 250000)
--backup_storage_compressbooleanif set, the backup files will be compressed. (default true)
--backup_storage_implementationstringWhich backup storage implementation to use for creating and restoring backups.
--backup_storage_number_blocksintif backup_storage_compress is true, backup_storage_number_blocks sets the number of blocks that can be processed, at once, before the writer blocks, during compression (default is 2). It should be equal to the number of CPUs available for compression. (default 2)
--builtinbackup-file-read-buffer-sizeuintread files using an IO buffer of this many bytes. Golang defaults are used when set to 0.
--builtinbackup-file-write-buffer-sizeuintwrite files using an IO buffer of this many bytes. Golang defaults are used when set to 0. (default 2097152)
--builtinbackup_mysqld_timeoutdurationhow long to wait for mysqld to shutdown at the start of the backup. (default 10m0s)
--builtinbackup_progressdurationhow often to send progress updates when backing up large files. (default 5s)
--ceph_backup_storage_configstringPath to JSON config file for ceph backup storage. (default "ceph_backup_config.json")
--compression-engine-namestringcompressor engine used for compression. (default "pargzip")
--compression-levelintwhat level to pass to the compressor. (default 1)
--concurrencyint(init restore parameter) how many concurrent files to restore at once (default 4)
--consul_auth_static_filestringJSON File to read the topos/tokens from.
--db-credentials-filestringdb credentials file; send SIGHUP to reload this file
--db-credentials-serverstringdb credentials server type ('file' - file implementation; 'vault' - HashiCorp Vault implementation) (default "file")
--db-credentials-vault-addrstringURL to Vault server
--db-credentials-vault-pathstringVault path to credentials JSON blob, e.g.: secret/data/prod/dbcreds
--db-credentials-vault-role-mountpointstringVault AppRole mountpoint; can also be passed using VAULT_MOUNTPOINT environment variable (default "approle")
--db-credentials-vault-role-secretidfilestring Path to file containing Vault AppRole secret_id; can also be passed using VAULT_SECRETID environment variable
--db-credentials-vault-roleidstringVault AppRole id; can also be passed using VAULT_ROLEID environment variable
--db-credentials-vault-timeoutdurationTimeout for vault API operations (default 10s)
--db-credentials-vault-tls-castringPath to CA PEM for validating Vault server certificate
--db-credentials-vault-tokenfilestringPath to file containing Vault auth token; token can also be passed using VAULT_TOKEN environment variable
--db-credentials-vault-ttldurationHow long to cache DB credentials from the Vault server (default 30m0s)
--db_allprivs_passwordstringdb allprivs password
--db_allprivs_use_sslbooleanSet this flag to false to make the allprivs connection to not use ssl (default true)
--db_allprivs_userstringdb allprivs user userKey (default "vt_allprivs")
--db_app_passwordstringdb app password
--db_app_use_sslbooleanSet this flag to false to make the app connection to not use ssl (default true)
--db_app_userstringdb app user userKey (default "vt_app")
--db_appdebug_passwordstringdb appdebug password
--db_appdebug_use_sslbooleanSet this flag to false to make the appdebug connection to not use ssl (default true)
--db_appdebug_userstringdb appdebug user userKey (default "vt_appdebug")
--db_charsetstringCharacter set used for this tablet. (default "utf8mb4")
--db_conn_query_infobooleanenable parsing and processing of QUERY_OK info fields
--db_connect_timeout_msintconnection timeout to mysqld in milliseconds (0 for no timeout)
--db_dba_passwordstringdb dba password
--db_dba_use_sslbooleanSet this flag to false to make the dba connection to not use ssl (default true)
--db_dba_userstringdb dba user userKey (default "vt_dba")
--db_erepl_passwordstringdb erepl password
--db_erepl_use_sslbooleanSet this flag to false to make the erepl connection to not use ssl (default true)
--db_erepl_userstringdb erepl user userKey (default "vt_erepl")
--db_filtered_passwordstringdb filtered password
--db_filtered_use_sslbooleanSet this flag to false to make the filtered connection to not use ssl (default true)
--db_filtered_userstringdb filtered user userKey (default "vt_filtered")
--db_flagsuintFlag values as defined by MySQL.
--db_flavorstringFlavor overrid. Valid value is FilePos.
--db_hoststringThe host name for the tcp connection.
--db_portinttcp port
--db_repl_passwordstringdb repl password
--db_repl_use_sslbooleanSet this flag to false to make the repl connection to not use ssl (default true)
--db_repl_userstringdb repl user userKey (default "vt_repl")
--db_server_namestringserver name of the DB we are connecting to.
--db_socketstringThe unix socket to connect on. If this is specified, host and port will not be used.
--db_ssl_castringconnection ssl ca
--db_ssl_ca_pathstringconnection ssl ca path
--db_ssl_certstringconnection ssl certificate
--db_ssl_keystringconnection ssl key
--db_ssl_modeSslModeSSL mode to connect with. One of disabled, preferred, required, verify_ca & verify_identity.
--db_tls_min_versionstringConfigures the minimal TLS version negotiated when SSL is enabled. Defaults to TLSv1.2. Options: TLSv1.0, TLSv1.1, TLSv1.2, TLSv1.3.
--detachbooleandetached mode - run backups detached from the terminal
--disable-redo-logbooleanDisable InnoDB redo log during replication-from-primary phase of backup.
--emit_statsbooleanIf set, emit stats to push-based monitoring and stats backends
--external-compressorstringcommand with arguments to use when compressing a backup.
--external-compressor-extensionstringextension to use when using an external compressor.
--external-decompressorstringcommand with arguments to use when decompressing a backup.
--file_backup_storage_rootstringRoot directory for the file backup storage.
--gcs_backup_storage_bucketstringGoogle Cloud Storage bucket to use for backups.
--gcs_backup_storage_rootstringRoot prefix for all backup-related object names.
--grpc_auth_static_client_credsstringWhen using grpc_static_auth in the server, this file provides the credentials to use to authenticate with server.
--grpc_compressionstringWhich protocol to use for compressing gRPC. Default: nothing. Supported: snappy
--grpc_enable_tracingbooleanEnable gRPC tracing.
--grpc_initial_conn_window_sizeintgRPC initial connection window size
--grpc_initial_window_sizeintgRPC initial window size
--grpc_keepalive_timedurationAfter a duration of this time, if the client doesn't see any activity, it pings the server to see if the transport is still alive. (default 10s)
--grpc_keepalive_timeoutdurationAfter having pinged for keepalive check, the client waits for a duration of Timeout and if no activity is seen even after that the connection is closed. (default 10s)
--grpc_max_message_sizeintMaximum allowed RPC message size. Larger messages will be rejected by gRPC with the error 'exceeding the max size'. (default 16777216)
--grpc_prometheusbooleanEnable gRPC monitoring with Prometheus.
--incremental_from_posstringPosition of previous backup. Default: empty. If given, then this backup becomes an incremental backup from given position. If value is 'auto', backup taken from last successful backup position
--init_db_name_overridestring(init parameter) override the name of the db used by vttablet
--init_db_sql_filestringpath to .sql file to run after mysql_install_db
--init_keyspacestring(init parameter) keyspace to use for this tablet
--init_shardstring(init parameter) shard to use for this tablet
--initial_backupbooleanInstead of restoring from backup, initialize an empty database with the provided init_db_sql_file and upload a backup of that for the shard, if the shard has no backups yet. This can be used to seed a brand new shard with an initial, empty backup. If any backups already exist for the shard, this will be considered a successful no-op. This can only be done before the shard exists in topology (i.e. before any tablets are deployed).
--keep-alive-timeoutdurationWait until timeout elapses after a successful backup before shutting down.
--keep_logsbooleankeep logs for this long (using ctime) (zero to keep forever)
--keep_logs_by_mtimedurationkeep logs for this long (using mtime) (zero to keep forever)
--lock-timeoutdurationMaximum time for which a shard/keyspace lock can be acquired for (default 45s)
--log_backtrace_attraceLocationwhen logging hits line file:N, emit a stack trace (default :0)
--log_dirstringIf non-empty, write log files in this directory
--log_err_stacksbooleanlog stack traces for errors
--log_rotate_max_sizeuintsize in bytes at which logs are rotated (glog.MaxSize) (default 1887436800)
--logtostderrbooleanlog to standard error instead of files
--manifest-external-decompressorstringcommand with arguments to store in the backup manifest when compressing a backup with an external compression engine.
--min_backup_intervaldurationOnly take a new backup if it's been at least this long since the most recent backup.
--min_retention_countintAlways keep at least this many of the most recent backups in this backup storage location, even if some are older than the min_retention_time. This must be at least 1 since a backup must always exist to allow new backups to be made (default 1)
--min_retention_timedurationKeep each old backup for at least this long before removing it. Set to 0 to disable pruning of old backups.
--mycnf-filestringpath to my.cnf, if reading all config params from there
--mycnf_bin_log_pathstringmysql binlog path
--mycnf_data_dirstringdata directory for mysql
--mycnf_error_log_pathstringmysql error log path
--mycnf_general_log_pathstringmysql general log path
--mycnf_innodb_data_home_dirstringInnodb data home directory
--mycnf_innodb_log_group_home_dirstringInnodb log group home directory
--mycnf_master_info_filestringmysql master.info file
--mycnf_mysql_portintport mysql is listening on
--mycnf_pid_filestringmysql pid file
--mycnf_relay_log_index_pathstringmysql relay log index path
--mycnf_relay_log_info_pathstringmysql relay log info path
--mycnf_relay_log_pathstringmysql relay log path
--mycnf_secure_file_privstringmysql path for loading secure files
--mycnf_server_idintmysql server id of the server (if specified, mycnf-file will be ignored)
--mycnf_slow_log_pathstringmysql slow query log path
--mycnf_socket_filestringmysql socket file
--mycnf_tmp_dirstringmysql tmp directory
--mysql_portintmysql port (default 3306)
--mysql_server_versionstringMySQL server version to advertise.
--mysql_socketstringpath to the mysql socket
--mysql_timeoutdurationhow long to wait for mysqld startup (default 5m0s)
--portintport for the server
--pprofstringsenable profiling
--purge_logs_intervalbooleanhow often try to remove old logs (default 1h0m0s)
--remote_operation_timeoutdurationtime to wait for a remote operation (default 15s)
--restart_before_backupbooleanPerform a mysqld clean/full restart after applying binlogs, but before taking the backup. Only makes sense to work around xtrabackup bugs.
--s3_backup_aws_endpointstringendpoint of the S3 backend (region must be provided).
--s3_backup_aws_regionstringAWS region to use. (default "us-east-1")
--s3_backup_aws_retriesintAWS request retries. (default -1)
--s3_backup_force_path_styleforce the s3 path style.
--s3_backup_log_levelstringdetermine the S3 loglevel to use from LogOff, LogDebug, LogDebugWithSigning, LogDebugWithHTTPBody, LogDebugWithRequestRetries, LogDebugWithRequestErrors. (default "LogOff")
--s3_backup_server_side_encryptionstringserver-side encryption algorithm (e.g., AES256, aws:kms, sse_c:/path/to/key/file).
--s3_backup_storage_bucketstringS3 bucket to use for backups.
--s3_backup_storage_rootstringroot prefix for all backup-related object names.
--s3_backup_tls_skip_verify_certskip the 'certificate is valid' check for SSL connections.
--security_policystringthe name of a registered security policy to use for controlling access to URLs - empty means allow all for anyone (built-in policies: deny-all, read-only)
--sql-max-length-errorsinttruncate queries in error logs to the given length (default unlimited)
--sql-max-length-uiinttruncate queries in debug UIs to the given length (default 512) (default 512)
--stats_backendstringThe name of the registered push-based monitoring/stats backend to use
--stats_combine_dimensionsstringList of dimensions to be combined into a single "all" value in exported stats vars
--stats_common_tagsstringsComma-separated list of common tags for the stats backend. It provides both label and values. Example: label1:value1,label2:value2
--stats_drop_variablesstringVariables to be dropped from the list of exported variables.
--stats_emit_perioddurationInterval between emitting stats to all registered backends (default 1m0s)
--stderrthresholdseveritylogs at or above this threshold go to stderr (default 1)
--tablet_manager_grpc_castringthe server ca to use to validate servers when connecting
--tablet_manager_grpc_certstringthe cert to use to connect
--tablet_manager_grpc_concurrencyintconcurrency to use to talk to a vttablet server for performance-sensitive RPCs (like ExecuteFetchAs{Dba,AllPrivs,App}) (default 8)
--tablet_manager_grpc_connpool_sizeintnumber of tablets to keep tmclient connections open to (default 100)
--tablet_manager_grpc_crlstringthe server crl to use to validate server certificates when connecting
--tablet_manager_grpc_keystringthe key to use to connect
--tablet_manager_grpc_server_namestringthe server name to use to validate server certificate
--tablet_manager_protocolstringProtocol to use to make tabletmanager RPCs to vttablets. (default "grpc")
--topo_consul_lock_delaydurationLockDelay for consul session. (default 15s)
--topo_consul_lock_session_checksstringList of checks for consul session. (default "serfHealth")
--topo_consul_lock_session_ttlstringTTL for consul session.
--topo_consul_watch_poll_durationdurationtime of the long poll for watch queries. (default 30s)
--topo_etcd_lease_ttlintLease TTL for locks and leader election. The client will use KeepAlive to keep the lease going. (default 30)
--topo_etcd_tls_castringpath to the ca to use to validate the server cert when connecting to the etcd topo server
--topo_etcd_tls_certstringpath to the client cert to use to connect to the etcd topo server, requires topo_etcd_tls_key, enables TLS
--topo_etcd_tls_keystringpath to the client key to use to connect to the etcd topo server, enables TLS
--topo_global_rootstringthe path of the global topology data in the global topology server
--topo_global_server_addressstringthe address of the global topology server
--topo_implementationstringthe topology implementation to use
--topo_zk_auth_filestringauth to use when connecting to the zk topo server, file contents should be :, e.g., digest:user:pass
--topo_zk_base_timeoutdurationzk base timeout (see zk.Connect) (default 30s)
--topo_zk_max_concurrencyintmaximum number of pending requests to send to a Zookeeper server. (default 64)
--topo_zk_tls_castringthe server ca to use to validate servers when connecting to the zk topo server
--topo_zk_tls_certstringthe cert to use to connect to the zk topo server, requires topo_zk_tls_key, enables TLS
--topo_zk_tls_keystringthe key to use to connect to the zk topo server, enables TLS
--vLevellog level for V logs
-v, --versionprint binary version
--vmodulemoduleSpeccomma-separated list of pattern=N settings for file-filtered logging
--xbstream_restore_flagsstringFlags to pass to xbstream command during restore. These should be space separated and will be added to the end of the command. These need to match the ones used for backup e.g. --compress / --decompress, --encrypt / --decrypt
--xtrabackup_backup_flagsstringFlags to pass to backup command. These should be space separated and will be added to the end of the command
--xtrabackup_prepare_flagsstringFlags to pass to prepare command. These should be space separated and will be added to the end of the command
--xtrabackup_root_pathstringDirectory location of the xtrabackup and xbstream executables, e.g., /usr/bin
--xtrabackup_stream_modestringWhich mode to use if streaming, valid values are tar and xbstream. Please note that tar is not supported in XtraBackup 8.0 (default "tar")
--xtrabackup_stripe_block_sizeuintSize in bytes of each block that gets sent to a given stripe before rotating to the next stripe (default 102400)
--xtrabackup_stripesuintIf greater than 0, use data striping across this many destination files to parallelize data transfer and decompression
--xtrabackup_userstringUser that xtrabackup will use to connect to the database server. This user must have all necessary privileges. For details, please refer to xtrabackup documentation.