“Behind every successful application is a clean and well-maintained database.”
“In the realm of database management, Percona Toolkit is the Swiss Army knife for administrators, unlocking unparalleled efficiency.”
Generally when we do clean-up activities like data purging or moving, writing SQL queries is going to be costly and very time consuming. That is where Percona Toolkit comes in handy. It can perform complex tasks for MySQL, MariaDB, MongoDB, and PostgreSQL servers and systems. The toolkit was originally designed to perform advanced database and system administration tasks which one would not do manually. Percona Toolkit is free and open source. You can download packages from the Percona website. It is derived from Maatkit and Aspersa, two of the best-known toolkits for MySQL server administration.
In the world of databases, maintaining an existing database on an organizational level takes a lot of effort. In the effort to minimize our troubles in moving and purging our database we came across the Percona Toolkit. We specifically used one tool i.e. the pt-archiver tool for all database related work that was to be done on the MySQL databases.
The one that has been extensively used for the data purging and the data migration is the pt-archiver tool. This tool is responsible for removing data from the table entirely or migrating data from one table to another.
The setup and installation for the toolkit is pretty easy but the issue comes in running the script for the data migration/purging. We faced troubles there due to lesser knowledge on the variety of options that come in with the tool to help mitigate any and all risks that could cause system wide stress on the database and cause the application to flaw out.
The installation can be done by the following command: -
apt-get install percona-toolkit
Why did we need the pt-archiver?
We needed the pt-archiver tool for the following purposes: -
- We needed to purge certain tables currently not being used in our application set up. Simple DELETE MySQL commands won’t suffice. They are very slow and will put the database machine under stress causing the application to break on the end user’s side.
- We needed to take backup of major tables who have rows entries in the millions or even billions. Those backups are currently stored in s3 of Amazon Web Services (AWS).
- We also needed to create new partitioned versions of old un-partitioned tables and migrate the data from old to new tables.
- Apart from the above 3 once the tables are partitioned, we won’t be needing the data in the DBs all the time. We would automate a process to take partitioned tables’ chunked backups and store in s3 purging that data from the table.
These prior 3 use cases we needed were met very satisfyingly using pt-archiver. It made the data manipulation highly effective, easy and efficient without causing system-wide downtime.
How do we use the pt-archiver?
Purging table data using pt-archiver: -
The code that was used for purging data in the table of the database was as follows: -
pt-archiver - source h=<host_name>,D=<db_name>,t=<table_name>,u=<username>,p=<password> - purge - where '<condition for which section of the table to choose from based on the primary key index>' - bulk-delete - limit 100 - commit-each - sleep 2 - dry-run - no-check-charset
The above command deletes rows in the specified table name from the database in bulk in batches of 100 at a time. All the commands like — limit and — sleep and — dry-run are options in the pt-archiver tool for the command to not break the normal functioning of the system.
limit is used for limiting how many rows in the table that satisfy the “where” condition can be selected and the task is executed.
sleep is used for the execution to sleep for 2 seconds after every batch purge.
dry-run is used to see the commands that the script is executing for the data purge.
Make sure the column or field on which the condition is to be put should be a primary key or a part of the primary key. Also make sure the user has all necessary permissions for the script to be run.
Moving of data between a source and destination: -
Here we will move the data from one source table in a database to a destination table in a database. This requires all details of the source database and the destination ones as well. In such events, there will be a good amount of replica lags and since our system has replicas on which multiple business scripts are being run, we cannot have a huge replica lag between the master and slave replicas. The scripts are to be always run on master since all of these are write queries or scripts and not of the replica. The script used for the moving of data is as follows: -
pt-archiver --source h=<source_hostname>,D=<source_database_name>,t=<source_table_name>,u=<source_username>,p=<source_password> --dest h=<destination_hostname>,D=<destination_database_name>,t=<destination_table_name>,u=<destination_username>,p=<destination_password> --where '<where condition>' --bulk-insert --check-slave-lag h=<slave_hostname>,u=<slave_username>,p=<slave_password> --max-lag 2 --limit 10000 --commit-each --sleep-coef 1.0 --no-check-charset
Here we specify the source and destination details where the table data was to be moved. The original table was unpartitioned and was thus renamed to <table_name_old> and a new table with updated schema and <table_name> was created that was partitioned.
The rider_location table that stored the riders’ locations had around 2–3 billions of rows of data and running a partitioning query on MySQL would have made the systems go down. What we did instead was create a new partitioned table keeping past one month’s partitions and creating new partitions everyday on a daily basis and then moved data from the old table to new. That saved us time and money and also resources.
In the above query you can see the option check-slave-lag that takes care of the replication lag. Due to the master db being hit with a resource extensive query using the pt-archiver results in a huge significant lag in the slave replicas which affected the app used by the rider fleet. Any such instance causes stress on the system performance and the end user is affected. Here in the above script, the replication lag has been controlled to a maximum of 2 seconds. In the whole monitoring process, the max it went was 3s but never crossed that threshold. For the user and password for the slave replica to work, the user should have replication client privileges.
For more content on pt-archiver and other tools in the Percona Toolkit, please refer to: -
“A clean database is a silent guardian, ensuring the integrity of your data.”
- From the above use cases, now our database is perfectly memory managed.
- We don’t have stale unused and useless data lying around in our tables.
- All tables are perfectly backed up and the purging of partitions and creations of new ones are all automated.
- There are perfect alerts in place and any shortcoming would be alerted to the technology team as and when they occur.