Change Character set and collation in Tables and Columns in Mysql 8 

In MySQL 8, we can change the default character set and collation of a table without locking the entire table, allowing us to perform the operation without downtime. This can be achieved by using the ALGORITHM=INPLACE option. However, it's important to note that changing the character set of a column while allowing concurrent DML (Data Manipulation Language) operations is not supported with ALGORITHM=INPLACE. In such cases, we can use the ALGORITHM=COPY option, which reduces downtime. With ALGORITHM=COPY, MySQL creates a temporary copy of the table, performs the necessary DDL (Data Definition Language) operation on the copy, and then replaces the original table with the modified copy. During this process, the original table remains accessible for concurrent read and write operations, minimizing the impact on ongoing activities.


Let's consider an example using a table named "employees" with a default character set of "latin1" and collation of "latin1_german1_ci". For testing purposes, we have a stored procedure that inserts a bulk number of rows into the table. This allows us to observe if any locks occur during the character set and collation changes in the table and column. The test results show that there was no lock during the change of character set and collation at the table level. However, when changing the character set of a column, there was a brief lock for a few seconds.
 

select version();

+-----------+

| version() |

+-----------+

| 8.0.23 |

+-----------+

 

CREATE TABLE employees (

id INT AUTO_INCREMENT PRIMARY KEY,

name VARCHAR(100) CHARACTER SET latin1 COLLATE latin1_german1_ci,

age INT,

department VARCHAR(100)

) DEFAULT CHARACTER SET latin2 COLLATE latin2_bin;

 


-- Check the default character set and collation

SELECT TABLE_SCHEMA,

            TABLE_NAME,

            CCSA.CHARACTER_SET_NAME AS DEFAULT_CHAR_SET,

            COLUMN_NAME,

            COLUMN_TYPE,

            C.CHARACTER_SET_NAME, CCSA.COLLATION_NAME, ENGINE

    FROM information_schema.TABLES AS T

    JOIN information_schema.COLUMNS AS C USING (TABLE_SCHEMA, TABLE_NAME)

    JOIN information_schema.COLLATION_CHARACTER_SET_APPLICABILITY AS CCSA

         ON (T.TABLE_COLLATION = CCSA.COLLATION_NAME)

   WHERE TABLE_SCHEMA='testdb'

     AND C.DATA_TYPE IN ('enum', 'varchar', 'char', 'text', 'mediumtext', 'longtext')

   ORDER BY TABLE_SCHEMA,

               TABLE_NAME,

               COLUMN_NAME, CCSA.COLLATION_NAME, ENGINE;

 

TABLE_SCHEMA|TABLE_NAME|DEFAULT_CHAR_SET|COLUMN_NAME|COLUMN_TYPE |CHARACTER_SET_NAME|COLLATION_NAME|ENGINE|

------------+----------+----------------+-----------+------------+------------------+--------------+------+---+------------------+--------------+------+---+------------------+--------------+------+

testdb                |employees |latin2                            |department            |varchar(100)  |latin2                              |latin2_bin             |InnoDB|

testdb                |employees |latin2                            |name                     |varchar(100) |latin1                      |latin2_bin             |InnoDB|                                

                                               

 -- Create a procedure for bulk insert in employees table

CREATE PROCEDURE insert_employees(IN numRecords INT)

BEGIN

DECLARE counter INT DEFAULT 0;

DECLARE employeeName VARCHAR(100);

DECLARE employeeAge INT;

DECLARE employeeDepartment VARCHAR(100);

 

WHILE counter < numRecords DO

     SET employeeName = CONCAT('Employee', counter);

     SET employeeAge = FLOOR(RAND() * 50) + 20;

     SET employeeDepartment = CONCAT('Department', FLOOR(RAND() * 10) + 1);

 

     INSERT INTO employees (name, age, department)

     VALUES (employeeName, employeeAge, employeeDepartment);

 

     SET counter = counter + 1;

 

     SELECT CONCAT('Inserted record: ', employeeName, ', ', employeeAge, ', ', employeeDepartment) AS 'Status';

END WHILE;

END;

 

-- Call the stored procedure to insert records 

CALL insert_employees(100000);

 

-- Execute the Alter statement in another terminal while during the insert operation 

ALTER TABLE employees CHARACTER SET = utf8mb4 COLLATE utf8mb4_general_ci, ALGORITHM=INPLACE, LOCK=NONE;

Query OK, 0 rows affected (0.21 sec)

Records: 0  Duplicates: 0  Warnings: 0

 

-- Check the default character set and collation

SELECT TABLE_SCHEMA,

            TABLE_NAME,

            CCSA.CHARACTER_SET_NAME AS DEFAULT_CHAR_SET,

            COLUMN_NAME,

            COLUMN_TYPE,

            C.CHARACTER_SET_NAME, CCSA.COLLATION_NAME, ENGINE

    FROM information_schema.TABLES AS T

    JOIN information_schema.COLUMNS AS C USING (TABLE_SCHEMA, TABLE_NAME)

    JOIN information_schema.COLLATION_CHARACTER_SET_APPLICABILITY AS CCSA

         ON (T.TABLE_COLLATION = CCSA.COLLATION_NAME)

   WHERE TABLE_SCHEMA='testdb'

     AND C.DATA_TYPE IN ('enum', 'varchar', 'char', 'text', 'mediumtext', 'longtext')

   ORDER BY TABLE_SCHEMA,

               TABLE_NAME,

               COLUMN_NAME, CCSA.COLLATION_NAME, ENGINE;

                                               

TABLE_SCHEMA|TABLE_NAME|DEFAULT_CHAR_SET|COLUMN_NAME|COLUMN_TYPE |CHARACTER_SET_NAME|COLLATION_NAME    |ENGINE|

------------+----------+----------------+-----------+------------+------------------+------------------+------+--+------------------+------+--+------------------+------+--+------------------+------+

testdb                |employees  |utf8mb4                   |department       |varchar(100)    |latin2                                  |utf8mb4_general_ci|InnoDB|

testdb                |employees  |utf8mb4                    |name                 |varchar(100)   |latin1                                |utf8mb4_general_ci|InnoDB|                                       

The test shows no lock during the change of character set and collation in Table.                                       

 

 -- Call the stored procedure to insert records 

CALL insert_employees(100000);

 

-- Execute the Alter statement in another terminal while during the insert operation  

ALTER TABLE employees modify name VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci, ALGORITHM=COPY, LOCK=SHARED;

Query OK, 803155 rows affected (7.29 sec)

Records: 803155  Duplicates: 0  Warnings: 0

 
-- Check the default character set and collation

SELECT TABLE_SCHEMA,

            TABLE_NAME,

            CCSA.CHARACTER_SET_NAME AS DEFAULT_CHAR_SET,

            COLUMN_NAME,

            COLUMN_TYPE,

            C.CHARACTER_SET_NAME, CCSA.COLLATION_NAME, ENGINE

    FROM information_schema.TABLES AS T

    JOIN information_schema.COLUMNS AS C USING (TABLE_SCHEMA, TABLE_NAME)

    JOIN information_schema.COLLATION_CHARACTER_SET_APPLICABILITY AS CCSA

         ON (T.TABLE_COLLATION = CCSA.COLLATION_NAME)

   WHERE TABLE_SCHEMA='testdb'

     AND C.DATA_TYPE IN ('enum', 'varchar', 'char', 'text', 'mediumtext', 'longtext')

   ORDER BY TABLE_SCHEMA,

               TABLE_NAME,

               COLUMN_NAME, CCSA.COLLATION_NAME, ENGINE;

                                               

 

TABLE_SCHEMA|TABLE_NAME|DEFAULT_CHAR_SET|COLUMN_NAME|COLUMN_TYPE |CHARACTER_SET_NAME|COLLATION_NAME    |ENGINE|

------------+----------+----------------+-----------+------------+------------------+------------------+------+-------------+------------------+------+-------------+------------------+------+----+------+

testdb                 |employees  |utf8mb4                   |department       |varchar(100)    |latin2                                |utf8mb4_general_ci|InnoDB|

testdb                  |employees |utf8mb4                     |name                 |varchar(100)     |utf8mb4                      |utf8mb4_general_ci|InnoDB|

 

The test show that there was few seconds lock during the change of character set in the column


ALGORITHM=INPLACE Vs ALGORITHM=COPY:

In MySQL 8, the ALGORITHM=INPLACE and ALGORITHM=COPY options are used in the context of online DDL (Data Definition Language) operations. These options determine how the table is modified during certain operations while allowing concurrent read and write access to the table.

ALGORITHM=INPLACE: This option performs the DDL operation in-place, directly modifying the table's structure without creating a temporary copy. It aims to minimize the impact on concurrent operations. However, not all operations can be performed in-place, and the availability depends on the specific operation and the storage engine used for the table.

ALGORITHM=COPY: This option creates a temporary copy of the table, performs the DDL operation on the copy, and then replaces the original table with the modified copy. While the copy is being modified, the original table remains accessible for concurrent read and write operations. This option is more versatile and allows for a broader range of DDL operations to be performed online.

 

The choice between ALGORITHM=INPLACE and ALGORITHM=COPY depends on the specific operation you need to perform and the capabilities of your table's storage engine. Some operations can only be performed using ALGORITHM=COPY, such as renaming a column or modifying the data type of a column.


LOCK=NONE Vs LOCK=shared:

In MySQL 8, the LOCK option is used in conjunction with online DDL (Data Definition Language) operations to control table locking behaviour during the execution of the operation.

LOCK=NONE: This option indicates that no explicit table-level locks will be acquired during the DDL operation. It allows concurrent read and write access to the table while the operation is in progress. This option is useful when you want to minimize the impact on concurrent operations and ensure uninterrupted access to the table.

LOCK=shared: This option acquires a shared lock on the table during the DDL operation. It allows concurrent read access to the table but restricts write access. Other sessions can still read from the table while the operation is in progress but write operations will be blocked until the shared lock is released. This option is useful when you want to allow read access but prevent write modifications during the DDL operation.

 

When executing DDL statements, it's essential to consider the potential impact on the database's performance and availability, especially for large tables or critical systems. It's always recommended to test these operations in a controlled environment and have proper backups in place before making any changes to production databases.