Challenges on upgrading a Cloud SQL Instance from MySQL 5.3 to 8

Upgrading a Cloud SQL instance from MySQL 5.3 to MySQL 8 can be challenging, particularly when managing large-scale data pipelines. Here, I’ll walk through a real-world use case where tools like Airbyte and Airflow were used to orchestrate data workflows while resolving upgrade issues.
Scenario: Data Integration Pipeline
Let’s consider a scenario where an e-commerce platform uses the following pipeline setup:
Cloud SQL (MySQL 5.3) is a source database for transactional data.
Airbyte extracts data from multiple sources like APIs, flat files, and databases (e.g., MySQL 5.3).
Airflow manages the workflow for data transformation and loading into downstream systems like BigQuery and Snowflake.
Challenges:
The organization wanted to upgrade the Cloud SQL instance to MySQL 8 for better performance and compatibility with modern SQL features. However, during the upgrade, they encountered issues with specific SQL modes:
NO_ZERO_IN_DATE, NO_ZERO_DATE, ERROR_FOR_DIVISION_BY_ZERO, NO_AUTO_CREATE_USER, NO_ENGINE_SUBSTITUTION |
These are SQL modes in MySQL, which control how the server handles certain behaviors and enforces rules. Here's a quick explanation of each:
NO_ZERO_IN_DATE:
Prevents invalid dates where the month or day is 00, e.g., 2024-00-15.
Invalid dates will throw errors.
NO_ZERO_DATE:
Prevents the use of 0000-00-00 as a date.
In strict mode, using 0000-00-00 will cause an error; otherwise, it gives a warning.
ERROR_FOR_DIVISION_BY_ZERO:
- Generates an error (instead of a warning) when a division by zero occurs in a query.
NO_AUTO_CREATE_USER:
Disallows the automatic creation of users when assigning privileges with GRANT.
Users must be explicitly created with CREATE USER.
NO_ENGINE_SUBSTITUTION:
Prevents MySQL from automatically switching to a different storage engine if the specified one is unavailable.
If a storage engine is unavailable, the query will fail.
These modes are set in MySQL's sql_mode
These errors interrupted the pipeline workflows, making it critical to resolve them without affecting the data pipeline's integrity.
Approach followed to Resolve the Issue
Step 1: Analyze the Pipeline Impact
Before upgrading, we should assess how the SQL mode changes would affect the pipeline:
Airbyte connectors required compatibility with MySQL 8.
Airflow DAGs (Directed Acyclic Graphs) need adjustments in their SQL scripts to handle stricter SQL modes in MySQL 8.
Step 2: Disable older version-dependent SQL Modes
Using Cloud SQL Console:
Navigated to Cloud SQL → Instance → Flags.
Disable the flag (specific to older mysql version): NO_AUTO_CREATE_USER
Upgrading to Mysql version: 8.0
For MySQL Direct Configuration:
- Logged into the MySQL root user:
mysql -u root -p |
- Updated the SQL mode globally:
SET GLOBAL sql_mode = NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'; |
- Verified the changes:
SELECT @@GLOBAL.sql_mode; |
Step 3: Test the Pipeline with Airbyte and Airflow
- Airbyte Testing:
Reconnected the MySQL source after upgrading to version 8.
Validated that data extraction jobs were successful without any SQL mode-related issues.
- Airflow DAG Testing:
Updated Airflow SQL scripts in the ETL pipeline to comply with MySQL 8 rules.
Executed DAGs to confirm smooth data transformation and loading into downstream systems.
Results
The Cloud SQL instance was successfully upgraded to MySQL 8
Airbyte and Airflow workflows resumed seamlessly with improved performance and reliability.
Key Takeaways
Pipeline Tools Integration: Always test the impact of database upgrades on pipeline tools like Airbyte and Airflow to avoid disruptions.
SQL Mode Adjustments: Disabling problematic SQL modes (e.g.,
NO_AUTO_CREATE_USER
) ensures a smooth upgrade process.Workflow Validation: Post-upgrade testing of extraction, transformation, and loading workflows ensures business continuity.
Subscribe to my newsletter
Read articles from Ranjith Subramani directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
