Nested JSON to CSV with python pandas
Nested JSON objects have one or more levels of additional objects or arrays. CSV, on the other hand, is a flat structure with rows and columns. The goal is to "flatten" the JSON structure, converting nested elements into a format that can be represented in columns. We can use pandas
and json
to flatten nested JSON and export it to a CSV file.
Example Nested JSON
[
{
"name": "John Doe",
"email": "johndoe@example.com",
"location": {
"city": "New York",
"country": "USA"
},
"skills": ["Python", "Data Analysis"]
},
{
"name": "Jane Doe",
"email": "janedoe@example.com",
"location": {
"city": "San Francisco",
"country": "USA"
},
"skills": ["JavaScript", "React"]
}
]
Install Required Libraries
Ensure you have pandas installed. If not, you can install it using pip:
pip install pandas
Load and flatten JSON
First, load your JSON data. You can load it from a file or directly as a string (for this example, we'll assume it's loaded into a variable).
import json
from pandas import json_normalize
with open('data.json') as file:
data = json.load(file)
data = json.loads(json_string)
# Use pandas to normalize the nested JSON structure
flat_data = json_normalize(data, sep='_')
print(flat_data)
The sep
parameter specifies the separator to use when flattening the JSON. This example uses an underscore to separate nested fields (e.g., "location_city").
Convert to CSV
Finally, convert the flattened DataFrame to a CSV file.
flat_data.to_csv('output.csv', index=False)
For deeply nested structures or arrays, you might need to preprocess the data before flattening it or use a more complex approach with custom functions to handle the specific structure of your JSON.
Subscribe to my newsletter
Read articles from commabot directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
commabot
commabot
Researching and writing articles about document processing.