How to Dump JSON to a file in Python

Working with JSON is a daily task for many developers. Whether you’re storing configuration, caching responses, or sharing data between services, writing JSON to a file is essential. Python’s built-in json module offers the json.dump() function to serialize objects directly into a file stream.

In this guide, we’ll explore how to use json.dump(), customize output, handle errors, and manage large datasets. By the end, you’ll have practical examples and best practices to ensure your JSON files are clean, efficient, and easy to work with.

Why Use json.dump?

The json module is part of Python’s standard library, so no extra packages are needed. The json.dump() function wraps serialization and file I/O in one call. Instead of doing this:

import json

data = {'name': 'Alice', 'age': 30}

json_str = json.dumps(data)
with open('user.json', 'w') as f:
    f.write(json_str)

You can write directly:

import json

data = {'name': 'Alice', 'age': 30}
with open('user.json', 'w') as f:
    json.dump(data, f)

Benefits:

Simplicity: One function call handles both serialization and writing.
Performance: Streams data to disk without building huge strings in memory.
Flexibility: Supports indentation, separators, custom encoders.

Tip: For quick experiments, you can also use json.dumps() to get a string and inspect it before writing.

Basic Usage of json.dump

Let’s start with the simplest case.

import json

sample = {
    "users": [
        {"id": 1, "name": "Alice"},
        {"id": 2, "name": "Bob"}
    ],
    "active": True
}

with open('data.json', 'w') as file:
    json.dump(sample, file)

This code will produce a file data.json containing:

{"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}], "active": true}

Adding Readable Formatting

To make the file human-readable, use the indent parameter:

with open('data_pretty.json', 'w') as f:
    json.dump(sample, f, indent=4)

This yields:

{
    "users": [
        {
            "id": 1,
            "name": "Alice"
        },
        {
            "id": 2,
            "name": "Bob"
        }
    ],
    "active": true
}

Customizing JSON Output

You may need to tweak the separators, sort keys, or handle non-serializable types.

Compact Output

Use custom separators to remove spaces:

with open('compact.json', 'w') as f:
    json.dump(sample, f, separators=(',', ':'), sort_keys=True)

Result:

{"active":true,"users":[{"id":1,"name":"Alice"},{"id":2,"name":"Bob"}]}

Sorting Keys

Sorted keys help with diffing files under version control:

with open('sorted.json', 'w') as f:
    json.dump(sample, f, indent=2, sort_keys=True)

Handling Custom Objects

If you have objects like datetime or custom classes, you need a converter:

import json
from datetime import datetime

class Event:
    def __init__(self, name):
        self.name = name
        self.time = datetime.utcnow()

    def to_dict(self):
        return {"name": self.name, "time": self.time.isoformat()}


def custom_encoder(obj):
    if hasattr(obj, 'to_dict'):
        return obj.to_dict()
    raise TypeError(f"Type {type(obj)} not serializable")

event = Event('Launch')
with open('event.json', 'w') as f:
    json.dump(event, f, default=custom_encoder, indent=2)

Best Practices and Error Handling

Always Use a Context Manager

with open('file.json', 'w') as f:
    json.dump(data, f)

This ensures files are closed properly, even if an error occurs.

Catch Serialization Errors

Wrap your dump call and catch TypeError:

try:
    with open('out.json', 'w') as f:
        json.dump(data, f)
except TypeError as e:
    print(f"Failed to serialize: {e}")

Validate Your Data

If you need to ensure valid JSON before writing, parse the dumped string:

import json

def is_valid_json(obj):
    try:
        json.loads(json.dumps(obj))
        return True
    except (TypeError, ValueError):
        return False

if is_valid_json(data):
    with open('valid.json', 'w') as f:
        json.dump(data, f)

Note: For advanced parsing and validation, check our JSON parser guide.

Working with Large Datasets

When writing very large lists or nested structures, streaming can help avoid high memory usage.

Using Iterators

Instead of building one huge list, write items one by one:

import json

def generate_items(n):
    for i in range(n):
        yield {"index": i}

with open('stream.json', 'w') as f:
    f.write('[\n')
    first = True
    for item in generate_items(1000000):
        if not first:
            f.write(',\n')
        json.dump(item, f)
        first = False
    f.write('\n]')

This pattern writes each object without holding the entire list in memory.

Compression

To save disk space, compress output on the fly:

import gzip
import json

data = {'big': [i for i in range(1000000)]}

with gzip.open('data.json.gz', 'wt', encoding='utf-8') as gz:
    json.dump(data, gz)

Conclusion

Writing JSON data to files in Python is straightforward with json.dump(). You can control formatting, handle custom types, and scale to large datasets. Remember to use context managers for file safety, catch serialization errors, and validate your data when needed.

By mastering these techniques, you’ll write clean, maintainable JSON files that integrate smoothly with other services and tools. Now, go ahead and elevate your data workflows—your future self will thank you!