Reading CSV Files and Inserting Data into MongoDB with Python

Managing data efficiently is a crucial aspect of any data-driven application. Often, data comes in CSV format, and a common task is to read this data and insert it into a database like MongoDB. In this blog, we will explore how to achieve this using Python.

Why Use MongoDB with Python?

  1. Scalability: MongoDB is designed to scale horizontally, making it ideal for handling large volumes of data.

  2. Flexibility: MongoDB's schema-less nature allows for a flexible and dynamic data model.

  3. Ease of Use: Python, with its rich ecosystem of libraries, provides an easy and efficient way to interact with MongoDB.

Prerequisites

Before we start, ensure you have the following installed on your system:

  1. Python: Download and install from python.org.

  2. MongoDB: Install MongoDB by following the instructions at mongodb.com.

  3. Pandas: Install Pandas for data manipulation:

     pip install pandas
    
  4. PyMongo: Install PyMongo to interact with MongoDB:

     pip install pymongo
    

Step-by-Step Guide

Step 1: Set Up MongoDB

Ensure MongoDB is running on your local machine. You can start the MongoDB service using the following command:

mongod

Step 2: Create a CSV File

Create a sample CSV file named data.csv with the following content:

name,age,city
Alice,30,New York
Bob,25,Los Angeles
Charlie,35,Chicago

Step 3: Read CSV File with Pandas

Use Pandas to read the CSV file:

import pandas as pd

# Read the CSV file
df = pd.read_csv('data.csv')

# Display the DataFrame
print(df)

Step 4: Connect to MongoDB with PyMongo

Connect to the MongoDB server and create a database and collection:

from pymongo import MongoClient

# Connect to the MongoDB server
client = MongoClient('mongodb://localhost:27017/')

# Create a database
db = client['mydatabase']

# Create a collection
collection = db['mycollection']

Step 5: Insert Data into MongoDB

Convert the DataFrame to a list of dictionaries and insert the data into the MongoDB collection:

# Convert DataFrame to list of dictionaries
data = df.to_dict(orient='records')

# Insert data into MongoDB
collection.insert_many(data)

Full Code Example

Here is the complete code to read a CSV file and insert its data into MongoDB:

import pandas as pd
from pymongo import MongoClient

# Read the CSV file
df = pd.read_csv('data.csv')

# Connect to the MongoDB server
client = MongoClient('mongodb://localhost:27017/')

# Create a database
db = client['mydatabase']

# Create a collection
collection = db['mycollection']

# Convert DataFrame to list of dictionaries
data = df.to_dict(orient='records')

# Insert data into MongoDB
collection.insert_many(data)

print("Data inserted successfully!")
Conclusion
Reading data from CSV files and inserting it into MongoDB using Python is a straightforward process. With the help of Pandas and PyMongo, you can efficiently manage and manipulate data, making your applications more powerful and flexible.

This guide covered the basics, but MongoDB and Python offer many more advanced features and capabilities. Explore further to take full advantage of these powerful tools.

11
Subscribe to my newsletter

Read articles from ByteScrum Technologies directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

ByteScrum Technologies
ByteScrum Technologies

Our company comprises seasoned professionals, each an expert in their field. Customer satisfaction is our top priority, exceeding clients' needs. We ensure competitive pricing and quality in web and mobile development without compromise.