Reading CSV Files and Inserting Data into MongoDB with Python
Managing data efficiently is a crucial aspect of any data-driven application. Often, data comes in CSV format, and a common task is to read this data and insert it into a database like MongoDB. In this blog, we will explore how to achieve this using Python.
Why Use MongoDB with Python?
Scalability: MongoDB is designed to scale horizontally, making it ideal for handling large volumes of data.
Flexibility: MongoDB's schema-less nature allows for a flexible and dynamic data model.
Ease of Use: Python, with its rich ecosystem of libraries, provides an easy and efficient way to interact with MongoDB.
Prerequisites
Before we start, ensure you have the following installed on your system:
Python: Download and install from python.org.
MongoDB: Install MongoDB by following the instructions at mongodb.com.
Pandas: Install Pandas for data manipulation:
pip install pandas
PyMongo: Install PyMongo to interact with MongoDB:
pip install pymongo
Step-by-Step Guide
Step 1: Set Up MongoDB
Ensure MongoDB is running on your local machine. You can start the MongoDB service using the following command:
mongod
Step 2: Create a CSV File
Create a sample CSV file named data.csv
with the following content:
name,age,city
Alice,30,New York
Bob,25,Los Angeles
Charlie,35,Chicago
Step 3: Read CSV File with Pandas
Use Pandas to read the CSV file:
import pandas as pd
# Read the CSV file
df = pd.read_csv('data.csv')
# Display the DataFrame
print(df)
Step 4: Connect to MongoDB with PyMongo
Connect to the MongoDB server and create a database and collection:
from pymongo import MongoClient
# Connect to the MongoDB server
client = MongoClient('mongodb://localhost:27017/')
# Create a database
db = client['mydatabase']
# Create a collection
collection = db['mycollection']
Step 5: Insert Data into MongoDB
Convert the DataFrame to a list of dictionaries and insert the data into the MongoDB collection:
# Convert DataFrame to list of dictionaries
data = df.to_dict(orient='records')
# Insert data into MongoDB
collection.insert_many(data)
Full Code Example
Here is the complete code to read a CSV file and insert its data into MongoDB:
import pandas as pd
from pymongo import MongoClient
# Read the CSV file
df = pd.read_csv('data.csv')
# Connect to the MongoDB server
client = MongoClient('mongodb://localhost:27017/')
# Create a database
db = client['mydatabase']
# Create a collection
collection = db['mycollection']
# Convert DataFrame to list of dictionaries
data = df.to_dict(orient='records')
# Insert data into MongoDB
collection.insert_many(data)
print("Data inserted successfully!")
Conclusion
This guide covered the basics, but MongoDB and Python offer many more advanced features and capabilities. Explore further to take full advantage of these powerful tools.
Subscribe to my newsletter
Read articles from ByteScrum Technologies directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
ByteScrum Technologies
ByteScrum Technologies
Our company comprises seasoned professionals, each an expert in their field. Customer satisfaction is our top priority, exceeding clients' needs. We ensure competitive pricing and quality in web and mobile development without compromise.