A Comprehensive Guide to Plotting with Matplotlib and Pandas

Emeron MarcelleEmeron Marcelle
4 min read

Matplotlib is a powerful library in Python that allows for a wide range of plotting capabilities. It is particularly useful for creating static, animated, and interactive visualizations in Python. Below is a guide on how to use Matplotlib, particularly focusing on various types of plots and the syntax required.

Importing Matplotlib

To get started with plotting, you'll need to import the pyplot module from Matplotlib:

import matplotlib.pyplot as plt

Basic Plotting Syntax Using plt

1. plt.plot(x, y): Plotting Line Data

The plt.plot() function is used to create simple line plots. You pass the x and y data as arguments.

plt.plot(x, y)
plt.show()

2. plt.scatter(x, y): Creating Scatter Plots

Scatter plots are useful when you want to show the relationship between two variables.

plt.scatter(x, y)
plt.show()

3. plt.line(x, y): Creating Line Plots with Custom Markers

Line plots are similar to basic plots but allow you to customize markers.

plt.plot(x, y, marker='o')
plt.show()
  • Marker Options: You can use various characters like 'o', '^', '*', etc., to represent data points on the line.

4. plt.hist(x, y): Creating Histograms

Histograms help in understanding the distribution of a dataset.

plt.hist(data, bins=10, stacked=True)
plt.show()
  • stacked=True: This option stacks the histogram bars on top of each other.

  • by=RowName: Creates a separate graph for each row.

5. plt.bar(x, y): Creating Bar Graphs

Bar graphs are useful for comparing quantities.

plt.bar(x, y, stacked=True)
plt.show()
  • stacked=True: Stacks bars on top of each other.

  • plt.barh(x, y): Makes the bar graph horizontal.

6. plt.pie(x): Creating Pie Charts

Pie charts are great for showing proportions of a whole.

plt.pie(x, autopct='%1.1f%%', labels=labels)
plt.show()
  • autopct='%1.1f%%': Displays percentage values for each slice.

  • labels=df.index: Use this argument to label each slice of the pie.

7. plt.area(x, y): Creating Area Plots

Area plots are used to show the cumulative data over a range.

plt.fill_between(x, y, alpha=0.5)
plt.show()
  • stacked=False: Unstacks the area.

8. plt.subplots(num, num, num): Creating Multiple Plots on One Screen

This function allows you to create multiple plots in a single figure.

fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(12,8))
plt.show()

Plotting Using a DataFrame in Pandas

Pandas also provides built-in plotting functionality, which integrates seamlessly with Matplotlib.

df.plot(x='ColumnName', y='ColumnName', kind='line')
plt.show()
  • kind='line': Specifies the type of graph (line, scatter, bar, etc.).

Customizing Your Plots

1. .axis(list): Set Axis of Plot

You can control the limits of the axes using the axis() method.

plt.axis([0, 10, 0, 100])

2. .title("Name"): Adding a Title

You can add a title to your plot for better context.

plt.title("Sample Plot")

3. .show(): Displaying the Plot

To display the plot, simply use the show() function.

plt.show()

4. .xlabel("Name") and .ylabel("Name"): Labeling Axes

These functions are used to add labels to the x and y axes.

plt.xlabel("X Axis")
plt.ylabel("Y Axis")

5. .xticks((x cords to replace), (replacing with)): Customizing Tick Marks

You can customize the tick marks on your plot's axes.

plt.xticks([0, 1, 2], ['A', 'B', 'C'])

6. .xscale('log'): Logarithmic Scaling

This is useful for datasets that span multiple orders of magnitude.

plt.xscale('log')

7. .clf(): Clearing the Plot

Clears the current figure.

plt.clf()

8. .legend(["string", 'string']): Adding a Legend

Legends help in identifying what each plot represents.

plt.legend(["Series1", "Series2"])

9. .figure(figsize=(xInt, yInt)): Adjusting Figure Size

You can adjust the size of your figure using this function.

plt.figure(figsize=(10, 5))

Types of Graphs

1. Line Plots: .plot(x, y)

Line plots are best for showing trends over time.

df.plot(x='Year', y='Sales', kind='line')

2. Scatter Plots: .scatter(df.x, df.y)

Scatter plots show relationships between two variables.

df.plot.scatter(x='Height', y='Weight', color='Red')

3. Histograms: .hist(array)

Histograms show the distribution of a dataset.

df['Age'].plot.hist(alpha=0.5, bins=10)

Example: Creating a Bar Graph Using Pandas

Suppose you want to create a bar graph to show the frequency of values in a column:

df['ColumnName'].value_counts().plot(kind='bar')
plt.show()

Conclusion

Matplotlib and Pandas are powerful tools for data visualization in Python. With the variety of plots available, you can effectively present your data and uncover hidden patterns or trends. Use this guide as a reference to create your own stunning visualizations.

0
Subscribe to my newsletter

Read articles from Emeron Marcelle directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Emeron Marcelle
Emeron Marcelle

As a doctoral scholar in Information Technology, I am deeply immersed in the world of artificial intelligence, with a specific focus on advancing the field. Fueled by a strong passion for Machine Learning and Artificial Intelligence, I am dedicated to acquiring the skills necessary to drive growth and innovation in this dynamic field. With a commitment to continuous learning and a desire to contribute innovative ideas, I am on a path to make meaningful contributions to the ever-evolving landscape of Machine Learning.