Week 1,2 - GSOC 2023

Rohan BabbarRohan Babbar
2 min read

This blog post describes the project "MPI Backend for Distributed Inverse Problems" with PyLops during the first two weeks of GSoC 2023.

Work Done

  1. Added DistributedArray class.

  2. Added to_dist method which can convert any NumPy array to a DistributedArray.

  3. Added Addition, Subtraction and Multiplication of arrays in Distributed Fashion using MPI.

  4. Build a CI pipeline on GitHub actions.

DistributedArray

I added a DistributedArray class which provides multidimensional NumPy-like distributed arrays. It brings NumPy arrays to high-performance computing. This class is created using mpi4py and NumPy libraries. Added an enum class Partition(as suggested by my mentors) which gives the option to broadcast or scatter the Numpy array.

from pylops_mpi import DistributedArray, Partition

global_shape = (10, 5)

# Initialize a DistributedArray with partition set to Broadcast
dist_array_broadcast = DistributedArray(global_shape=global_shape,
                                        partition=Partition.BROADCAST)

# Initialize a DistributedArray with partition set to Scatter
dist_array_scatter = DistributedArray(global_shape=global_shape,
                                      partition=Partition.SCATTER)

Added to_dist method

I added a to_dist method which converts any NumPy array into a DistributedArray. This method makes sure that the portions of the array are distributed among ranks efficiently.

Addition, Subtraction and Multiplication

Handles basic mathematical functions in a distributed fashion using mpi4py. These methods reduce memory usage by carrying out operations at different ranks.

Build a CI pipeline on GitHub Actions

Was suggested by my mentors to add a build.yml file to perform jobs on every push/pull request to the main branch. Since the project required the use of MPI, this Setup-MPI was helpful.

Challenges I ran into

There were quite a few challenges I faced, it took me some days to get the initial version of DistributedArray class ready as we had to make sure that the data is distributed equally and privately to each rank.

At first, was quite confused about how to set up Github Actions using MPI, but was guided by my mentors and got all the required resources.

To do in the coming week

In the coming week, I will be working on extending the DistributedArray class to distribute along any axis, and also setup Sphinx for documentation.

Thank you for reading my post!

0
Subscribe to my newsletter

Read articles from Rohan Babbar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rohan Babbar
Rohan Babbar