Week 5,6 - GSOC 2023

Rohan BabbarRohan Babbar
4 min read

This blog post briefly describes the progress on the project achieved during the 5th and 6th weeks of GSOC.

In my previous blog post, I discussed the inclusion of MPIBlockDiag as part of the project. Initially, MPIBlockDiag was extending the pylops.LinearOperator class. However, due to the utilization of DistributedArray for matrix-vector multiplications, it was determined that incorporating an MPILinearOperator class and making MPIBlockDiag extend it, would be more appropriate.

Additionally, after the inclusion of MPIBlockDiag, I had a discussion with my mentors and we reached a decision. We agreed to prioritize the implementation of the CGLS solver using DistributedArray. This approach would be beneficial for the inversion component in the Post Stack Inversion example. Once this task was completed, we planned to proceed with the MPIVStack implementation.

Work Done

  1. Added MPILinearOperator which makes use of DistributedArray instead of NumPy Arrays.

  2. Added other classes such as _AdjointLinearOperator, _TransposedLinearOperator, _ProductLinearOperator,_ScaledLinearOperator,_SumLinearOperator, _PowerLinearOperator, and _ConjLinearOperator.

  3. Added Post Stack Inversion Example using MPIBlockDiag.

  4. Added CGLS(Conjugate Gradient Least Squares) solver using DistributedArray.

Added MPILinearOperator

This step was of great significance as the creation of this class would serve as the foundation for the entire library. All the operators that would be developed would extend this base class, making it the base component of the project. Similar to the pylops.LinearOperator class, this class provides functionalities for matrix-vector product operations. At present, the MPILinearOperator solely supports matrix-vector operations and does not include matrix-matrix operations. This decision was based on the initial requirements, as it adequately addresses all our use cases. This class makes use of DistributedArray to calculate matrix-vector product.

For more detailed information about the MPILinearOperator, visit here.

Added Additional classes for internal use within the LinearOperator module

It was essential to incorporate the following classes in a distributed manner, similar to what we have in pylops. These classes have been designed to handle operations within the distributed computing framework and ensure compatibility with the library's overall functionality. The implementation of the following classes has been done in such a way where both the input and output are instances of pylops_mpi.DistributedArray.

Added Post Stack Modelling Example

After completing the implementation of MPIBlockDiag, MPILinearOperator, and other necessary classes, I proceeded with the development of the 3D Post Stack Inversion. While pylops already had an example for a 2D Post Stack example where the model had a shape of (nz, nx), in the case of 3D, I extended the first axis at each rank. This allowed me to create a portion of the model, resulting in each rank having a model shape of (ny_i, nx, nz), where i represents the rank. These model portions were then incorporated into the MPIBlockDiag, enabling matrix-product operations to be performed to calculate the data. I also implemented some plots using matplotlib for ease of understanding, these plots give a better view of what I want to achieve. The inversion portion will be done after successfully implementing the CGLS solver, for now, only the Post Stack Linear Modeling is done.

The below visualization is created when this example file is executed using 3 processes.

Added CGLS Solver in a Distributed Fashion

Following discussions with my mentors, we concluded that the CGLS (Conjugate Gradient Least Squares) solver would be more suitable for our inversion tasks, given its prevalence in pylops. With the existing DistributedArray class and its distributed computing operations, I proceeded to replace the methods in pylops.CGLS that originally used NumPy arrays for inversion calculations. Instead, I integrated the methods with those specifically designed for DistributedArray usage. For more information, you can look into CGLS and cgls.

Challenges I ran into

One significant obstacle encountered during the project was the inability of Sphinx-Gallery to execute gallery examples using MPI. Despite looking into various resources, a solution could not be found. Consequently, my mentor opened an issue, leading us to discover that executing the Python files using mpiexec was not possible within Sphinx-Gallery. To address this limitation, we concluded implementing mpi_examples.sh, a shell script designed to run all the Python example files using mpiexec. We highly recommend that users use this script for executing the examples. Additionally, for user convenience, I also incorporated a make run_examples command that automates the execution of the above-mentioned task.

The second challenge I encountered involved the Post-Stack example. As I had limited knowledge about Poststack inversion initially, I faced several difficulties during the implementation process. However, my mentors provided patient guidance, explaining the concept in detail and outlining their expectations for the implementation. With their support and guidance, I successfully overcame the challenges and completed the Post-Stack example.

To do in the Coming Weeks

Starting this week, I will be undergoing my Mid-term evaluation, which will continue until July 14th. Once the evaluation is complete, I will shift my focus to implementing MPIVStack. This implementation will be similar to MPIBlockDiag, further enhancing the functionality and capabilities of the library.

Thank You for reading my Post!

0
Subscribe to my newsletter

Read articles from Rohan Babbar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rohan Babbar
Rohan Babbar