GSoC 2023 Project — openSUSE Rancher ⚡
Google Summer of Code (GSoC) is an annual program by Google that offers students (above the age of 18) worldwide the opportunity to work on open-source software projects. Students gain real-world coding experience while collaborating with established open-source organizations. It's a platform for learning, contributing, and fostering innovation in the tech community.
I participated in GSoC 2023 under the organization openSUSE for the Rancher project.
Here’s a link to my project — Analytics Edge Ecosystem Workloads
In this project, we need to develop a microservice workload within a distributed edge-core-cloud infrastructure, all built on open-source principles. The aim is to cater to various market verticals, and I've specifically opted to focus on the healthcare sector for my continued contributions to the project.
The goal of the project is to create a Machine Learning application that would be an accessible and convenient translator from audio transcription to American Sign Language (ASL) and can be deployed on a Kubernetes cluster for the user to use. Machine learning models for the vertical would be trained/tested/deployed as microservices in cloud-native formats.
All the deployments would be done on Rancher-managed Kubernetes clusters running K3s/RKE/RKE2.
Project Overview 🤯
We had 3 phases for our project — Community Bonding period, Coding Phase 1 and Coding Phase 2.
Here’s my GSoC’23 project link — https://summerofcode.withgoogle.com/programs/2023/projects/KwnJF6Su
Community Bonding period
I began exploring the sample rancher and Kubernetes environment provided to me. I installed Longhorn on the Kubernetes clusters, created new multi-node clusters, connected them with Rancher for easy management and tested them with sample workloads like WordPress and busybox.
Coding Phase — 1
This phase was more focused on creating the Machine Learning models for the chosen business vertical. I chose the Environment vertical to work upon for the phase. I first tried out a sample kaggle notebook in the environment vertical which was about Microplastic Detection on the ocean based on various parameters like latitude, longitude, size etc.
I then made a sample workload for Microplastic detection using python and various Libraries like YOLOv5
, OpenCV
, PyTorch
, SciPy
etc. And also used Deep sort which is basically the same with sort but added a CNN model to extract features in image bounded by a detector. I wrote one script track.py
in order to track all the microplastics and get the results.
I then trained and tested out various model by PyTorch by providing different videos of microplastics and get all the details onto console like no of microplastics, size, time inference, sorting time etc.
Here is the link of Github repository :-https://github.com/bishal7679/TrackYOLO_Microplastics
Coding Phase — 2
In this phase, I experimented with Sign Language and implemented one workload, which is basically An easy-to-use program that transcribes text or audio files into a sign language animation.
What it does and how it can be used
Our program has three main steps:
Convert audio to text (skipped when converting text-to-sign)
Find what movement corresponds to each word
Animate the movement
It can take two types of input: audio files and text. If an audio file is given, it will produce a transcript of the words spoken in the audio file. If a text input is given and presents an animation of the transcript in sign language and continues to the next step. When a text input is given, the process is bypassed and the program produces an animation of the text in sign language. While useful as a sign language equivalent of closed captions, It also extends as an educational tool. Those looking to learn sign language can use SpeechToSign to teach themselves how to sign various phrases using both the speech-to-sign and text-to-sign functionalities.
Tech Stack used 🧪
OpenAI's Whisper API to convert speech to text
Using Python scripts to convert the .txt file into a list of unique strings
Using Flask for Frontend
Using Google's MediaPipe Hand Landmarker to retry the coordinates of each hand
Using the ASL Dictionary to map each word to an array of coordinates
Using three.js to animate the set of points
Using HTML, CSS, JS, and Git to create a website and repository
Using Rancher, k3s, k8s, AWS ECS Cluster for Deployment
Github Action for CI/CD pipeline
Challenges ✨
Semantics: Not having the exact translation of every word in the ASL dictionary
Creating a model that uses both right and left hand, especially when their animations overlap
Making User Interface design smooth, accommodating both text and audio file inputs
Technical Overview 🚀
Collect all the videos from ASL dictionary for the required words spoken in the audio file.
Load the model from whisper to convert the given audio to Transcripted words and then modify those words so all of them are in the dictionary and save them to a list.
Make a POST request to submit the ASL words and convert it to sign language animation.
Determine handedness based on hand landmarker positions and store the coordinates and joint index to
reference.json
Display the video frame with landmarker through OpenCV
Containerize the application.
Define a deployment mechanism (such as with yaml manifest files).
Deploy the application (Kubernetes).
Access the code for the aforementioned project in the GitHub repository provided below. This repository encompasses all the code developed throughout the GSoC 2023 final Evaluation. For additional details and a comprehensive deployment guide, refer to the same GitHub repository:
Further Steps
I plan to write a SUSE technical reference documentation (TRD) covering my project — that will explain all the steps from creating the ML models to deploying them on Kubernetes clusters using Rancher in detail. I will update its link here when it gets published on the SUSE documentation site.
Acknowledgements 🙏
I would like to thank my mentors Bryan Gartner, Ann Davis and Terry Smith for their unwavering support and inspiration. Their invaluable guidance led me to solutions and instilled in me a culture of iterative improvement within the realm of Machine Learning. Engaging conversations throughout the GSoC tenure were truly captivating, fueling my desire for continuous learning.
I'm equally appreciative of the SUSE community for their backing.
Lastly, I'd like to acknowledge Google for orchestrating the Google Summer of Code, an initiative that has undeniably elevated the open-source community.
About Me 🚩
Name — Bishal Das
Location — West Bengal, India
Twitter — https://twitter.com/bishaltwt7679
LinkedIn — https://www.linkedin.com/in/bishal-das-1bba8b1b8/
GitHub — https://github.com/bishal7679
I appreciate your time in reviewing my GSoC 2023 project endeavors and the associated experience. I hope it resonated positively with you ❤️
Subscribe to my newsletter
Read articles from Bishal Das directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by