From ATGC to Algorithms: How Machine Learning is Powering DNA Sequencing

fakhir hassanfakhir hassan
5 min read

When you walk into an environment buzzing with innovation, knowledge, and passion, you instantly feel inspired. That’s exactly how I felt stepping into the Applied Machine Learning Workshop at NUST SINES (School of Interdisciplinary Engineering and Sciences).

Coming from COMSATS University, I thought I had seen my fair share of seminars and sessions — but what I experienced on Day 1 at NUST was on another level. From the quality of speakers to the exposure to cutting-edge topics, it was an unforgettable start to this 3-day journey.


Kickoff: Setting the Stage for AI, ML, and DL

The day began with registration at 9:00 AM, followed by the opening session at 9:15 AM. This session was designed to bring everyone on the same page.

The speakers gave us a bird’s-eye view of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) — not just defining them, but highlighting their scope, real-world impact, and how they differ from computational sciences.

We also touched upon:

  • Types of Machine Learning (Supervised, Unsupervised, Reinforcement Learning).

  • Core algorithms and their applications.

  • Popular AI/ML libraries used in research and industry.

It was just an overview, but enough to spark curiosity and set the tone for the rest of the day.


Dr. Waseem Haider: AI in Life Sciences

At 10:10 AM, the atmosphere completely shifted as Dr. Waseem Haider took the stage.

Dr. Waseem is not only the CEO of Next Gen Solutions, but also an Associate Professor at COMSATS University Islamabad. His humble beginnings and inspiring academic journey made his presence even more relatable and motivating for us students.

Understanding DNA: The Blueprint of Life

He started by breaking down DNA at the most fundamental level:

  • Cells contain a nucleus, inside which chromosomes reside.

  • Chromosomes are packed with histones, which further wrap around DNA base pairs — Adenine (A), Thymine (T), Guanine (G), and Cytosine (C).

From here, he explained DNA sequencing — the chemistry behind decoding our genetic blueprint. He honored legendary scientists like Walter Gilbert and Allan Maxam, who pioneered sequencing methods, and Sir Fred Sanger, a Nobel Laureate who revolutionized sequencing with continuous sequencing techniques.

Next-Generation Sequencing (NGS)

Dr. Waseem then explained how sequencing evolved into Next-Generation Sequencing (NGS) — a leap that allows higher throughput and massive parallel sequencing. This innovation enabled researchers to decode genomes at a speed and scale previously unimaginable.

He also introduced us to mRNA and miRNA, their role in protein synthesis, and how proteins form the very structure of life — from our muscles to our facial features.

Pakistan’s Contribution

He proudly mentioned Dr. Atta-ur-Rehman, the first Pakistani to have his genome mapped, marking an important milestone for science in our country.

Complexity of Genes & Algorithms

While discussing genome assembly, Dr. Waseem highlighted the complexity of genes and how algorithms are essential in making sense of the vast data.

He traced the origins of the word “algorithm” back to Abu Abdullah Muhammad ibn Musa Al-Khwarizmi, the Persian mathematician whose works laid the foundation for algebra and computation. This historical context made us realize how timeless the pursuit of knowledge is — connecting the past with the future.

A Deeper Reflection: Faith and Knowledge

One of the most memorable moments was when Dr. Waseem presented a Quranic Ayah:

“Allah taught Adam the names of everything, then presented them to the angels.”

He linked this to machine learning, showing how classification, naming, and knowledge are at the core of both divine wisdom and modern AI. It was a powerful reminder that learning and teaching are central to our existence.

ML in DNA Sequencing

Finally, he tied everything together by explaining how ML algorithms are applied in DNA sequencing. He used a simple analogy: just like we train a child to distinguish between good and bad, ML models are trained on labeled data and then tested on new data to evaluate their performance.

He introduced us to performance metrics — tools to evaluate how well a model performs — and emphasized how ML is driving breakthroughs in disease detection, genetic research, and life sciences.

By the end of his talk, it was clear: DNA sequencing isn’t just biology anymore — it’s a computational challenge, and AI is the key to solving it.


Hands-On Coding: Bringing Theory to Life

At 12:30 PM, it was time to roll up our sleeves. Dr. Muhammad Shahzad and Miss Saadia led an interactive hands-on coding session using Google Colab.

They introduced us to the ML workflow:

  1. Data Extraction – collecting raw data.

  2. Data Preprocessing – cleaning, transforming, and scaling it.

  3. Modeling – training algorithms to make predictions.

We began by working on tumor datasets, exploring what DNA sequencing metrics could be used to predict tumor presence. After preprocessing the data, we applied Standard Scaling to prepare it for modeling.

Next, we implemented two popular algorithms:

  • K-Nearest Neighbors (KNN)

  • Support Vector Machine (SVM)

Once we got the basics, the instructors challenged us further: we were given COVID-19 and Alzheimer’s datasets and asked to build predictive models.

It was an eye-opening experience — not only because we got to practice coding, but because we saw firsthand how machine learning translates into real-world healthcare solutions.


Reflections on Day 1

As the first day came to a close, I found myself reflecting on how much I had learned in just a few hours:

  • I gained a deeper appreciation of DNA sequencing and genomics.

  • I saw how machine learning bridges biology and computation.

  • I learned from pioneers who are shaping Pakistan’s contribution to AI and life sciences.

  • And most importantly, I realized how workshops like this are crucial for students to gain practical exposure beyond textbooks.

Day 1 wasn’t just about algorithms or biology. It was about connecting knowledge across disciplines, blending science with technology, and realizing the potential of AI to transform lives.

And the best part? This was only the beginning. With two more days ahead, I can’t wait to dive deeper into applied ML and see how far this journey goes.


Closing Thought

For any student passionate about AI, ML, or life sciences, workshops like this are golden opportunities. They don’t just teach you — they inspire you, challenge you, and open doors to new possibilities.

As I walked out of the session, one thought echoed in my mind:
The future isn’t just about coding or biology — it’s about the fusion of both. And we, as students, are standing right at the intersection.


0
Subscribe to my newsletter

Read articles from fakhir hassan directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

fakhir hassan
fakhir hassan

Student at Comsats Islamabad Will be completing my degree in 2026 here you will find all my daily learnings