🐘 My PostgreSQL Journey – From Basics to Insights

MATHAVAN EMATHAVAN E
3 min read

πŸ“Œ Why I Decided to Learn PostgreSQL

In today’s data-driven world, every industry β€” tech, healthcare, finance, education, e-commerce β€” generates massive volumes of information. To make sense of this data, SQL remains the universal language for storing, retrieving, and analyzing it.

By learning PostgreSQL, I gained hands-on skills to work with real-world datasets and answer analytical questions effectively.


πŸ’‘ What is PostgreSQL?

PostgreSQL is a powerful, open-source, object-relational database management system (ORDBMS) renowned for its stability, reliability, and advanced features.

It supports both:

  • Relational data (SQL) – for structured, table-based storage.

  • Non-relational data (JSON) – for flexible, schema-less storage.

Its versatility makes it a go-to choice for analytics, app development, and complex data solutions. It’s also highly extensible, allowing developers to create custom data types, functions, and even operators.


πŸ“š Skills I’ve Gained in PostgreSQL

  1. SELECT statement

  2. SELECT DISTINCT

  3. COUNT()

  4. Query challenges using filtering & aggregation

  5. WHERE clause

  6. ORDER BY

  7. LIMIT

  8. BETWEEN

  9. IN

  10. LIKE and ILIKE

  11. Aggregates (SUM, AVG, etc.)

  12. GROUP BY for summarization


🎯 Why I Chose PostgreSQL Over Others

  1. Free & Open Source – No licensing fees, complete control.

  2. Advanced Features – Handles analytics, GIS, and complex queries.

  3. SQL Standards Compliance – Closely follows ANSI SQL.

  4. Extensibility – Build your own types, functions, and operators.

  5. Versatile Usage – Suitable for both transactional (OLTP) and analytical (OLAP) workloads.

  6. Cross-Platform Support – Works on all OS and cloud platforms.


πŸ“Š Real Dataset Practice – Stroke Prediction Analysis

I practiced SQL on a stroke prediction dataset, tackling questions like:

  1. Gender-based Stroke Incidence – % of patients who had a stroke, grouped by gender.

  2. Age Group Risk Analysis – Which age group has the highest stroke rate.

  3. Hypertension & Heart Disease Co-occurrence – Proportion of patients with both conditions, and stroke prevalence among them.

  4. Work Type vs. Stroke Risk – Identifying occupations with higher stroke rates.

  5. Urban vs. Rural Stroke Rates – Comparing stroke occurrence based on Residence_type.

I couldn’t find exact solutions online, so I used ChatGPT to give some tasks which i have learnt.


⚑ Key Errors & Lessons Learned

  1. Percentage Calculations in SQL Learned to use window functions for accurate percentages:

     ROUND((COUNT(*) * 100.0 / SUM(COUNT(*)) OVER ()), 2) AS percentage
    
  2. CASE in CTEs Used to create custom categories for cleaner, more readable queries.

  3. WHERE Clause Placement Realized that filters before aggregation impact results, so placement is crucial.

  4. FILTER (WHERE ...) in Aggregates Filtering conditions must be inside aggregate functions:

     COUNT(*) FILTER (WHERE stroke = 1)  -- βœ… Correct
    
  5. Grouping & Sorting Learned to combine GROUP BY and ORDER BY for well-structured summaries.


πŸš€ What’s Next?

My PostgreSQL learning journey is far from over. Next, I’ll explore:

  • Advanced joins

  • Window functions for analytics

  • Performance tuning

  • PostgreSQL extensions

πŸ“¬ Let’s Connect: LinkedIn – Maddy Das


0
Subscribe to my newsletter

Read articles from MATHAVAN E directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

MATHAVAN E
MATHAVAN E