SQL Joins

Tinotenda JoeTinotenda Joe
7 min read

So here we are again, deep in the trenches of development. It's fascinating how, as developers, we often think we know a lot, but in reality, sometimes what we know is just the tip of the iceberg. In fact, a good 90% of the time, what we think we know might be incorrect or incomplete ๐Ÿ˜‚. I recently had one of those moments while working with Drizzle, which I think is one ORM that is very close to SQL, and encountered a frustrating bug that took me a day to figure out.

The Bug

I was working on a project involving transactions where each transaction could optionally have a studentId. I had written an API endpoint to fetch transaction data, but noticed that transactions without a studentId were not being returned. Only those with a studentId were showing up. After a day of frustration and debugging, I finally realized that the issue lay in the type of join I was using in my SQL query.

Here's the API code that led me to this revelation:

javascriptCopy code.get(
  "/:id",
  zValidator(
    "param",
    z.object({
      id: z.string().optional(),
    })
  ),
  clerkMiddleware(),
  async (c) => {
    const auth = getAuth(c);
    const { id } = c.req.valid("param");
    if (!id) {
      return c.json({ error: "Missing Id" }, 401);
    }
    if (!auth?.userId) {
      return c.json({ error: "Unauthorized" }, 401);
    }

    const [data] = await db
      .select({
        id: transactions.id,
        date: transactions.date,
        categoryId: transactions.categoryId,
        payee: transactions.payee,
        studentId: registrationNumbers.number,
        amount: transactions.amount,
        notes: transactions.notes,
        accountId: transactions.accountId,
      })
      .from(transactions)
      .innerJoin(accounts, eq(transactions.accountId, accounts.id))
      .leftJoin(students, eq(transactions.studentId, students.id))
      .leftJoin(
        registrationNumbers,
        eq(students.registrationNumberId, registrationNumbers.id)
      )
      .where(and(eq(transactions.id, id), eq(accounts.userId, auth.userId)));

    if (!data) {
      return c.json({ error: "Not Found" }, 404);
    }
    return c.json({ data });
  }
)

The key change I made was switching from INNER JOIN to LEFT JOIN for the students and registrationNumbers tables. This adjustment ensured that transactions without a studentId were also included in the results. To understand why this worked, let's dive into the differences. Here is the research I did.

INNER JOIN

Definition:

An INNER JOIN returns only the rows that have matching values in both tables.

Usage:

It is often used to retrieve only the records that have corresponding matches in the joined tables.

Behavior:

If there is no match between the tables, the row is excluded from the result set.

Example: Suppose we have two tables, A and B:

A.idA.value
1A1
2A2
3A3
B.idB.value
1B1
2B2

An INNER JOIN on A.id = B.id would produce:

A.idA.valueB.idB.value
1A11B1
2A22B2

Rows with A.id = 3 are excluded because there is no matching B.id = 3.

LEFT JOIN (or LEFT OUTER JOIN)

Definition:

ALEFT JOIN returns all rows from the left table and the matched rows from the right table. If there is no match, the result is NULL on the side of the right table.

Usage:

It is often used to retrieve all records from the left table, regardless of whether they have matching rows in the right table.

Behavior:

If there is no match, the result will still include the row from the left table, but with NULL values for the columns from the right table.

Example: Using the same tables A and B:

A LEFT JOIN on A.id = B.id would produce:

A.idA.valueB.idB.value
1A11B1
2A22B2
3A3NULLNULL

Row with A.id = 3 is included with NULL values for columns from table B because there is no matching B.id = 3.

RIGHT JOIN (or RIGHT OUTER JOIN)

Definition:

A RIGHT JOIN returns all rows from the right table, and the matched rows from the left table. If there is no match, the result is NULL on the side of the left table.

Usage:

It is used often used to retrieve all records from the right table, regardless of whether they have matching rows in the left table.

Behavior:

If there is no match, the result will still include the row from the right table, but with NULL values for the columns from the left table.

Example: Using the same tables A and B:

A RIGHT JOIN on A.id = B.id would produce:

A.idA.valueB.idB.value
1A11B1
2A22B2
NULLNULL3B3

Row with B.id = 3 is included with NULL values for columns from table A because there is no matching A.id = 3.

FULL OUTER JOIN

Definition:

A FULL OUTER JOIN returns all rows when there is a match in either left or right table. This means it returns all rows from the left table and all rows from the right table. If there is no match, the result is NULL on the side that does not have a match.

Usage:

It isoften used to retrieve all records from both tables, with NULL in places where there is no match.

Behavior:

If there is no match, the result will include the row from both tables, but with NULL values for the columns from the table that does not have a match.

Example: Using the same tables A and B:

A FULL OUTER JOIN on A.id = B.id would produce:

A.idA.valueB.idB.value
1A11B1
2A22B2
3A3NULLNULL
NULLNULL3B3

Rows with A.id = 3 and B.id = 3 are included with NULL values for columns from the other table because there is no matching B.id = 3 and A.id = 3 respectively.

CROSS JOIN

Definition:

A CROSS JOIN returns the Cartesian product of the two tables. This means that every row from the first table is paired with every row from the second table.

Usage:

It is used when you want to combine all rows from the first table with all rows from the second table.

Behavior:

The result set includes all possible combinations of rows from the two tables.

Example: Using the same tables A and B:

A CROSS JOIN on A and B would produce:

A.idA.valueB.idB.value
1A11B1
1A12B2
2A21B1
2A22B2
3A31B1
3A32B2

Every row from A is combined with every row from B.

SELF JOIN

Definition:

A SELF JOIN is a regular join but the table is joined with itself.

Usage:

It is used when you want to compare rows within the same table or query hierarchical data.

Behavior:

The table is effectively duplicated and then joined on the specified condition.

Example: Using a single table A:

A.idA.valueA.parent_id
1A1NULL
2A21
3A31

A SELF JOIN on A.id = A.parent_id would produce:

A.idA.valueA.parent_idB.idB.value
2A211A1
3A311A1

Rows with A.parent_id are matched with rows from the same table where A.id equals A.parent_id.

0
Subscribe to my newsletter

Read articles from Tinotenda Joe directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Tinotenda Joe
Tinotenda Joe

Avid developer