I have had a few questions recently asking what sub queries are and how they are used. While there are a number of articles out there on the subject they don’t explain it in a detailed manner as I would like. So here is my take on them.

Sub queries (or subqueries) can be extremely useful when creating large projects. Let’s take, for example, a project I have just completed where I needed to extract data from an IPB forum to use in a Codeigniter application. Without sub queries it would have been impossible to complete in a way that still allowed it to run in a fast & efficient manner.

What Are Sub Queries?

Well, sub queries are queries within a query. Think of them as bracketed equations in mathematics. Those sums inside brackets are completed first & then the results are used to work out the main equation. The same thing applies to sub queries. The query inside the brackets is completed and then the results are used to complete the main query.

How Are Sub Queries Used?

Let’s take a look at a simple sub query to help illustrate. I’m using a table of celebrities for this example. Why? It makes a change from boring old products…

SELECT *
FROM celebrities
WHERE age = (SELECT MAX(age) FROM celebrities)

This simple query will always return the oldest celebrity (not sure if they’d appreciate that) in the celebrities table. Some people call this a dynamic query, personally I just call it useful.

Let’s go through how it works. Let’s look at the sub query first. It just selects the highest age from the celebrities table, this result is then handed back to the main query which uses it as the value for it’s WHERE filter. You may ask “Why not just just add the other fields to the sub query?” well because it will throw back an error. In most circumstances you can’t use aggregate functions and normal field selects together.

You might be asking yourself what the benefit of the above query is. Well let’s think about how you would get the same result without using sub query. You couldn’t, not without using two separate queries joined with your server side scripting language.

Derived Tables

While all sub queries are all basically queries nested inside others there are different types. Derived tables are something I found useful when doing the IPB/Codeigniter project I mentioned at the start of this tutorial.

A derived table is a temporary table of information created by gathering results from other tables. Here is an example of a query that makes use of a derived table:

SELECT * FROM films AS f
JOIN (
    SELECT * FROM celebrities AS c
    JOIN int_cf AS i
    ON c.id = i.c_id
) AS ci
ON ci.f_id = f.id

This query is actually an example of a many to many relationship query, but that is a completely different tutorial. It is made to join three tables. Celebrities, Films & the intermediary table called int_cf.

Let’s start off with what the tables looked like, first the celebrities table:

id name
1 Brendan Fraser
2 Rachel Weisz
3 Jennifer Morrison
4 Yvonne Strahovski

Now the films table:

id name
1 The Mummy
2 The Mummy Returns
3 Star Trek
4 The Canyon

Finally the intermediary table int_cf:

c_id f_id
1 1
1 2
2 1
2 2
3 3
4 4

Let’s break down the query. First our sub query:

SELECT * FROM celebrities AS c
JOIN int_cf AS i
ON c.id = i.c_id

This query give us results that look like this:

id name c_id f_id
1 Brendan Fraser 1 1
1 Brendan Fraser 1 2
2 Rachel Weisz 2 1
2 Rachel Weisz 2 2
3 Jennifer Morrison 3 3
4 Yvonne Strahovski 4 4

This is our derived table, it is called that because it was derived from the results of two other tables. At the moment though it only gives us the ids of all the films. So we use the results of this derived table in our main query.

Let’s remind ourselves what it looks like:

SELECT * FROM films AS f
JOIN (
    SELECT * FROM celebrities AS c
    JOIN int_cf AS i
    ON c.id = i.c_id
) AS ci
ON ci.f_id = f.id

This gives us the following:

(f) id (f) name (ci) id (ci) name (ci) c_id (ci) f_id
1 The Mummy 1 Brendan Fraser 1 1
2 The Mummy Returns 1 Brendan Fraser 1 2
1 The Mummy 2 Rachel Weisz 2 1
2 The Mummy Returns 2 Rachel Weisz 2 2
3 Star Trek 3 Jennifer Morrison 3 3
4 The Canyon 4 Yvonne Strahovski 4 4

As you can see it’s joined everything together perfectly. Without the use of that derived table from before this would have taken two separate queries to the database which is generally detrimental to performance.

Hopefully this small look at sub queries has helped you out. Let me know if you have any queries (see what I did there?..) and I’ll try my best to answer them as quickly as I can.

I’d like to say a big thank you for your patience with the massive post gap over the Christmas period, and thank you to everyone who has supported Return True by purchasing advertisement spots (buy here), donating to the server fund or by posting links around the internet.