MySQL: What Are Sub Queries & How Are They Used?
Sub queries (or subqueries) can be extremely useful when creating large projects. Let’s take, for example, a project I have just completed where I needed to extract data from an IPB forum to use in a Codeigniter application. Without sub queries it would have been impossible to complete in a way that still allowed it to run in a fast & efficient manner.
What Are Sub Queries?
Well, sub queries are queries within a query. Think of them as bracketed equations in mathematics. Those sums inside brackets are completed first & then the results are used to work out the main equation. The same thing applies to sub queries. The query inside the brackets is completed and then the results are used to complete the main query.
How Are Sub Queries Used?
Let’s take a look at a simple sub query to help illustrate. I’m using a table of celebrities for this example. Why? It makes a change from boring old products…
1 2 3 |
SELECT * FROM celebrities WHERE age = (SELECT MAX(age) FROM celebrities) |
This simple query will always return the oldest celebrity (not sure if they’d appreciate that) in the celebrities table. Some people call this a dynamic query, personally I just call it useful.
Let’s go through how it works. Let’s look at the sub query first. It just selects the highest age from the celebrities table, this result is then handed back to the main query which uses it as the value for it’s WHERE filter. You may ask “Why not just just add the other fields to the sub query?” well because it will throw back an error. In most circumstances you can’t use aggregate functions and normal field selects together.
You might be asking yourself what the benefit of the above query is. Well let’s think about how you would get the same result without using sub query. You couldn’t, not without using two separate queries joined with your server side scripting language.
Derived Tables
While all sub queries are all basically queries nested inside others there are different types. Derived tables are something I found useful when doing the IPB/Codeigniter project I mentioned at the start of this tutorial.
A derived table is a temporary table of information created by gathering results from other tables. Here is an example of a query that makes use of a derived table:
1 2 3 4 5 6 7 |
SELECT * FROM films AS f JOIN ( SELECT * FROM celebrities AS c JOIN int_cf AS i ON c.id = i.c_id ) AS ci ON ci.f_id = f.id |
This query is actually an example of a many to many relationship query, but that is a completely different tutorial. It is made to join three tables. Celebrities, Films & the intermediary table called int_cf.
Let’s start off with what the tables looked like, first the celebrities table:
id | name |
---|---|
1 | Brendan Fraser |
2 | Rachel Weisz |
3 | Jennifer Morrison |
4 | Yvonne Strahovski |
Now the films table:
id | name |
---|---|
1 | The Mummy |
2 | The Mummy Returns |
3 | Star Trek |
4 | The Canyon |
Finally the intermediary table int_cf:
c_id | f_id |
---|---|
1 | 1 |
1 | 2 |
2 | 1 |
2 | 2 |
3 | 3 |
4 | 4 |
Let’s break down the query. First our sub query:
1 2 3 |
SELECT * FROM celebrities AS c JOIN int_cf AS i ON c.id = i.c_id |
This query give us results that look like this:
id | name | c_id | f_id |
---|---|---|---|
1 | Brendan Fraser | 1 | 1 |
1 | Brendan Fraser | 1 | 2 |
2 | Rachel Weisz | 2 | 1 |
2 | Rachel Weisz | 2 | 2 |
3 | Jennifer Morrison | 3 | 3 |
4 | Yvonne Strahovski | 4 | 4 |
This is our derived table, it is called that because it was derived from the results of two other tables. At the moment though it only gives us the ids of all the films. So we use the results of this derived table in our main query.
Let’s remind ourselves what it looks like:
1 2 3 4 5 6 7 |
SELECT * FROM films AS f JOIN ( SELECT * FROM celebrities AS c JOIN int_cf AS i ON c.id = i.c_id ) AS ci ON ci.f_id = f.id |
This gives us the following:
(f) id | (f) name | (ci) id | (ci) name | (ci) c_id | (ci) f_id |
---|---|---|---|---|---|
1 | The Mummy | 1 | Brendan Fraser | 1 | 1 |
2 | The Mummy Returns | 1 | Brendan Fraser | 1 | 2 |
1 | The Mummy | 2 | Rachel Weisz | 2 | 1 |
2 | The Mummy Returns | 2 | Rachel Weisz | 2 | 2 |
3 | Star Trek | 3 | Jennifer Morrison | 3 | 3 |
4 | The Canyon | 4 | Yvonne Strahovski | 4 | 4 |
As you can see it’s joined everything together perfectly. Without the use of that derived table from before this would have taken two separate queries to the database which is generally detrimental to performance.
Hopefully this small look at sub queries has helped you out. Let me know if you have any queries (see what I did there?..) and I’ll try my best to answer them as quickly as I can.
I’d like to say a big thank you for your patience with the massive post gap over the Christmas period, and thank you to everyone who has supported Return True by purchasing advertisement spots (buy here), donating to the server fund or by posting links around the internet.
6 Comments
kishore
Its nice and very easily understand full language…
Thanks a lot…. Keep it up…
Site visualization also very sweet and cool
Paul Robinson
Glad it was helpful, and thanks for stopping by. 🙂
leo
SELECT *
FROM celebrities
WHERE age = (SELECT MAX(age) FROM celebrities)
could be
SELECT * FROM celebrities
ORDER BY age DESC LIMIT 1
leo
SELECT * FROM films AS f
JOIN (
SELECT * FROM celebrities AS c
JOIN int_cf AS i
ON c.id = i.c_id
) AS ci
ON ci.f_id = f.id
could be
SELECT w.*,l.*,c.* FROM celebrities w
left join int_cf l
on w.id = l.wid
left join films c
on l.cid = c.id
Paul Robinson
You are correct on both counts, however the first query was used just to show how a sub query works in a simple to understand example. Whether it could be wrote in an alternate way, or if it is actually useful or not is not at debate.
The second query, again you are right. However I just ran both queries and the sub query was the faster of the two, so I think I’ll stick with that one. (0.0005 sec VS 0.0011 sec)