The problem of finding the amount of matches in another table is one that had me stumped for a while, but after googling a while, looking through concepts that were new to me, I figured it out. Since it may be a fairly common problem, I thought I’d share my solution here.
I used this technique to find the amount of posts a user has on a forum, in a situation where I have a users table, and a forum table. Other possibilities could be amount of sales for a certain item in a webshop, where the items are in a table, and the sales in another. I’ll use the users / posts example to show the query.
|143||3||42||Cras sodales molestie…|
|144||3||54||Phasellus condimentum euismod…|
|145||1||65||Aliquam placerat nibh ut…|
|146||3||54||Donec nec convallis…|
|147||1||54||Sed tincidunt fermentum…|
|148||1||60||Aenean id turpis…|
SELECT users.user_id, users.username, COALESCE(pc.postcount, 0) AS postcount FROM users LEFT JOIN ( SELECT forum.user_id, COUNT(forum.post_id) AS postcount FROM forum WHERE 1 GROUP BY forum.user_id ) pc ON pc.user_id=users.user_id WHERE 1
How it works
First, we select the information of the users, simply, by referring to them using
table.field. Then, we also include a postcount field, but not quite like we’re used to seeing it. Since the value
NULL would be retrieved for users who don’t have any posts, we have a workaround. The
COALESCE function returns the first non-NULL argument it is given. So if there is no match for a certain
user_id in the table
'forum', the resulting
NULL will not be used, we get a regular numeric 0 instead.
The simple user information is retrieved, as expected, from the users table. After that it gets a little more tricky. We perform a
LEFT JOIN with what is called a subquery. Within the parentheses after
LEFT JOIN, there is a whole new query; the resultset of this query will behave itself as another table we’re joining.
In our subquery we select the
postcount. Normally, using
COUNT() this way would simply return a single result, with the amount of rows in the whole forum table. However, because we’re using
GROUP BY forum.user_id here, it will count the
postcount for each unique value of
user_id encountered in
forum. This gives us a simple
postcount table as a result.
Now we’re outside the parentheses again. The
pc you see after the parentheses is the name we assign to the resultset of our subquery. We can refer to it as if it were any regular table, as we see in the
SELECT part of our main query, as well as the
ON part we’re at now. We want to join the two tables where the
user_id‘s match, again, like any regular join. In case there is no match for a
user_id from the table
users, there would be no corresponding
postcount value to join in the final resultset, thus, a value of
NULL is given, which is then ‘changed’ to 0 by the
Here you see what the outcome of above query would be on our given tables:
This was all tested and figured out on MySQL 5+. Thus, I don’t know if it would work on other SQL servers. As far as I know, the
IFNULL function is the one thing that’s usually different throughout database servers. The subquery system will mostly be the same-ish.
If you know more about this, or I’m misinforming on anything, please, by all means, let me know in the comments and I’ll change it up to be as accurate as possible.
Also, if you know a more efficient way to go for this specific task, let me know as well. Obviously, feel free to ask any questions if something is unclear.
April 23rd, 2011: Changed the
IFNULL to the standards-compliant
COALESCE function – thanks Omar.
April 30rd, 2011: Fixed the query:
COUNT(forum. – Thanks Ringish.
postID) AS postcount