Aim: To clarify the difference between filtering data in a WHERE clause or within the OUTER JOIN.
When using an OUTER JOIN, there will be a difference in the results depending on whether the data is filtered in a WHERE clause or within the OUTER JOIN itself.
The below queries have been run against a copy of the publicly-available StackOverflow database. For context, the initial query does not contain any filtering. As expected, it returns all the rows that appear in the VoteTypes table but not in Votes.
SELECT VT.Name AS VoteType
,V.Id AS VoteId
,V.UserId
FROM dbo.VoteTypes VT
LEFT OUTER JOIN dbo.Votes V
ON VT.Id = V.VoteTypeId;
To exclude the Votes rows where the UserId is NULL, there appear to be two options:
Option 1: Filtering data in a WHERE clause
SELECT VT.Name AS VoteType
,V.Id AS VoteId
,V.UserId
FROM dbo.VoteTypes VT
LEFT OUTER JOIN dbo.Votes V
ON VT.Id = V.VoteTypeId
WHERE V.UserId IS NOT NULL;
While this query excludes the Votes rows where UserId is NULL, it also excludes the VoteTypes rows which do not have a match in Votes. It effectively makes the LEFT OUTER JOIN behave like an INNER JOIN. This is not the desired behaviour.
Option2: Filtering data within the OUTER JOIN
SELECT VT.Name AS VoteType
,V.Id AS VoteId
,V.UserId
FROM dbo.VoteTypes VT
LEFT OUTER JOIN dbo.Votes V
ON VT.Id = V.VoteTypeId
AND V.UserId IS NOT NULL;
Like the initial query, this query returns all the rows that appear in the VoteTypes table but not in Votes. It also excludes the Votes rows where UserId is NULL, which is the desired behaviour.
Be First to Comment