orange and purple merging in water

UNION and INTERSECT in SQL

UNION and INTERSECT are two useful statements used in SQL for specific purposes. In this post, we will define, provide examples of each statement, and compare and contrast UNION, INTERSECT, with the JOIN command.

UNION

The UNION statement is used to append rows together from different select statements. Below is the code followed by the output and explanation.

(
SELECT street_address, city, state, postal_code
FROM customers
WHERE street_address IS NOT null
)
union
(
SELECT street_address, city, state, postal_code
FROM dealerships
WHERE street_address IS NOT null 
)

Notice how the select statements are both in their own set of parentheses with the union statement between them. In addition, you can see that both select statements used the same columns. Remember we are trying to combine data from different places that have the same columns. You can ignore this but the output will be hard to comprehend. For example, if you have different columns in each select statement you will get an output but it will be hard to interpret.

ad

Essentially, in the example above, we took the same columns from two different tables and created one table. UNION removes duplicates. If you want duplicates you must use UNION ALL

In contrast, joins are used to append columns together based on criteria. If we were to join two or more tables the joined table would increase in its number of columns and possibly its rows. A UNION will have a specified number of columns while growing in terms of the number of rows that are present in the output.

INTERSECT

INTERSECT finds common rows between select statements. It is highly similar to JOIN with the main difference being INTERSECT removes duplicates while JOIN will not. Below is an example.

SELECT state,postal_code FROM customers  
INTERSECT 
SELECT state,postal_code FROM dealerships;

The code is simple. You make your select statements and place INTERSECT in between them. The results above show us what data these two select statements share. When state and postal_code are the criteria for these two tables only these three rows are in common. If we did a JOIN we would get every instance of these three state and postal codes rather than just the unique ones.

Conclusion

UNION and INTERSECT have their place in the data analyst toolbelt. UNION is for appending rows together. INTERSECT is for finding commonalities between different select statements. Both UNION and INTERESECT remove duplicates which is not done when using a JOIN.

Leave a Reply