Aggregating Data in SQL

Aggregation is focused on summarizing data in order to develop some insights for taking action. This post will provide some simple ways to aggregate data in SQL. Specifically, we will learn how to use the following commands

  • COUNT
  • SUM
  • MIN
  • MAX
  • AVG

The data we are using is nba_salary which is available at the Kaggle website.

COUNT

The COUNT command allows you to count the number of rows that contain data in a table. In the code below we are going to count the number of data containing rows in the Seasons_Stats table.

SELECT COUNT(*)
FROM Seasons_Stats

As you can see the output is not much only one cell. In this first example we took into consideration all of the columns in the table. That is why there is an asterisk inside the COUNT function parentheses. However, you can also limit this to one or more columns as shown below.

SELECT COUNT(Player)
FROM Seasons_Stats

In the code above, we limited are query to only the “Player” column in the Table. You can see there is a differece in the count for this example and the previuous one. This indicates that some of the rows do not have data for the Player’s name.

SUM

The sum gives a total for numerical values. For our example, we are going to add up all the points scored by the players in our table.

SELECT SUM(PTS)
FROM Seasons_Stats

In this table, you can see the number of points scored. This command is rather straightforward in how to use it,

MIN and MAX

The MIN and MAX commands do what they say. To make this a little more interesting we are going to use these two commands together to find the range of the data. In the table we are going to use the Age column which shares the age of the player. We are going to do three things in the code below

  • calculate the minimum age
  • Calculate the maximum age
  • Find the difference between the MAX and the MIN to get the range of the age

The code and output are below

SELECT MIN(Age),
	   MAX(Age),
	   MAX(Age) - MIN(Age)
FROM Seasons_Stats

AVG

The AVG command calculates the average. this is also rather simple. Therefore, the code is below for the average age of a basketball player.

SELECT AVG(Age)
FROM Seasons_Stats

Conclusion

This post share some simple ways in which aggregation can happening when using SQL. There are naturally other ways that these tools can be used. However, for informative purposes this will provide a foundation for someone who is new to using SQL.

Leave a Reply