The data table data structure is a great way to manipulate your data to address various questions you may have. In this post, we will learn about filtering, dealing with text, and more complex numerical calculations.
Packages and Data Preparation
We will begin by loading our package data.table and converting our datasets mtcars and iris, into data tables. Both mtcars and iris are preinstalled on R. Below is the code.
library(data.table)
mtcars<-data.table(mtcars)
iris<-data.table(iris)
Next, we will quickly examine both datasets using the head() function to understand what each one is about.

We now move to filtering.
Filtering for Not
Our first exercise is the use of NOT logic in filtering. With NOT logic, you are filtering for what is not included in your code. For example, in the code below, we are telling R to display all cars that do not have a transmission. The code for NOT is != which means “does not equal”. Below is the code and example.
> # Filter all rows where am is not 0
> not_0_am <- mtcars[am !=0]
> not_0_am
mpg cyl disp hp drat wt qsec vs am gear carb
<num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1: 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2: 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
4: 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
5: 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
6: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
7: 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
8: 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
9: 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
10: 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
11: 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
12: 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
13: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
>
Of course, you can have more than one argument within your code, as we will see in the next example.
Multiple Commands for Not
It is also possible to include multiple commands. In the example below, we are filtering for cars with an automatic transmission (am==1) but do not have 6 cylinders (cyl != 6). The output matches the criteria that were set
> # Filter all rows where am is 0 AND cyl is not 6
> am_cyl <- mtcars[am==1 & cyl != 6]
> am_cyl
mpg cyl disp hp drat wt qsec vs am gear carb
<num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
2: 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
3: 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
4: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
5: 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
6: 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
7: 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
8: 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
9: 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
10: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Searching Text
It is also possible to search for text and even numbers. In the code below, we are searching the iris dataset for the species “setosa” and for petal lengths that are less than 1.3
> #with text
> iris[Species=="setosa" & Petal.Length<1.3]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<num> <num> <num> <num> <fctr>
1: 4.3 3.0 1.1 0.1 setosa
2: 5.8 4.0 1.2 0.2 setosa
3: 4.6 3.6 1.0 0.2 setosa
4: 5.0 3.2 1.2 0.2 setosa
We can also search for text when unsure what we are looking for. In the example below, we use the %like% argument to search the Specias column for text containing the letter v. Since the results are rather long, we use the head() function to see the first few rows.
> # Filter all rows where Species contains "V"
> any_v <- iris[Species %like% "v"]
> head(any_v)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<num> <num> <num> <num> <fctr>
1: 7.0 3.2 4.7 1.4 versicolor
2: 6.4 3.2 4.5 1.5 versicolor
3: 6.9 3.1 4.9 1.5 versicolor
4: 5.5 2.3 4.0 1.3 versicolor
5: 6.5 2.8 4.6 1.5 versicolor
6: 5.7 2.8 4.5 1.3 versicolor
Another way to search text is by looking for words that end with something. In the example below, we are looking for words in the Species column that end with the word “color.” We indicate this to are by using the %like% argument again and the word “color” with a dollar sign at the end of it. The dollar sign tells R to look for this word at the end of a word in the Species column.
> # Filter all rows where Species ends with "color"
> end_flowers <- iris[Species %like% "color$"]
> head(end_flowers)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<num> <num> <num> <num> <fctr>
1: 7.0 3.2 4.7 1.4 versicolor
2: 6.4 3.2 4.5 1.5 versicolor
3: 6.9 3.1 4.9 1.5 versicolor
4: 5.5 2.3 4.0 1.3 versicolor
5: 6.5 2.8 4.6 1.5 versicolor
6: 5.7 2.8 4.5 1.3 versicolor
Multiple Numerical Arguments
Multiple numerical arguments are also possible. In the example shown below, we are looking for all cars in the mtcars dataset that are 4 or 6 cylinders. We achieve this by listing the variable we are searching “cyl” followed by the %in% argument, and lastly we use the c() function and include our values inside it. Below is the code and output.
> # Filter all rows where cyl is 4 or 6
> filter_cyl <- mtcars[cyl %in% c(4, 6)]
> filter_cyl
mpg cyl disp hp drat wt qsec vs am gear carb
<num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1: 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2: 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
4: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
5: 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
6: 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
7: 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
8: 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
9: 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
10: 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
11: 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
12: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
13: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
14: 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
15: 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
16: 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
17: 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
18: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
In this last example, we learn to find data that meets a range rather than just specific values. In the code below, we are looking for cars that have an mpg between 20 and 22. The new argument in this example is the %between% argument, which is used to tell R to search for a range of values. Below is the code, followed by the output
> # Filter all rows where mpg is between [20, 22]
> mpg_20_22 <- mtcars[mpg %between% c(20,22)]
> mpg_20_22
mpg cyl disp hp drat wt qsec vs am gear carb
<num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1: 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2: 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
4: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
5: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Conclusion
Data tables provide a different way of pulling insights from data. The value of this approach becomes clearer when dealing with large datasets in which speed becomes important.




















































































