2 Basics
2.1 Prerequisites
We introduce the basics in R programming in this chapter. We will review the basic operators and data types in this chapter. We also provide an introduction to the basic R data structures (including tibbles and data tables).
Most of this chapter involves working with R basic operators and data types, which do not require any extra packages. We will also introduce the tibble package, which forms part of the tidyverse package in section 2.5, and the data.table package in section 2.6
2.2 Basic operators
Basic arithmetic operators (+, -, *, /, ^, `%’) would work like your
calculator
3 + 2 # addition
## [1] 5
3 - 2 # subtraction
## [1] 1
3 * 2 # multiplication
## [1] 6
3 / 2 # division
## [1] 1.5
3^2 # exponent
## [1] 9
3 %/% 2 # integer division
## [1] 1
3 %% 2 # mod (remainder of a division)
## [1] 1R uses the <- operator for assignments. You can read the following code as
assigning the outcome of 2 + 3, 5, and 3 to the object
value_a,value_b, and value_c respectively, which stores it for later use.
value_a <- 2 + 3
value_b <- 5
value_c <- 3You can print what is stored in the object to console with
print(value_a)
## [1] 5You can use relational operators to compare how one object relates to another.
2 + 3 == 5 # TRUE that 2 + 3 equals 5
## [1] TRUE
2 + 3 != 5 # FALSE that 2 + 3 not equals to 5
## [1] FALSE
2 + 3 < 3 # FASLE that 2 + 3 is less than 3
## [1] FALSE
2 + 3 > 3 # TRUE that 2 + 3 is more than 3
## [1] TRUE
2 + 3 <= 5 # TRUE that 2 + 3 is less than or equal to 5
## [1] TRUE
2 + 3 >= 5 # TRUE that 2 + 3 is more than or equal to 5
## [1] TRUEYou can use logical operators to connect two or more expressions. For example, to connect the results of the comparisons made using relational operators.
(2 + 3 == 5) && (2 + 3 < 3) # logical AND operator
## [1] FALSE
(2 + 3 == 5) || (2 + 3 >= 3) # logical OR operator
## [1] TRUENote that the logical && and || only examines the first element of a vector.
x <- c(TRUE, TRUE, FALSE)
y <- c(FALSE, TRUE, FALSE)
x && y
## Warning in x && y: 'length(x) = 3 > 1' in coercion to 'logical(1)'
## Warning in x && y: 'length(x) = 3 > 1' in coercion to 'logical(1)'
## [1] FALSE
x || y
## Warning in x || y: 'length(x) = 3 > 1' in coercion to 'logical(1)'
## [1] TRUETo perform element-wise logical operations, use & and | instead
x & y
## [1] FALSE TRUE FALSE
x | y
## [1] TRUE TRUE FALSE
!y
## [1] TRUE FALSE TRUE2.3 Basic data types
There are basic data types (also known as atomic data types) in R in order to use them.
| Data Type | Examples | Additional Information |
|---|---|---|
| Logical |
TRUE, FALSE
|
Boolean values |
| Numeric |
1, 999.9
|
Default data type for numbers |
| Integer |
1L, 999L
|
L is used to denote an integer |
| Character |
"a", "R for BES"
|
Data type for one or more characters |
| Complex | 2 + 3i |
Data type for numbers with a real and imaginary component |
| Raw | charToRaw("R for BES") |
Not commonly used data type used to store raw bytes |
2.4 Basic data structures
The basic data structures in R include factors, atomic vectors, lists, matrices,
and data.frames
Factors are used in R to represent categorical variables. Although they appear
similar to character vectors they are actually stored as integers. You can use
the function levels() to output the categorical variables and nlevels() to
check the number of categorical variables.
eye_color <- factor(c("brown", "black", "green", "brown", "black", "blue"))
nlevels(eye_color)
## [1] 4
levels(eye_color)
## [1] "black" "blue" "brown" "green"Atomic vectors or more frequently referred to as vectors are a data structure
that is used to store multiple objects of the same data type (logical, numeric,
integer, character, complex, or raw). Vectors are one-indexed (i.e., the first
element is indexed using [1]) and you can get the number of elements in the
vector using the function length(). The function class() can be used to
reveal the class of any object in R.
vec_num <- c(1, 2, 3, 4)
class(vec_num)
## [1] "numeric"
vec_char <- c("R", "for", "BES")
class(vec_char)
## [1] "character"
# coercion if data types are mixed
vec_mix <- c("R", 4, "BES")
class(vec_mix)
## [1] "character"
# you can easily combine vectors using the c function
c(vec_char, vec_mix)
## [1] "R" "for" "BES" "R" "4" "BES"
# vector length
length(vec_num)
## [1] 4
# access first element of vector
vec_num[1]
## [1] 1Lists are an ordered data structure that is used to store multiple R objects of
different types. The function list() is used to create a list and a list in R
can be accessed using a single [] or double brackets [[]]. Using []
returns a list of the selected element while using [[]] returns the selected
element. Using the function length(), you can obtain the number of objects in
a list.
my_list <- list(
c(1, 2, 3, 4),
c("a", "b"),
1L,
matrix(1:9, ncol = 3)
)
my_list_a <- my_list[1]
class(my_list_a)
## [1] "list"
my_list_b <- my_list[[1]]
class(my_list_b)
## [1] "numeric"
length(my_list)
## [1] 4If you have named the elements in your list, you could also access them by
specifying their names in the brackets or using the $ operator.
named_list <- list(
a = c(1, 2, 3, 4),
b = c("a", "b"),
c = 1L,
d = matrix(1:9, ncol = 3)
)
class(named_list["a"])
## [1] "list"
class(named_list[["a"]])
## [1] "numeric"
class(named_list$a)
## [1] "numeric"A matrix is a two dimensional data structure that is used to store multiple
objects. You can use the function matrix() to create a matrix using the ncol
and nrow argument to specify the number of columns and rows respectively, and
the byrow argument to specify how the data in would be ordered.
matrix(1:12, ncol = 3, byrow = FALSE)
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
matrix(1:12, nrow = 3, byrow = FALSE)
## [,1] [,2] [,3] [,4]
## [1,] 1 4 7 10
## [2,] 2 5 8 11
## [3,] 3 6 9 12
matrix(1:12, ncol = 3, byrow = TRUE)
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
## [4,] 10 11 12Aside from numeric data types, a matrix can also be used to store other data types as long as they are homogeneous. To store heterogeneous data types, you should use a data frame which is introduced next.
matrix(c("brown", "black", "green", "brown", "black", "blue"), ncol = 2)
## [,1] [,2]
## [1,] "brown" "brown"
## [2,] "black" "black"
## [3,] "green" "blue"
matrix(c("TRUE", "TRUE", "FALSE", "FALSE"), ncol = 2)
## [,1] [,2]
## [1,] "TRUE" "FALSE"
## [2,] "TRUE" "FALSE"
matrix(c(1L, 2L, 3L, 4L), ncol = 2)
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4You can access the values in a matrix by providing the row and column index. For
example, [1, 2] would return the value stored in the first row, second column
of the matrix. [3, 1] would return the value stored in the third row, first of
the matrix. You can retrieve all the values in a column by leaving the row index
empty and likewise retrieve all the values in a row by leaving the column index
empty. For example, [, 2] would return all the values in column 2 and [2, ]
would return all the values in row 2.
m <- matrix(1:12, nrow = 3, byrow = FALSE)
m[1, 2]
## [1] 4
m[3, 1]
## [1] 3
m[, 2]
## [1] 4 5 6
m[2, ]
## [1] 2 5 8 11A data.frame is a two-dimensional data structures that are used to store
heterogeneous data types in R. As a result of it’s convenience, data frames are
a commonly used data structure in R. You can use the function data.frame() to
create a data frame.
df <- data.frame(
x = c(1, 2, 3),
y = c("red", "green", "blue"),
z = c(TRUE, FALSE, TRUE)
)You can access elements of a data.frame like a list [], [[]] or $. Using
[] returns a data.frame of the selected element while using [[]] or $
will reduce it to a vector.
df
## x y z
## 1 1 red TRUE
## 2 2 green FALSE
## 3 3 blue TRUE
df["y"]
## y
## 1 red
## 2 green
## 3 blue
df[["y"]]
## [1] "red" "green" "blue"
df$y
## [1] "red" "green" "blue"You can also access a data.frame like a matrix.
df
## x y z
## 1 1 red TRUE
## 2 2 green FALSE
## 3 3 blue TRUE
df[1, 2]
## [1] "red"
df[, 2]
## [1] "red" "green" "blue"
df[2, ]
## x y z
## 2 2 green FALSE2.5 Tibbles
Tibbles are basically a modified version of R’s data.frame. Therefore, you
would also access tibbles like how you would access a data.frame. You can
create a tibble using the function tibble(). Alternatively, you can coerce a
data frame into a tibble using as_tibble().
tibble(
x = c(1, 2, 3),
y = c("red", "green", "blue"),
z = c(TRUE, FALSE, TRUE)
)
## # A tibble: 3 × 3
## x y z
## <dbl> <chr> <lgl>
## 1 1 red TRUE
## 2 2 green FALSE
## 3 3 blue TRUE
as_tibble(iris)
## # A tibble: 150 × 5
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## <dbl> <dbl> <dbl> <dbl> <fct>
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 7 4.6 3.4 1.4 0.3 setosa
....A key difference lies in how tibbles are printed. Printing a tibble only results in the first ten rows being displayed with an explicit reporting of each column’s data type.
tb <- as_tibble(iris)
print(tb)
## # A tibble: 150 × 5
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## <dbl> <dbl> <dbl> <dbl> <fct>
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 7 4.6 3.4 1.4 0.3 setosa
....
df <- as.data.frame(iris)
print(df)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 7 4.6 3.4 1.4 0.3 setosa
## 8 5.0 3.4 1.5 0.2 setosa
## 9 4.4 2.9 1.4 0.2 setosa
....Unlike a data.frame, tibbles provides clarity on the data structure that it
returns. When indexing with tibbles, [ always return another tibble while [[
and $ alway returns a vector. In contrast, single column data frames are often
converted into atomic vectors in R unless drop = FALSE is specified.
class(tb[, 1])
## [1] "tbl_df" "tbl" "data.frame"
class(tb[[1]])
## [1] "numeric"
class(tb$Sepal.Length)
## [1] "numeric"
class(df[, 1])
## [1] "numeric"
class(df[, 1, drop = FALSE])
## [1] "data.frame"Additionally, tibbles do not do partial matching and raises a warning unless the variable specified is an exact match.
tb$Sepal.Lengt
## Warning: Unknown or uninitialised column: `Sepal.Lengt`.
## NULL
df$Sepal.Lengt
## [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
## [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
## [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
## [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
## [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
## [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
## [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
## [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
## [145] 6.7 6.7 6.3 6.5 6.2 5.9You can read more about tibbles by typing vignette("tibble") in your console.
2.6 data.table
Like tibbles, data.tables are an enhanced version of data.frames. You can
create a data.table using the function data.table(). You can also coerce
existing R objects into a data.table with setDT() for data.frames and
lists, and as.data.table() for other data structures. Note that
as.data.table() also works with data.frames and lists. However setDT() is
more memory efficient because it does not create a copy of the original data
frame or list but instead returns a data table by reference.
dt <- data.table(
x = c(1, 2, 3),
y = c("red", "green", "blue"),
z = c(TRUE, FALSE, TRUE)
)
class(
setDT(
data.frame(c(1, 2, 3))
)
)
## [1] "data.table" "data.frame"data.tables provide additional functionality through the way it is queried.
The general form for working with a data table is [i, j, by], which can be
read as subset rows using i, operate on j, and grouped by by.
Lets see how this work using the iris example dataset.
dt <- as.data.table(iris)
print(dt)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1: 5.1 3.5 1.4 0.2 setosa
## 2: 4.9 3.0 1.4 0.2 setosa
## 3: 4.7 3.2 1.3 0.2 setosa
## 4: 4.6 3.1 1.5 0.2 setosa
## 5: 5.0 3.6 1.4 0.2 setosa
## ---
## 146: 6.7 3.0 5.2 2.3 virginica
## 147: 6.3 2.5 5.0 1.9 virginica
## 148: 6.5 3.0 5.2 2.0 virginica
....If you want to get an explicit reporting of each column’s data type, as the
tibble does by default, you can set the data.table.print.class to TRUE.
options(datatable.print.class = TRUE)
print(dt)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## <num> <num> <num> <num> <fctr>
## 1: 5.1 3.5 1.4 0.2 setosa
## 2: 4.9 3.0 1.4 0.2 setosa
## 3: 4.7 3.2 1.3 0.2 setosa
## 4: 4.6 3.1 1.5 0.2 setosa
## 5: 5.0 3.6 1.4 0.2 setosa
## ---
## 146: 6.7 3.0 5.2 2.3 virginica
## 147: 6.3 2.5 5.0 1.9 virginica
....You can filter the rows to only contain Species == "virginica.
dt[Species == "virginica"]
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## <num> <num> <num> <num> <fctr>
## 1: 6.3 3.3 6.0 2.5 virginica
## 2: 5.8 2.7 5.1 1.9 virginica
## 3: 7.1 3.0 5.9 2.1 virginica
## 4: 6.3 2.9 5.6 1.8 virginica
## 5: 6.5 3.0 5.8 2.2 virginica
## 6: 7.6 3.0 6.6 2.1 virginica
## 7: 4.9 2.5 4.5 1.7 virginica
## 8: 7.3 2.9 6.3 1.8 virginica
....You can select the columns using the j expression. Notice that wrapping the
variables within list() or .() ensures that a data.table is returned. In
contrast, an atomic vector is returned when list() or .() is not used. .()
is an alias for list() and therefore the two are the same.
dt[, Sepal.Length]
## [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
## [19] 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0
## [37] 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5
## [55] 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1
## [73] 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3 5.6 5.5
## [91] 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3
## [109] 6.7 7.2 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2
## [127] 6.2 6.1 6.4 7.2 7.4 7.9 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8
## [145] 6.7 6.7 6.3 6.5 6.2 5.9
class(dt[, Sepal.Length])
## [1] "numeric"
class(dt[, list(Sepal.Length)])
## [1] "data.table" "data.frame"
class(dt[, .(Sepal.Length)])
## [1] "data.table" "data.frame"You can select multiple columns with list() or .().
dt[, .(Sepal.Length, Species)]
## Sepal.Length Species
## <num> <fctr>
## 1: 5.1 setosa
## 2: 4.9 setosa
## 3: 4.7 setosa
## 4: 4.6 setosa
## 5: 5.0 setosa
## ---
## 146: 6.7 virginica
## 147: 6.3 virginica
....You can also save the targeted column names in a variable and use it to specify
columns with .. prefix.
cols <- c("Sepal.Length", "Species")
dt[, ..cols]
## Sepal.Length Species
## <num> <fctr>
## 1: 5.1 setosa
## 2: 4.9 setosa
## 3: 4.7 setosa
## 4: 4.6 setosa
## 5: 5.0 setosa
## ---
## 146: 6.7 virginica
## 147: 6.3 virginica
....Aside from selecting columns using j, you can carry out computations on j
involving one or more columns and a subset of rows using i.
dt[, mean(Sepal.Length)]
## [1] 5.843333
dt[, .(
Sepal.Length.Mean = mean(Sepal.Length),
Sepal.With.Mean = mean(Sepal.Width)
)]
## Sepal.Length.Mean Sepal.With.Mean
## <num> <num>
## 1: 5.843333 3.057333
dt[
Species == "virginica" & Sepal.Length < 6,
.(
Sepal.Length.Mean = mean(Sepal.Length),
Sepal.With.Mean = mean(Sepal.Width)
)
]
## Sepal.Length.Mean Sepal.With.Mean
## <num> <num>
## 1: 5.642857 2.714286You can then use the by expression in data tables to perform computations by
groups.
dt[, .(
Sepal.Length.Mean = mean(Sepal.Length),
Sepal.With.Mean = mean(Sepal.Width)
),
by = Species
]
## Species Sepal.Length.Mean Sepal.With.Mean
## <fctr> <num> <num>
## 1: setosa 5.006 3.428
## 2: versicolor 5.936 2.770
## 3: virginica 6.588 2.974The .N variable that counts the number of instances is particularly useful
when combined with by.
dt[, .N, by = Species]
## Species N
## <fctr> <int>
## 1: setosa 50
## 2: versicolor 50
## 3: virginica 50You can also apply it to multiple columns using the list() or .() notation.
You can read the code below as calculating the mean of Speal.Length and the
number of instances (given by .N) grouped by their Species and whether
Sepal.Length < 6.
dt[, .(Sepal.Length.Mean = mean(Sepal.Length), .N),
by = .(Species, Sepal.Length < 6)
]
## Species Sepal.Length < 6 Sepal.Length.Mean N
## <fctr> <lgcl> <num> <int>
## 1: setosa TRUE 5.006000 50
## 2: versicolor FALSE 6.375000 24
## 3: versicolor TRUE 5.530769 26
## 4: virginica FALSE 6.741860 43
## 5: virginica TRUE 5.642857 7data.tables add, update, and delete columns by reference to avoid redundant
copies for performance improvements. You can use the := operator to add,
update, and delete columns in j by reference. There are two forms for using
:= and they are: [, LSH := RHS] and [,:=(LHS = RHS)].
dt <- as.data.table(iris)
dt[, Sepal.Sum := .(Sepal.Length + Sepal.Width)]
head(dt)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Sum
## <num> <num> <num> <num> <fctr> <num>
## 1: 5.1 3.5 1.4 0.2 setosa 8.6
## 2: 4.9 3.0 1.4 0.2 setosa 7.9
## 3: 4.7 3.2 1.3 0.2 setosa 7.9
## 4: 4.6 3.1 1.5 0.2 setosa 7.7
## 5: 5.0 3.6 1.4 0.2 setosa 8.6
## 6: 5.4 3.9 1.7 0.4 setosa 9.3
dt[, `:=`(Petal.Sum = Petal.Length + Petal.Width)]
head(dt)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Sum
## <num> <num> <num> <num> <fctr> <num>
## 1: 5.1 3.5 1.4 0.2 setosa 8.6
## 2: 4.9 3.0 1.4 0.2 setosa 7.9
## 3: 4.7 3.2 1.3 0.2 setosa 7.9
## 4: 4.6 3.1 1.5 0.2 setosa 7.7
## 5: 5.0 3.6 1.4 0.2 setosa 8.6
## 6: 5.4 3.9 1.7 0.4 setosa 9.3
## Petal.Sum
## <num>
....Note that in the above code, we do not need to make any assignments back to a
variable because the modification is done by reference or in place. In other
words we are modifying dt and not a copy of dt. Therefore, you will also see
that if you run the entire code chunk above, dt will contain both columns
Sepal.Sum and Petal.Sum.
Since := is used in j, it can be combined with i and by as we have seen
in the earlier parts of this sub-section.
dt[Species == "versicolor" | Species == "virginica", Sepal.Length := 0]
head(dt)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Sum
## <num> <num> <num> <num> <fctr> <num>
## 1: 5.1 3.5 1.4 0.2 setosa 8.6
## 2: 4.9 3.0 1.4 0.2 setosa 7.9
## 3: 4.7 3.2 1.3 0.2 setosa 7.9
## 4: 4.6 3.1 1.5 0.2 setosa 7.7
## 5: 5.0 3.6 1.4 0.2 setosa 8.6
## 6: 5.4 3.9 1.7 0.4 setosa 9.3
## Petal.Sum
## <num>
....
dt[, Sepal.Length.Mean := mean(Sepal.Length),
by = .(Species)
]
head(dt)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Sum
## <num> <num> <num> <num> <fctr> <num>
## 1: 5.1 3.5 1.4 0.2 setosa 8.6
## 2: 4.9 3.0 1.4 0.2 setosa 7.9
## 3: 4.7 3.2 1.3 0.2 setosa 7.9
## 4: 4.6 3.1 1.5 0.2 setosa 7.7
## 5: 5.0 3.6 1.4 0.2 setosa 8.6
## 6: 5.4 3.9 1.7 0.4 setosa 9.3
## Petal.Sum Sepal.Length.Mean
## <num> <num>
....You can find out more about data.tables by typing
vignette(package = "data.table") into the console.