The different types of Data Structure used in R are :
Vectors
Matrices
Arrays
Factors
Data Frames
Lists
Lets Have a Quick Overview of these Data Structure . Do not bother if you do not get all of them , as it takes time to sink in . The idea is to get an overview currently and keep referring to it whenever we need one , and we progress in our R Journey these would become crystal clear :
Contact at TJT@TechnicalJockey.com , if you are looking for an Instructor Based Online Training !
1. Vectors
A vector is most basic data structure in R. It is collection same objects like character , logical, integer or numeric .
a < c(1,2,3,4)
a
Output :
[1] 1 2 3 4
We can check object "a" is vector or not .
is.vector(a)
Output:
[1] TRUE
We have created character vector "b" as :
2. Matrices
A matrix is a collection of data elements arranged in twodimensional rectangular layout.
We can check layout of matrix function by :
?matrix
It opens description and syntax of matrix function in Help window.
Syntax of matrix is :
matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)
data = Represent data
nrow = number of rows
ncol = number of columns
byrow= the matrix is filled by row
dimnames = to assign names to row and columns
Number of elements = nrow * ncol
We have nrow=3 .
We find ncol = 2 .
The default value of byrow is FALSE and dimnames is NULL .
mat < matrix(c(1,0,4,2,1),nrow= 3)
In this case, byrow = FALSE means we filled elements of matrix by column. So, elements are filled column wise .
[,1] [,2] < Number of column
[1,] 1 2 <Number of row
[2,] 0 1
[3,] 4 5
We represent matrix mat as mat[r,c] , where r is number of row and c is number of column.
mat[1,2] < First row and second column element .
Output :
2
mat[2,2]
Output :
1
We can show all elements of whole row as m[r,] and whole column as m[,c] .
mat[1,] < It shows all elements of first row
Output:
[1] 1 2
mat[,2] < It shows all elements of second column
Output:
[1] 2 1 5
We can also filled elements by row as :
It shows elements are filled in matrix by filling rows by rows.
We check the class of mat object by :
class(mat)
Output:
[1] "matrix"
We can give names to columns and rows by using dimnames :
mat < matrix(c(1,0,4,2,1,5),nrow= 3,dimnames = list(c("a","b","c"),c("x","y")))
mat
Recycling :
Recycle means reusable of materials . We can reuse data to perform functions required .
We are creating a matrix having 5 elements. We assign number of rows are 2 . So , number of columns are 3.
x<matrix(c(1,2,3,4,5),2)
You can see that , there is warning message which showing number of elements are not multiple of 2. So, it will recycle the remaining element by starting with first element to fill.
You can see here , we have create a matrix of 10 elements . The elements are repeated to fill remaining elements.
We can also create matrix on data object :
a<c(5,3,8,7,11,9)
We can create matrix by using dim() . We assign dimension of matrix using dim() and create a matrix on "a" object.
a < 1:20
We assign dimension as rows X columns .
dim(a) < c(4,5) # number of rows = 4 , number of columns =5
a
Output :
We can transpose matrix by using t() .
%*% Operator
This operator is used to multiply a matrix with its transpose.
We can bind matrices by row or column .
cbind()
We can bind two matrices columnwise . When we bind columns, the number of rows of matrices should be same.
X<c(1,2,5,7,8)
Y<c(11,24,85,98,12)
cbind(X,Y)
Output:
When the two matrix do not have same number of rows , it will join . There is an ERROR comes while binding them.
We can also bind vectors by using following code:
v<c(1,2,4,5,9)
h<c(2,8,9,4,7)
cbind(v,h)
rbind()
We can bind matrices by rowwise . When we bind rows , the number of columns of two matrices should be same.
We can also bind vectors rowwise as:
3.Arrays
We can store data in more than two dimensions . If we create an array of dimension (2,3,4) then it creates 4 rectangular matrices each of 2 rows and 3 columns.
An array can create by using array() . We used dim to assign dimension of array.
Arrays are also recycled same as matrix . We create two vectors and input these vectors to an array to fill the elements of array.
vector1 < c(5,9,3)
vector2 < c(10,11,12,13,14,15,16)
result < array(c(vector1,vector2),dim = c(3,3,2))
result
Output:
We can give names to columns , rows and matrices in the array by using dimnames parameter.
vector1 < c(5,9,3)
vector2 < c(10,11,12,13,14,15)
column.names < c("first","second","third")
row.names < c("first","second","third")
matrix.names < c("Matrix1","Matrix2")
result < array(c(vector1,vector2),dim = c(3,3,2),dimnames = list(row.names,column.names,
matrix.names))
We can show the third row of the first matrix of the array .
result [3,,2]
We can show the element in the 1st row and 2nd column of the 1st matrix.
result[1,2,1]
[1] 10
Check out second matrix .
result[,,2]
Create matrices from the array .
mat1<result[,,1]
mat2<result[,,2]
We add two matrices also :
4.Factors
Factors are the data objects which are used to categorize the data and store it. They can store both strings and integers .
Factors are created using factor() function .
l < c("male","female")
Levels shows all possible values of given object . We can check levels of object .
levels(l)
[1] "female" "male"
We create another factor variable Name :
Name<c(1,2,1,1,2,1,2,1,2,1,2,1,2,1)
Name<factor(Name)
levels(Name)
[1] "1" "2"
class(Name)
[1] "factor"
To convert the default factor Name to roman numerals, we use the assignment form of the levels() function:
levels(Name) = c('I','II')
Table
It is used to build a contingency table of the count of each combination of factor variables .
mons = c("March","April","January","November","January",
"September","October","September","November","August",
"January","November","November","February","May","August",
"July","December","August","August","September","November",
"February","April")
mons = factor(mons)
table(mons)
mons
5.Data Frames
Data Frame is a two dimensional data structure . The characteristics of a data frame are :
The column names should be nonempty . Every column has assign certain name.
The row names should be unique.
The data can be of numeric , character or factor type.
Each column should contain same number of elements.
We create a data frame name " myFirstDataFrame " as:
myFirstDataFrame < data.frame(name = c("Bob", "Fred", "Barb", "Sue","Jeff"),
age = c(21,18,18,24,20),
hgt= c(70,67,64,66,72),
wgt= c(180,156,128,118,202),
race= c("Cauc", "Af.Am","Af.Am", "Cauc", "Asian"),
year= c("Jr","Fr","Fr","Sr","So"),
SAT= c(1080,1210,840,1340,880))
myFirstDataFrame
We can view data frame by :
View(myFirstDataFrame)
We can find number of rows and columns by using nrow() and ncol() .
Contact at TJT@TechnicalJockey.com , if you are looking for an Instructor Based Online Training !
6.Lists
A list is a generic vector . It is combination of different objects .
We can create list as :
list<list(1:4 ,"abc",TRUE)
list
Output :
[[1]] < it shows first object from list
[1] 1 2 3 4 < it shows elements of first object
[[2]] < it shows second object from list
[1] "abc" < it shows elements of second object
[[3]] < it shows third object from list
[1] TRUE < it shows elements of third object
We can create a list by combining different objects as:
a<c(1,5,4,7,8)
b<c("Alec", "Dan", "Rob", "Rich")
c < c(TRUE, TRUE, FALSE, FALSE)
list1<list(a,b,c)
list1
We create a list of integer , matrix and character as:
x < 1:10
y < matrix(1:12, nrow=3)
z < "Hello"
mylist < list(x,y,z)
mylist
We create a list contains character , matrix and list objects as:
list_data < list(c("January","February","March"), matrix(c(4,8,6,9,5,3), nrow = 2),
list("yellow",15.4))
We can give names to the list objects by using names() as:
names(list_data) < c("1st Quarter", "A_Matrix", "A Inner list")
list_data
Output:
Contact at TJT@TechnicalJockey.com , if you are looking for an Instructor Based Online Training !

