Chapter 2 Getting acquainted with R
These lectures concentrated on basic R syntax, different data types, some built-in functions and methods for reading data from external files. Thanks to the RStudio Team (2020) for providing a intuitive interface to work with R.
R is an object orientated programming language focus around statistics programming and data analysis, the way it work is that it store each instance in memory so it can refer it later. R comes with some predefined variables like LETTERS
and letters
, which holds a list with the capitalized and non capitalized letters of the alphabet, constants such pi and i
and j
for imaginary numbers. Some interesting little behavior that happen (like in many other programming languages), is that do to limited precision on floating points sometimes results that doesn’t look zero actually mean 0.
The ? symbol and help can be used to access the documentation page of built-in functions. Some other functions great for getting information about functions are: the args()
and attributes()
which gives the arguments and attributes of an object respectably.
? dimnames()
help("mean")
ls() # list of variables
args(read.table) # get arguments from a function
my_vector <-c(1,2)
attributes(my_vector) # get the attributes of an object
class(my_vector)
There are two methods to print the value of a variable both print
and cat
, print is a generic function that can be implemented across user-defined objects. While cat display a character string of the parameter pass to it.
## [1] 71
## 71
When working with R, you have an assigned working directory which usually is initialize at the directory where your R project is located. There are functions that let you interact with the file system and individual files.
getwd() # get working directory
path.expand("~") # shows the location of the root directory
# setwd(~) # set working directory to root directory
list.files()
file.info('index.Rmd')
Some easy key shortcut in order to keep a clean console and workspace always come in handy. control + L clears the console and
remove(list=ls())
erases all the variables stored in the environment __ This is a potentially dangerous__ because it can erase all your specially if you have been using scripting in the console instead of code written on files but can be useful to give new breathing air when you work is getting to clutter.
2.1 Looking at Data
The following functions allows you to get a look at different aspects of an object, they become very useful when dealing with datasets and doing exploratory analysis.
object.size()
allows you to know how much space a certain object is occupying in memory mostly used with datasets.str()
display the structure of an objectsummary()
is very useful to get a sense of the data it give properties such as the mean, the median, the number of NA values, etc.table()
display information in a table formatdim()
show the dimension of a dataframe
names()
show the names of the columns of a datafram
2.2 Sessions and environment
Sessions store information about the working directory, this configuration can be seen in the .Rdata file.
environments are the container where objects are stored. Objects created by the user get stored in the global environment, core functions are store in the base environment and every library loaded has its own environment. When R wants to find the value/definition of an object it searches trough this environment in a particular order starting by the global one. When evaluating a function R creates its own environment, that is why variables created inside a function have no effect in the main environment.
environments are a collection of symbol/value pairs
References
RStudio Team. 2020. RStudio: Integrated Development Environment for R. Boston, MA: RStudio, Inc. http://www.rstudio.com/.