A Simple Analysis using R

1

This is part of series of article in which I demonstrate that R is quite easy to learn as it has tips, tricks, shortcuts and graphical user interfaces that can easily cut down your time to learn it, and adding great value to your resume and your analytical capabilities.

Let’s assume you are given a dataset -and you just want to a simple analysis on it.

Well here is the R code for it.

Let the dataset name by Ajay (note R is syntax sensitive unlike the SAS language!!). So ajay is not the same as Ajay.

 

INPUT

read.csv

Ajay <- read.table(“C:/Users/KUSHU/Desktop/A.csv”, header=TRUE, sep=”,”,

+   na.strings=”NA”, dec=”.”, strip.white=TRUE)

Note here the path if input data is  C:/Users/KUSHU/Desktop/A.csv

We are assuming header=true , that means variable names are in first row

Sep=”,” refers to separator between two consecutive data elements (which is a comma here since we are reading data from a comma separated value)

dec=”.”  means we use “.” for seperating decimal points

strip.white=TRUE (how you treat blank spaces)

This looks so intimidating to a new R user to learn. Instead you can just use the Graphical user interface R commander, like this

library(Rcmdr)

and then you can simply click your way into the menu. The code is automatically generated thus helping you learn

 

DESCRIBE DATA STRUCTURE

Now that we have inputted the data we need to see data quality

We get just the variable names using, simply the command names

names (Ajay)

and we get the data structure using simply the command str

Str(Ajay)

The first five observations in a dataset can be given from

head(Ajay,5)

The last  five observations in a dataset can be given from

 

tail(Ajay,5)

DESCRIBE VARIABLE STRUCTURE

We can get summary statistics on the entire data using

summary(Ajay)

But if we want only to refer to one variable say Names and save time,

we refer it to

summary(Ajay$Names)

Similarly we can plot the dataset using a simple command plot

plot(Ajay)

I personally like the describe command, but for that I need to LOAD a new package

library(Hmisc) loads the package.

library(Hmisc)

describe(Ajay)

Suppose I want to add comments to my coding so it looks clean and legible, I just use the # sign and anything after #  looks commented out

 

OUTPUT

Finally I want to save all my results. Welcome I can export them using menus in the GUI, or using the menu within R, or modify the read.table statement to simply write.table and it saves the dataset.

R is easier than you think and doing simple analysis in R is much faster due to the sheer efficiency of the syntax

Enjoy your R coding.

Image courtesy to cooldesign at FreeDigitalPhotos.net
Image courtesy to cooldesign at FreeDigitalPhotos.net
Interested in a career in Data Science?
To learn more about Jigsaw’s Data Science with SAS Course – click here.
To learn more about Jigsaw’s Data Science with R Course – click here.
To learn more about Jigsaw’s Big Data Course – click here.