A Simple Analysis using R

This is part of series of article in which I demonstrate that R is quite easy to learn as it has tips, tricks, shortcuts and graphical user interfaces that can easily cut down your time to learn it, and adding great value to your resume and your analytical capabilities.

Let’s assume you are given a dataset -and you just want to a simple analysis on it.

Well here is the R code for it.

Let the dataset name by Ajay (note R is syntax sensitive unlike the SAS language!!). So ajay is not the same as Ajay.

INPUT

read.csv

Ajay <- read.table(“C:/Users/KUSHU/Desktop/A.csv”, header=TRUE, sep=”,”,

+   na.strings=”NA”, dec=”.”, strip.white=TRUE)

Note here the path if input data is  C:/Users/KUSHU/Desktop/A.csv

We are assuming header=true , that means variable names are in first row

Sep=”,” refers to separator between two consecutive data elements (which is a comma here since we are reading data from a comma separated value)

dec=”.”  means we use “.” for seperating decimal points

strip.white=TRUE (how you treat blank spaces)

This looks so intimidating to a new R user to learn. Instead you can just use the Graphical user interface R commander, like this

library(Rcmdr)

and then you can simply click your way into the menu. The code is automatically generated thus helping you learn

DESCRIBE DATA STRUCTURE

Now that we have inputted the data we need to see data quality

We get just the variable names using, simply the command names

names (Ajay)

and we get the data structure using simply the command str

Str(Ajay)

The first five observations in a dataset can be given from

head(Ajay,5)

The last  five observations in a dataset can be given from

tail(Ajay,5)

DESCRIBE VARIABLE STRUCTURE

We can get summary statistics on the entire data using

summary(Ajay)

But if we want only to refer to one variable say Names and save time,

we refer it to

summary(Ajay$Names)

Similarly we can plot the dataset using a simple command plot

plot(Ajay)

I personally like the describe command, but for that I need to LOAD a new package

library(Hmisc) loads the package.

library(Hmisc)

describe(Ajay)

Suppose I want to add comments to my coding so it looks clean and legible, I just use the # sign and anything after #  looks commented out

OUTPUT

Finally I want to save all my results. Welcome I can export them using menus in the GUI, or using the menu within R, or modify the read.table statement to simply write.table and it saves the dataset.

R is easier than you think and doing simple analysis in R is much faster due to the sheer efficiency of the syntax

Enjoy your R coding.

Post Popularity 10%  
Popularity Breakdown
Comments 10%  

About Ajay Ohri

Ajay Ohri is an independent writer and blogger in the field of analytics since 2007. His Decisionstats.com website reaches more than 15,000 people on a monthly basis and has carried more than 90 interviews with noted leaders in the analytics space including authors, analysts, founders, and senior management of practically the entire analytics industry. Ajay has been a proponent of R since its early days and is currently working on a book on the same topic.
This entry was posted in Career in Analytics, Tools & Techniques and tagged , , . Bookmark the permalink.

2 Responses to A Simple Analysis using R

  1. Ramma says:

    About 3 years ago, I did a side job with a TV station who wtaned to catalog information on daycare facilities around their city, which was in 3 counties. 2 counties provided CSV files, one provided PDF. I got hired to extract and table 900 PDF reports, which all had pretty much the same layout, by converting the PDF into a text file and then using a mess of regular expressions to extract the data into a tabular form. It was a fun exercise for me that got me some coffee money (I don’t do this regularly, I was a friend of someone there who knew I had done something similar before for my own personal stuff) and some more experience as to how to handle problems like this that shouldn’t exist but do.

  2. K.Sandeep Kumar says:

    Hi,
    I am learning R language on my own from web.Currently working on Linear discriminant analysis. When i use lda() function. I get only group means,coefficients of linear discriminants, proportion of trace and table with predict commands gives me classification table.But i am unable to get Fishers discriminant function coefficients or classification function coefficients. Please tell me how to get Fishers discriminant function coefficients in R …

    Thanks in advance.
    Sandeep

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>