5 Reasons to Learn Python if You Already Know R Programming
One of the things that I have come across while browsing through the Job Descriptions of a Data Scientist on portals like Glassdoor is the seemingly overwhelming co-occurrence of Python along with R as one of the skill sets required. I have been a user of both the languages and I love working with both. I started as a user of R and then picked up Python along the way.
I found many similarities between R and Python when it comes to wrangling data and these similarities helped me pick up the language quite quickly. One of the most commonly used libraries while working in Python is the pandas library. This is the library which makes it’s usage like that of R.
Python & R: The Similarities
- Use of data frames: Both R and Python (pandas) rely a lot on dataframes while crunching data. A dataframe is one of the most convenient data structures when it comes to handling tabular data. Anyone who has used R would inevitably have worked with dataframes. There are striking syntactical similarities between the R’s Dataframe and pandas’s Dataframe. Sample the code below to see the similarities between both. (I am using the famous iris dataset)
The slight difference in syntax is because, in python indices of any iterable start from 0 instead of 1.
- Visualization using ggplot: Most R users become accustomed to using ggplot for visualizing data. If you transition to Python you need not learn a new visualization library, you can still use ggplot to create stunning visualizations that you are accustomed to creating in R!R code:
Both the sets of code produce a plot as shown below:
- Data I/O: The process of ingesting common flat files such as .csv, .tsv etc is very similar in both. Sample the code below
As can be seen, the process of data ingestion is almost similar, both R and python make use of a “read” function. Both make use of head() method to look at the snapshot of the data.
- Doing SQL style table joins: Many a times one needs to join dataframes containing different set of information. The way these joins are done in R and Python is very similar. See the code below illustrating an inner join in each:
One can’t fail to notice how similar both sets of codes are, even the function names are similar.
- Creating predictive models: The API used in both to build linear models is very similar. Check the code below used to create a linear regression model.
Where Python Comes Handy
Being an extensive user of both the languages I believe anyone who is a beginner to intermediate level R user, can easily transition to Python. Having said that there are many additional benefits that a python user can reap. Here are a few tasks that one can do in Python far more easily as compared to using R:
1. Text Processing: Python is very good at processing text data. There are many good text processing modules available in python. Python being an object oriented language has a very clean syntax that aids in working with text data.
2. Scraping data from websites: Python modules such as Beautiful soup, scrappy etc can be used to scrape data from webpages relatively easily.
3. Image processing: Projects like OpenCV and PIL help in processing image data relatively easily. A good data scientist should be able to make sense out of data from diverse sources. A quick look at the newly launched kaggle competitions will reveal that in many competitions the data is nothing but a bunch of images, revealing a strong trend towards the changing notions of “data”. Having the ability to work with image data will put any analyst at the top of skill set ladder in the industry today.
4. Using Big Data frameworks such as Spark: Python is becoming defacto language when it comes to working with Spark with its pyspark One can accomplish a lot using pyspark. Another good news is if you have used pandas you can easily pick up the pyspark syntax.
Python is the next big thing for Data Science. Explore our Data Science with Python course.
Become an all round Data Scientist with our Data Science Specialization, which covers SAS, R, Python, Excel, VBA, Macros, and SQL.
Learning to use python can yield great dividends. If you are a fresher who already knows R, then picking up python won’t be difficult at all. If you have some experience in the industry and are currently stuck with same type of projects since past couple of years, then learning Python will give you the opportunity to work in exciting new projects like text mining, image analysis and Big Data.