Diary of a Freshly Minted Data Scientist

Sweeping the Floor with Baby Steps

0

It’s 4 pm on a Thursday afternoon. In a bustling corner of Bangalore’s Indiranagar, lined with budget eateries, start-ups and bundles of positive energy, I am sweeping the floor of a small technology company under many watchful eyes.

Feeling a strain around my lower back after a couple of minutes or so, I stretch and straighten up. But before I can feel a hint of relief creep in, three voices shout “Don’t stop!”  in unison. Shaken and surprised but with little choice in the matter, I get back to the job again.

There has been some deep learning as I start my life as a data scientist, only not the sort I had been visualising as I set out on this journey.

But how did it all get to this?

It had been around four months since I completed the Post Graduate Program in Machine Learning and Data Science offered by Jigsaw Academy and the University of Chicago Graham School.

Most people who enrol for a course like this do so with the intention of landing one of the 2.3 million jobs across the world that will require AI skills by 2020. However, I am more excited by the fact that one in five companies is expected to be using AI to make decisions by the end of 2018. This and other similar trends indicate that there are plenty of opportunities for entrepreneurs to use the powers of AI and machine learning to create something valuable and transformational.

And so, armed with new-found knowledge and inspired by the amazing innovations happening all around me, there’s a project – a new product really that I have started cooking up with the help of some friends in a secret kitchen somewhere.

Hopefully, some time down the line, this project will have taken over my life and there will be stories to share from it but for now, I am here to share some notes and anecdotes from my journey as a new born data scientist.

To begin with, let me explain what this is about. 

Very early in this new life as a data scientist, I came to the following conclusions:

  1. Becoming a data scientist does not automatically result in millions being added to my bank account
  2. No one else thinks I am a data scientist just because I think I am
  3. To make people believe that I am one, I will need to show them some real work I have done
  4. If I don’t do real data science work regularly, I will forget a lot of what I have learnt anyway
  5. If I forget a lot of what I have learnt, even I will stop thinking that I am a data scientist
  6. Hence to keep reminding myself that I am a data scientist, to make others think of me as one and to see some crazy action in my bank account, the only choice I have is to actually be one!

When this realisation dawned, it occurred to me that maybe this may be true for some other people out there as well. After all, there may be others not moving into a specialist role right after completing a data science or machine learning program. They may have the intention to do so but it may take them some time before they find the right opportunity or break.

So, I thought it may be a good idea to share my own experiences about how I attempt to stay in good form when it comes to my machine learning skills. If you have other thoughts, please let me know.

A dinner and three resolutions

Wise and burdened with my realisation about the law of expiring data science skills, my very first thought was to call up my clients and ask them to give me their data for me to transform them into Fortune 500 superstars. And I did try! But I knew I was up against it. Refer point number two in the list of conclusions I had already drawn.

So, it was obvious that I needed a different approach. I was going to have to create work for myself and I would have to manage it along with everything else that goes on in life.

Deliberating over a dinner of dimsums with some dear friends, I laid out my state of despair. I was looking for simple solutions that would help me not just find time to apply skills but learn new ones as well. And I wanted to do it in my spare time with minimal disturbance to my ongoing commitments and projects.

My friends first told me to also wish for the inheritance of a fortune and a private jet! Then they asked me if I believed it was very important for me to be a practising data scientist. I said it was. Then they told me to get realistic and be ready to commit the time that it needed. Then they helped me frame the following resolutions:

  1. Learn something new every week – Budget minimum four hours for this (maybe an hour in the night, four times a week?).
  2. Start building a portfolio – Spend a minimum of 6 hours a week on a self-assigned data project (There goes Saturday afternoon!)
  3. Gain experience using internships – Details need figuring out but do it!

So, with a little help from my friends, I at least had a plan. After all that’s what friends are for – to show you the light. And to pay for the dinner of course!

My job now was to put the plan in action.

So, I started with sweeping some floors.

Over the next couple of months, I will be sharing with you my updates on how I am keeping up with my resolutions. Early data suggests I have made a good start. Have a look.

Resolution number one: Learn something new every week

Thanks to a client that specialises in technologies helping their clients detect identity fraud, I got interested in the very exciting field of detecting tampering of digital images. I learnt about different types of image tampering – retouching, cloning and splicing as well as read up numerous papers on how these can be detected using machine learning algorithms. I learnt that there are more than half a dozen different ways in which image forgery leaves a trail and can be caught!

Apart from understanding many technical details behind this whole process, I also took away a couple of great actionable insights from last week’s effort. The first – that one of my data science projects should be building a model that detects tampering of images. The second – if I ever plan a crime, I must remember not to use tampered images in the modus operandi. I would be likely to get caught!

Resolution number two: Start building a portfolio

With the football world cup still fresh in everyone’s minds and with a football revolution supposedly brewing in India as well, I have decided to use data from India’s football league and create a tool for assessing player performance.

I spent a little more than the allotted six hours in a week on this and realised that this is going to be a lot of fun – for you guys I guess! My first challenge is to export that data into neat tables and at the moment the best solution I have is to copy and paste the data into an excel sheet. Mind you, if I wanted to capture all the data from one season of the league using that method, I will probably end up spending all my Saturdays for the next two years just filling up excel sheets cell by cell.

Will keep you posted on how this one progresses.

Resolution number three: Gain experience using internship

So I rang up a friend who builds the technology for some really cool products for clients across the globe. I told him that I was looking for an internship where I could dedicate half a day every week helping out a team working with machine learning technologies. He told me he couldn’t promise me a long ongoing project but he had just the right opportunity for that week at least.

That’s what friends are for – to give you internships when you need them. And to make you sweep floors – because that’s what my friend made me do on Thursday when I showed up!

The project they are involved with is a healthcare app that uses smart watches to detect certain actions people are performing. My sweeping was part of the data collection exercise that would help them create the programs that will eventually automatically detect the actions of end customers.

For the record, apart from sweeping the floor, my evening was spent brushing my teeth, making some coffee, lying down on a couch, tying my shoe laces and performing many other such fun activities. All the while, it was strangely satisfying to see the words “Sweeping”, “Brushing”, “Combing” etc. pop up between lines of matrix type code on the giant screen acting as the log window. And rewarding to sit with the team afterwards and understand a bit about their approach to modeling. Some fascinating stuff happening indeed!

Some thoughts before signing off!

That’s my status report of life as a newly minted data scientist. I may not be changing the world yet but I am glad that I have started. I can see myself needing a lot of help along the way and hopefully will get to hear from others about how they are going about it as well.

So, until next time, let me know if you are up to something similar by commenting below. After all – that’s what friends are for. And for sharing this post!

Also Read

7 Things You Need to Know About a Career in Analytics

Decluttering Data Science: The Expert Review