Some tips on becoming a data scientist.

In 2017 I started a MSc program in applied data science. Since then many people have asked me for my thoughts on how to pivot into a career in this area. I decide to condense my thoughts and advice into this article.


1. Data science is a relevantly new profession, there has been an exponential explosion in the hype surrounding it. I enjoy my work and I'm glad I discovered it as a career. However, I wouldn’t nominate it as “the sexiest job of the 21st century”. I’d rather suggest that data science does not exist as a career (see article here)


2. “Data is the new oil”. I’ve seen many companies hire data positions because they want to ‘unlock the potential’ for their data. However, without correctly supporting the individuals this can lead to frustration on both sides. Working with data in a company requires many levels like a cake. You need to start with a correctly designed data warehouse, basic KPIs, and dashboards, then you can move to more advanced prediction analysis. Data career can become very frustrating as a result.

3. Companies often incorrectly advertise data science positions. Many other career routes are ‘rebranded’ as data science.

  • Data engineers: Are developers who design and manage data warehousing and pipeline solutions. Data engineers are concerned with the production readiness of that data and all that comes with it: formats, scaling, resilience, security, and more (see article here)
  • Data analyst: Are concerned with working with various teams in a company to utilize data to derive answers. Do they ask questions such as what stories do the numbers tell? What business decisions can be made based on these insights? They may also create visual representations, such as charts and graphs to better showcase what the data reveals. There are several subdiscipline and specializations such as product analyst, market analyst, business intelligence analyst. They probably don’t write production-level code, and have a good statistical background.
  • Machine learning engineers: In cases where companies design and deploy machine learning models, people develop the code and deploy them to production. They are specialist engineers who are trained in machine learning theory. Specialists include NLP, computer vision.
  • Research scientist: A generic term that is often a variation of machine learning engineer with a focus on research and not implemention.
  • Data scientist: Could cover any of these roles. In some cases would sit between data analyst and machine learning engineer.

4. If you are considering a switch into a data career, you may not know yet which of the above job titles you’ll one day hold. As time goes on, I expect companies to become more data literate and better able to hire the correct people with the correct job tiles.


5. There are other career routes out there. Many professions work with data.


6. It often feels like there is a constant waterfall of new technologies, software, models etc. In my experience, academia is light years ahead of industry. LSTM, reinforcement learning, Q-learning are still very limited in production use. Aside from the giant of the industry (Google, facebook). Most mid-level companies still struggle to just generate KPIs.


7. Things change fast, and there is a lot of noise in this profession. Data scientists seem to have a love of writing blogs and publishing their achievements. Which can quickly generate a fear that you are falling behind and missing out. I don’t have much advice on how to prevent this feeling.


8. You should enjoy working with data. If you want a career in this field, you should enjoy the process of understanding and gaining insights from numerical and quantitative data. Before you commit to a degree program, there are many online courses you can take to explore the field and see what you enjoy.