Why Create a Data Science Portfolio?

It’s fair to say that data science has become quite popular over the past decade. Per 100 people in the UK ‘googling’ for data science in 2011, ~1600 people now search for the term in 2019 (Google, 2019), and this trend expands to a few of the other popular related terms:

Term20112013201520172019
Data Science1001754509751600
Big Data150900120013251200
Machine Learning15027560017751950
Data Analytics125200450725975
Deep Learning5025175575550
Term20112019
Data Science1001600
Big Data1501200
Machine Learning1501950
Data Analytics125975
Deep Learning50550

If data science is, or isn’t, going through a hype cycle is outside the scope of this article. The misinterpreted payoff of big data and black box tools as magic pills and increasing volume of, and access to, sector-relevant data, have positioned data science as a necessary field for the foreseeable future, however. And, this is indenpedent of whether you want to work with startups/SMEs who may not have an established data culture, or large national/multinational companies which assumedly do.

“Most companies don’t do a good job with the information they already have. They don’t know how to manage it, analyze it in ways that enhance their understanding, and then make changes in response to new insights. Companies don’t magically develop those competencies just because they’ve invested in high-end analytics tools.” - (Beath, Quaadgras and Ross, 2013).

So, now I’ve wholeheartedly convinced “2019 you” data science isn’t a bubble about to burst, let’s say you’re interested in data science as a career. How do you showcase your ability to improve a company’s existing data culture without compounding upon the magic pill problems of the past? Enter the data science portfolio:

Pros
  • Allows you to target main stakeholders in the employment pipeline (headhunters, decision-makers, data scientists) with your content.
  • Shows your eagerness to learn, develop your skillset and display your results, without the worry of company NDAs you may (or may not) be under.
  • Creates a place for your content, that you are in control of (i.e. the IndieWeb movement).
  • Provides other data scientists with the ability to get in contact with you to discuss constructive feedback, potential collaborative projects, etc..

Cons
  • Can’t bullshit your skillset to yourself anymore.
  • People of the internet can contact you.

I understand I’ve made this all sound relatively easy: website + a few targeted projects/articles = perfect job, but data science is hard. Personally, it’s been one the most intimidating things for me to put any content out there, given, as a cross-discipline field, there will always be a specialist who can give you the feeling of impostor syndrome - intentional or not. But, that is the state of play of the internet for many a person: Youtube creator, newspaper journalist and data scientist alike.

I find the chance to understand my current limits in the field in a public space to be beneficial in exposing areas I can focus on for my next post or project, which I can then give myself deadlines for. After all anxiety of starting to produce content has passed, there is also a certain level of simplicity my content should have for people to enjoy consuming it, and for myself to know what I’m actually talking about.

“Whatever cannot be said clearly is probably not being thought clearly either.” - (Singer, 2016).

If you think this is something that interests you, then I implore you to create a portfolio, as I’ve started to. And if it’s not something that you feel comfortable doing, potentially creating a private version of this (e.g. with private repositories on GitLab) is a good way to get started.

Bibliography

Beath, C., Quaadgras, A. and Ross, J. (2013). You May Not Need Big Data After All. [online] Harvard Business Review. Available at: https://hbr.org/2013/12/you-may-not-need-big-data-after-all [Accessed 4 Sep. 2019].

Google. (2019). Google Trends. [online] Available at: https://trends.google.com/trends/explore?date=2011-09-01%202019-09-01&geo=GB&q=Data%20Science,Big%20Data,Machine%20Learning,Data%20Analytics,Deep%20Learning [Accessed 1 Sep. 2019].

Singer, P. (2016). Ethics in the Real World. 1st ed. Princeton: Princeton University Press, p.5.