by Eduardo Ariño de la Rubia on May 26th, 2017
In this post we summarize some of our most recent and favorite answers on Quora to questions from the community about hiring junior data scientists, sharing work with the public, and collaborating.
However, as data science matures from the playground to the boardroom, practitioners and managers are finding a new set of challenges on their hands: Challenges around people, processes, and careers. Challenges that require soft skills, not models.
This is apparent from the questions starting to appear on Quora. The following three questions and answers highlight some of the soft skills required to grow as a data scientist within an organization:
What do you look for when hiring an entry-level data scientist?
There is no lack of breathless reports warning anyone who will listen about the coming catastrophe that is the data science skills gap. IBM predicts that by 2020 the demand for data scientists will grow by 28%, and yet Quora is filled with dozens if not hundreds of posts from people who are transitioning from Ph.D programs, engineering, or many other disciplines.
"What do you look for when hiring an entry-level data scientist? Would a master’s in Data Science or a bootcamp be beneficial?"
This question contains many facets we've seen before, packed into a single question. Our answer included the following...
The three traits to look for in a junior data scientist:
- They have the drive and determination to be a self-directed learner.
- They understand the fundamentals of “enough” programming,
- They understand how to analyze data when the goals and metrics are not explicit or time-boxed.
The real conclusion was that as a hiring manager, I want signals that let me know you will be productive in the things they don’t teach you in school. I want to know you understand how to be independent, how to write code, and how to drive to insights when everyone is busy and no one has time to help mentor you.
A master's degree or a boot camp certification are all signals that I will take into account, but neither is make-or-break. It’s everything else around your CV that motivates me to take the conversation further.
What’s the best way for data scientists to share their work?
Few people enjoy standing in front of a crowd and being in the spotlight. A study showed that nearly 27 million Americans have an explicit “fear” of public speaking.
Considering the deeply technical nature of the work, and the many ways in which an analysis can go awry, it can feel like an especially daunting task to share one’s work as a data scientist. However, communicating your insights is one of the most critical skills for a successful career in data science. A recent article by Emma Walker, Data Scientist at Qriously, even called communication the “critical skill” many data scientists are missing.
Telling data scientists they just have to get better at something is not particularly helpful, so instead we broke it down into this:
Five ways that data scientists at different comfort levels with public speaking can share their work with the public:
- Create a really nice web portfolio or GitHub page.
- Find your voice on social media.
- Get involved with the local Meetup community.
- Talk at conferences—from local to regional and even national.
- Mentor someone who is earlier along in their evolution.
No one can expect that just publishing work on the internet will get their work in front of people. The internet is a firehose full of people who want to get noticed. There is no substitute for being your own passionate advocate, standing in front of people, and excitedly telling them about the work that you’ve done.
What are best practices for collaboration between data scientists?
Once a data scientist had their work noticed, and once they’ve been hired as a data scientist at an organization, the truism that “data science is a team sport” will become a daily reality. However, there is a broad range of collaboration in team sports, from the near perfect synchronicity of a rowing team, to the chaos of Battle of the Nations, teamwork can mean many things to many people. This is not helped by the fact that collaboration is a vague often misappropriated term.
Genuine, practical, productive collaboration is the combination of four principles:
- Shared context creates an environment in which the penalty for communication or collaboration is minimized because everyone has the necessary information “paged in.” They’re able to operate on it without paying a significant cost of context switching.
- Discussion and communication, when it happens fluidly and is recorded in a system of record, is a tool that allows layered, compounded knowledge to be built. Learning from others' work and experiments is significantly cheaper than having to discover from first principles.
- Discoverability acknowledges that while context and discussion are powerful, they’re much less so if locked away behind impenetrable navigation. Providing search, taxonomy, hierarchy and ontology that makes navigation between topics and insights easy, and that provides serendipity can shortcut learning curves and encourages future collaboration.
- Reuse is often the most desirable outcome of collaboration, but it has to be a goal as well as a principle. Making your work reusable trivially allows others to leverage work and “stand on the shoulders of giants.” Taking lessons learned and then generalizing them into a reusable template can impact the pace of innovation in organization.
Collaboration isn’t something that can be solved with a silver bullet. It requires people devoted to a collaborative environment, tools which support, enhance, and leverage collaborative strategies, and processes that ensure adherence to collaborative ideals.
It's exciting to discuss the latest new approach or algorithm, but there are many interesting questions beginning to come out surrounding the people, processes, and careers of data scientists.
Although some might argue that Aristotle was the first data scientist—and therefore the field is 2,500 years old—questions around how this work is done, by whom, and in what environment are still relevant today. A rich, vibrant conversation about all of the aspects of data science, from technology to people, help to strengthen the field and to define the kind of community and profession we hope to be.