by Domino on July 20th, 2017
This blog post provides concise tips for leaders at startups and early-stage companies to consider when getting a data science team off the ground. Tips covered include timing, hiring, processes, and technology.
Why Data Science?
An increasing number of startups and early-stage companies are realizing they need to tap into data science to grow or stay competitive. They’ve recognized that collecting and storing data is fruitless unless it can drive insights that propel the business forward. If you’re a manager charged with building a data science team from scratch, or expanding a fledgling one, it can be hard to get the timing right or know what to prioritize. This blog post will help you ensure you’re investing in the right people, processes and technology to make your young team flourish.
Don’t launch a data science team just because everyone seems to be doing it. You’re jumping in too early if you haven’t been collecting any data or haven’t gleaned basic business intelligence from the information you have. You need to understand what’s happening today before building tools to predict what’s next.
Hiring The Best People
Building a strong team
There are no unicorns. You probably won’t find someone who excels at dissecting high-level problems, strategizing about moving the business forward, pitching the team’s work to other stakeholders, doing the heavy lifting technically and integrating tools into existing production systems. Instead, think of the team as a balanced organism in which people with different strengths come together symbiotically.
Reflect deeply on your existing skill set and what you need to round out the team. If you have a savvy leadership that knows exactly what problems they want a data scientist to tackle, hire a technical-leaning person first. If you’re having trouble integrating data science into the business, prioritize a hire that can cultivate those relationships and spark cultural change. Avoid bringing on a junior data scientist without a strong mentor. People that are scrappy, business-curious and low-ego will be an asset in a young team. And don’t underestimate the importance of diversity in fostering better outcomes, even in a small team.
Recruiting good candidates
Data scientists are in high demand, so you’ll probably have to do some wooing. Pitch potential hires on the opportunity to work on problems that impact the business, and assure them that they'll have access to interesting data to do so. Strong candidates want to know that data science is a priority at the company, not a buzzword. Give them concrete examples of projects they could work on to drive revenue and value. Invite them to spend a day getting to know the team. Communicate that the company will see them not as a cog in the machine but as a thought worker with a seat at the table.
While interviewing candidates, don't just ask about their technical skills. How well can they get to the heart of problems they could be tackling? How effectively can they communicate the importance of their work? How well can they prioritize? After all, solving the right problems and selling data science to business stakeholders are key aspects of the job.
Try to give new hires access to data and business stakeholders on Day One. Get them involved in meetings that will help them understand pressing business concerns, and give them examples of how data science insights have impacted the company in the past. Acquaint them with the standard suite of tools your team uses, but leave room for flexibility and experimentation. New hires will be happier and more productive if they can leverage tools they already know. Make sure newbies understand how they will be evaluated.
Running Your Team
Picking a focus
With a small team, you won’t be able to do everything. Identify one or two priorities that have buy-in from executive stakeholders. Ideally, these priorities will also have clear business value and relatively short timelines, and they won’t require vast changes in how other teams operate. Before you start a project, take time to ensure you’re solving the right problem. Be clear on who you’re building for and how it’s going to help the business. If you take six months to build a “perfect” model, it could be obsolete by the time you finish. Being late to the game will also hurt your credibility, because your work will no longer be relevant and the business will have moved on.
Iterating quickly and encouraging collaboration
Deploy products early and often. Data science is about experimentation, and most experiments fail. Identify the flops quickly so you can course correct. Find ways to shorten the feedback loop. For example, consider a first iteration that is pure business logic, rather than an algorithm, or provide opportunities for soft launches. And make sure you celebrate experiments that lead to no results to encourage bold experimentation.
Collaboration is also critical to working efficiently and making an impact. Working together allows your team to tackle bigger problems, leverage individual strengths and avoid depending too much on one person. Collaborative data science involves ensuring repeatability (the same process produces the same outputs), reproducibility (team members can easily recreate statistical tests, empirical experiments and computational functions), and replicability (the same experiment can be repeated twice, collecting and analyzing data in the same way and arriving at the same results).
Provide opportunities, such as “lunch and learns,” that encourages team members to share their work in a supportive environment. Establish frequent communication with the business side to get buy-in and feedback (don’t forget to use language they understand and tie everything to business objectives). The strongest data science teams are proactive partners in business discussions rather than request-fillers.
Establishing best practices early
Don’t make the mistake of thinking your team is too small to adopt best practices for documentation, code review and feedback. It will be harder to change behavior down the line, and insights you generate could end up being a cornerstone of the business. Failing to install good systems early could mean losing institutional knowledge or going down pointless rabbit holes.
Deciding where your team will live
Having a centralized team is important for facilitating collaboration, building on existing work, improving quality through code reviews and providing career development pathways. Meanwhile, embedding data scientists in various divisions steeps them in that aspect of the business and generates better innovations. A happy medium is the hybrid “hub and spoke” model: Team members spend two or three days embedded with a team, and the rest of the week sitting with their data science colleagues.
How Technology Can Help
Selecting the best technology for your team will have a massive impact on its success. The right data science platform will help integrate new hires, make your team more efficient and enable collaboration — so you can quickly make a visible impact.
Easing new hires
Some companies can take up to six weeks to onboard new team members, wasting precious time. If past data sources, experiments and discussions are stored on individual desktops or lost in email chains, they can be difficult or impossible to trace. The solution is a data science platform that automatically stores existing data, tools, connections, packages, libraries and code. That means new team members are up and running in days. Since they can easily reproduce previous experiments and view past results, they're in a position to start contributing immediately.
Data scientists also want to use familiar tools. Pick a platform that allows new hires to stick with their preferred programming languages and software while supplementing gaps in knowledge, such as working with Git or AWS.
Enabling innovation and collaboration
A data science platform can help iteration happen quickly by allowing your team to efficiently run and track multiple experiments at the same time. It can also allow you to easily share and discuss results with colleagues and business stakeholders. Choose a platform that can communicate results to non-technical users without overwhelming them with the complex model behind the scenes. Sharing your work through easy-to-digest visualizations, interactive dashboards and web apps will do wonders to get buy-in across the company. Getting feedback in real time will also keep your team from going off course and will speed up development cycles.
Protecting institutional knowledge
With a small team, you don't want aspects of your work to become dependent on a single member who can leave at any time. A data science platform that automatically maintains a system of record ensures everyone's contributions remain an asset of the company. The archive also prevents the team from reinventing the wheel or losing the knowledge underlying a key revenue-driver. Choose a platform that tracks code, data, environments and other factors in one place, without anyone having to input them manually.
A growing number of companies are recognizing the value of investing in data science. But launching or building a new team can be daunting. Creating an effective data science team requires getting your timing right, attracting the best people and integrating them seamlessly. It also means focusing on key problems and establishing practices that enable your team to collaborate and quickly make an impact. Adopting the right data science platform can help your budding team thrive.
If you are looking for additional and in-depth insight to help data science leaders identify existing gaps and direct future data science investment, download Domino's whitepaper, Data Science Maturity Model.