Why Data Science fails: 10 dangers you must never understimate

Dr. Carsten Bange (Founder and Managing Director, BARC) and Marcel Kling (Senior Director Data Driven Customer Journey, Lufthansa Group) with Regina Schmidt (Editor, BARC)

Tips and tricks, success factors or even the perfect recipe: Numerous presentations and publications deal with the successful approach to data science projects. However, a lot of the advice is superficial and applies to all kinds of projects (“Have a vision”, “Think Big, start small”, “80% of the work is in data integration”).

In this article, we warn you of specific dangers that in our experience lurk in data science projects. We combine BARC’s broad market view from interaction and work with many data science teams in various companies with the concrete experience of implementing data science initiatives in a large corporation.

Danger 1: Use cases without impact on the P&L

Data science can play a role in strategic projects altering a business model, for example. These projects should be done without large calculations.

All other use cases – probably 80-90% of the work of data science teams in companies – should be clearly prioritized. You should be aware from the outset what effects a use case has on the company’s P&L, i.e. whether it can increase sales (top line focus) or reduce costs (bottom line focus). An exact amount of the effect can often not be determined because of the exploratory character of data science, but the focus „positive impact on the P&L” helps to align the work in data science teams.

Danger 2: Striving for perfection

Do not concentrate on implementing use cases perfectly. Better work fast: Dare to make the breakthrough early and test your ideas step by step. A quick breakthrough to live data in source systems is important in order to let the data flow and the involvement of the users is crucial to get feedback early. This makes it easier to determine whether a use case will fail or is likely to succeed.

Danger 3: Project-oriented thinking

Many data scientists concentrate on executing projects. However, a product mindset is more sustainable and successful. The often unsolved challenge in projects is to operationalize results in a way that creates long-term added value for day-to-day business. Therefore, it is better to think of data products and increments of them in your work. This allows you to concentrate on how you provide functions, develop roadmaps and training, and provide continuous support for employees. A project comes to an end, a product remains and scales permanently (or at least has a defined end of life).

Danger 4: Laboratory and factory out of balance

Roll up your sleeves, tinker and dare the breakthrough. The model is developed and tested in the “laboratory”. But you also have to scale your model in a “factory” to create a data product. Often, only one of the two is focused – and the project fails. Laboratory and factory have to work in balance.

Danger 5: No ethical guidelines

Data ethics is fundamental to data science projects. First of all, external rules and laws must be observed. In the last twelve months some things have changed in this area, for example with GDPR. On the other hand Data Scientists need internal guidelines and principles, which translate ethical basic norms into concrete directives and thus give security. If Data Scientists act unlawfully or unethically, your model could unintentionally evaluate people with regard to their gender, religion or the like. Without guidelines, meaning, efficiency and motivation in data science suffer and, in the worst case, fines or negative publicity threaten.

Danger 6: An army of Data Scientists without a battle plan

The recruitment of Data Scientists alone is not enough. Companies have to set up themselves in an organized structure with fixed roles and responsibilities – while taking into account the sensitive interaction between architecture, data engineering and data science. Fixed does not mean static: the organization should also allow flexibility and adapt itself. For example, an alternation of roles and tasks promotes new, creative ideas.

Danger 7: Missing translation 

You can show beautiful results and present a prototype that increases your company’s turnover by one million euros. But your outcome will not be accepted? Often Data Scientists are not clear about the problem of their colleagues. In order to integrate models and signals into operational processes and day-to-day work, you have to do translation work, and even the best Data Scientist is not always the best person to do it. It should be clear and communicated from the outset what changes are required as a result of an implementation – otherwise use cases will be created that are beautiful but not accepted.

Danger 8: Starting with broad data integration

When Data Scientists or Data Engineers start with a broad integration of data, it is often an indication that their initiative may fail. Data Scientists need a variety of data from different sources. However, when large-scale data integration is started, the amount of work involved increases enormously. Learn to swim in the data pond before trying to “bring the ocean to boil” – and think of data architectures that grow over time and on demand.

Danger 9: Data science far away from business processes

Where should advanced analytics be placed in the company? In IT or in the business department? In the laboratory or as part of other teams? Ultimately, these questions can only be decided on an enterprise-specific basis, but for any success of data science initiatives, close integration with business processes is more important than the supposed “proximity to data”. Thus, the Data Scientist develops a better understanding of how to improve the business, and translating the results and operationalization considers the skills of other employees and the corporate culture.

Danger 10: No deep trust in advanced analytics

The relationship between management and data scientists can sometimes be difficult. Incalculable costs and unpredictable project duration can put managers’ patience to the test. Data Scientists can gain the confidence of their management by recognizing the dangers described above and by constantly emphasizing and communicating the benefits of their work. Then not only attention, but true management love can develop for really advanced data work.