Mit Datenstrategie zur KI

Simon Görtzen
October 25, 2021
4 Min
Lesezeit

AI creates value from data. The prerequisite is that the data contains sufficient information content to answer the question that the AI should address. Data is therefore the foundation for value creation with AI. But what does this mean for your data strategy? Should you start collecting enterprise data early or wait until you have decided on concrete AI use cases? Which data should be stored? Is a central data repository needed? In this article, you will find answers to these questions.

Which data should be stored?

The primary goal of any data strategy is, of course, to store the data that is valuable to your company or could become valuable. To identify valuable data, it is helpful to start by creating a list of all data sources in your organization. Potentially valuable data includes ERP data, production data, production planning data, product data, and quality control data. Subsequently, the value of each data source should be assessed, for example, on a scale of 1 (low relevance) to 5 (highly valuable).

In our view, two approaches work particularly well for assessing the value of your data. First, you can examine how the data contributes to your company's business processes and what business value results from this. Second, you can ask yourself how closely the data relates to questions that are important to you but for which you do not have reliable answers.

If you cannot reach a clear conclusion in an assessment, you should plan to store this data. The cost for this is lower in the long run than the risk of data gaps that will hurt later.


What knowledge, what relationships, what insights are contained in your data?

When is the right time to start collecting data?

In principle, the answer is: you cannot start collecting valuable data too early. The more historical data available, the more reliable AI models can be trained on this data. Therefore, systematic data collection should not begin only at the start of an AI project. Data is valuable and storage space is cheaper than ever. Data that was not stored in the past cannot be recovered. So why wait to store data?

Data storage is an investment in future value creation.

Is a central data repository needed?

For the collected data to contribute optimally to your future value creation, it should be stored centrally. From an AI perspective, connecting to data sources is generally a time-intensive process. This effort increases the more decentralized data sources are connected.

A central data repository can significantly reduce the initial effort. All valuable enterprise data is stored in a central database or data lake, and important properties are documented, such as column names, descriptions, data source, retention period, unit, and collection cycle. Additionally, processes can be established to ensure that new data is continuously added and meets enterprise data quality standards.

All analytics, AI, and other software applications can now build on this central data repository. This enables both your internal development and external service providers to develop productive applications for you as quickly as possible. To accomplish this, we recommend establishing a standardized process or interfaces through which new developers can quickly gain access to your data.

The central organization of enterprise data is the best preparation for value creation with advanced analytics and AI applications.

A data audit brings clarity

For all companies that want to use data to increase their value creation, a smart data strategy is essential groundwork. But what is the best way to start? We recommend conducting a data audit as the first step. The goal of the data audit is to create a central overview of all data sources in your organization, such as a data landscape.

The data landscape contains all data sources, data sinks, and data processing elements as well as their connections to each other. Additional technical aspects such as storage locations, file formats, data volume, security requirements, or the duration of data retention can complement the data landscape. In addition, legal and contractual data retention requirements typically need to be considered.

With a thorough data audit, you create a very solid foundation for building your central data repository in the next step. If you would like to bring in external support for implementing your data audit, please reach out to us about the aiXbrain Data Audit. We look forward to hearing from you.

Beitrag teilen
Simon Görtzen