Written by Keith Oelkers and Mike Frayler, Practice Directors for Data Science and Analytics, DecisivEdge™
Many businesses keep their data in silos and employ analytics at a process or departmental level. Centralizing these data silos to create a data mart, warehouse, or lake gives organizations more visibility over operations and results in more accurate analytics that take a wider range of internal and external factors into consideration.
Among other things, centralizing data helps organizations to:
- Gain a global vision of their customers which enables them to understand how their customers interact with different areas of the company, including sales, marketing and customer service.
- Empower analytics to achieve a 360-degree customer view with in-depth customer insights across all channels to better understand each customer’s value, their product and service interests, and how they prefer to interact and be served by the organization.
- Determine customer or account level profitability. We have seen this metric used to help organizations make key decisions like call center routing, marketing and solicitation, and risk mitigation.
For instance, DecisivEdge recently helped a financial services company combine customer service data sets. The data collected from inbound calls and online customer service platforms was centralized, and our client was able to use the data to develop new insights to reduce the number of inbound calls, determine which channel needed to be promoted based on seasonal needs, and improve the online customer service platform.
Three Popular Approaches to Centralizing Data
- Joining data. This approach consists of combining two databases where one data set is combined into another. The datasets need to have the same key in order to join the data properly.
- Blending data. This is a more flexible approach, but more work goes into preparing the data sets. A new database – the data warehouse – extracts data from different sources and aggregates it into a new data set. This is an optimal approach for long-term insights and for data sets with disparities in structure and/or keys. A common challenge with this approach is the warehouse is only periodically updated and typically won’t deliver real-time analytics.
- Integration. This practice is about extracting a data set from one platform and importing it into another. This is not an ideal way to centralize data, but it can be relevant if a data silo would benefit another process.
Common Data Challenges and How to Address Them:
- Redundant data is one of the main challenges you will face when centralizing data sources. You can reduce redundancy by preparing the data before importing it into the new system. One data set might have a greater degree of granularity than another. A data source could, for instance, record daily activities while another monitors monthly or quarterly numbers. These data sets will have to be cleansed before being joined or blended. Eliminate groups that contain identical data, look for common ground between data sets, and customize the data sets to create more manageable subsets.
Another potential issue is redundancy caused by data that is not a perfect match – either as a result of human error or of how data is captured. There are data preparation tools that can look for these close matches and either aggregate the data, choose the highest or lowest value, or have one dataset or value take precedence over the other.
- Size: The other challenge you will encounter is the size of the combined data sources. You need to prepare by upgrading your data management solution. Switching to a solution with more storage space might be needed, and speed is another consideration.
- Frequency: If you opt for the data warehouse model, you will need to figure out how often the warehouse should extract new data from data sources. You can customize this model and have the warehouse update some sources more frequently than others.
- Visual Tools (i.e. dashboards): Making sense of the new blended dataset can be another challenge. Tools that give you access to visual insights in a consolidated format rather than numbers are usually a good solution. Visual representation is extremely useful when it comes to obtaining a global view of your operations in a single dashboard and providing additional capabilities to drill down to more granular levels of information.
- Analytics: Bringing different data sources together leads to more nuanced analytics and allows you to collect more insights from the data. You can combine data sets with external data sources, or use these strategies to leverage older data sets. The key to analytics is to understand what information is needed to solve a specific problem. Too many data warehouses and data lakes contain raw extracts from productions systems without cleansing or organizing the data. Operational data structures are designed by IT to store data efficiently during real time interactions. Significant work needs to be done to convert these production structures to a data mart or a data layer that can support analytics. This will enhance both the speed of analytics but also the scope of the problems that can be solved.
In conclusion, organizations looking to enhance their enterprise data warehouse to meet reporting and analytics goals need to centralize their data and will experience some of the common challenges outlined above. Working with proven experts to design and implement data warehouses will save money and time and typically deliver better results in the long run.
We welcome your thoughts on the above issues and an opportunity to engage in a conversation if your organization is grappling with these issues. You can reach us at – firstname.lastname@example.org; email@example.com.
About the Authors:
Keith Oelkers brings 20 years of experience in Data Science and Statistical modeling. He understands the power of data science and big data and how it can be used to enhance the bottom line. Keith has spent his career building solutions that enable business leaders to make better decisions. He can breakdown complex statistical terminology and provide organizations with a clear vision on how statistical modeling and data science should be used.
Mike Frayler is a Managing Director at DecisivEdge. He brings 20 years of both technology and management experience at Fortune 100 companies. He is a proven innovator with the experience and ability to rapidly comprehend a customer’s business vision, plans and strategic decision-making processes, allowing him to create and deliver critical data and analytic solutions that bring immediate improved results in both sales and efficiency.
Prior to joining DecisivEdge in 2007, Mike was a Senior Vice President at Bank of America where he was responsible for delivering analytics and insights to profitably grow and manage the bank’s credit card and consumer portfolio. He led the team responsible for developing a multi-faceted data and analytic platform that allowed front line managers, in near real-time, to strategically lead the sales performance of over 2,000 account representatives.
Mike is a graduate of St. Joseph’s College where he received a BA in Mathematics.