by Mike Sargo
Data Management is about recognizing your data as an asset and one of your organization’s most valuable resources. You must thoughtfully collect, store, model, and govern your data in a way that optimizes the performance of your data-driven applications. These tasks will support the streamlining of your data lifecycle to enable a data-driven organization by moving the data closer to the point of action.
Data Collection is focused on the process of managing the data that you are collecting. At first glance, many customers will say that they want to simply collect all data and keep it forever. While this may be a noble answer, it certainly is not the most effective approach; leading to the much more ambitious practice of trying to boil the ocean. As a big believer in agile development practices that deliver numerous iterations to deliver value along the way, I always recommend that we start small by focusing on what data is essential to the business process and what the different retention policies should be for this data.
Data Storage is about supporting the data lifecycle and how we should best store our data to support additional processing of the data and support business activities. This is about determining the best method to support these processes based on several variables while applying business rules based upon data classifications. Here are a few variables that would affect the data storage strategy: is the data structured or unstructured, what point of the data lifecycle is this data being stored, and what if any additional processing of this information may need to occur? In a simplified data process, you may have multiple storage points supporting the data throughout the data lifecycle as it moves closer to the point of action. An example is that you could initially store data in a data lake, applying some data integration techniques and transformations to then store in a data warehouse, which can then be moved and further processed to be stored in a business process specific data mart or analytics application that is closer to the point of action. As it relates to the data lifecycle, it is extremely important to factor in how the data will be used currently along with how it could be leveraged in the future as you start to think about where and how the data will be stored.
Data Modeling is a structured representation of the data geared towards the context in which the data will be used. This data must be structured in a way that is easy to use and understand while meeting the needs of the business users along with supporting the associated business processes. Effective data models must be designed with a focus on the process that it’s designed to support. This is another area where agile development practices play an important role. Again, I would recommend starting small and progressively enhancing the model to make it more complete.
In summary, data management plays a critical role in supporting the data lifecycle and building the foundations that make streamlining the data lifecycle possible. Organizations should carefully plan and consider their data management strategy and how it can support the larger needs of the organization.
Written by Mike Sargo
Mike Sargo is Chief Data and Analytics Officer and Co-Founder of Data Ideology with over 18 years of experience leading, architecting, implementing, and delivering enterprise analytics, business intelligence, and enterprise data management solutions.
Deploying Power BI without the appropriate planning and preparation sets the organization up for several challenges down the road. When deployed properly the organization can modernize the way it operates.
POWER BI - CHECKLIST
Before you start, you’ll want to make sure you’ve thought of everything.
Implementing Power BI can be quite the undertaking. Use this Power BI checklist as a reference to make sure you’re prepared for every aspect of your project.
- Deploy Power BI with the appropriate planning.
- Accelerate Power BI adoption.
- Enable the organization.