Data Lineage 101: The Basics
This activity was sponsored by Anthony Fazio Marshall on behalf of Manta.IO.
This webinar discusses the basics of data lineage, and the importance of accurate and efficient data pipelines. Historically, data lineage has been documented manually, costing organizations time and money, with a lot of room for error. Automating data lineage leads to a more accurate and actionable picture of the flow of data from beginning to end. Mr. Fazio covers key terms, such as source system, data assets, meta data, transformations, and more. He also discusses some of the associated challenges, such as simplifying the data lineage process in a way that it can be understood by people from a variety of different business units, not just those in the technical audience. Another challenge is addressing the changes in data that come with company evolution – new suppliers, new products, etc. – and integrating the new data sources into the existing pipelines so that it flows smoothly.
Some initiatives that are directly related to data lineage are data democratization, root cause analysis, data modernization and migration, and adopting a proactive approach to data risk management in identifying issues within the data pipeline. Reliable and thorough data lineage is essential for data governance, dataops, cloud migration, and more. This conversation has given me a lot of insight into how data is managed from both a business and technical perspective.