Data Science in practice
Data can be generated, captured, and stored in a dizzying variety of structures, but when it comes to analysis, not all data formats are created equal. Data preparation is the process of cleaning dirty data, restructuring ill-formed data, and combining multiple sets of data for analysis. It involves transforming the data structure, like rows and columns, and cleaning up things like data types and values. The speed and efficiency of your data prep process directly impacts the time it takes to discover insights. Understanding the scope of data you’re analyzing and seeing the changes you make to the data can accelerate the entire process.
Before you get started, it’s important to think about how people will use the data that you’re preparing. Understanding this context will help you determine which data set to use, how much data to bring into your data prep tool, and how to ultimately structure and shape the data. To get started, you’ll need to answer some basic questions:
Leave a Comments
You must be logged in to post a comment.