Today’s finance and accounting professionals face ever-growing challenges when it comes to data.
The simplest analyses can be complicated by the wrong data or bad data and too much data can even be a problem. Effectively sourcing, managing, and using data are key skills.
Data Collection Basics
When it comes to collecting data, keeping three V's in mind goes a long way towards ensuring you collect high-quality data.
Volume, variety, and velocity are attributes of data you should consider when collecting data.
How much data you collect should be a function of the relative value you can wring out of that data. Collecting every piece of data you can get your hands on might seem like the way to go at first glance, but many constraints make it wise to consider the potential value of data before you collect it. Time and cost are two big considerations when determining what data to collect. It’s not just the time and cost to collect the data you need to consider, but also the time and cost of managing it.
Variety is an attribute that helps an analyst by adding context. This helps whoever is analyzing data better understand how the data fits into the big picture. For example, data about what platforms your customers use can help provide context to make better sense of usage data. It could uncover the fact that mobile users in certain locations don’t spend as much time on your website, which might be due to poor cellular coverage in that area, which could prevent someone to jumping to false conclusions about website statistics.
From batch to real-time, there is a wide spectrum of velocity with which data is processed. More and more data is processed in real-time than ever before and there are even some users, like high-frequency traders, who have gone so far as to shorten the distance between themselves and their data source. Author Michael Lewis writes about this in his book Flash Boys.
Using data to enhance decision making is smart, but the quality of the data you use will impact the quality of your decisions. What exactly makes good data?
In his book, Data Driven: Profiting from Your Most Important Business Asset, Dr. Thomas C. Redman describes high-quality data as suited for the intended use. Specific criteria used to measure data quality include accuracy, completeness, consistency, timeliness and uniqueness.
Much has been written about each of these data characteristics and your criteria for each aspect of data quality may differ from someone else’s. For example, if company A’s criteria for validity is accuracy to two decimal places, but department B’s criteria is accuracy to three decimal places, you may have data that is valid for one department, but not the other. Carefully consider how you establish your data quality criteria and then be consistent in applying it.
Should We Go Self-Service
Are you considering using some type of self-service analysis or reporting tool? Effective use of self-service tools requires some basic knowledge of data access procedures from a variety of data sources and file types. It also requires an understanding of how to manipulate data filters, drill-down menus, and basic user interfaces. Knowledge of basic security concepts used in self-service tools is also important.
Extracting, Transforming, Loading (ETL): The Basics
Another skillset you should have if you’re going to use self-service data analysis tools, a basic understanding of the extracting, transforming, and loading (ETL) processes.
Extracting data is the process of acquiring the data from its source and it requires basic knowledge about how to access and extract the data out of the source system or dataset. What is required to accomplish the extraction will vary depending on the data source and other factors, so some flexibility and resourcefulness on your part may be required.
Now you have the data in the file format you need, but it may not be organized the way you want it. For example, you may have data items that include dates in text format, but you need to perform calculations on dates, so the data will need to be transformed before you can do calculations. This will involved some data manipulation, which may be done in the data source or in the destination application. This is the transformation phase of the process.
Once data transformation is complete, you can then load your data into its destination. You may have data you’re moving from Excel to a data visualization or reporting tool, for example. This loading step may involve several steps, but this process is getting easier as tools are being designed to be more and more user-friendly.
With the right foundation, moving to self-service tools can be a great move, just make sure you approach the idea with care.