A couple of weeks ago I joined the meeting of the Amsterdam FP&A Board where we...
In this series of blogs, I’m looking at several areas that FP&A departments must address to add value to the organisations they serve in this technology-driven age. In this blog, I look at the changing nature of databases.
The predominant analytic tools for FP&A professionals are spreadsheets, probably due to their prevalence, versatility, low cost, and reporting capabilities.
Where are these weak? In being a single user, unable to present multiple views without duplicating data, difficult to set up/maintain large complex models, lack of workflow capabilities — more sophisticated analytic applications have plugged the gap. Most, if not all, of these, depend on a database that provides easy access, scalability and integrity.
The Rise of the SQL database
A database is simply a way of storing data that can be searched and information retrieved later. To this end, there are two parts to a database — how the data to be stored is structured and the tools that provide access to the content.
It can be argued that an electronic folder is a database where data is stored as individual documents through which the file manager can be used to search for a specific text string within those documents. Similarly, a spreadsheet is a database with facilities for searching and producing analyses (e.g. summaries) from the stored content.
Analytic applications, such as those used for planning, also have a database that typically holds both the data and the defined relationships between the data.
For example, a relationship exists between sales, volume and price for each product and location combination.
This application ‘intelligence’ and what each data item represents is known as ‘metadata’.
The underlying database technology in these applications is typically a ‘relational’ database where data and metadata are held in tables that are made up of records and fields. These components can be considered equivalent to spreadsheet’s sheets, rows and columns.
Some vendors use a proprietary SQL database, while others use a hybrid of their own, but they still tend to be made up of sets, records and fields.
(For this blog, I am treating multidimensional databases as a specialised form of a relational database.) To analyse the data, the vendor will have created various functions using the database’s access language, sometimes made available to customers.
However, with today’s information-rich environment and the vast quantities of data that are generated covering multiple forms, the relational model fails in terms of performance and the type of analyses that can be produced. It is mainly due to the structure used where data has to fit into predefined sets, records and fields and where rows have a unique primary key used as the basis for performing analyses.
NoSQL Databases
In recent years, new database forms have been developed to handle specific types of analyses on large data sets. These come under the name ‘NoSQL’ to indicate that these store data differently from relational databases and for which there is no standard query language. Their performance is staggering and can produce analyses that were previously impossible to do. Just think for a minute how fast Google responds to a search query. It can do this as data is held in a way that optimises this query type — and it’s not a relational database.
It is generally agreed that four main types of NoSQL databases come under the following names:
- Key value
These are simple databases where data is accessed via a dictionary or a map. The data for each ‘key value’ can have a different structure. An example is storing the interactions of individual customers logging into a website.
This type is ideal for simple searches but cannot handle complex analyses.
- Document
These databases can handle complex queries on vast amounts of data that can be stored in a structured, semi-structured or unstructured manner. It is because the data can be embedded with a description of what it contains (typically using XML or JSON) instead of storing it within predefined fields. Each document is effectively a record with the same or a different structure. They also support data hierarchies where data is organised into groups called collections (analogous to the tables in relational databases). For example, a document database might include collections for customers, orders, and products, where each collection has its own standard fields.
- Column
This type of database is similar to a relational database except that records are stored as columns, and the fields of a record are stored as rows.
This vastly improves performance on large data sets involving complex queries, making them ideal for advanced analytics.
- Graph
These store data as relationships using ‘graph theory’, mathematical structures that model relations between objects. It can then be used to identify more distant relationships and allow analyses such as which web pages customers are browsing and which products they’re buying to help create upsell/cross-sell recommendations.
Conclusion
In this blog, we can only touch on summaries of these new database types. It’s worth mentioning that each type typically requires users to develop their code, who must have a good understanding of the technology architecture used. There are also many competing products in each category. Consequently, using these databases is probably way beyond the capability of most accountants. But that doesn’t mean to say they should be ignored. As previously mentioned, FP&A departments must take on skills that can track development, assess potential uses and determine when and where these technologies should be used.
In my next blog, I will look at the kinds of applications that can be developed using these database technologies.
The article was first published at Unit4Prevero.