In this post, @rufuspollock mentioned the need to compare measures of expenditure both within and between datasets:
Comparing different versions of expenditure is a common and extremely
important functionality. In the case of budgeted (planned) and actuals
(or different versions of a planned budget) it is comparing the same
budget line between planned and actual. We also want to compare between
countries: for example spending on defence in one country vs another (or
spending on the same cofog codes). This is a really important feature
and one that OpenSpending will support.
Exactly how OpenSpending will do that needs discussion. Has anyone had experience with generating spending data comparisons? What is important to think about, especially, say, when generating visualizations of such comparisons. Is there a need to make sure that two datasets that classify spending by COFOG have some agreed way to actually know they are referring to the same classification scheme? For those that don’t share a classification scheme, is it worth supporting comparing those datasets? How? What else is there to think about?
I’ve done comparative analysis of budgetary data (this is the app, but I’ve just checked and the demo is no longer online :(.)
In this app, we relied on the fact that each “entity” (in this case, a municipalities in Israel) declares budgets (read: both projected and actuals) according to a pre-defined budget tree (pretty much a functional classification system for both expenditure and revenues).
Anyway, from our experiences there, these are the main points for comparative analysis:
The main types of comparison that a range of users want to make are:
Across entities (How much does Jerusalem spend on pre-schools compared to Tel Aviv)
Over time (how much did Tel Aviv spend on pre-schools in 2005, and how much in 2015)
A combination of both above (Over the last 10 years, how much did the 3 biggest cities in Israel spend on pre-schools)
All comparative analysis needs a context in which to make comparisons. The depth of the contextual data used to frame spending in obviously enriches the comparison itself. At a minimum, I’d say these are the base data points required for any comparative analysis:
Population per entity (population should be tied to period too. eg: population in 2005 and population 2015)
Population should at least be broken down by gender, and some age bands
Inflation
GDP
At least some type of socioeconomic indicator, eg the GINI Index