W2UP6 -Assessing the suitability of data sources for input into European energy demand forecast models
23rd October 2015 Robert Sharp

Assessing the suitability of data sources for input into European energy demand forecast models

Ed Sharp, UCL


Energy demand forecast models were first created as a result of the 1973 – 1974 oil crisis. Since then they have been used primarily to ensure that supply has met demand by enabling effective planning of associated infrastructure. The range of users has expanded to include policy makers, grid operators and energy suppliers. In an attempt to improve the accuracy, and therefore inform these users more effectively, the models have been developed incorporating a range of different methods. Despite this, the consensus among researchers and practitioners is that the precision of the output of the models is still dictated by the availability of data to be used as inputs for the models.

Practitioners such as EDF, the industrial partner to this study, have found that this data is not abundant or easy to locate. There are also many competing organisations providing data without a common methodology for collection, storage or delivery. Resulting in differing values for the same variables. EDF’s region of interest is the tertiary sector of a group of European countries which encompass varying levels of economic development. Data availability has been found by previous research to be less abundant and lower in quality for both the non-domestic sector and less developed countries. Therefore this study has attempted to identify data sources and quantify the amount of divergence that is caused by the competing organisations lack of common methodology.

Common variables have been identified from examples of forecast models found in the literature including the two tertiary sector models created by EDF. Data on each of these variable has then been located from a range of sources including data collation agencies, national statistics authorities and academic research. This data has then been compared on a per country basis.

This comparison has shown that the largest ranges between datasets as a percentage of the mean value are 370% for employee numbers in the sector, 350% for gross domestic product, 300% for gross value added, 170% for energy consumed in the sector, 170% for floor space and 20% for population. Data availability has been shown to be inconsistent. Long time series available for economic and demographic variables. Crucially however the key driver of energy demand in the sector, floor space, has little associated data.

The research has demonstrated that a lack of transparent methodologies has made exact pinpointing of the reasons for divergence impossible. However there are several key differences between the datasets which point towards possible reasons. These include the use of several different schemes to classify the sector and the difficulties in harmonising data that arise. Some of the error can be attributed to variables being calculated in different ways. Divergence has also been shown to be caused by the use of nonstandard units and the resultant processing necessary to harmonise data sources. The study concludes that there are existing structures that could improve the situation by encouraging the use of standards for classification and collection of data. There is also a need to collect more data on certain variables, in particular floor space.

Project Team

Ed Sharp
Robert Lowe


Conference poster

Poster given at annual Energy Institute Colloquium

A brief description of the MRes Dissertation


Assessing the availability and quality of data for key input variables into energy demand forecast models of the UK tertiary sector

A paper written as a synopsis of the MRes dissertation research for prospective submittal to BRI. The paper effectively summarises the wider project within the context of the UK rather than Europe-wide.