The Thames River, London, Canada. Image by Nicole Osborne.

Forecasting Long-Term Daily Municipal Water Demand

Interpretable, accurate, quick forecasting with Prophet

Blake VanBerlo
Towards Data Science
13 min readApr 19, 2021

--

Blake VanBerlo is a data scientist who supports the Municipal AI Applications Lab at the City of London, Canada. The lab operates out of the Information Technology Services Division. Blake posts about our open source AI projects. See “Contact” section below for other key City stakeholder contacts.

Summary

The Municipal government of the City of London, Canada completed an applied machine learning project focused on obtaining daily forecasts for long-term citywide water demand (see the source code). This article is a narration of the project’s lifecycle, from research to deployment. The final system fits a Prophet model to daily estimates of citywide water consumption to produce interpretable forecasts four years into the future.

Introduction

Forecasting is a commonplace task in municipal government. One such task is the forecasting of aggregate water demand over the long-term (i.e. years into the future). Reasonably confident forecasts help cities produce water revenue projections, enabling faithful budgeting. They could also be used to identify and plan major investments in water infrastructure capacity. Additionally, analysis of such forecasts could characterize rich insights into consumption patterns (e.g. seasonal variation). Comparing forecasts for the various rate classes of customer (e.g. residential, commercial, industrial, etc) may also facilitate better understanding of the evolving needs of the various sectors of the population and even more accurately predict revenue.

The Municipal Artificial Intelligence Applications Lab and the Water Demand team at the City of London sought out to investigate whether a more accurate forecasting model could be developed for the task of long-term water demand forecasting.

This article will describe (1) the forecasting problem, (2) the dataset used, (3) the process of searching for an effective forecasting model, (4) how the model was deployed for future reuse, (5) possible next steps, and (6) how municipalities can easily replicate this approach.

Location in London, Canada. Image by Nicole Osborne.

Problem

In the literature, water demand forecasting typically falls into one of two categories: short-term and long-term. The former usually takes place on the scale of hours to days. Several efforts for short-term forecasting have been documented, often using features such as historical water consumption and weather data (see [1-5] for examples). Long-term forecasting refers to predictions that identify the approximate consumption over a period of years (even decades). These approaches may use any of a variety of features, such as historical consumption, climate patterns, domestic patterns and economic factors (see [1, 6–8]).

Water Demand management team was most interested in producing forecasts at least 4 years in the future, which corresponds to the City’s 4-year multi-year budget planning cycle. Specifically, the problem was defined as:

Predict daily water consumption (in cubic metres) for any date within at least the next 4 years.

We approached this problem with great flexibility and narrowed down a solution as we got deeper into the investigation.

Dataset

We were able to obtain billing data for all customers between mid-2009 and mid-2020. In total, this left us with just over 10 years of billing data. Unfortunately, we were missing data for 2 date ranges: March 1, 2014 to September 30, 2014, March 25, 2017 to May 31, 2017. Later we will discuss how missing data turned out to be a factor in our model selection.

The data was provided to us as a series of CSV files for each quarter of a year (more or less). Each row consisted of a customer’s total water consumption over a billing period, along with other related attributes. That is, there was 1 record per billing period per address. Each billing period had a start date and an end date. Billing periods were approximately one month, but varied considerably in the dataset. Further complicating the situation was the fact that billing periods among customers were offset from each other. The variance in the billing periods was related to the reality that this information comes from digital meter reads from vehicles physically driving around the City passively collecting the data meaning the collection for any one customer wont necessarily happen at exact intervals of one month. We were left with a logistic problem to solve: how can we accurately represent citywide water consumption for a particular time period?

Recall that our goal was to produce a prediction for citywide water consumption for a particular date. That is, our goal was to transform the consumption data from the provided format to a series of daily estimates of citywide water consumption. We chose our time step to be 1 day to maximize the granularity of the dataset and to increase the amount of training data by a factor of ~30. To estimate the daily consumption for a particular historical date, we devised the following method:

  1. Find the set of billing periods in the dataset that contained date d.
  2. Divide the consumption for each billing period in the set by its length in days.
  3. Sum the result from step 2.

The end result was a time series dataset of daily estimated consumption over the available historical range. Given that the business outcome was to provide long-term forecasts of system wide consumption, we felt confident in using the above procedure to estimate daily consumption. Inherent in the above procedure is the assumption that daily consumption is constant for each customer across the billing period. Note that such an assumption would eliminate any model’s ability to capture variance on short time scales (e.g. weekday and holiday effects).

Citywide daily water consumption estimates (in cubic metres) computed from water billing data. Image by author.

We also transformed all the non-consumption features (e.g. water meter type, land area, building classification, etc) to create aggregate measures of each feature value for each date. For numerical features, we created average and standard deviation features for all the customers who had billing periods over date d. We also created features for counts of occurrences of values of categorical features in the dataset on date d.

Candidate Models

We considered four alternative modelling techniques for our forecasting task. Prior to this work, the Water Demand team had been using a mix of extrapolations based on use by category, casual models, and most recently linear regression to produce water consumption forecasts; therefore, linear regression was included as a baseline. The techniques are listed below:

  • Ordinary least squares linear regression (abbreviated OLS)
  • A recurrent neural network with an LSTM layer followed by 2 fully connected layers (abbreviated LSTM-RNN)
  • A convolutional neural network with a 1-dimensional convolutional layer followed by 2 fully connected layers (abbreviated 1D-CNN)
  • Prophet, with yearly seasonality

We refer any readers who are unfamiliar with Prophet to its creators’ paper [1]. In a nutshell, Prophet is a composition of a piecewise linear trend function, periodic functions, and holiday-specific function.

Initial experimentation indicated that the non-consumption features had little to no effect on the forecasts produced by the model. Our theory is that this may have been because they were fairly stagnant over long durations in the dataset. Consider that the mean and standard deviation of the aggregate meter type ratio or land area would not change significantly day to day. Further, the Water Demand team was most interested in a univariate consumption model. As a result, we abandoned the non-consumption features and focused on univariate forecasting with the consumption data.

Evaluation Approach

To arrive at a fair representation of model performance, all training experiments were conducted using a rolling origin cross validation strategy similar to that proposed by Tashman and Leonard [10]. Since we had missing data in 2017 and neural networks do not nicely handle missing data, we evaluated the four models on a subset of continuous data from June 1, 2017 onwards. During this stage, rolling origin cross validation was conducted for the 4 most recent sextiles of the dataset.

The train/validation/test split for fold k. Image by author.

Bayesian hyperparameter optimization was conducted for each candidate model to give them the best chance for final evaluation. Each particular combination of hyperparameters investigated were evaluated by running cross validation and using mean absolute percentage error (MAPE) for the test set, averaged across the folds, as the objective. For each model, we examined a partial dependence plot for each parameter on the objective and selected optimal hyperparameters accordingly. See the example below for Prophet.

Partial dependence plots (PDP) for Prophet, created from the results of a Bayesian hyperparameter optimization. Brighter regions correspond to lower values for the objective. Black dots correspond to values of hyperparameters for each iteration of the optimization, and the red star indicates the best iteration. The graphs on the diagonal are one-dimensional PDPs. The red line indicates the best iteration. Image by author.

Performance

We then ran cross validation for each model, using their optimal hyperparameters. The list of performance metrics was mean absolute error (MAE), MAPE, mean squared error (MSE), and root mean squared error (RMSE). The table below summarizes the results for each candidate model.

Test set performance metrics for each candidate model, averaged across all cross validation trials. Bold indicates the best value for each metric. Image by author.

On the whole, Prophet had the best performance metrics. However, LSTM-RNN was a close second. Prophet minimized the metric that mattered most to the stakeholders, which was MAPE. Prophet also had the lowest MSE, potentially indicating that it deals better with high variance.

Interpretability

Since the forecasts are to be used to aid decision-making with the potential of impacting municipal budgets, policy making and rate-class definitions, it was desirable for the water manager to understand why the models make the predictions that it does. Of the four candidate models, half were not inherently interpretable. LSTM-RNN and 1D-CNN are both neural network models, which are considered to be black box models. If one of these models were to be selected, it would likely be necessary to use an explainability algorithm to produce explanations for each prediction which would require extra effort and increase compute costs in production.

OLS and Prophet, on the other hand, are inherently interpretable. Linear regression models are simple to understand because their coefficients correspond directly to the relative impact of each feature on the final prediction. As mentioned before, a fitted Prophet model is a composition of simple functions. In this project, the fitted Prophet model was the sum of a linear piecewise trend, a yearly Fourier series function, a weekly Fourier series function, and a parameterized mapping function for holidays. The parameters for each function can be obtained and saved. These four components can be separated and plotted as below:

The components of the final Prophet model trained on all available data. Trend and holiday components, given for all dates in the training set (labelled “ds” on the x-axis), are presented in the top two graphs. The bottom two graphs display one period of the fitted yearly and weekly periodic functions. Image by author.

The above figure also reveals some interesting insights about water consumption patterns in London. The overall trend in water demand has been steadily increasing, which corresponds to the ever-increasing residential population in the area. Notice the spike in usage in the warm summer months and the dip in the cold winter months. We are wary of the validity of the apparent weekly variation in consumption. First, notice the scale of the graph is extremely small. It is thought that there actually does exist significant weekly variation in consumption throughout the week. It is not surprising that the model didn’t pick it up because the daily estimates were averaged over the disjoint billing periods as discussed above, which is likely to eliminate any significant weekday variation. A similar effect was noticed for holidays as the averaging across the billing period in pre-processing would eliminate most holiday-effects.

Final Model Selection

On top of their marginally inferior performance, the lack of interpretability of the neural network models served as a major deterrent to their selection as a final model. An additional benefit of Prophet is the speed at which it trains (it took minutes for our dataset). Moreover, the whole dataset could be used when fitting Prophet, regardless of the intervals of missing consumption data. Prophet’s superior performance metrics, its low compute cost and its inherent interpretability positioned it as the natural choice.

The below figure demonstrates Prophet’s ability to perform well on a test set consisting of the most recent data.

Top left: Prophet’s predictions on the training set are contrasted with the ground truth daily consumption (i.e. “gt”). Top right: Prophet’s forecast for the 6-month test set compared with the ground truth consumption from the test set. Bottom left: residuals (training set MAE) and test set error (MAE). Bottom right: the distributions of residuals and test error. Consumption is expressed in cubic metres. Image by author.

Since this investigation was conducted in late 2020, the test set happened to fall entirely on the first few months of the COVID-19 pandemic. During this time, water consumption patterns were abnormal (see the pronounced spike in 2020).

Prophet Forecasts

To make forecasts, Prophet extrapolates the latest linear trend into the future, adding the yearly periodic function, the weekly periodic function, and the holiday function. Uncertainty intervals are created based on the chances that the trend will change in the future. Since the fitted model included very few trend changepoints over the training set, the uncertainty intervals are fairly narrow.

We trained Prophet on all available data (using no test set) and produced the following 4-year forecast:

The forecast indicates that London’s water demand will continue to increase in the coming years. The peaks and troughs in the forecast correspond to the increased and decreased demand in the summer and winter months respectively. In principle, this forecast can be extended indefinitely, but it would be unwise to rely on forecasts for years too far in the future, since it is unknown if significant events in the future will induce major changepoints in the trend.

Deployment and Support

The goal of the water deamnd management team was to continue to produce new forecasts whenever new raw data becomes available. Billing data is downloaded to the database on a quarterly basis. Therefore, a new 4-year forecast will be generated by fitting a Prophet model to all available data, including the data from the new quarter.

Cloud Architecture

The model was deployed using Microsoft Azure cloud computing services. The deployment scenario was split into 2 tasks. First, the model is to be trained on the first 90% of all available data and tested on the remaining 10%, saving model performance records for business approval. Second, the model was trained on all training data available and a 4-year forecast was produced, along with figures for each functional component of the model (i.e. trend, yearly, weekly, holidays). An Azure Machine Learning pipeline was developed to carry out these tasks. All data was stored in Azure Blob Storage.

The pipeline consisted of the following steps:

  1. Preprocess data from the latest quarter and append it to the daily consumption estimates for all historical dates.
  2. Train a Prophet model on the first 90% of the univariate consumption data and save performance metrics on the test set. Then train a Prophet model on all consumption data, save the functional components of the model, and create a 4-year forecast using the model.

A PowerBI dashboard consumes the aggregate raw data and the forecasts made by the model. The pipeline is automatically triggered by an Azure Logic App whenever the latest quarterly consumption data is uploaded to the Azure blob container for raw data.

Skaters on the flooded ice rink at Storybook Gardens, London, Canada. Image by Nicole Osborne.

Next Steps

One potential task for the future is analysis of forecasts produced for subsets of the customer stratified by rate class. The city charges customers for water usage at different rates depending on the rate class that they fall under. Rate classes currently include: residential, commercial, industrial, institutional, and fire line. Comparing Prophet forecasts for each rate class would be an interesting research question and may help the city in rate class policy-making. Such forecasts would also provide a more detailed long-term estimate of municipal water revenue.

As mentioned earlier, climate is often used as a feature for obtaining long-term water demand forecasts. A future study could include climate features and characterize the effect of climate change on municipal water demand and its implications for water demand in the far future.

How Does My City Use This?

This approach can be adopted by your municipality if you have a historical record of municipal water consumption. Here we provide some suggestions for getting started.

First, your municipality would require a team consisting of the following professionals:

  • IT manager: to manage and support all staff involved in the project; ideally possessing familiarity with data science fundamentals
  • Data scientist: to lead research and development efforts; should possess data science expertise and experience; ideally experienced in cloud services if that would be your final deployment path for the production model
  • Water Demand manager/team: to identify the requirements of the project, serve as an application domain expert, and evaluate performance/interpretability artifacts produced by the data science member(s)
  • Data engineer: to navigate access to the consumption data. In our case our Water Demand manager and the team at London Hydro performed this function.

To obtain our code, you can clone our public GitHub repository. The front page of the repository includes detailed code documentation. Please be advised that our raw consumption data likely exists in a different format from yours, so you may need to devote effort to transforming your raw data into a time series consumption dataset.

We have submitted a paper detailing our methods and are waiting for a publication decision. If the paper is accepted, we will update this article to provide a link to the paper.

Contact

Matt Ross
Manager, Artificial Intelligence
Information Technology Services
City of London
maross@london.ca

Blake VanBerlo
Data Scientist
City of London Municipal Artificial Intelligence Applications Lab
vanberloblake@gmail.com

Daniel Hsia
Water Demand Manager
Water Engineering
City of London
dhsia@london.ca

Image by Renee Gaudet from Pixabay

References

[1] H. Liu, Municipal Water Demand Forecasting in the Short and Long Term with ANN and SD Models (2020), University of Alberta

[2] J. Adamowski, H. Fung Chan, S. O. Prasher, B. Ozga-Zielinski and A. Sliusarieva, Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada (2012), Water Resources Research

[3] S. Shah, M. Hosseini, Z. B. Miled, R. Shafer and S. Berube, A water demand prediction model for central Indiana (2018), 32nd AAAI Conference on Artificial Intelligence

[4] L. Shvartser, U. Shamir and M. Feldman, Forecasting Hourly Water Demands by Pattern Recognition Approach (1993), Journal of water resources planning and management

[5] H. Du, Z. Zhao and H. Xue, ARIMA-M: A new model for daily water consumption prediction based on the autoregressive integrated moving average model and the markov chain error correction (2020), Water (Switzerland)

[6] C. Qi and N. Chang, System dynamics modeling for municipal water demand estimation in an urban region under uncertain economic impacts (2011), Journal of Environmental Management.

[7] K. Wang and E. G. R. Davies, Municipal water planning and management with an end-use based simulation model (2018) Environmental Modelling & Software

[8] M. Shrestha, S. Manandhar and S. Shrestha, Forecasting water demand under climate change using artificial neural network: A case study of Kathmandu Valley, Nepal (2020), Water Science and Technology: Water Supply

[9] S. Taylor and B. Lethan, Forecasting at Scale (2018), American Statistician

[10] L. J. Tashman, Out-of-sample tests of forecasting accuracy: an analysis and review (2000), International Journal of Forecasting.

--

--

Data Scientist out of the Municipal AI Applications Lab @ City of London, ITS Division. Posting about our open source AI projects.