Ask A Modeler - December 2021

“What data sources can you use to do modeling at the urban scale and beyond?”

-Bigger and Better

Dear Bigger and Better,

When modeling individual buildings, obtaining pertinent data about the building that impacts its energy use is straightforward. You can observe physical traits of the building such as footprint, height, and number of floors, and you can obtain characteristics related to the building’s energy performance such as HVAC type, schedule, and water use with an energy audit. These building properties can be aggregated and used as inputs to a physical building simulation engine (such as EnergyPlus) to develop a building energy model that, with some tuning, may be representative of the building.

While it is possible to collect this amount of data about individual buildings, it becomes more challenging to collect this data or complete energy audits on thousands or millions of buildings when doing urban, utility, or nation scale analyses. For this reason, large scale building energy modeling analyses often simplify the required input data to four variables from which all other building properties may be derived (though more specific data about a building may be used if available). These properties are building footprint, height, type, and age. From these properties, all other information about the building may be assumed. For example, the building type informs the floor-to-floor height which can be used with the height to estimate the number of floors (or the number of floors can be used directly if that data is available). The building type also informs the window-to-wall ratio, HVAC type, schedules, etc., while the age of the building informs characteristics such as insulation and infiltration. Even though the properties assigned to an individual building may not match perfectly as not all buildings are the same, the aggregation of all buildings of that type and age should be representative.

Obtaining this building data at urban, utility, and national scales comes at varying difficulty for each variable. The building footprint and height provide the physical structure of the building but not the performance. The building footprint has become easier to obtain at a large scale in recent years due to the proliferation of publicly available data in sources such as OpenStreetMap, which provide more than 129 million building footprints for the United States. This footprint data is created using convolutional neural networks on overhead images with segmentation and polygonization applied to simplify the footprints. Building height data have proven to be a more difficult problem. Certain counties and regions have publicly available height data that is typically obtained using LiDAR or other localized methods. While expanding the scale at a high resolution creates challenges and uncertainty, a 30-meter resolution dataset called “AW3D30” was released in 2016 and can provide a rough estimate of building height with proper manipulations. Unfortunately, resolution of height data is an important factor because at low resolutions, tall objects around the building may influence the height estimate. The United States Geological Survey (USGS) currently has a categorical mapping of US building heights, and while this cannot be used directly, they have been expanding their 3D elevation program and may soon make higher resolution data available.

The building type and age fill out the physical skeleton developed by the footprint and height. These factors have a huge influence on the energy use of each building. A misclassified building can result in an energy difference of more than 10x. At a small scale (city or county), building type can often be gathered from tax-assessor’s or parcel data. It can be difficult to aggregate parcel data at a multi-county scale due to the differences in the structure of each individual database. If working with a utility or if hourly measured energy use data is available, these building energy signatures can be compared to models of prototypes for each category of buildings to assess which category the individual building signature best fits. This method will provide the building type that is most representative of the energy use, but not necessarily the function as not all buildings of the same type use energy in the same way. At a national level, there are currently no aggregated building type datasets. For analyses of this scale, heuristics based on building physical attributes and the distribution of building types in that region can be used to classify buildings. While it is unlikely to classify every individual building properly, the hope is that the aggregated building type distribution is representative. Like building type, the year built may be available at a city or county level from tax-assessor or parcel data. New urbanization research has attempted to determine when a building was constructed by mapping global artificial impervious areas (GAIA) from a period of 1985 to 2018, which can be used to estimate the age of tens of millions of buildings.

As these data sources continue to expand and improve, the modeling capability and quality will follow, allowing researchers to better represent large quantities of buildings and identify how to reduce their negative carbon effect on the environment. This negative carbon effect can be achieved in a variety of ways once representative models are developed. Savings estimates from modeling building technologies/retrofits can be used secure financing to implement these technologies to see real savings. Models can be aggregated and used by a utility to estimate peak demand, allowing them to optimize their grids and use the oldest, most costly, high emission energy generation capability as scarcely as possible. These are just two examples of the numerous ways building energy models can be used to lessen the negative carbon impact buildings have on the environment.

Brett Bass, PhD

R&D Associate Staff Member, Oak Ridge National Laboratory

bassbc@ornl.gov

aam