How a Machine Learning Algorithm Helped Make Hurricane Damage Assessments Safer, Cheaper, and More Effective

Not sure what career in data is for you? 

Take the quizCoursesHow it WorksMentorsStudent SuccessBlog

Copyright 2021






springboard medium
springboard instagram
springboard twitter
springboard linkedin
springboard facebook


Example of "no damage" images:

Example of errors where test images labeled as ‘damage’ were misidentified by the model as ‘no_damage.’

Example of errors where test images labeled as ‘no_damage’ were misidentified by the model as ‘damage.’

Eleanor Hoyt is a sustainability professional excited by data and the intersection of technology and climate solutions. She manages projects for sustainability and resilience strategy in the built environment, including certification under LEED and other rating systems. Hoyt holds an M.S. in Sustainable Building Systems from Northeastern University.

Hobson Lane is the co-founder and CTO of Tangible AI, a company that helps social sector organizations scale impact and enhance their operations using artificial intelligence. His specialty is developing conversational AI and machine intelligence. Lane is also a mentor to Springboard students enrolled in the Data Science Career Track.

About the Authors

Hoyt’s project was completed and published through Springboard’s School of Data (SoDA), which offers mentor-led training in data sciencedata analyticsdata engineering, and machine learning engineering.

Looking ahead

Armed with her newfound knowledge of data science, Hoyt is excited to explore the intersection of technology and climate solutions in her career. She says she is eager to improve her image classification model and knows that if the use of such technologies were to become mainstream, it could vastly improve disaster relief efforts by making damage assessment safer, cheaper, faster, and more accessible. 

“If I had a lot more time and resources to keep working on this I’d love to look at other data images from other locations to continue to test the model, run the convolutional neural network for longer, potentially add new convolutional layers,” she said. “I stopped at a certain point because I thought the accuracy was pretty good for the work that I’d put into it but I think it definitely could be improved upon.”

The National Oceanic and Atmospheric Administration already collects aerial images capturing damage to coastal areas caused by storms, but properly labeled training data is important to develop an accurate machine learning model, and some of this labeling must be done—or at least supervised—by humans. Using data mining techniques on satellite imagery could help unearth many more insights about the extent of the damage by enabling environmental assessors to categorize the nature of the damage. 

“Image-driven data mining provides information about the scope—or order of magnitude—of the damage, the nature of the damage, such as water, wind, sand deposits, and so on,” said Dr. Christopher F. Barnes, an associate professor at the Georgia Institute of Technology, who has developed image-driven data mining methods for hurricane damage assessments. 

“It would be useful in government response in planning staging areas and determining the size of the assessment team.”

“You can tell that these images came from three distinct areas within Houston, and you could see the natural spatial variation of the hurricane damage,” said Hoyt, referencing the diagram above. “Some areas were harder hit, where all the buildings were damaged, and all the buildings [at the top] were totally fine and the ones in the middle there was a little bit of both.”

Building a convolutional neural network to classify images
After she had identified patterns in her data relating to pixel statistics and compressed file size as a statistically significant proxy for building damage, Hoyt was ready to build a convolutional neural network that would classify the images and use those aforementioned variables as features she could train the model to recognize. 

convolutional neural network (CNN) is a specialized type of neural network designed for working with 2D image data, although it can also analyze one-dimensional and three-dimensional images. The CNN is directly inspired by the visual cortex in the human brain that is responsible for object recognition, where different neurons fire in response to an object depending on its orientation and location.

Central to the CNN is the convolution layer, which slides a filter over an array of image pixels to create a feature map. The feature map summarizes the presence of detected features in the input and their spatial relationship to one another. CNNs can be equipped with specific filters during training in the context of a specific prediction problem. These steps amount to feature extraction, whereby the network builds a picture of the image data according to its own mathematical rules

Following Lane’s advice, Hoyt started with a very simple architecture with only one convolutional layer and four filters. She then tuned the model's hyperparameters by adding and subtracting filters and convolutional layers (a linear operation that applies the filter to an input) until she found the best model with the right combination of layers and filters. The final winning model had an accuracy of 0.94, which exceeded the accuracy of the logistic regression classifier Hoyt had experimented with earlier. As for the images mislabeled by the model, some were obvious mistakes, while others were for images where damage was difficult to determine with the human eye, indicating that the model’s mistakes were “reasonable ones."

Pixel-based image classification uses pixel statistics such as mean, variance, and standard deviation to characterize the contents of an image. In Hoyt’s case, she found that compressed file size also served as a proxy for classifying images, where a larger compressed file size indicates a more complex image, and images with more complexity tended to contain more building damage than those with less complexity and a smaller compressed file size. 

“One of the surprising things I found in that process was that using that compressed file size feature [as an independent variable] actually allowed the logistic regression to have an accuracy of 0.82,” said Hoyt. The accuracy rate refers to the proportion of images the model classifies correctly when run on the training dataset. 

The first machine learning algorithm Hoyt decided to apply was logistic regression, a machine learning technique borrowed from statistics and a go-to method for binary classification problems (i.e. problems with two class values). A function characterized by an S-shaped curve, the model takes any real number and maps it to a value between 0 and 1. This number represents the probability that an input belongs to the default class. So if the default class is “damage” and the model returns a value of, say, 0.81, there is a high probability that the input image contains building damage. 

Hoyt also plotted latitude and longitude by damage class to show the spatial relationship of the images in the dataset. The resulting map shows the distribution of damaged and undamaged structures in the area most affected by Hurricane Harvey with damage class indicated by color, with blue indicating damage.

Training a machine-learning algorithm to recognize building damage 

Image classification involves analyzing an input image and outputting a class. Since image classification is a decision-making process, proper labeling of images is absolutely essential. In order to frame her damage detection problem using classification, Hoyt had two choices: 

  • object-based classification

  • pixel-based classification

Object-based classification refers to the more mainstream applications of image classification technology, such as Facebook’s photo-tagging function and Google’s image search feature, in which an algorithm learns to recognize specific shapes such as a bird or a chair or someone’s face. While this method is effective for high-resolution images, the types of satellite images used for activities like crop yield predictions and natural disaster assessments are of a relatively low resolution. 

What’s more, the features of hurricane damage can’t be categorized according to specific object shapes. In this case, the machine learning algorithm uses pixel-based classification to understand—and classify—the characteristics of an image dataset. 

When a computer ‘sees’ an image, it perceives an array of pixel values depending on the size and resolution of the image. Each of these numbers is given a value between zero and 255 describing pixel intensity at that point, which corresponds to a color. An image classification algorithm analyzes an image by detecting changes in pixel intensity.

Eleanor Hoyt spent five years working as a sustainability consultant helping organizations and communities reach their environmental sustainability goals. Last year, she enrolled in a data science program so she could apply a more data-driven approach to her work. “A lot of my work is thinking about the effects of climate change,” Hoyt said. “How it will impact structures and buildings, and how we can design our buildings to be more resilient to these impacts in the future?”

The latest climate research shows that a warming planet leads to hurricanes with stronger wind speeds, more rain, and a worsened storm surge, all of which add up to more potential destruction. A 2020 study found that the odds of major hurricanes—Category 3, 4 and 5 storms —are increasing due to human-caused global warming.

Many aspects of environmental resilience work are still done using paper-and-pen record-keeping and other manual processes. Take hurricane damage assessment, for example. Typically, environmental assessors will survey a disaster site on foot or in a vehicle, recording the extent of building damage in a devastated area using little more than the naked eye or a camera and their own judgment. Damage assessments help local governments understand what kinds of structures failed and why, as well as serving as the basis for rescue missions, economic aid for survivors, and government funding to rebuild the community.

When it came time for Hoyt to pick her capstone project, she decided to build a machine-learning algorithm to assess building damage after a hurricane has occurred by using publicly available satellite imagery. 

Satellite images are one of the most important tools used by meteorologists—akin to having “eyes in the sky”—and are used to forecast weather and predict natural disasters like tsunamis, tornadoes, and hurricanes. Hoyt was drawn to an image dataset of satellite photographs taken in the aftermath of Hurricane Harvey, which struck Houston, TX, in 2017. The hurricane caused an estimated $125 billion in economic damage, making it the second-costliest natural disaster in U.S. history, according to the National Oceanic and Atmospheric Administration. Hoyt was interested in Hurricane Harvey in particular because it was a landmark event not only for the community it devastated, but also for those in the building sustainability industry who design solutions for worst-case scenarios. “It became a driving factor in a lot of the strategies and the ideas that we then developed with our clients,” she said. “Like most severe hurricane events around here, it becomes a marker we can point towards and say ‘this just happened, it can happen again.’” 

Automating the process of hurricane damage assessment

Hoyt developed an image classification machine learning model that could classify satellite images of post-hurricane carnage as either “damaged” or “not damaged.” Typically, images are labeled by humans, a process prone to error: low-resolution images are particularly blurry and noisy, far from the resolution of common object detection datasets. There are two main benefits to an automated approach over human labeling.

First, unlike humans, algorithms are consistent. So if the algorithm repeatedly mislabels certain images, it can be retrained using the correct labels until it can recognize “damage” versus “no damage” features at a higher accuracy.

Second, automating this process and using satellite imagery for data collection saves time and resources spent manually canvassing an area after a disaster has occurred while also improving safety conditions for the workers who perform the inspections. 

“There are a lot of hazards involved because you’re basically in a warzone at that point,” said Hobson Lane, co-founder and CTO of machine learning startup Tangible AI and Hoyt’s mentor during her Springboard data science bootcamp. “So there’s a safety impact from being able to do the inspection autonomously from aerial photographs.”


Hoyt sourced her images from the IEEE DataPort, a repository for open data. The aerial photographs show flooded areas and buildings in various states of disrepair. The training set on which she trained her model contained 5,000 images labeled as “damaged” and 5,000 images labeled “no damage.”

Having a balanced dataset—that is, an equal ratio of observations for each category—is important for training a model with high accuracy for machine learning classification. Hoyt also set aside a validation dataset of 1,000 “damage” and “no damage” images respectively. A validation set (or holdout set) is a sample of data that is omitted from the training set so that it can be used as an unbiased estimate of model accuracy and the basis for tuning model hyperparameters. 

Testing the model against a validation dataset avoids overfitting, a modeling error that occurs when the model is too closely related to the training dataset rather than being universally applicable, thereby resulting in a biased classification.

Examples of “damage” images: