A Crowdsourced Hail Dataset: Potential, Biases, and Inaccuracies

File(s)
Date
2013-12-01Author
Pehoski, Joseph Robert
Department
Mathematics
Advisor(s)
Kyle Swanson
Metadata
Show full item recordAbstract
Hail is a substantial severe weather hazard in the USA, with significant damage to property and crops occurring annually. Traditional methods of forecasting hail size have limited accuracy, and despite improvements in remote sensing of precipitation, the fall characteristics of hail make quantification of hail imprecise. Research into hail is ongoing, but traditional hail datasets have known biases and low spatiotemporal resolution. The increased usage of smartphones creates the opportunity to use a crowdsourced dataset provided by the Precipitation Identification Near the Ground (PING) program, a program developed by the National Severe Storms Laboratory. PING data is compared to approximate ground truth in the form of preliminary Severe Prediction Center (SPC) hail reports, and National Weather Service (NWS) issued severe warning polygons. Biases and inaccuracies in the dataset are also explored through exploratory data analysis. While PING reports did not suffer from biases based on time of day or day of week, the location of PING reports was found to have a heavy bias towards high population density areas compared to SPC reports. Skill scores of PING reports, compared to SPC reports, were low, with a remarkably high False Alarm Rate (FAR), indicating false reports being a problem in the PING dataset. Comparing PING reports to severe polygons did not substantially improve the skill scores. The low number of severe PING reports prevented any meaningful analysis of size accuracy. While the number of SPC reports were mostly correlated with the number of warning polygons issued by each Weather Forecast Office, the PING reports were not well correlated, with an anomalously high number of reports in the Oklahoma City region. The inaccuracy of PING reports and strong population bias suggest that the PING hail database may not have high utility, and should only be used in conjunction with other databases in order to ensure quality.
Subject
Bias
Crowdsourced
Hail
mPING
Non-Meteorological
PING
Permanent Link
http://digital.library.wisc.edu/1793/93015Type
thesis
