Severe weather effects - U.S.


Case Study


Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project will explore the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database to answer two basic questions about sever weather events:

  1. Which types of severe weather events are most harmful with respect to population health in the U.S.?

  2. Which types of severe weather events have the greatest economic impact?

Data Processing


Packages

library(tidyverse)
library(tidydr)
library(dplyr)
library(gt)

Read data

Since this happens to be a .csv.bz2 folder we can read it directly instead of unzipping it first.

storm_data <- read.csv("D:/Education/R/Data/JH_C5_week2/repdata_data_StormData.csv.bz2", header = TRUE)

Data types

Let’s make sure our data is of a type we can calculate

str(storm_data)

Impact on health


The first goal of our study is to discover the storm events that are most harmful to population health, so we’ll need to

  • Group the data by storm events and sum the effects caused to the population in fatalities and injuries due to each one

  • Rank the events from most impact-full to least in order to answer our question

Aggregate

  • Aggregate() will allow us to group by EVTYP and sum

  • Separate fatalities and injuries into separate dataframes

  • Order the totals in descending and choose the most costly 10 events to show in the results section

impact_fatal <- storm_data |>
        aggregate(FATALITIES~EVTYPE, sum) 
impact_fatal <- impact_fatal[order(impact_fatal$FATALITIES,decreasing = TRUE),]

impact_injured <- storm_data |>
        aggregate(INJURIES~EVTYPE,sum)
impact_injured <- impact_injured[order(impact_injured$INJURIES,decreasing = TRUE),]

Tables

In this section we’ll present the data in table format showing only the 10 most impact-full events, the events that cost the most impact to human health.

Fatalities

table_fatal <- impact_fatal[1:10,] |>
        gt() |> 
        tab_header( title = md("**Number of Fatalities per Event**"),
                    subtitle = "10 most impactful events") |> 
        tab_options(table.align = "left", table.width = pct(50))

Injuries

table_injured <- impact_injured[1:10,] |>
        gt() |> 
        tab_header( title = md("**Number of Injuries per Event**"),
                    subtitle = "10 most impactful events") |> 
        tab_options(table.align = "left", table.width = pct(50))

Combine the two tables

tables <- data.frame(fatal=table_fatal, injured=table_injured)
tables |>
        gt() |>
        cols_label(
                fatal.EVTYPE = md("**Event**"),
                fatal.FATALITIES = md("**Fatalities**"),
                injured.EVTYPE = md("**Event**"),
                injured.INJURIES = md("**Injuries**")
                 ) |> 
        tab_header(title= md("**Event Type and Effect on Population of the U.S.**"),
                   subtitle = "10 most impactful events")

Economic Consequences


Similar to storm effects on human fatality and health, let’s calculate storm damages caused on property and crops. But before we proceed with the aggregation it appears that the data is saved in 5 columns:

colnames(storm_data[c(8,25:28)])
[1] "EVTYPE"     "PROPDMG"    "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP"

Multipliers

  • The documentation implied that the cost columns are abbreviations that contained the following: “K”, “M”, “B”

  • Verify that with

unique(storm_data$PROPDMGEXP)
 [1] "K" "M" ""  "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-" "1" "8"
  • “..EXP” columns are multipliers for values in the “..DMG” columns

  • K for thousands, M for millions, B for billions and we’ll ignore the rest

  • We need to convert the “..DMG” columns to real figures before we aggregate over the values

Convert damage costs columns

We’ll just multiply the representation of columns “…EXP” with the “…DMG” columns to give us whole numbers we can perform addition on.

storm_data <- storm_data |> 
        mutate(PROPDMG_COST= as.numeric(case_when(
                                PROPDMGEXP == "K" ~ as.character(PROPDMG*1000),
                                PROPDMGEXP == "M" ~ as.character(PROPDMG*1000000),
                                PROPDMGEXP == "B" ~ as.character(PROPDMG*1000000000),
                                TRUE ~ PROPDMGEXP))
        ) |> 
        mutate(CROPDMG_COST= as.numeric(case_when(
                        CROPDMGEXP == "K" ~ as.character(CROPDMG*1000),
                        CROPDMGEXP == "M" ~ as.character(CROPDMG*1000000),
                        CROPDMGEXP == "B" ~ as.character(CROPDMG*1000000000),
                        TRUE ~ PROPDMGEXP))
        )

Could use something like this as well:

# could use something like this
#storm.selected$PROPDMGEXP[(storm.selected$PROPDMGEXP == "2") | (storm.selected$PROPDMGEXP == "h") | (storm.selected$PROPDMGEXP == "H")] <- 10^2

Aggregate

So let’s group and sum the costs to property and crops for each event:

  • Aggregate() will group by EVTYPE and sum

  • Order the totals in descending and choose the most costly 10 events to show in the results section

impact_prop <- storm_data |>
        aggregate(PROPDMG_COST~EVTYPE, sum)
impact_prop <- impact_prop[order(impact_prop$PROPDMG_COST,decreasing = TRUE),]


impact_crop <- storm_data |>
        aggregate(CROPDMG_COST~EVTYPE,sum)
impact_crop <- impact_crop[order(impact_crop$CROPDMG_COST,decreasing = TRUE),]

Results


Question 1

  1. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

Answer

Tornadoes appear to be the most harmful with respect to population health with 5633 fatalities and 91346 injuries

Plot

figure1 <- ggplot(impact_fatal[1:5, ],
                 aes(x= reorder(EVTYPE, - FATALITIES),
                     y= FATALITIES,
                     fill = EVTYPE))
figure1 + geom_col(show.legend = FALSE, width = 0.8, color="black") +
        coord_flip() +
        labs(x= "Weather Event Type",
             y= "# of People Effected") +
        theme_bw()

Tables

Fatalities and Injuries per Event

table_fatal <- impact_fatal[1:10,] |>
        gt() |> 
        tab_header( title = md("**Number of Fatalities per Event**"),
                    subtitle = "10 most impactful events") |> 
        tab_options(table.align = "left", table.width = pct(50))

table_injured <- impact_injured[1:10,] |>
        gt() |> 
        tab_header( title = md("**Number of Injuries per Event**"),
                    subtitle = "10 most impactful events") |> 
        tab_options(table.align = "left", table.width = pct(50))

Event Type and Effect on Population of the U.S.

10 most impactful events

Event

Fatalities

Event

Injuries

TORNADO 5633 TORNADO 91346
EXCESSIVE HEAT 1903 TSTM WIND 6957
FLASH FLOOD 978 FLOOD 6789
HEAT 937 EXCESSIVE HEAT 6525
LIGHTNING 816 LIGHTNING 5230
TSTM WIND 504 HEAT 2100
FLOOD 470 ICE STORM 1975
RIP CURRENT 368 FLASH FLOOD 1777
HIGH WIND 248 THUNDERSTORM WIND 1488
AVALANCHE 224 HAIL 1361

Question 2:

  • Across the United States, which types of events have the greatest economic consequences?

Answer

Flood appears to be the most costly with $144.67 B

Plot

figure2 <- ggplot(impact_prop[1:5, ],
                 aes(x= reorder(EVTYPE, - PROPDMG_COST),
                     y= PROPDMG_COST,
                     fill = EVTYPE))
figure2 + geom_col(show.legend = FALSE, width = 0.8, color="black") +
        coord_flip() +
        labs(x= "Weather Event Type",
             y= "Economic Loss in (USD)") +
        theme_bw()

Tables

Damage to Property

Property Damage Cost per Event

10 most impactful events
EVTYPE PROPDMG_COST
FLOOD 144.66B
HURRICANE/TYPHOON 69.31B
TORNADO 56.93B
STORM SURGE 43.32B
FLASH FLOOD 16.14B
HAIL 15.73B
HURRICANE 11.87B
TROPICAL STORM 7.70B
WINTER STORM 6.69B
HIGH WIND 5.27B

Damage to Crops

Crop Damage Cost per Event

10 most impactful events
EVTYPE CROPDMG_COST
DROUGHT 13.97B
FLOOD 5.66B
RIVER FLOOD 5.03B
ICE STORM 5.02B
HAIL 3.03B
HURRICANE 2.74B
HURRICANE/TYPHOON 2.61B
FLASH FLOOD 1.42B
EXTREME COLD 1.29B
FROST/FREEZE 1.09B