Anna Jeffries

MS Data Science

Visualizing Census Data


[Picture]
Western North Carolina counties (Buncombe, Haywood, Rutherford, Henderson, Yancey, Madison). Made in R using the ACS data API and other mapping libraries.
In an exercise, I was tasked with using Census data to explore some kind of exploratory social science question. I wanted to look at SNAP/food stamp households and specifically narrowed my scope to only look at a collection of counties in Western North Carolina. I chose this area because I already have in-depth domain knowledge of the area since I lived there for the majority of my life. 

The cost of living isn't cheap in Western North Carolina and, as we all know, the technical definition of "poor" is crap and there are plenty of people who still need assistance (and many of whom who are excluded or disqualified from assistance programs for one reason or another). It's a systemic problem, to say the least, but there's more recently been a push to try to do more for the harder-to-reach rural areas around Asheville, particularly in terms of healthcare access and services. Buncombe County is by far the most wealthy of the immediate counties surrounding it. In particular, most adjacent counties are quite rural when compared with Buncombe (in which the only city of note is Asheville). 

Observing the change over time, there are some noticeable differences and interesting tract behaviour that would require further investigation to get the precise zipcodes they cover and the political goings-on. Suffice to say, the region overall had the least number of what may be crudely termed "poverty+" (at or above the povery level) households using food assistance around 2018, preceded by the worst scenario in 2014. This tracks, as Asheville is booming throughout the late 2010s, mainly in the food/beer/service industry, and a lot of people move in to work at breweries and restaurants and fancy hotels and the like. Tracts that are still further away from the centre (where Asheville is located) are slower to catch up to this trend. Then, not unsurprisingly, there's a notable *increase* in poverty+ households depending on food assistance when Covid-19 hits. And while Asheville remained open and actually had a lot of cool modified stuff going on to accomodate for pandemic mandates, it was a definitive blow to a city whose main industry is tourism. While not everything revolves around Asheville in WNC, a lot does, and the cost of living is extraordinarily high because of Asheville's status as a destination city, and those costs didn't change when Covid-19 effectively shut down the area's main sector of economy. 

This has been a fairly brief and high-level exercise, but there are many ways that one could dive deeper into the data to get much more precise figures and conclusions.

# B22001_001 is the total estimate for households receiving SNAP or food stamps
# B22003_004 is the total estimate for households *at or above the poverty level* receiving SNAP or food stamps
# defining a function I can use repeatedly
get_acs_data <- function(year, counties) {
  data <- get_acs(
    geography = 'tract',
    county = counties,
    state = '37',
    variables = c(snap_total = 'B22001_001', geq_poverty = 'B22003_004'),
    year = year,
    geometry = TRUE,
    progress_bar = FALSE,
    output = 'wide'
  )
  
  return(data)
}

###########################

years <- 2022  
counties <- c('021', '115', '161', '199', '087', '111', '089')  

# iterate and compile
bunc.SNAP_geq_Pov_list <- lapply(years, function(year) get_acs_data(year, counties))
bunc.SNAP_geq_Pov <- do.call(rbind, bunc.SNAP_geq_Pov_list)

# names
names <- list()
for (n in bunc.SNAP_geq_Pov$NAME){
  words <- strsplit(n, " ")[[1]]
  names <- append(names, words[4])
}

# final df
bunc.SNAP_geq_Pov$county <- unlist(names)
bunc.SNAP_geq_Pov$percent_geq_pov = bunc.SNAP_geq_Pov$geq_povertyE/bunc.SNAP_geq_Pov$snap_totalE

# creating and saving the final plot
all_2022 <- ggplot(bunc.SNAP_geq_Pov, aes(fill = percent_geq_pov)) + 
  geom_sf(color = "black", size = .5) + 
  theme_minimal() +
  scale_fill_viridis_c(option = "magma", direction = -1) +
  labs(fill = 'Percentage') +
  ggtitle("2022") +
  theme(plot.title = element_text(size = 8, face = "bold")) +
  theme(
    panel.grid.major.x = element_blank(),  
    panel.grid.minor.x = element_blank(), 
    panel.grid.major.y = element_line(), 
    panel.grid.minor.y = element_blank(),  
    axis.text = element_text(angle = 30, vjust = 1, hjust = 0.8, size = 5),
    legend.position = "right",
    legend.direction = 'vertical',
    legend.title = element_text( size=8),
    legend.text=element_text(size=8)
  )