Air quality indicators and lung disease across the United States

Final Project: Report
Data Science 1 with R (STAT 301-1)

Author

Cassie Lee

Published

November 29, 2023

Introduction

Air pollution is the presence of sufficient quantities of contaminants in the atmosphere for a duration that is long enough to cause harm to human health (World Health Organization, n.d.). Air pollution mainly enters the body through the lungs, and while it mostly impacts the heart, lungs, and brain, it has the potential to affect other organs as well by traveling through the bloodstream (World Health Organization, n.d.).

In this exploratory data analysis, I explore the relationships between air quality indicators, certain lung diseases, and how sociodemographic vulnerabilities affect the relationships between air quality and health in the United States (excluding territories). The goal of this analysis is to build a deeper understanding of how various air pollutants impact health across the United States.

I queried and downloaded the data at the county level from the CDC National Environmental Public Health Tracking Network interactive data explorer (Centers for Disease Control and Prevention, n.d.). This network was built to centralize environmental health data on the national, state, and county level across the United States (Centers for Disease Control and Prevention, n.d.).

The three central questions I explored in this analysis were:

  1. How are various air and environmental quality indicators related to each other?

  2. What are the most important indicators for determining the prevalence of asthma, cancer, and chronic obstructive pulmonary disease?

  3. How do sociodemographics like age, gender, race, and the social vulnerability index affect the relationships between lung disease and air quality?

Data overview & quality

The air pollutants I selected includes the days over the ozone standard, the days over the PM 2.5 standard, benzene, formaldehyde, acetaldehyde, carbon tetrachloride, and 1,3-butadiene pollution. The environmental quality indicators I selected includes the percent of people living near highways, the percent of public schools near highways, access to parks, and methods of transportation to work (walking, biking, driving alone, carpooling, public transportation, and none).

The indicators I selected for prevalence of lung diseases includes the crude prevalence of adult and child asthma, the crude and age adjusted rates of emergency department visits for asthma, age adjusted rates of lung and bronchus cancer, and the crude and age adjusted rates of chronic obstructive pulmonary disease.

The indicators of sociodemographic data I selected includes age, gender, race, and and the social vulnerability index (Agency for Toxic Substances and Disease Registry, 2023). I compared sociodemographic characteristics of each county to other counties in the United States and used this to identify counties with comparatively higher percentages of vulnerable groups (Appendix A). I also identified the majority race demographic in each county.

If available, I downloaded all data at the county level for 2018. Crude rates of child asthma and age adjusted rates of lung and bronchus cancer were only available at the state level. I downloaded data for the usual method of transportation to work for the time period of 2017 to 2021.

The final merged dataset includes 42 variables and 3144 observations matching each county and Washington DC. There are 4 identifying variables, 5 factor variables, 32 numeric variables, and one simple features geometry variable for mapping.

With the exception of Rio Arriba, Mexico, all counties had complete sociodemographic information. Rio Arriba, Mexico was missing data for the social vulnerability index. The prevalence of adult asthma was complete across all observations, but 1402 observations were missing for the prevalence of child asthma and 1789 observations were missing for crude and age adjusted emergency department visits for asthma. The majority of counties did not have information about the days over the ozone and PM 2.5 air quality standards. 1567 observations were missing data for the percent of public school near highways. All other variables were either complete or missing at most 1 observation.

Explorations

Indicators of air and environmental quality

To address the first central question, I used univariate variate analysis to understand the distribution of the indicators of air and environmental quality. I also used bivariate analysis to understand if and how certain indicators were related to each other.

Air quality: ozone and PM 2.5

As seen in Figure 1, the distribution of days over the ozone and particulate matter size 2.5 microns (PM 2.5) standards were both skewed right by observations with significantly more days over the standard. For both ozone and PM 2.5, most of the counties experienced no more than 20 days over the standard. However, the degree of right skewing for the distribution of days over ozone standard was significantly higher than for the distribution of days over the PM 2.5 standard.

Figure 1: Distribution of days over air quality standards.

Given that the distribution was so heavily skewed right, I was interested to see the distribution of the days over the ozone and PM 2.5 standards across the US. Although most of the data is missing, Figure 2 shows that Southern California had high levels of ozone pollution and Central California had high levels of PM 2.5 pollution. For PM 2.5, the days over the PM 2.5 standard aligns with the California wildfire incident map in 2018 (Cal Fire, n.d.).

Figure 2: Distribution of days over the ozone and PM 2.5 standard across the United States, excluding Hawaii and Alaska.

Figure 3 shows the relationship between the days over the ozone and PM 2.5 standard. There is a positive relationship between the two air quality indicators, indicating that counties with worse ozone pollution generally also had worse PM 2.5 pollution. This is consistent with research about the sources of ozone and PM 2.5 pollution, as they can both originate from nitrogen oxides from power plants, industrial pollution, and automobiles (US EPA, 2015; US EPA, 2016a).

However, there were a large number of counties that reported having 0 days over the PM 2.5 standard while having several days over the ozone standard and several counties that reported having 0 days over the ozone standard while having several days over the PM 2.5 standard. This is also consistent with research showing that these pollutants also have sources that produce one pollutant, but not the other. For example, construction sites, unpaved roads, fields, smokestacks and fires produce PM 2.5 pollution, but not ozone pollution (US EPA, 2016a).

Figure 3: Relationship between days over the PM 2.5 standard and days over the ozone standard excluding outliers (over 50 days over one or both standards).

Air quality: benzene, formaldehyde, acetaldehyde, carbon tetrachloride, and 1,3-butadiene

The other group of air quality indicators I was interested in were benzene, formaldehyde, acetaldehyde, carbon tetrachloride, and 1,3-butadiene concentrations. Figure 4 shows the distribution of these air pollutants. Benzene, formaldehyde, acetaldehyde, and 1,3-butadiene were skewed right, indicating that there are several counties that have unusually high levels of these pollutants. This is expected, as counties with usually high industrial pollution would cause this distribution. However, carbon tetrachloride was skewed left, indicating that there were several counties that had unusually low levels of these pollutants. One reason that this distribution may differ from the others is that carbon tetrachloride is not naturally occuring, while the other pollutants are. Thus counties with usually low levels of carbon tetrachloride may be counties that have never had high levels of carbon tetrachloride exposure, and thus were capable of having extremely low values (Agency for Toxic Substances and Disease Registry, 2019; Agency for Toxic Substances and Disease Registry, 2023a; Agency for Toxic Substances and Disease Registry, n.d.-a.; Agency for Toxic Substances and Disease Registry, n.d.-b; National Institute of Environmental Health Sciences, n.d.).

Figure 4: Distribution of five air pollutants.

Then, I explored how these five air pollutants were correlated with each other. Figure 5 shows a correlation matrix of these air pollutants. Formaldehyde and acetaldehyde were highly correlated with each other, while the rest of the pollutants were somewhat or barely correlated with each other. Carbon tetrachloride and benzene were somewhat correlated, and 1,3-butadiene was somewhat correlated with both formaldehyde and acetaldehyde. Benzene was barely correlated with formaldehyde and acetaldehyde. Correlation between these pollutants indicate similar sources of pollution that emit multiple pollutants at the same time.

Figure 5: Correlation matrix of five air pollutants.

Air quality: combined

After exploring how these five pollutants were correlated with each other, I wanted to see if they were correlated with ozone and PM 2.5 pollution. Figure 6 shows that ozone and PM 2.5 pollution were not particularly correlated with the other 5 pollutants. This is likely because the five chemicals are nearly strictly from industrial pollution, while ozone and PM 2.5 pollution can have significant non-industrial sources, such as from automobiles.

Figure 6: Correlation matrix of all air quality indicators.

Environmental quality

Upon bivariate analysis between air quality indicators and environmental quality indicators, I decided not to move forward with analysis including environmental indicators because they were not particularly predictive of air quality. For example, I had originally suspected that the percent of population living near a highway would be predictive of ozone levels, however, Figure 7 shows that it was not. This lack of relationship between environmental quality and air quality suggested that continuing to explore indicators of environmental quality would not help me answer my three main questions.

Figure 7: The relationship between the days over ozone standard and the percent of people living within 150 M of a highway is an example of how environmental quality indicators were not particularly predictive of air quality indicators.

Lung disease and air quality indicators

Once I had an understanding of how the air quality indicators were related to each other, I was interested in seeing how air pollution and various lung diseases were related.

Asthma

Figure 8 shows that although ozone and PM 2.5 pollution are known to aggravate lung diseases such as asthma, there was no particularly clear relationship between emergency department visits for asthma and ozone or PM 2.5 pollution (US EPA, 2015; US EPA, 2016b). It is possible that given better predictions and access to air quality information online, individuals with asthma were better able to avoid long exposures to high ozone and PM 2.5 levels, allowing them to avoid aggravating their asthma.

Figure 8: The crude rate of emergency deartment visits for asthma per 10 K population as a function of days over the ozone and PM 2.5 standard.

However, Figure 9 shows a clear positive relationship between the prevalence of asthma and exposure to the pollutants formaldehyde and acetaldehyde for both adults and children. Since the relationship between asthma and these two pollutants holds in both childhood and adulthood, I suspect that exposure to these pollutants in childhood can be associated with the development of asthma, which is then carried into adulthood. The relationship between childhood asthma and formaldehyde exposure has been supported by various studies (McGwin et al., 2010). However, there is limited and conflicting evidence about the long term effects of acetaldehyde exposure, so the positive relationship between childhood asthma and acetaldehyde exposure in these graphs may just be a result of the extremely strong correlation between formaldehyde and acetaldehyde.

Figure 9: The prevalence of adult and child asthma prevalence (percent of population) as a function of formaldehyde and acetaldehyde concentrations.

Lung and bronchus cancer

To explore how lung and bronchus cancer was associated with the air quality indicators, I used a correlation matrix to identify potentially interesting relationships to explore. Figure 10 shows that the measure of days over the ozone and PM 2.25 standard were not positively correlated with cancer, but it is important to note that there was a lot of missing data for these two indicators. On the other hand, there are relatively strong positive correlations between lung and bronchus cancer and the pollutants formaldehyde, acetaldehyde, and carbon tetrachloride.

Figure 10: Correlation matrix of lung and bronchus cancer and air quality indicators.

Figure 11 visualizes the relationship between the pollutants formaldehyde, acetaldehyde, and carbon tetrachloride and the prevalence of lung and bronchus cancer. As expected, formaldehyde and acetaldehyde had very similar relationships with the prevalence of lung and bronchus cancer. However, it is surprising that the relationship between lung and bronchus cancer and carbon tetrachloride had such a high correlation and a relatively high slope because this pollutant primarily affects the liver, kidneys, and central nervous system (US EPA, n.d.). The main carcinogenic properties affect the liver, not the lungs (US EPA, n.d.). Figure 10 shows that carbon tetrachloride was most strongly correlated with benzene, however, benzene is not strongly correlated with cancer risk. Thus, carbon tetrachloride was likely correlated with a different carcinogenic air pollutant which affects the respiratory system that was not explored here.

Figure 11: The prevalence of lung and bronchus cancer per 100 K population as a function of the pollutants formaldehyde, acetaldehyde, and carbon tetrachloride.

Chronic obstructive pulmonary disease

Finally, to explore the relationship between chronic obstructive pulmonary disease (COPD) and air quality, I created another correlation matrix to identify potentially interesting relationships. Figure 12 and Figure 13 show that like lung and bronchus cancer, formaldehyde, acetaldehyde, and carbon tetrachloride pollution were strongly correlated with COPD. However, unlike the relationships for lung and bronchus cancer, formaldehyde and acetaldehyde had higher slopes. This is consistent with studies showing that formaldehyde exposure through inhalation increases the risk of COPD (Bentayeb et al., 2015; Malaka & Kodama, 1990). However, the relationship between COPD and acetaldehyde is likely just a result of the strong correlation between formaldehyde and acetaldehyde because acetaldehyde is not known to have chronic health effects.

Figure 12: Correlation matrix of COPD and air quality indicators.
Figure 13: Age adjusted percentage of COPD as a function of the pollutants formaldehyde, acetaldehyde, and carbon tetrachloride.

Sociodemographic effects on lung disease and air quality

To answer the final central question, I incorporated sociodemographic vulnerabilities as a third variable in analysing the relationships between lung disease and air quality.

Age vulnerability

Given that poor air quality generally affects the young and the old, I explored how the distribution of age affected the relationship between lung disease and air quality. I identified counties with a relatively high proportion of young people or a relatively high proportion of older people as vulnerable.

Although Figure 8 did not show a clear relationship between emergency department visits and days over the ozone and PM 2.5 standards, Figure 14 highlights how counties with a high population of young or old individuals did in fact expect to see an increase in emergency department visits for asthma given poor air quality. For counties that were not vulnerable by age demographics, emergency department visits still did not have a clear relationship with ozone. However, for PM 2.5 pollution, emergency department visits decreased with increasing number of days over the PM 2.5 standard. This may still reflect the tendency of individuals to use air quality forecasts to limit exposure.

Figure 14: The crude rate of emergency deartment visits for asthma per 10 K population as a function of days over the ozone and PM 2.5 standard, disaggregated by demographic age vulnerability.

I was also interested in exploring how vulnerability by age demographics would affect the relationship between the prevalence of cancer and air pollutants. Figure 15 shows how the relationship between lung and bronchus cancer and air pollutants was the same across age vulnerabilities. However, counties that were vulnerable by age demographics generally had a lower prevalence of cancer. This is likely because children typically do not have enough time to develop lung and bronchus cancer at a young age, and people who have had lung and bronchus cancer may not live until older ages, so they would not be included in the population statistics.

Figure 15: The prevalence of lung and bronchus cancer per 100 K population as a function of the pollutants formaldehyde, acetaldehyde, and carbon tetrachloride, disaggregated by demographic age vulnerability.

Figure 16 shows that demographic age vulnerability had a similar effect on the relationship between COPD and air pollutants. However, it seems that for counties that did not have a high population of young and old individuals, the effect of formaldehyde and acetaldehyde pollution on COPD prevalence was greater. This is also likely another effect of how chronic diseases develop and affect age distributions.

Figure 16: Age adjusted percentage of COPD as a function of the pollutants formaldehyde, acetaldehyde, and carbon tetrachloride, disaggregated by demographic age vulnerability.

Gender vulnerability

Given that there are often differences in exposure to environmental hazards between men and women (OECD, 2020), I was interested to see if counties with a relatively higher proportion of women had a different relationship with asthma and air quality than counties with a relatively lower proportion of women. Figure 17 shows that in general, the effect of poor air quality on emergency department visits for asthma was larger for counties with a relatively higher proportion of women. The exception to this was PM 2.5, and this could be due to gendered differences in risk perception for poor air quality (Brown et al., 2021).

Figure 17: The crude rate of emergency deartment visits for asthma per 10 K population as a function of days over the ozone and PM 2.5 standard and the pollutnats formaldehyde and acetaldehyde, disaggregated by gender vulnerability.

For both lung and bronchus cancer and COPD, the effect of gender demographics on the relationship between air quality and disease was generally the opposite than for asthma. Figure 18 shows that increased formaldehyde and acetaldehyde concentrations had a larger effect on lung and bronchus cancer and COPD for counties with a relatively high proportion of men (lower proportion of women). With the exception of carbon tetrachloride, the gendered differences in the development of chronic lung diseases in response to air pollution may reflect gendered differences in exposure duration, possible through occupation (OECD, 2020).

Figure 18: The prevalence of lung and bronchus cancer per 100 K population and the age adjusted percentage of COPD as a function of the pollutnats formaldehyde, acetaldehyde, and carbon tetrachloride, disaggregated by gender vulnerability.

Race

I was interested in how the predominant race in each county affected the relationship between lung disease and air quality. However, there were so few counties that were predominantly American Indian/Alaskan Native and Asian/Pacific Islander, that that comparisons could only be made between predominantly Black and white counties. However, limited variability within the disaggregation of predominant race greatly limited the ability to draw conclusions about the effect of predominant race on the relationship between lung disease and air quality.

Figure 19 shows that predominantly Black counties had higher levels of both formaldehyde and acetaldehyde pollution and a higher prevalence of asthma, however, the effect of increased air pollution on the prevalence of asthma was either equal to or less than in predomnantly white counties. This may be a limitation of lower variation in pollution levels in counties that were predominantly Black.

Figure 19: The prevalence of adult and child asthma prevalence (percent of population) as a function of formaldehyde and acetaldehyde concentrations, disaggregated by predominant race.

Unexpectedly, Figure 20 shows that in predominantly Black counties, as the level of formaldehyde and acetaldehyde pollution increased, the prevalence of lung and bronchus cancer decreased. Even though the confidence interval for predominantly Black counties is fairly large, it still clearly shows a different relationship than for predominantly white counties. For carbon tetrachloride, however, all predominant races had very similar effects on the relationship between cancer and pollution, just with varying levels of cancer prevalence overall.

Figure 20: The prevalence of lung and bronchus cancer per 100 K population as a function of the pollutnats formaldehyde, acetaldehyde, and carbon tetrachloride, disaggregated by predominant race.

Figure 21 shows that between predominantly Black and white counties, the predominant race did not particularly affect the relationship between COPD and the pollutants formaldehyde and acetaldehyde. The relationship between COPD and carbon tetrachloride in predominantly Black counties was nearly opposite of the relationship for predominantly white counties, but I suspect this is due to the limited variability in the level of carbon tetrachloride pollution measured in predominantly Black counties.

Figure 21: Age adjusted percentage of COPD as a function of the pollutants formaldehyde, acetaldehyde, and carbon tetrachloride, disaggregated by predominant race.

Social Vulnerability Index

Initially, I was interested in exploring how the social vulnerability index (SVI) of each county affected the relationship between lung disease and air quality. However, upon exploration, there was little difference in the relationship between lung disease and air quality for counties with a high SVI score compared to a low SVI score. Figure 22 shows that although counties with high social vulnerability had a slight negative slope for the prevalence of asthma and an overall higher prevalence, the relationship between increased air pollution and the prevalence of asthma was not much different than for counties with low social vulnerability. Similarly, although the prevalence of COPD was higher for counties with high social vulnerability, the slope is not much different from counties with low social vulnerability. For lung and bronchus cancer, social vulnerability has an extremely negligible effect on the relationship between the prevalence of cancer and formaldehyde pollution.

I suspect that part of the reason SVI does not greatly affect the relationship between lung disease and air pollution is that SVI was designed to capture the social vulnerability in responding to emergency events such as hurricanes, disease outbreaks, or exposure to dangerous chemicals, whereas my explorations focus more on the chronic effects of long term air pollution (Agency for Toxic Substances and Disease Registry, 2023b).

Figure 22: The prevalence of adult asthma (percent of popultion), lung and bronchus cancer per 100 K population, and COPD (age adjusted percentage of population) as a function of formaldehyde pollution, disaggregated by SVI.

Conclusions

In this exploratory data analysis, I visualized the distributions of air quality indicators and their relationships with lung disease across the United States.

The distributions of the air quality standards were all skewed right, with the exception of carbon tetrachloride. I suspect that the extremely low values for carbon tetrachloride concentrations could reflect areas that have not had carbon tetrachloride pollution before, as this chemical does not occur naturally. Additionally, analysis showed that the two categories of air quality indicators, days over the ozone and PM 2.5 standard, and five industrial pollutants, benzene, formaldehyde, acetaldehyde, carbon tetrachloride, and 1,3-butadiene were not highly correlated with each other. However, correlation between air quality indicators within both categories were relatively high.

Explorations of the relationships between lung disease and air quality showed that formaldehyde, acetaldehyde, and carbon tetrachloride were generally most predictive of lung disease. However, since formaldehyde and acetaldehyde were very highly correlated, and exposure to acetaldehyde is not known to have chronic health effects, formaldehyde is likely confounding the relationship between acetaldehyde and formaldehyde. The prevalence of both adult and child asthma were associated with formaldehyde and acetaldehyde, and emergency department visits for asthma were also associated with the days over the ozone and PM 2.5 standards for counties that had a high population of younger and older people. Lung and bronchus cancer was associated with formaldehyde, acetaldehyde, and carbon tetrachloride, and counties that had a relatively higher proportion of men experienced a greater effect of pollution on the prevalence of cancer. Chronic obstructive pulmonary disease was also most strongly associated with formaldehyde, acetaldehyde, and carbon tetrachloride, and similar to lung and bronchus cancer, counties that had a relatively higher proportion of men experienced a greater effect of pollution on the prevalence of COPD.

Finally, although I disaggregated the relationship between air pollution and lung cancer by the predominant race in each county, it is difficult to draw conclusions because there was very little variation in each category, with the exception of Black and white counties. Even then, there was often limited variation in the exposure to pollution in Black counties because they tended to cluster around the higher end of pollution exposure.

One further question this exploratory data analysis raises is what other air pollutant is highly correlated with carbon tetrachloride? Although the relationship between lung and bronchus cancer and carbon tetracholoride is strong and holds even when I disaggregated by various sociodemographic vulnerabilities, carbon tetrachloride is not known to cause cancer in the respiratory system, but rather cause liver cancer. Since carbon tetrachloride was not particularly associated with another air pollutant explored here, there must be another air pollutant that actually directly causes an increased prevalence of lung and bronchus cancer.

Additionally, future work could have a greater emphasis on mapping the exposures and health outcomes across the United States. Mapping would allow for a greater understanding of the spatial distributions of air pollution and health outcomes. For example, mapping allows us to identify places with extreme industrial pollution, like Louisiana’s Cancer Alley (Singer, 2011). Additionally, mapping enables the ability to put results into greater historical and political context to understand how and why air pollution and health outcomes concentrate in certain counties, allowing for a better application of the data exploration to efforts in environmental justice.

Sources

Agency for Toxic Substances and Disease Registry. (n.d.-a). Public Health Statement for 1,3-Butadiene. Retrieved November 28, 2023, from https://wwwn.cdc.gov/TSP/PHS/PHS.aspx?phsid=457&toxid=81#bookmark02

Agency for Toxic Substances and Disease Registry. (n.d.-b). Public Health Statement for Formaldehyde. Retrieved November 28, 2023, from https://wwwn.cdc.gov/TSP/PHS/PHS.aspx?phsid=218&toxid=39

Agency for Toxic Substances and Disease Registry. (2019, January 3). Benzene. https://www.atsdr.cdc.gov/sites/toxzine/benzene_toxzine.html

Agency for Toxic Substances and Disease Registry. (2023a, May 25). Carbon Tetrachloride Toxicity: Where Is Carbon Tetrachloride Found? https://www.atsdr.cdc.gov/csem/carbon-tetrachloride/where_found.html

Agency for Toxic Substances and Disease Registry. (2023b, July 12). CDC/ATSDR Social Vulnerability Index (SVI). https://www.atsdr.cdc.gov/placeandhealth/svi/index.html

Bentayeb, M., Norback, D., Bednarek, M., Bernard, A., Cai, G., Cerrai, S., Eleftheriou, K. K., Gratziou, C., Holst, G. J., Lavaud, F., Nasilowski, J., Sestini, P., Sarno, G., Sigsgaard, T., Wieslander, G., Zielinski, J., Viegi, G., & Annesi-Maesano, I. (2015). Indoor air quality, ventilation and respiratory health in elderly residents living in nursing homes in Europe. European Respiratory Journal, 45(5), 1228–1238. https://doi.org/10.1183/09031936.00082414

Brown, G. D., Largey, A., & McMullan, C. (2021). The impact of gender on risk perception: Implications for EU member states’ national risk assessment processes. International Journal of Disaster Risk Reduction, 63, 102452. https://doi.org/10.1016/j.ijdrr.2021.102452

Cal Fire. (n.d.). 2018 Fire Season Incident Archive. Retrieved November 28, 2023, from https://www.fire.ca.gov/incidents/2018

Centers for Disease Control and Prevention. (n.d.). National Environmental Public Health Tracking Network. Retrieved November 28, 2023, from https://ephtracking.cdc.gov

Malaka, T., & Kodama, A. M. (1990). Respiratory Health of Plywood Workers Occupationally Exposed to Formaldehyde. Archives of Environmental Health: An International Journal, 45(5), 288–294. https://doi.org/10.1080/00039896.1990.10118748

McGwin, G., Lienert, J., & Kennedy, J. I. (2010). Formaldehyde Exposure and Asthma in Children: A Systematic Review. Environmental Health Perspectives, 118(3), 313–317. https://doi.org/10.1289/ehp.0901143

National Institute of Environmental Health Sciences. (n.d.). Acetaldehyde (Report on Carcinogens, Fifteenth Edition). Retrieved November 28, 2023, from https://ntp.niehs.nih.gov/sites/default/files/ntp/roc/content/profiles/acetaldehyde.pdf

OECD. (2020). Gender and environmental statistics: Exploring available data and developing new evidence. https://www.oecd.org/environment/brochure-gender-and-environmental-statistics.pdf

Singer, M. (2011). Down Cancer Alley: The Lived Experience of Health and Environmental Suffering in Louisiana’s Chemical Corridor. Medical Anthropology Quarterly, 25(2), 141–163. https://doi.org/10.1111/j.1548-1387.2011.01154.x

US EPA, O. (n.d.). Carbon tetrachloride. Retrieved November 28, 2023, from https://www.epa.gov/sites/default/files/2016-09/documents/carbon-tetrachloride.pdf

US EPA, O. (2015, May 29). Ground-level Ozone Basics [Overviews and Factsheets]. https://www.epa.gov/ground-level-ozone-pollution/ground-level-ozone-basics

US EPA, O. (2016a, April 19). Particulate Matter (PM) Basics [Overviews and Factsheets]. https://www.epa.gov/pm-pollution/particulate-matter-pm-basics

US EPA, O. (2016b, April 26). Health and Environmental Effects of Particulate Matter (PM) [Overviews and Factsheets]. https://www.epa.gov/pm-pollution/health-and-environmental-effects-particulate-matter-pm

World Health Organization. (n.d.). Air quality and health. Retrieved November 28, 2023, from https://www.who.int/teams/environment-climate-change-and-health/air-quality-and-health/health-impacts

Appendix

A. Identifying sociodemographic vulnerabilities

The sociodemographic data downloaded from the National Environmental Public Health Tracking Network contained demographic data by percentages, rather than specific sociodemographic vulnerabilities such as a high proportion of young or old individuals.

I identified counties with a particularly high percentage of young or old people by creating a histogram to graph the distribution of percentage of each county that is under 19 years old or over 65 years old. Since the distributions looked fairly normal, I identified counties that had a percentage of each given age group that was one standard deviation above the mean. I created a category to identify whether or not the counties had a high population of vulnerable people. Figure 23 visualizes how counties were identified as vulnerable by age demographics. When joining this data with the rest of the data, I collapsed the vulnerability variable to consider a county vulnerable if they have a high percentage of young or old people.

Figure 23: Distribution of age demographics across counties with an emphasis on demographic age vulnerability.

Figure 24 shows that the distribution of the percentage of women in each county is fairly narrow, however there are several counties that have a particularly low or high percentage of women. To identify counties with a low or high percentage of women, I identified counties that had a percentage of women that was one standard deviation above or below the mean.

Figure 24: Distribution of the percentage of women across counties with an emphasis on gender vulnerability.

Figure 25 shows the distribution of the predominant race in each county by percentage.

Figure 25: Distribution of the predominant race in each county.

The social vulnerability index identifies the relative vulnerability of each county in the United States when in comes to preparing for responding to hazardous events. The index ranges from 0 to 1, with 1 being the most vulnerable. Figure 26 shows that this is a uniform distribution, so I identified the top 25% of counties as vulnerable.

Figure 26: Distribution of social vulnerability index with an emphasis on vulnerability.