Policy Brief

Gaps in COVID-19 Race/Ethnicity Data in California

December 20, 2020

The spread of SARS-CoV-2 has been accompanied by troubling disparities in infections, hospitalizations, and deaths. This issue brief underscores lessons from the HIV/AIDS epidemic, outlining the need for comprehensive data to design and carry out appropriate mitigation measures. It draws comparisons to the HIV epidemic and identifies gaps in race/ethnicity data in routine COVID-19 case reporting and explores risk factors for exposure that may be driving disparities in COVID-19 outcomes.


Public health efforts have concentrated on the spread of SARS-CoV-2 (COVID-19) since the World Health Organization declared a global public health emergency in January 2020. The widespread health and economic impact of this novel coronavirus further burdens health and public health systems responding to existing challenges to addressing HIV and related syndemics. Recent data demonstrate that providers addressing HIV, hepatitis C and sexually transmitted infection services throughout California have experienced a diversion of resources and reduction in services (California HIV/AIDS Policy Research Centers, 2020).

Despite effective treatment to stop forward transmission and methods to prevent acquisition of HIV, including new biomedical interventions (e.g., pre-exposure prophylaxis [PrEP]), HIV remains a pressing public health issue in the United States. New infections have largely held steady at 40,000 each year since 2013 (Office of Infectious Disease and HIV/AIDS Policy, 2020). With the emergence of COVID-19, the decades-long effort to prevent, treat, and eradicate HIV/AIDS is further complicated by conditions exacerbated by this new global pandemic. This current context includes broader economic and political instability and civil unrest (Premkumar & Tebandeke, 2011), higher rates of unemployment (Maruthappu et al., 2017), housing instability (Aidala et al., 2015; Griffin et al., 2020), intimate partner violence (Joint United Nations Programme on HIV/AIDS, 2020; Willie et al., 2018), substance use and problems with behavioral health (Campbell et al., 2018; Remien et al., 2019). Many of these conditions are understood to be drivers of the HIV epidemic.

We draw similarities between the two pandemics in California — both in terms of some of the structural and other drivers of disparities in infections and outcomes — and in the importance of comprehensive sociodemographic data to mitigation efforts. The urgency of the COVID-19 pandemic calls for a rapid learning of lessons from other epidemics.

  • HIV and COVID-19 Racial & Ethnic Disparities

    Race/ethnicity data are often collected using two separate questions, one on race and another on ethnicity, with the second referring to whether a person identifies as being of Hispanic or Latino origin. Hispanic/Latino persons may be of any race. Sources vary in how they report these data. Where data sources detail their usage of race/ethnicity terms that differ from those presented above, such will be indicated in-text or footnotes.

Persistent racial and ethnic disparities in incidence and treatment of HIV nationally and at the state level are well documented (Centers for Disease Control and Prevention [CDC], 2020f; CDPH, 2020c). Among people living with HIV (PLWH), such disparities can be tracked along the continuum of care with disproportionately poorer outcomes related to linkage to and retention in care. Increasing use of PrEP is a key strategy, but PrEP uptake differs across racial/ethnic lines. Among the state’s low-income Medi-Cal enrollees receiving PrEP, uptake rates among non-Hispanic Black/African-American and Hispanic/Latino (hereafter referred to as “Black” and “Latino”) enrollees are considerably lower than among their White counterparts (Harawa et al., 2018). Disparities related to COVID-19 have become apparent, but their documentation is often lacking, in contrast to HIV/AIDS, because of inadequate reporting of race and ethnicity data.

At this same moment, we are witnessing racial and ethnic disparities in access to care and health outcomes related to the spread of COVID-19 across the country (Nelson, 2020). Using data from the first wave of COVID-19 in the spring, Millett et al. (2020) noted that the roughly one-fifth of U.S. counties that are disproportionately Black accounted for 52% of COVID-19 diagnoses and 58% of COVID-19 deaths in the country. County-level data too from Rodriguez et al. (2020) showed that COVID-19 diagnoses and fatalities rose with the proportion of Latinos residing within a county. According to the CDC (2020b), COVID-related, age-adjusted hospitalization rates for Black and Latino persons are almost five times that of non-Hispanic Whites. The unadjusted case rate for American Indian/Alaska Native (AI/AN) persons is the same as that of Latinos and 2.8 times the rate for White persons. The AI/AN hospitalization rate is more than 5 times that of non-Hispanic Whites (CDC, 2020b). The impact on Native Hawaiians/Pacific Islanders (NHPI) is obscured by some of their data being grouped with those of Asian or multi-racial persons. As of 11/15/20, however, in the top nine states ranked by case rate, the NHPI rate per 100,000 exceeded those of all other racial/ethnic groups (UCLA Center for Health Policy Research, 2020).

COVID-19 Cases and Mortality

In California, the COVID-19 surveillance statistics as of 11/23/20 are missing race/ethnicity data for 28% or roughly 316,000 out of 1.13 million cases (see Table 1). By contrast, nearly all (99%) data on COVID-19 deaths include race/ethnicity information for 18,817 individuals (CDPH, 2020b). Latinos and Blacks in California make up a higher proportion of COVID-19 deaths relative to their share of the state’s general population (CDPH, 2020b).


Table 1. California COVID-19 Mortality by Race/Ethnicity

(data as of
Number of CasesPercent of CasesNumber of DeathsPercent of DeathsPercent of CA Population
American Indian/Alaska Native2,4030.3640.30.5
Native Hawaiian/Pacific Islander4,3480.5890.50.3
Total with data809,610
Source: COVID-19 Race and Ethnicity Data (CDPH, 2020b)


California is not among the 15 states that report public health department COVID-19 hospitalization numbers by race/ethnicity (University of Minnesota Tracking Project, 2020). The CDC provides laboratory-confirmed, coronavirus-related hospitalization data through COVID-NET, “a population-based surveillance system that collects data … through a network of over 250 acute-care hospitals in 14 states” (CDC, 2020a, para 1). COVID-NET data indicate that the cumulative rate of COVID-19 hospitalizations for three participating counties in California, representing 9% of the state population (CDC, 2020a), demonstrates rates for Black and Latino people that are, respectively, three and over four times that of White people (see Figure 1).


COVID-19 Testing

A month into the pandemic, Illinois and Kansas were the only two states reporting race/ethnicity data for COVID-19 tests (Daniels & Morial, 2020). As of mid-November, only five others have begun doing so (Johns Hopkins University & Medicine, 2020). One-third of positive COVID-19 tests in California by late July lacked information on race/ethnicity (Goldberg, 2020). On July 28, 2020, the CDPH announced emergency regulations that took immediate effect and required COVID-19 testing laboratories and health care providers to collect and report data on the race/ethnicity, gender identity, and sexual orientation of patients (CDPH, 2020d). To date, California has not made testing data by race/ethnicity publicly available. Comprehensive collection of state-level COVID-19 race/ethnicity data has been inconsistent with the exception of Indiana, Kansas, and Utah.

Sexual Orientation and Gender Identity

Because of the need to improve data on the COVID-19 pandemic, California altered its confidential morbidity report form CDPH 110a to include three separate items—sex assigned at birth, current gender identity, and sexual orientation, whereas an earlier version had a lone entry for gender. State Senator Scott Wiener introduced in early May 2020, SB 932, a bill mandating the inclusion of sexual orientation and gender identity (SOGI) data, to be voluntarily provided by individuals, in COVID-19 case reports (Equality California, 2020). It passed both the California Assembly and Senate with zero noes (California Legislative Information, 2020). On September 26, Governor Gavin Newsom signed an amended bill that requires SOGI data not just for COVID-19, but also for all reportable communicable diseases (Levesque, 2020; Office of Governor Gavin Newsom, 2020). This legislation cements the July CDPH emergency regulation to enhance infectious disease reporting with SOGI data collection. California’s publicly available COVID-19 data on cases and deaths continue to be reported in terms of binary gender and to exclude data on sexual orientation.

Age-Adjusted Data

Age-adjusted data account for differences in the age distribution of populations and are helpful when comparing populations at a point in time or a single population at more than one point in time (Klein & Schoenborn, 2001). They can also reveal sharper racial disparities than are observed in crude or unadjusted data. In California, unadjusted figures indicate that the proportion of COVID-19 fatalities among Black and Latino persons is 30% and 20% higher than their share of the state population, respectively (see Table 2). With age-adjusted distributions, the proportion of COVID-19 deaths among Blacks is 61% and, for Latinos, 68% higher than their respective shares of the state population. Of the eight California counties prioritized under the federal Ending the HIV Epidemic: A Plan for America, Los Angeles County is the sole county making age-adjusted case and mortality rates by race/ethnicity publicly available. As of 11/22/20, in Los Angeles County, COVID-19 crude mortality rates among persons who are Black, Latino, and NHPI are 1.4, 1.2, and 1.7 times that of White people. Age-adjusted rates reveal much sharper differences. Mortality rates among Black, Latino, and NHPI people are, respectively, 1.9, 2.8, and 2.9 times that of Whites (County of Los Angeles Public Health, 2020).


Table 2: Crude and Age-Adjusted Distribution of California Population and COVID-19 Deaths by Race/Ethnicity

(data as of 11/18/20)
Crude Unweighted distribution of population (%)Crude Distribution of
COVID-19 Deaths (%)
Age-adjusted Unweighted distribution of population (%)Age-adjusted Distribution of COVID-19 Deaths (%)
American Indian/
Alaska Native
Native Hawaiian/
Pacific Islander
Source: Health Disparities: Race and Hispanic Origin, Figure 1 (CDC, 2020e)
For the CDC data in this table, the “Latino” category includes persons of any race. For the categories “AI/AN”, “Asian”, “Black”, and “White”, they refer to non-Hispanic, single-race persons.

Co-Morbidities and People Living with HIV

The CDC has documented evidence of other conditions linked to severe COVID-19, which it defines
as “hospitalization, admission to the ICU, intubation or mechanical ventilation, or death” (CDC, 2020d, para 1). The conditions for which strong and consistent evidence of higher risk of severe COVID-19 exists are serious heart disease, cancer, chronic kidney disease, chronic obstructive pulmonary disease (COPD), obesity, sickle cell disease, Type 2 diabetes, smoking, and solid organ transplantation (CDC, 2020d). Because people living with HIV (PLWH) tend to suffer the same co-morbidities as individuals with aggravated COVID-19 (Patel & Pella, 2020), there was concern early on that they too would be at increased risk for severe COVID-19. According to the CDC (2020d), however, evidence of higher risk for severe COVID-19 among those with HIV remains limited. Others have found that PLWH do not differ much from persons without HIV in COVID-19 incidence or in its clinical presentation and outcomes (Saag, 2020). The worst clinical outcomes of COVID-19 have taken a particular toll among racial/ethnic minorities. For Black, Latino, NHPI, and AI/AN persons, one explanation is higher prevalence of such conditions as hypertension, heart disease, obesity, and diabetes, in the context of lower access to quality medical care for managing these conditions (CDC, 2013, 2018; Graham, 2015; Lavizzo-Mourey & Williams, 2016; Moy et al., 2010; Poteat et al., 2020). This is compounded by mistrust or fear of the health care system due to histories of discrimination and concerns about immigration enforcement.

Social and Structural Risk Factors

Labor-related Risk Factors

The likelihood of exposure to COVID-19 heightens with certain social risk factors. In California, the Central Valley’s predominantly Latino farm, food processing, and warehouse workers and their communities have been hit hard. Increased risk of HIV is evident among Latino people who predominate in the Central Valley. Of 16 Central Valley counties, ten had HIV incidence rates in 2018 above the average HIV incidence across California counties, 7.2 per 100,000 people (CDPH, 2020a). Risk of coronavirus infection appears higher for people in low-wage work (Flores & Padilla, 2020), employed in essential jobs or industries not amenable to remote work (Angelucci et al., 2020), and at work sites where physical distancing is difficult (Dyal et al., 2020). According to the Bureau of Labor Statistics, in 2017 to 2018 Black workers were 31% less likely to work from home than were White workers and Latino workers of any race were 52% less likely to work from home than were non-Latino workers. Asian workers were the most likely to have worked from home (Woods, 2020).

Structural Risk Factors

Housing and the places and spaces where people live affect vulnerability to COVID-19. Though outbreaks in metropolitan areas around the world implicate population density, others argue residential or household density or crowded housing are more relevant (Citizens Housing & Planning Council, 2020). In California, the average household size is 2.9 and the average family size is 3.5 persons. It ranks second highest among US states for proportion of multigenerational households. Asian people and Pacific Islanders, Latinos of any race, and non-Hispanic Black people are all more likely than non-Hispanic White people to live in multigenerational households (Cohn & Passel, 2018; Hoff, 2020).

Uncontrolled transmission can also occur in institutional and congregate settings, such as nursing and care facilities, prisons, shelters, detention facilities, and churches. The Prison Policy Initiative (2014) calculated incarceration rates for Latino people at 757 per 100,000 and 3,036 per 100,000 for Black people, compared to 453 per 100,000 for non-Hispanic Whites, in California. As of 11/22/20, the California Department of Corrections and Rehabilitation reported a cumulative COVID-19 rate of 196 per 1,000 people or seven times the California rate of 28 per 1,000 people. Of the roughly 123,000prisoners under the state’s jurisdiction in 2019, 44% were Latino, 29% were non-Hispanic Black, and 21% were non-Hispanic White (Carson, 2020). Differences in incarceration rates are likely another structural factor behind the disparate racial impact of COVID-19; however, the data provided on COVID in prisons, jails, and detention facilities do not include racial/ethnic distributions of COVID cases or deaths.

Because of the threat COVID-19 poses to people who are experiencing homelessness and living outdoors or in shelters, California mobilized at the state and local levels and undertook Project Roomkey to provide this vulnerable group with non-congregate shelter (California Health Care Foundation, 2020). The benefit of measures such as these, however, go only to the few that are selected for assistance. Widespread, additional efforts are needed to address the increased needs resulting from closures of other vital social services programs resulting from COVID-related safety protocols (Barry-Jester & Hart, 2020).


Public health officials and policymakers responding to COVID-19 are having to learn what those working to address the HIV epidemic have been doing for decades—to design and implement mitigation measures based on comprehensive data. Thus, COVID-19 data collection must facilitate more robust comparisons by race/ethnicity at the intersection of other factors such as health status, sexual and gender minority status, social and structural risks. These data are critical for uncovering differences and redressing disparities in COVID-19 outcomes.

While far from comprehensive, existing data on COVID-19 in California point to troubling disparities in COVID-19 infections, hospitalizations, and deaths by race/ethnicity. Despite public attention to this issue and legislation to increase the completeness of race/ethnicity data in state-level COVID-19 case reporting, this information is still routinely missing from reported cases. Other key reported information on COVID-19, such as on outbreaks and prisons, is not stratified by race/ethnicity. The lack of age adjustment in routine COVID-19 reporting leads to an underestimate of racial/ethnic disparities in COVID-19 outcomes. Health departments have ready access to information for calculating ageadjusted estimates through well-established methods using data from the American Community Survey as the standard. Not doing so contributes to an incomplete picture of the impact of the epidemic on various groups. The reporting of data on sexual orientation and gender identity would further enhance our understanding of COVID-19 and the ability of public health efforts to tailor messaging, testing, contact tracing, and other efforts to mitigate the pandemic.