Data Sources

What San Diego County provides


There are two main dashboards the county provides related to Covid-19.

The COVID-19 Dashboard (mobile version) displays an overview of various statistics on Covid-19 in the county.

The Triggers Dashboard (mobile version) highlights the metrics that are tracked to decide when to modify the public health officer’s orders.

(The county has added 2 additional dashboards for case rate and vaccinations, which are available at the county’s website which is mentioned in the next section.)

The state provides a comprehensive dashboard with Covid-19 data for each county.

Other data visualizations and statistics

San Diego County also reports Covid 19 statistics on their Coronavirus in San Diego County website.

Their site provides a summary table of cumulative cases, broken out by age groups and gender, as well as statistics on hospitalizations, ICU, and deaths.

This image has an empty alt attribute; its file name is MainPage.png

Below that chart, they include links to numerous PDF files that contain charts and tables for more statistics on location by city, zip code, and other demographics for cases, hospitalizations and deaths.

These PDFs are updated every day around 5pm and the tables contain only a view of the current cumulative totals, similar to the table shown above, with no way to see how those numbers have trended over time for the various demographic groups.

Raw data

The county makes Covid-19 related datasets used by the dashboards available here:

But they do not have datasets for hospitalizations and deaths broken down by age groups and race/ethnicity demographics, even though that data is reported daily in PDFs.

State data


In addition to the county-provided data, San Diego County data can be found with the CA Covid-19 datasets on the CA Open Data Portal.

The following data is available from the state broken down by county:

  • Cases
  • Hospital and ICU utilization and availability
  • PPE logistics
  • Medical surge facilities
  • Homeless impact

At at the state level, they provide data on:

  • Testing numbers
  • Sex demographics
  • Ethnicity demographics
  • Age demographics

Death data

I use death data from the state of California to compare death rates in 2020 to past years. I pull death data from the California Vital Data Query Tool, which requires setting up a free account to query data by various demographics.

For more detailed information about deaths in San Diego County, I have turned to the Medical Examiner dataset from San Diego County data catalog.

Additionally, I use the San Diego County COVID-19 deaths data published by the San Diego Union Tribune to get a detailed breakdown of COVID-19 deaths in San Diego County, by date of death. They get their data from the county, but it includes data that is not available from the open data portal, including specific ages and race/ethnicity of those who have succumbed to COVID-19. (The COVID-19 death data publicly available from the county is broken down by race/ethnicity OR age group, but not both, and only in tables in PDF files.)

My data

I download the relevant PDFs every day, and then use Tabula to extract the data in the tables into csv files. I then use Python to read in all the csv files and create a master csv file with all the data.

However, I started this project in late June/early July and I am missing some of the older PDF files. I retrieved as many as I could from the WayBack Machine, but there are still a number of dates missing for hospitalizations and deaths.

You can find the latest composite csv files I have generated for this data here:

Historical county data

Since San Diego County does not provide past PDFs and some of the data in them is not present in the datasets available from the county’s open data portal, I have been downloading and saving these files. Originally I made them available on my website, but they began to use up too much storage space. You can use the WayBack Machine to obtain older versions of the files.