COVID-19 Cases and Deaths Summarized by Geography | Last Updated 17 Aug 2022

Note: On September 12, 2021, a new case definition of COVID-19 was introduced that includes criteria for enumerating new infections after previous probable or confirmed infections (also known as reinfections). A reinfection is defined as a confirmed positive PCR lab test more than 90 days after a positive PCR or antigen test. The first reinfection case was identified on December 7, 2021. Some fluctuation in historic data may result when this change is implemented on July 15, 2022. Note: On February 23, 2022, the New Cases Map dashboard began pulling from this dataset. To access Cases by Geography and Date, please refer to this dataset: Note: On January 22, 2022, system updates to improve the timeliness and accuracy of San Francisco COVID-19 cases and deaths data were implemented. You might see some fluctuations in historic data as a result of this change. Due to the changes, starting on January 22, 2022, the number of new cases reported daily will be higher than under the old system as cases that would have taken longer to process will be reported earlier. <i><b>Note: As of April 16, 2021, this dataset will update daily with a five-day data lag.</i></b> <strong>A. SUMMARY</strong> Medical provider confirmed COVID-19 cases and confirmed COVID-19 related deaths in San Francisco, CA aggregated by several different geographic areas and normalized by 2019 American Community Survey (ACS) 5-year estimates for population data to calculate rate per 10,000 residents. Cases and deaths are both mapped to the residence of the individual, not to where they were infected or died. For example, if one was infected in San Francisco at work but lives in the East Bay, those are not counted as SF Cases or if one dies in Zuckerberg San Francisco General but is from another county, that is also not counted in this dataset. Dataset is cumulative and covers cases going back to March 2nd, 2020 when testing began. Geographic areas summarized are: 1. <a href="">Analysis Neighborhoods</a> 2. <a href="">Census Tracts</a> 3. <a href="">Census Zip Code Tabulation Areas</a> <strong>B. HOW THE DATASET IS CREATED</strong> Addresses from medical data are geocoded by the San Francisco Department of Public Health (SFDPH). Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area. The 2019 ACS estimates for population provided by the Census are used to create a rate which is equal to ([count] / [acs_population]) * 10000) representing the number of cases per 10,000 residents. <strong>C. UPDATE PROCESS</strong> Geographic analysis is scripted by SFDPH staff and synced to this dataset daily at 7:30 Pacific Time. <strong>D. HOW TO USE THIS DATASET</strong> <em>Privacy rules in effect</em> To protect privacy, certain rules are in effect: 1. Case counts greater than 0 and less than 10 are dropped - these will be null (blank) values 2. Death counts greater than 0 and less than 10 are dropped - these will be null (blank) values 3. Cases and deaths dropped altogether for areas where acs_population < 1000 <em>Rate suppression in effect where counts lower than 20</em> Rates are not calculated unless the case count is greater than or equal to 20. Rates are generally unstable at small numbers, so we avoid calculating them directly. We advise you to apply the same approach as this is best practice in epidemiology. <em>A note on Census ZIP Code Tabulation Areas (ZCTAs)</em> ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are areal representations of routes. <a href="">Read how the Census develops ZCTAs on their website</a>. <em>Row included for Citywide case counts, incidence rate, and deaths</em> A single row is included that has the Citywide case counts and incidence rate. This can be used for comparisons. Citywide will capture all cases regardless of address quality. While some cases cannot be mapped to sub-areas like Census Tracts, ongoing data quality efforts result in improved mapping on a rolling bases.

This dataset has the following 11 columns:

Column NameAPI Column NameData TypeDescriptionSample Values
area_typearea_typetextType of geographic area, one of: Citywide, Census Tract, Analysis Neighborhood, or ZCTA (ZIP Code Tabulation Area)
ididtextThe identifier for the area type
countcountnumberThe count of cases in the area, null when not zero and less than 10
count_last_60_dayscount_last_60_daysnumberThe count of cases in the area between max_specimen_collection_date and 60 days prior, null when total count is not zero and less than 10
rateratenumberThe rate of cases in the area, calculated as (count/acs_population) * 10000 which is a rate per 10,000 residents
deathsdeathsnumberNumber of deaths, null when not zero and less than 10
acs_populationacs_populationnumberThe population from the latest 5-year estimates from the American Community Survey (2015-2019)
max_specimen_collection_datemax_specimen_collection_datecalendar_dateThe most recent date through which data is populated for the dataset. Will be 5 days before current date—due to the implemented five-day lag—barring any unforeseen data issues
last_updated_atlast_updated_atcalendar_dateWhen the dataset was last compiled by scripts, representing how current the data is
data_loaded_atdata_loaded_atcalendar_dateTimestamp when the record (row) was most recently updated here in the Open Data Portal
multipolygonmultipolygonmultipolygonThe geometry in multipolygon format stored in EPSG:4326 coordinate system