Estimates and techniques
The following parameters are estimated per country (or geographical area).
- Cases infected: Prevalence of COVID-19, i.e., (fraction of) the population that has been infected of COVID-19.
- Cases daily: The new population (or the fraction thereof) infected on a particular day (to the available knowledge).
- Cases contagious: The population (or the fraction thereof) that can transmit the COVID-19 on a given day.
- Cases active: The population (or the fraction thereof) that is infected and whose case is still active on a given day. It includes symptomatic and asymptomatic cases.
We use the following techniques to estimate these parameters. See our publications for a more detailed description of the methods to compute them.
- Confirmed: These are the values derived from the official data of confirmed cases obtained from Our World in Data. Cases daily are the new cases confirmed in a given day. To estimate cases contagious and active, the distributions of the number of days a case is infectious and active are used, respectively.
- CCFR: This technique uses the official data of confirmed cases and fatalities obtained from Our World in Data, as described here. It uses a known Case-Fatality Rate and the number of fatalities to correct the number of confirmed cases, taking into account the time from infection to death.
- CCFR-Fatalities: This is similar to cCFR but using only the official number of fatalities. It uses a value of 15 as time from onset to death (source: Centers for Disease Control and Prevention, CDC).
- UMD CLI: The direct symptom responses from the University of Maryland COVID-19 World Symptoms Survey Microdata (part of the CMU/UMD COVID-19 Symptom Survey initiative, CTIS) are used to estimate the ratio of cases with COVID-like ill (CLI). COVID-like illness: fever, along with cough, shortness of breath, or difficulty breathing.” (CTIS participants have to be at least 18 yo.)
- UMD CLI local: The indirect responses from the CTIS are used to estimate the the ratio of cases with COVID-like ill (CLI). These are responses to the survey question “How many people do you know with these [CLI] symptoms?”
- Random Forest: Machine learning based estimate of active cases from the responses of the CTIS survey using the Random Forest algorithm. Details here. (CTIS participants have to be at least 18 yo.)
- Symptomatic random forest: Same as Random Forest, but with the classifier trained only with positive cases that have symptoms.
- XGBoost: Same as Random Forest, but using the XGBoost algorithm for classification.
- Symptomatic XGBoost: Same as Symptomatic random forest, but with the XGBoost classifier.
Estimates of active cases per country
World risk map with current estimates of cases
More Data and Estimates
All the collected data and other estimates can be found in the project GitHub repository at https://github.com/GCGImdea/coronasurveys/tree/master/data.
Data Sources and Computation
Source of data on confirmed cases and deaths: Our World in Data.