The Methods is structured as follows. We describe the datasets that we used in the ‘Datasets’ section and the mobility network that we derived from these datasets in the ‘Mobility network’ section. In the ‘Model dynamics’ section, we discuss the SEIR model that we overlaid on the mobility network; in the ‘Model calibration’ section, we describe how we calibrated this model and quantified uncertainty in its predictions. Finally, in the ‘Analysis details’ section, we provide details on the experimental procedures used for our analyses of mobility reduction, reopening plans and demographic disparities.

### Datasets

#### SafeGraph

We use data provided by SafeGraph, a company that aggregates anonymized location data from numerous mobile applications. SafeGraph data captures the movement of people between CBGs, which are geographical units that typically contain a population of between 600 and 3,000 people, and POIs such as restaurants, grocery stores or religious establishments. Specifically, we use the following SafeGraph datasets.

First, we used the Places Patterns^{39} and Weekly Patterns (v1)^{40} datasets. These datasets contain, for each POI, hourly counts of the number of visitors, estimates of median visit duration in minutes (the ‘dwell time’) and aggregated weekly and monthly estimates of the home CBGs of visitors. We use visitor home CBG data from the Places Patterns dataset: for privacy reasons, SafeGraph excludes a home CBG from this dataset if fewer than five devices were recorded at the POI from that CBG over the course of the month. For each POI, SafeGraph also provides their North American industry classification system category, as well as estimates of its physical area in square feet. The area is computed using the footprint polygon SafeGraph that assigns to the POI^{41,42}. We analyse Places Patterns data from 1 January 2019 to 29 February 2020 and Weekly Patterns data from 1 March 2020 to 2 May 2020.

Second, we used the Social Distancing Metrics dataset^{43}, which contains daily estimates of the proportion of people staying home in each CBG. We analyse Social Distancing Metrics data from 1 March 2020 to 2 May 2020.

We focus on 10 of the largest metro areas in the United States (Extended Data Table 1). We chose these metro areas by taking a random subset of the SafeGraph Patterns data and selecting the 10 metro areas with the most POIs in the data. The application of the methods described in this paper to the other metro areas in the original SafeGraph data should be straightforward. For each metro area, we include all POIs that meet all of the following requirements: (1) the POI is located in the metro area ; (2) SafeGraph has visit data for this POI for every hour that we model, from 00:00 on 1 March 2020 to 23:00 on 2 May 2020; (3) SafeGraph has recorded the home CBGs of visitors to this POI for at least one month from January 2019 to February 2020; (4) the POI is not a ‘parent’ POI. Parent POIs comprise a small fraction of POIs in the dataset that overlap and include the visits from their ‘child’ POIs: for example, many malls in the dataset are parent POIs, which include the visits from stores that are their child POIs. To avoid double-counting visits, we remove all parent POIs from the dataset. After applying these POI filters, we include all CBGs that have at least one recorded visit to at least ten of the remaining POIs; this means that CBGs from outside the metro area may be included if they visit this metro area frequently enough. Summary statistics of the post-processed data are shown in Extended Data Table 1. Overall, we analyse 56,945 CBGs from the 10 metro areas, and more than 310 million visits from these CBGs to 552,758 POIs.

SafeGraph data have been used to study consumer preferences^{44} and political polarization^{45}. More recently, it has been used as one of the primary sources of mobility data in the USA for tracking the effects of the COVID-19 pandemic^{26,28,46,47,48}. In Supplementary Methods section 1, we show that aggregate trends in SafeGraph mobility data match the aggregate trends in Google mobility data in the USA^{49}, before and after the imposition of stay-at-home measures. Previous analyses of SafeGraph data have shown that it is geographically representative—for example, it does not systematically overrepresent individuals from CBGs in different counties or with different racial compositions, income levels or educational levels^{50,51}.

#### US census

Our data on the demographics of the CBGs comes from the American Community Survey (ACS) of the US Census Bureau^{52}. We use the 5-year ACS data (2013–2017) to extract the median household income, the proportion of white residents and the proportion of Black residents of each CBG. For the total population of each CBG, we use the most-recent one-year estimates (2018); one-year estimates are noisier but we wanted to minimize systematic downward bias in our total population counts (due to population growth) by making them as recent as possible.

####
*The New York Times* dataset

We calibrated our models using the COVID-19 dataset published by the *The New York Times*^{32}. Their dataset consists of cumulative counts of cases and deaths in the USA over time, at the state and county level. For each metro area that we modelled, we sum over the county-level counts to produce overall counts for the entire metro area. We convert the cumulative case and death counts to daily counts for the purposes of model calibration, as described in the ‘Model calibration’ section.

#### Data ethics

The dataset from *The New York Times* consists of aggregated COVID-19-confirmed case and death counts collected by journalists from public news conferences and public data releases. For the mobility data, consent was obtained by the third-party sources that collected the data. SafeGraph aggregates data from mobile applications that obtain opt-in consent from their users to collect anonymous location data. Google’s mobility data consists of aggregated, anonymized sets of data from users who have chosen to turn on the location history setting. Additionally, we obtained IRB exemption for SafeGraph data from the Northwestern University IRB office.

### Mobility network

#### Definition

We consider a complete undirected bipartite graph (mathscrG=(mathscrV, mathcal E )) with time-varying edges. The vertices (mathscrV) are partitioned into two disjoint sets (mathscrC=c_1,ldots ,c_m), representing *m* CBGs, and (mathscrP=p_1,ldots ,p_n), representing *n* POIs. From US census data, each CBG *c*_{i} is labelled with its population (N_c_i), income distribution, and racial and age demographics. From SafeGraph data, each POI *p*_{j} is similarly labelled with its category (for example, restaurant, grocery store or religious organization), its physical size in square feet (a_p_j), and the median dwell time (d_p_j) of visitors to *p*_{j}. The weight (w_ij^(t)) on an edge (*c*_{i}, *p*_{j}) at time *t* represents our estimate of the number of individuals from CBG *c*_{i} visiting POI *p*_{j} at the *t*th hour of simulation. We record the number of edges (with non-zero weights) in each metro area and for all hours from 1 March 2020 to 2 May 2020 in Extended Data Table 1. Across all 10 metro areas, we study 5.4 billion edges between 56,945 CBGs and 552,758 POIs.

#### Overview of the network estimation

The central technical challenge in constructing this network is estimating the network weights (W^(t),=,w_ij^(t)) from SafeGraph data, as this visit matrix is not directly available from the data. Our general methodology for network estimation is as follows.

First, from SafeGraph data, we derived a time-independent estimate (barW) of the visit matrix that captures the aggregate distribution of visits from CBGs to POIs from January 2019 to February 2020.

Second, because visit patterns differ substantially from hour to hour (for example, day versus night) and day to day (for example, before versus after lockdown), we used current SafeGraph data to capture these hourly variations and to estimate the CBG marginals *U*^{(t)}, that is, the number of people in each CBG who are out visiting POIs at hour *t*, as well as the POI marginals *V*^{(t)}, that is, the total number of visitors present at each POI *p*_{j} at hour *t*.

Finally, we applied the iterative proportional fitting procedure (IPFP) to estimate an hourly visit matrix *W*^{(t)} that is consistent with the hourly marginals *U*^{(t)} and *V*^{(t)} but otherwise ‘as similar as possible’ to the distribution of visits in the aggregate visit matrix (barW), in terms of Kullback–Leibler divergence.

IPFP is a classic statistical method^{31} for adjusting joint distributions to match prespecified marginal distributions, and it is also known in the literature as biproportional fitting, the RAS algorithm or raking^{53}. In the social sciences, it has been widely used to infer the characteristics of local subpopulations (for example, within each CBG) from aggregate data^{54,55,56}. IPFP estimates the joint distribution of visits from CBGs to POIs by alternating between scaling each row to match the hourly row (CBG) marginals *U*^{(t)} and scaling each column to match the hourly column (POI) marginals *V*^{(t)}. Further details about the estimation procedure are provided in Supplementary Methods section 3.

### Model dynamics

To model the spread of SARS-CoV-2, we overlay a metapopulation disease transmission model on the mobility network defined in the ‘Mobility Network’ section. The transmission model structure follows previous work^{15,20} on epidemiological models of SARS-CoV-2 but incorporates a fine-grained mobility network into the calculations of the transmission rate. We construct separate mobility networks and models for each metropolitan statistical area.

We use a SEIR model with susceptible (*S*), exposed (*E*), infectious (*I*) and removed (*R*) compartments. Susceptible individuals have never been infected, but can acquire the virus through contact with infectious individuals, which may happen at POIs or in their home CBG. They then enter the exposed state, during which they have been infected but are not infectious yet. Individuals transition from exposed to infectious at a rate inversely proportional to the mean latency period. Finally, they transition into the removed state at a rate inversely proportional to the mean infectious period. The removed state represents individuals who can no longer be infected or infect others, for example, because they have recovered, self-isolated or died.

Each CBG *c*_{i} maintains its own SEIR instantiation, with (S_c_i^(t)), (E_c_i^(t)), (I_c_i^(t)) and (R_c_i^(t)) representing how many individuals in CBG *c*_{i} are in each disease state at hour *t*, and (N_c_i=S_c_i^(t)+E_c_i^(t)+I_c_i^(t)+R_c_i^(t)). At each hour *t*, we sample the transitions between states as follows:

$$N_S_c_ito E_c_i^(t) sim rmPoisleft(fracS_c_i^(t)N_c_imathopsum limits_j=1^nlambda _p_j^(t)w_ij^(t)right)+rmBinom(S_c_i^(t),lambda _c_i^(t))$$

(1)

$$N_E_c_ito I_c_i^(t) sim rmBinom(E_c_i^(t),1/delta _E)$$

(2)

$$N_I_c_ito R_c_i^(t) sim rmBinom(I_c_i^(t),1/delta _I),$$

(3)

where (lambda _p_j^(t)) is the rate of infection at POI *p*_{j} at time *t*; (w_ij^(t)), the *ij*th entry of the visit matrix from the mobility network (see ‘Mobility network’), is the number of visitors from CBG *c*_{i} to POI *p*_{j} at time *t*; (lambda _c_i^(t)) is the base rate of infection that is independent of visiting POIs; *δ*_{E} is the mean latency period; and *δ*_{I} is the mean infectious period.

We then update each state to reflect these transitions. Let (Delta S_c_i^(t)=S_c_i^(t+1)-S_c_i^(t)) and likewise for (Delta E_c_i^(t)), (Delta I_c_i^(t)) and (Delta R_c_i^(t)). Then, we make the following updates:

$$Delta S_c_i^(t)=-N_S_c_ito E_c_i^(t)$$

(4)

$$Delta E_c_i^(t)=N_S_c_ito E_c_i^(t)-N_E_c_ito I_c_i^(t)$$

(5)

$$Delta I_c_i^(t)=N_E_c_ito I_c_i^(t)-N_I_c_ito R_c_i^(t)$$

(6)

$$Delta R_c_i^(t)=N_I_c_ito R_c_i^(t).$$

(7)

#### The number of new exposures

We separate the number of new exposures (N_S_c_ito E_c_i^(t)) in CBG *c*_{i} at time *t* into two parts: cases from visiting POIs, which are sampled from (rmPois((S_c_i^(t)/N_c_i)sum _j=1^nlambda _p_j^(t)w_ij^(t))), and other cases not captured by visiting POIs, which are sampled from (rmBrmirmnrmormm(S_c_i^(t),lambda _c_i^(t))).

First, we calculate the number of new exposures from visiting POIs. We assume that any susceptible visitor to POI *p*_{j} at time *t* has the same independent probability (lambda _p_j^(t)) of being infected and transitioning from the susceptible (*S*) to the exposed (*E*) state. As there are (w_ij^(t)) visitors from CBG *c*_{i} to POI *p*_{j} at time *t*, and we assume that a (S_c_i^(t)/N_c_i) fraction of them are susceptible, the number of new exposures among these visitors is distributed as (rmbrmirmnrmormm(w_ij^(t)S_c_i^(t)/N_c_i,lambda _p_j^(t))approx rmPrmormirms(lambda _p_j^(t)w_ij^(t)S_c_i^(t)/N_c_i)). The number of new exposures among all outgoing visitors from CBG *c*_{i} is therefore distributed as the sum of the above expression over all POIs, (rmPois((S_c_i^(t)/N_c_i)sum _j=1^nlambda _p_j^(t)w_ij^(t))).

We model the infection rate at POI *p*_{j} at time *t*, (lambda _p_j^(t)=beta _p_j^(t)(I_p_j^(t)/V_p_j^(t))), as the product of its transmission rate (beta _p_j^(t)) and proportion of infectious individuals (I_p_j^(t)/V_p_j^(t)), where (V_p_j^(t)=sum _i=1^mw_ij^(t)) is the total number of visitors to *p*_{j} at time *t*. We model the transmission rate at POI *p*_{j} at time *t* as

$$beta _p_j^(t)=psi d_p_j^2fracV_p_j^(t)a_p_j,$$

(8)

where (a_p_j) is the physical area of *p*_{j}, and *ψ* is a transmission constant (shared across all POIs) that we fit to data. The inverse scaling of transmission rate with area (a_p_j) is a standard simplifying assumption^{57}. The dwell time fraction (d_p_jin [0,1]) is what fraction of an hour an average visitor to *p*_{j} at any hour will spend there (Supplementary Methods section 3); it has a quadratic effect on the POI transmission rate (beta _p_j^(t)) because it reduces both the time that a susceptible visitor spends at *p*_{j} and the density of visitors at *p*_{j}. With this expression for the transmission rate (beta _p_j^(t)), we can calculate the infection rate at POI *p*_{j} at time *t* as

$$lambda _p_j^(t)=beta _p_j^(t)fracI_p_j^(t)V_p_j^(t)=psi d_p_j^2fracI_p_j^(t)a_p_j.$$

(9)

For sufficiently large values of *ψ* and a sufficiently large proportion of infected individuals, the expression above can sometimes exceed 1. To address this, we simply clip the infection rate to 1. However, this occurs very rarely for the parameter settings and simulation duration that we use.

Finally, to compute the number of infectious individuals at *p*_{j} at time *t*, (I_p_j^(t)), we assume that the proportion of infectious individuals among the (w_kj^(t)) visitors to *p*_{j} from a CBG *c*_{k} mirrors the overall density of infections (I_c_k^(t)/N_c_k) in that CBG, although we note that the scaling factor *ψ* can account for differences in the ratio of infectious individuals who visit POIs. This gives

$$I_p_j^(t)=mathopsum limits_k=1^mfracI_c_k^(t)N_c_kw_kj^(t).$$

(10)

In addition to the new exposures from infections at POIs, we model a CBG-specific base rate of new exposures that is independent of POI visit activity. This captures other sources of infections, for example, household infections or infections at POIs that are absent from the SafeGraph data. We assume that at each hour, every susceptible individual in CBG *c*_{i} has a base probability (lambda _c_i^(t)) of becoming infected and transitioning to the exposed state, where

$$lambda _c_i^(t)=beta _rmbasefracI_c_i^(t)N_c_i$$

(11)

is the product of the base transmission rate *β*_{base} and the proportion of infectious individuals in CBG *c*_{i}. *β*_{base} is a constant (shared across all CBGs) that we fit to data.

By adding all of the above together, the expression for the distribution of the overall number of new exposures in CBG *c*_{i} at time *t* becomes

$$beginarraycN_S_c_ito E_c_i^(t) sim rmPoisleft(fracS_c_i^(t)N_c_imathopsum limits_j=1^nlambda _p_j^(t)w_ij^(t)right)+rmBinom(S_c_i^(t),lambda _c_i^(t)) =mathop{underbracermPoisleft(psi fracS_c_i^(t)N_c_imathopsum limits_j=1^nfracd_p_j^2a_p_jleft(mathopsum limits_k=1^mfracI_c_k^(t)N_c_kw_kj^(t)right)w_ij^(t)right)}limits_rmNew,rminfections,rmfrom,rmvisiting,rmPOIs+mathop{underbracermBinomleft(S_c_i^(t),beta _rmbasefracI_c_i^(t)N_c_iright)}limits_rmBase,rmrate,rmof,rmnew,rmCBG,rminfections.endarray$$

(12)

#### The number of new infectious and removed cases

We model exposed individuals as becoming infectious at a rate that is inversely proportional to the mean latency period *δ*_{E}. At each time step *t*, we assume that each exposed individual has a constant, time-independent probability of becoming infectious, with

$$N_E_c_ito I_c_i^(t) sim rmBinom(E_c_i^(t),1/delta _E).$$

(13)

Similarly, we model infectious individuals as transitioning to the removed state at a rate that is inversely proportional to the mean infectious period *δ*_{I}, with

$$N_I_c_ito R_c_i^(t) sim rmBinom(I_c_i^(t),1/delta _I),$$

(14)

We estimate *δ*_{E} = 96 h (refs. ^{20,58}) and *δ*_{I} = 84 h (ref. ^{20}) based on previous studies.

#### Model initialization

In our experiments, *t* = 0 is the first hour of 1 March 2020. We approximate the infectious *I* and removed *R* compartments at *t* = 0 as initially empty, with all infected individuals in the exposed *E* compartment. We further assume that the same expected initial prevalence *p*_{0} occurs in every CBG *c*_{i}. At *t* = 0, every individual in the metro area has the same independent probability *p*_{0} of being exposed *E* instead of susceptible *S*. We thus initialize the model state by setting

$$S_c_i^(0)=N_c_i-E_c_i^(0)$$

(15)

$$E_c_i^(0) sim rmBinom(N_c_i,p_0)$$

(16)

$$I_c_i^(0)=0$$

(17)

$$R_c_i^(0)=0.$$

(18)

#### Aggregate mobility and no-mobility baseline models

To test whether the detailed mobility network is necessary, or whether our model is simply making use of aggregate mobility patterns, we tested an alternative SEIR model that uses the aggregate number of visits made to any POI in the metro area in each hour, but not the breakdown of visits between specific CBGs to specific POIs. Like our model, the aggregate mobility model also captures new cases from visiting POIs and a base rate of infection that is independent of POI visit activity; thus, the two models have the same three free parameters (*ψ*, scaling transmission rates at POIs; *β*_{base}, the base transmission rate ; and *p*_{0}, the initial fraction of infected individuals). However, instead of having POI-specific rates of infection, the aggregate mobility model captures only a single probability that a susceptible person from any CBG will become infected due to a visit to any POI at time *t*; we make this simplification because the aggregate mobility model no longer has access to the breakdown of visits between CBGs and POIs. This probability (lambda _rmPOI^(t)) is defined as

$$lambda _rmPOI^(t)=psi mathopunderbracefracsum _i=1^msum _j=1^nw_ij^(t)nmlimits_rmAverage,rmmobility,rmat,rmtime,tfracI^(t)N,$$

(19)

where *m* is the number of CBGs, *n* is the number of POIs, *I*^{(t)} is the total number of infectious individuals at time *t*, and *N* is the total population size of the metro area. For the base rate of infections in CBGs, we assume the same process as in our network model: the probability (lambda _c_i^(t)) that a susceptible person in CBG *c*_{i} will become infected in their CBG at time *t* is equal to *β*_{base} times the current infectious fraction of *c*_{i} (equation (11)). Putting it together, the aggregate mobility model defines the number of new exposures in CBG *c*_{i} at time *t* as

$$N_S_c_ito E_c_i^(t) sim mathop{underbracermBinom(S_c_i^(t),lambda _rmPOI^(t))}limits_rmNew,rminfections,rmfrom,rmvisiting,rmPOIs+mathopunderbracermBinom(S_c_i^(t),lambda _c_i^(t))limits_rmBase,rmrateof,rmnew,rmCBG,rminfections.$$

(20)

All other dynamics remain the same between the aggregate mobility model and our network model, and we calibrated the models in the same way, which we describe in the ‘Model calibration’ section. We found that our network model substantially outperformed the aggregate mobility model in predictions of out-of-sample cases: on average across metro areas, the out-of-sample r.m.s.e. of our best-fit network model was only 58% that of the best-fit aggregate mobility model (Extended Data Fig. 1). This demonstrates that it is not only general mobility patterns, but specifically the mobility network that allows our model to accurately fit observed cases.

Next, to determine the extent to which mobility data could aid in modelling the case trajectory, we compared our model to a baseline SEIR model that does not use mobility data and simply assumes that all individuals within an metro area mix uniformly. In this no-mobility baseline, an individual’s risk of being infected and transitioning to the exposed state at time *t* is

$$lambda ^(t)=beta _rmbasefracI^(t)N,$$

(21)

where *I*^{(t)} is the total number of infectious individuals at time *t*, and *N* is the total population size of the metro area. As above, the other model dynamics are identical, and for model calibration we performed a similar grid search over *β*_{base} and *p*_{0}. As expected, we found both the network and aggregate mobility models outperformed the no-mobility model on out-of-sample case predictions (Extended Data Fig. 1).

### Model calibration and validation

Most of our model parameters can either be estimated from SafeGraph and US census data, or taken from previous studies (see Extended Data Table 2 for a summary). This leaves three model parameters that do not have direct analogues in the literature, and that we therefore need to calibrate with data: (1) the transmission constant in POIs, *ψ* (equation (9)); (2) the base transmission rate, *β*_{base} (equation (11)); and (3) the initial proportion of exposed individuals at time *t* = 0, *p*_{0} (equation (16)).

In this section, we describe how we fitted these parameters to published numbers of confirmed cases, as reported by *The New York Times*. We fitted models for each metro area separately.

#### Selecting parameter ranges for *ψ*, *β*

_{base} and *p*

_{0}

We select parameter ranges for the transmission rate factors *ψ* and *β*_{base} by checking whether the model outputs match plausible ranges of the basic reproduction number *R*_{0} before lockdown, as *R*_{0} has been the study of substantial previous work on SARS-CoV-2^{59}. Under our model, we can decompose *R*_{0} = *R*_{base} + *R*_{POI}, where *R*_{POI} describes transmission due to POIs and *R*_{base} describes the remaining transmission (as in equation (12)). We first establish plausible ranges for *R*_{base} and *R*_{POI} before translating these into plausible ranges for *β*_{base} and *ψ*.

We assume that *R*_{base} ranges from 0.1 to 2. *R*_{base} models transmission that is not correlated with activity at POIs in the SafeGraph dataset, including within-household transmission and transmission at POI categories that are not well-captured in the SafeGraph dataset. We chose the lower limit of 0.1 because beyond that point, base transmission would only contribute minimally to overall *R*, whereas previous studies have suggested that within-household transmission is a substantial contributor to overall transmission^{60,61,62}. Household transmission alone is not estimated to be sufficient to tip the overall *R*_{0} above 1; for example, a single infected individual has been estimated to cause an average of 0.32 (0.22–0.42) secondary within-household infections^{60}. However, because *R*_{base} may also capture transmission at POIs that are not captured in the SafeGraph dataset, to be conservative, we chose an upper limit of *R*_{base} = 2; as we describe below, the best-fit models for all 10 metro areas have *R*_{base} < 2, and 9 out of 10 have *R*_{base} < 1. We allow *R*_{POI} to range from 1 to 3, which corresponds to allowing *R*_{0} = *R*_{POI} + *R*_{base} to range from 1.1 to 5. This is a conservatively wide range, as a previous study^{59} estimated a pre-lockdown *R*_{0} of 2–3.

To determine the values of *R*_{base} and *R*_{POI} that a given pair of *β*_{base} and *ψ* imply, we seeded a fraction of index cases and then ran the model on looped mobility data from the first week of March to capture pre-lockdown conditions. We initialized the model by setting *p*_{0}, the initial proportion of exposed individuals at time *t* = 0, to *p*_{0} = 10^{−4}, and then sampling in accordance with equation (16). Let *N*_{0} be the number of initial exposed individuals sampled. We computed the number of individuals that these *N*_{0} index cases went on to infect through base transmission, *N*_{base}, and POI transmission, *N*_{POI}, which gives

$$R_rmPOI=frac{N_{rmPOI}}N_0$$

(22)

$$R_rmbase=frac{N_{rmbase}}N_0.$$

(23)

We averaged these quantities over stochastic realizations for each metro area. Supplementary Figure 6 shows that, as expected, *R*_{base} is linear in *β*_{base} and *R*_{POI} is linear in *ψ*. *R*_{base} lies in the plausible range when *β*_{base} ranges from 0.0012 to 0.024, and *R*_{POI} lies in the plausible range (for at least one metro area) when *ψ* ranges from 515 to 4,886; we therefore consider these parameter ranges when fitting the model.

The extent to which SARS-CoV-2 infections had spread in the USA by the start of our simulation (1 March 2020) is currently unclear^{63}. To account for this uncertainty, we allow *p*_{0} to vary across a large range between 10^{−5} and 10^{−2}. As described in the next section, we verified that case count data for all metro areas can be fit using parameter settings for *β*_{base}, *ψ* and *p*_{0} within this range.

#### Fitting to the number of confirmed cases

Using the parameter ranges described above, we grid-searched over *ψ*, *β*_{base} and *p*_{0} to find the models that best fit the number of confirmed cases reported by *The New York Times*^{32}. For each metro area, we tested 1,500 different combinations of *ψ*, *β*_{base} and *p*_{0} in the parameter ranges specified above, with parameters linearly spaced for *ψ* and *β*_{base} and logarithmically spread for *p*_{0}.

In the ‘Model dynamics’ section, we directly model the number of infections but not the number of confirmed cases. To estimate the number of confirmed cases, we assume that an *r*_{c} = 0.1 proportion^{20,58,64,65,66} of infections will be confirmed, and moreover that they will confirmed exactly *δ*_{c} = 168 h (7 days)^{20,66} after becoming infectious. From these assumptions, we can calculate the predicted number of newly confirmed cases across all CBGs in the metro area on day *d*,

$$N_rmcases^(rmday,d)=r_cmathopsum limits_i=1^mmathopsum limits_tau =24(d-1)+1-delta _c^24d-delta _cN_E_c_ito I_c_i^(tau ),$$

(24)

where *m* indicates the total number of CBGs in the metro area and for convenience we define (N_E_c_ito I_c_i^(tau )), the number of newly infectious people at hour *τ*, to be 0 when *τ* < 1.

From the dataset of *The New York Times*, we have the reported number of new cases (hatN_rmcases^(rmday,d)) for each day *d*, summed over each county in the metro area. We compare the reported number of cases and the number of cases that our model predicts by computing the r.m.s.e. between each of the *D* = ⌊*T*/24⌋ days of our simulations,

$$rmrrm.mrm.srm.e.=sqrt{frac1Dmathopsum limits_d=1^D{(N_rmcases^(rmdayd)-hatN_{rmcases}^(rmdayd))}^2}.$$

(25)

For each combination of model parameters and for each metro area, we quantify the model fit with the data from *The New York Times* by running 30 stochastic realizations and averaging their r.m.s.e. Note that we measure model fit based on the daily number of new reported cases (as opposed to the cumulative number of reported cases)^{67}.

Our simulation spans 1 March to 2 May 2020, and we use mobility data from that period. However, because we assume that cases will be confirmed *δ*_{c} = 7 days after individuals become infectious, we predict the number of cases with a 7-day offset, from 8 March to 9 May 2020.

#### Parameter selection and uncertainty quantification

Throughout this paper, we report aggregate predictions from different parameter sets of *ψ*, *β*_{base} and *p*_{0}, and multiple stochastic realizations. For each metro area, we: (1) find the best-fit parameter set, that is, with the lowest average r.m.s.e. on daily incident cases over stochastic realizations; (2) select all parameter sets that achieve an r.m.s.e. (averaged over stochastic realizations) within 20% of the r.m.s.e. of the best-fit parameter set; and (3) pool together all predictions across those parameter sets and all of their stochastic realizations, and report their mean and 2.5–97.5th percentiles.

On average, each metro area has 9.7 parameter sets that achieve an r.m.s.e. within 20% of the best-fitting parameter set (Supplementary Table 6). For each parameter set, we have results for 30 stochastic realizations.

This procedure corresponds to rejection sampling in an approximate Bayesian computation framework^{15}, for which we assume an error model that is Gaussian with constant variance; we pick an acceptance threshold based on what the best-fit model achieves; and we use a uniform parameter grid instead of sampling from a uniform prior. It quantifies uncertainty from two sources. First, the multiple realizations capture stochastic variability between model runs with the same parameters. Second, simulating with all parameter sets that are within 20% of the r.m.s.e. of the best fit captures uncertainty in the model parameters *ψ*, *β*_{base} and *p*_{0}. This procedure is equivalent to assuming that the posterior probability over the true parameters is uniformly spread among all parameter sets within the 20% threshold.

#### Model validation on out-of-sample cases

We validate our models by showing that they predict the number of confirmed cases on out-of-sample data when we have access to corresponding mobility data. For each metro area, we split the available dataset from the *The New York Times* into a training set (spanning from 8 March 2020 to 14 April 2020) and a test set (spanning from 15 April 2020 to 9 May 2020). We fit the model parameters *ψ*, *β*_{base} and *p*_{0}, as described in the ‘Mobility network’ section, but using only the training set. We then evaluate the predictive accuracy of the resulting model on the test set. When running our models on the test set, we still use mobility data from the test period. Thus, this is an evaluation of whether the models can accurately predict the number of cases, given mobility data, in a time period that was not used for model calibration. Extended Data Figure 1 shows that our network model fits the out-of-sample case data fairly well, and that our model substantially outperforms alternative models that use aggregated mobility data (without a network) or do not use mobility data at all (see ‘Aggregate mobility and no-mobility baseline models’). Note that we only use this train/test split to evaluate out-of-sample model accuracy. All other results are generated using parameter sets that best fit the entire dataset, as described above.

### Analysis details

In this section, we include additional details about the experiments that underlie the figures in the paper. We do not include explanations for figures that are completely described in the main text.

#### Counterfactuals of mobility reduction

Associated with Fig. 2a and Supplementary Tables 4, 5. To simulate what would have happened if we changed the magnitude or timing of mobility reduction, we modified the real mobility networks from 1 March to 2 May 2020, and then ran our models on the hypothetical data. In Fig. 2a, we report the total number of people per 100,000 of the population ever infected (that is, in the exposed, infectious and removed states) by the end of the simulation.

To simulate a smaller magnitude of mobility reduction, we interpolate between the mobility network from the first week of simulation (1–7 March 2020), which we use to represent typical mobility levels, and the actual observed mobility network for each week. Let *W*^{(t)} represent the observed visit matrix at the *t*th hour of simulation, and let *f*(*t*) = *t* mod 168 map *t* to its corresponding hour in the first week of simulation, since there are 168 h in a week. To represent the scenario in which people had committed to *α* ∈ [0, 1] times the actual observed reduction in mobility, we construct a visit matrix (tildeW_alpha ^(t)) that is an *α*-convex combination of *W*^{(t)} and *W*^{f(t)},

$$tildeW_alpha ^(t)=alpha W^(t)+(1-alpha )W^f(t).$$

(26)

If *α* is 1, then (tildeW_alpha ^(t)=W^(t)), and we use the actual observed mobility network for the simulation. On the other hand, if *α* = 0, then (tildeW_alpha ^(t)=W^f(t)), and we assume that people did not reduce their mobility levels at all by looping the visit matrix for the first week of March throughout the simulation. Any other *α* ∈ [0, 1] interpolates between these two extremes.

To simulate changing the timing of mobility reduction, we shift the mobility network by *d* ∈ [−7, 7] days. Let *T* represent the last hour in our simulation (2 May 2020, 23:00), let *f*(*t*) = *t* mod 168 map *t* to its corresponding hour in the first week of simulation as above, and similarly let *g*(*t*) map *t* to its corresponding hour in the last week of simulation (27 April–2 May 2020). We construct the time-shifted visit matrix (tildeW_d^(t))

$$tildeW_d^(t)={beginarrayllW^(t-24d) & rmif,0le t-24dle T, W^f(t-24d) & rmif,t-24d < 0, W^g(t-24d) & rmotherwise.endarray$$

(27)

If *d* is positive, this corresponds to starting mobility reduction *d* days later; if we imagine time on a horizontal line, this shifts the time series to the right by 24*d* hours. However, doing so leaves the first 24*d* hours without visit data, so we fill it in by reusing visit data from the first week of simulation. Likewise, if *d* is negative, this corresponds to starting mobility reduction *d* days earlier, and we fill in the last 24*d* hours with visit data from the last week of simulation.

#### Distribution of predicted infections across POIs

Associated with Fig. 2b, Extended Data Fig. 2 and Supplementary Fig. 10. We run our models on the observed mobility data from 1 March–2 May 2020 and record the number of predicted infections that occur at each POI. Specifically, for each hour *t*, we compute the number of expected infections that occur at each POI *p*_{j} by taking the number of susceptible people who visit *p*_{j} in that hour multiplied by the POI infection rate (lambda _p_j^(t)) (equation (9)). In Fig. 2b and Supplementary Fig. 10, we sort the POIs by their total predicted number of infections (summed over hours) and plot the cumulative distribution of infections over this ordering of POIs. In Extended Data Fig. 2, we select the POI categories that contribute the most to predicted infections and plot the daily proportion of POI infections each category accounted for (summed over POIs within the category) over time.

#### Reducing mobility by capping maximum occupancy

Associated with Figs. 2c and Extended Data Fig. 3. We implemented two partial reopening strategies: one that uniformly reduced visits at POIs to a fraction of full activity, and the other that ‘capped’ the number of hourly visits at each POI to a fraction of the maximum occupancy of that the POI. For each reopening strategy, we started the simulation on 1 March 2020 and ran it until 30 May 2020, using the observed mobility network from 1 March to 30 April 2020, and then using a hypothetical post-reopening mobility network from 1 May to 30 May 2020, corresponding to the projected impact of that reopening strategy. Because we only have observed mobility data from 1 March to 2 May 2020, we impute the missing mobility data up to 30 May 2020 by looping mobility data from the first week of March, as in the above analysis on the effect of past reductions in mobility. Let *T* represent the last hour for which we have observed mobility data (2 May 2020, 23:00). To simplify the notation, we define

$$h(t)={beginarrayllt & rmif,t < T, f(t) & rmotherwise,endarray$$

(28)

where, as above, *f*(*t*) = *t* mod 168. This function leaves *t* unchanged if there is observed mobility data at time *t*, and otherwise maps *t* to the corresponding hour in the first week of our simulation.

To simulate a reopening strategy that uniformly reduced visits to an *γ* fraction of their original level, where *γ* ∈ [0, 1], we constructed the visit matrix

$$tildeW_gamma ^(t)={beginarrayllW^h(t) & rmif,t < tau , gamma W^h(t) & rmotherwise,endarray$$

(29)

where *τ* represents the first hour of reopening (1 May 2020, 00:00). In other words, we use the actual observed mobility network up until hour *τ*, and then subsequently simulate an *γ* fraction of full mobility levels.

To simulate the reduced occupancy strategy, we first estimated the maximum occupancy *M*_{pj} of each POI *p*_{j} as the maximum number of visits that it ever had in one hour, across all of 1 March–2 May 2020. As in previous sections, let (w_ij^(t)) represent the *ij*th entry in the observed visit matrix *W*^{(t)}, that is, the number of people from CBG *c*_{i} who visited *p*_{j} in hour *t*, and let (V_p_j^(t)) represent the total number of visitors to *p*_{j} in that hour, that is, (sum _iw_ij^(t)). We simulated capping at a *β* fraction of maximum occupancy, where *β* ∈ [0, 1], by constructing the visit matrix (tildeW_beta ^(t)) for which the *ij*th entry is

$$tildew_ijbeta ^(t)={beginarrayllw_ij^h(t) & rmif,t < tau ,rmor,V_p_j^(t)le beta M_p_j, fracbeta M_p_jV_p_j^(t)w_ij^h(t) & rmotherwise.endarray$$

(30)

This corresponds to the following procedure: for each POI *p*_{j} and time *t*, we first check whether *t* < *τ* (reopening has not started) or whether (V_p_j^(t)le beta M_p_j) (the total number of visits to *p*_{j} at time *t* is below the allowed maximum (beta M_p_j)). If so, we leave (w_ij^h(t)) unchanged. Otherwise, we compute the scaling factor (fracbeta M_p_jV_p_j^(t)) that would reduce the total visits to *p*_{j} at time *t* down to the allowed maximum (beta M_p_j), and then scale down all visits from each CBG *c*_{i} to *p*_{j} proportionately. For both reopening strategies, we calculate the predicted increase in cumulative incidence at the end of the reopening period (30 May 2020) compared to the start of the reopening period (1 May 2020).

#### Relative risk of reopening different categories of POIs

Associated with Fig. 2d, Extended Data Fig. 5 and Supplementary Figs. 11, 15–24. We study separately the reopening of the 20 POI categories with the most visits in SafeGraph data. In this analysis, we follow previous studies^{28} and do not study four categories: ‘child day-care services’ and ‘elementary and secondary schools’ (because children under 13 are not well-tracked by SafeGraph); ‘drinking places (alcoholic beverages)’ (because SafeGraph seems to undercount these locations^{28}) and ‘nature parks and other similar institutions’ (because boundaries and therefore areas are not well-defined by SafeGraph). We also exclude ‘general medical and surgical hospitals’ and ‘other airport operations’ (because hospitals and air travel both involve many additional risk factors that our model is not designed to capture). We do not filter out these POIs during model fitting (that is, we assume that people visit these POIs, and that transmissions occur there) because including them still increases the proportion of overall mobility that our dataset captures; we simply do not analyse these categories, because we wish to be conservative and only focus on categories for which we are most confident that we are capturing transmission faithfully.

This reopening analysis is similar to the previous experiments on reducing maximum occupancy versus uniform reopening (see ‘Reducing mobility by capping maximum occupancy’). As above, we set the reopening time *τ* to 1 May 2020, 00:00. To simulate reopening a POI category, we take the set of POIs in that category, (mathscrV), and set their activity levels after reopening to that of the first week of March. For POIs not in the category (mathscrV), we keep their activity levels after reopening the same, that is, we simply repeat the activity levels of the last week of our data (27 April–2 May 2020): This gives us the visit matrix (tildeW^(t)) with entries

$$tildew_ij^(t)={beginarrayllw_ij^(t) & rmif,t < tau , w_ij^f(t) & rmif,tge tau ,p_jin mathscrV w_ij^g(t) & rmif,tge tau ,p_jnotin mathscrV.endarray$$

(31)

As in the above reopening analysis, *f*(*t*) maps *t* to the corresponding hour in the first week of March, and *g*(*t*) maps *t* to the corresponding hour in the last week of our data. For each category, we calculate the predicted difference between (1) the cumulative fraction of people who have been infected by the end of the reopening period (30 May 2020) and (2) the cumulative fraction of people infected by 30 May 2020 had we not reopened the POI category (that is, if we simply repeated the activity levels of the last week of our data). This seeks to model the increase in cumulative incidence by the end of May from reopening the POI category. In Extended Data Fig. 5 and Supplementary Figs. 15–24, the bottom right panel shows the predicted increase for the category as a whole, and the bottom left panel shows the predicted increase per POI (that is, the total increase divided by the number of POIs in the category).

#### Per-capita mobility

Associated with Fig. 3d, Extended Data Fig. 6 and Supplementary Fig. 3. Each group of CBGs (for example, the bottom income decile) comprises a set (mathscrU) of CBGs that fit the corresponding criteria. In Fig. 3d and Extended Data Fig. 6, we show the daily per-capita mobilities of different pairs of groups (broken down by income and by race). To measure the per-capita mobility of a group on day *d*, we take the total number of visits made from those CBGs to any POI, (sum _c_iin mathscrUsum _p_jin mathscrPsum _t=24d^24d+23w_ij^(t)), and divide it by the total population of the CBGs in the group, (sum _c_iin mathscrUN_c_i). In Supplementary Fig. 3, we show the total number of visits made by each group to each POI category, accumulated over the entire data period (1 March–2 May 2020) and then divided by the total population of the group.

#### Average predicted transmission rate of a POI category

Associated with Fig. 3e and Extended Data Tables 3, 4. We compute the predicted average hourly transmission rate experienced by a group of CBGs (mathscrU) at a POI category (mathscrV) as

$$barbeta _mathscrUmathscrV=fracsum _c_iin mathscrUsum _p_jin mathscrVsum _t=1^Tw_ij^(t)beta _p_j^(t)sum _c_iin mathscrUsum _p_jin mathscrVsum _t=1^Tw_ij^(t),$$

(32)

where, as above, (beta _p_j^(t)) is the transmission rate at POI *p*_{j} in hour *t* (equation (8)), (w_ij^(t)) is the number of visitors from CBG *c*_{i} at POI *p*_{j} in hour *t*, and *T* is the last hour in our simulation. This represents the expected transmission rate encountered during a visit by someone from a CBG in group (mathscrU) to a POI in category (mathscrV).

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.