use prefix []or [-]not [+]and [=]has feature [!]exclude feature ie. 'interleukin-6 -animal +phenotypic =protein !tumor'

Displaying 10 papers, 9 pages, start at 1, 176 Hits
53 section matches

Introduction

The science of networks has revolutionised research into the dynamics of interacting elements. The associated techniques have had a huge impact in a range of fields, from computer science to neurology, from social science to statistical physics. However, it could be argued that epidemiology has embraced the potential of network theory more than any other discipline. There is an extremely close relationship between epidemiology and network theory that dates back to the mid-1980s [1, 2] . This is because the connections between individuals (or groups of individuals) that allow an infectious disease to propagate naturally define a network, while the network that is generated provides insights into the epidemiological dynamics. In particular, an understanding of the structure of the transmission network allows us to improve predictions of the likely distribution of infection and the early growth of infection (following invasion), as well as allowing the simulation of the full dynamics. However the interplay between networks and epidemiology goes further; because the network defines potential transmission routes, knowledge of its structure can be used as part of disease control. For example, contact tracing aims to identify likely transmission network connections from known infected cases and hence treat or contain their contacts thereby reducing the spread of infection. Contact tracing is a highly effective public health measure as it uses the underlying transmission dynamics to target control efforts and does not rely on a detailed understanding of the etiology of the infection. It is clear, therefore, that the study of networks and how they relate to the propagation of infectious diseases is a vital tool to understanding disease spread and, therefore, informing disease control.
Here, we review the growing body of research concerning the spread of infectious diseases on networks, focusing on the interplay between network theory and epidemiology. The paper is split into four main sections which examine the types of network relevant to epidemiology, the multitude of ways these networks can be characterised, the statistical methods that can be applied to either infer the likely network structure or the epidemiological parameters on a realised network, and finally simulation and analytical methods to determine epidemic dynamics on a given network. Given the breadth of areas covered and the ever-expanding number of publications (over seven thousand papers have been published concerning infectious diseases and networks) a comprehensive review of all work is impossible. Instead, we provide a personalised overview into the areas of network epidemiology that have seen the greatest progress in recent years or have the greatest potential to provide novel insights. As such considerable importance is placed on analytical approaches and statistical methods which are both rapidly expanding fields. We note that a range of other networkbased processes (such as the spread of ideas or panic) can be modelled in a similar manner to the spread of infection; however, in these contexts, the transmission process is far less clear; therefore, throughout this paper, we restrict our attention to epidemiological issues.

The Ideal Network.

We start our examination of network forms by considering the ideal network that would allow us to completely describe the spread of any infectious pathogen. Such a network would be derived from an omniscient knowledge of individual behaviour. We define G i, j (t) to be a time-varying, real, and high-dimensional variable that informs about the strength of all potential transmission routes from individual i to individual j at time t. Any particular infectious disease can then be represented as a function ( f pathogen ) translating this high-dimensional variable into an instantaneous probabilistic transmission rate (a single real variable). In this ideal, G subsumes all possible transmission networks, from sexual relations to close physical contact, face-to-face conversations, or brief encounters, and quantifies the time-varying strength of this contact. The disease function then picks out (and combines) those elements of G that are relevant for transmission of this pathogen, delivering a new (single-valued) timevarying infection-specific matrix (T i, j (t) = f pathogen (G i, j (t))). This infection-specific matrix then allows us to define the stochastic dynamics of the infection process for a given pathogen. (For even greater generality, we may want to let the pathogen-specific function f also depend on the time since an individual was infected, such that time-varying infectivity or even time-varying transmission routes can be accommodated.)
Obviously, the reality of transmission networks is far from this ideal. Information on the potential transmission routes within a population tends to be limited in a number of aspects. Firstly, it is rare to have information on the entire population; most networks rely on obtaining personal information on participants, and therefore participation is often limited. Secondly, information is generally only recorded on a single transmission route (e.g., face-to-face conversation or sexual partnership) and often this is merely recorded as the presence or absence of a contact rather than attempting to quantify the strength or frequency of the interaction. Finally, data on contact networks are rarely dynamic; what is generally recorded is whether a contact was present during a particular period with little consideration given to how this pattern may change over time. In the light of these departures from the ideal, it is important to consider the specifics of different networks that have been recorded or generated and understand their structure, uses, and limitations.

Realised Encounter Networks.

One of the few examples of where many of the potential transmission routes within a population have been documented comes from the spread of sexually transmitted infections (STIs). In contrast with airborne infections, STIs have very obvious transmission routes-sex acts (or sharing needles during intravenous drug use)-and as such these potential transmission routes should be easily remembered (Figure 1(a) ). Generally the methodology replicates that adopted during contact tracing, getting an individual to name all their sexual partners over a given period, these partners are then traced and asked for their partners, and the process is repeated-this is known as snowball sampling [4] (Figure 1(b) ). A related methodology is respondent-driven sampling, where individuals are paid both for their participation and the participation of their contacts while protecting each individual's anonymity [5] . This approach, while suitable for hidden and hard to reach populations, has a number of limitations, both practical and theoretical: recruiting people into the study, getting them to disclose such highly personal information, imperfect recall from participants, the inability to find all partners, and the clustering of contacts. In addition, there is the theoretical issue that this algorithm will only find a single connected component within the population, and it is quite likely that multiple disjoint networks exist [6] .
tracing was generally only performed for a single iteration although many initial participants in high-risk groups were enrolled, while in the Manitoba study, tracing was performed as part of the routine information gathered by public health nurses. Therefore, while both provide a vast amount of information on sexual contacts, it is not clear if the results are truly a comprehensive picture of the network and sampling biases may corrupt the resulting network [9] . In addition, compared to the ideal network, these sexual contact networks lack any form of temporal information; instead, they provide an integration of the network over a fixed time period and generally lack information on the potential strength of a contact between individuals. Despite these difficulties, they continue to provide an invaluable source of information on human sexual networks and the potential transmission routes of STIs. In particular, they point to the extreme levels of heterogeneity in the number of sexual contacts over a given period-and the variance in the number of contacts has been shown to play a significant role in early transmission dynamics [10] .
One of the few early examples of the simulation of disease transmission on an observed network comes from a study of a small network of 22 injection drug users and their sexual partners [3] (Figure 1(a) ). In this work, the risk of transmission between two individuals in the network was imputed based on the frequency and types of risk behaviour connecting those two individuals. HIV transmission was modelled using a monthly time step and single index case, and simulations were run for varying lengths of (simulated) time. This enabled a node's position in the network (as characterised by a variety of measures) to be compared with how frequently it was infected during simulations, and how many other nodes it was typically responsible for infecting.
A different approach to gathering social network and behavioural data was initiated by the Human Dynamics group at MIT and illustrates how modern technology can assist in the process of determining transmission networks. One of the first approaches was to take advantage of the fact that most people carry mobile phones [11] . In 2004, 100 Nokia 6600 smart-phones preinstalled with software were given to MIT students to use over the course of the 2004-2005 academic year. Amongst other things, data were collected using Bluetooth to sense other mobile phones in the vicinity. These data gave a highly detailed account of individuals behaviour and contact patterns. However, a limitation of this work was that Bluetooth has a range of up to 25 meters, and as such networks inferred from these data may not be epidemiological meaningful.

Inferred Encounter Networks.

Egocentric data generally consists of information on a number of individuals (the egos) and their contacts (the alters). As such the information gathered is very similar to that collected in the sexual contact network studies in Manitoba and Colorado Springs, but with only the initial step of the snowball sampling was performed; the difference is that for the majority of egocentric data the identity of partners (alters) is unknown and therefore connections between egos cannot be inferred (Figure 1(c) ). The data, therefore exists as multiple independent "stars" linking the egos to the alters, which in itself provides valuable information on heterogeneities within the network. Two major studies have attempted to gather such egocentric information: the NATSAL studies of sexual contacts in the UK [13] [14] [15] [16] , and the POLYMOD study of social interactions within 8 European countries [17] . The key to generating a network from such data is to probabilistically assign each alter a set of contacts drawn from the information available from egos; in essence, using the ego data to perform the next step in the snowball sampling algorithm. The simplest way to do this is to generate multiple copies of all the egos and to consider the contacts from each ego to be "half-links"; the half-links within the network can then be connected at random generating a configuration network [18] [19] [20] ; if more information is available on the status (age, gender, etc.) of the egos and alters then this can also be included and will reduce the set of half-links that can be joined together. However, in the vast majority of modelling studies, the egocentric data have simply been used to construct WAIFW (who-acquiresinfection-from-whom) matrices [15, 17, 21] that inform about the relative levels of transmission between different groups (e.g., based on sexual activity or age) but neglect the implicit network properties. This matrix-based approach is often reliable: for STIs it is the extreme heterogeneity in the number of contacts (which are close to being powerlaw or scale-free distributed; see Section 3.2) that drives the infection dynamics [22] although larger-scale structure does play a role [23] ; for social interactions, it is the assortativity between (age-) groups that controls the behaviour, with the number of contacts being distributed as a negative binomial [17] . The POLYMOD matrices have therefore been extensively used in the study of the H1N1 pandemic in 2009, providing important information about the cost-effective vaccination of different age-classes [21, 24] . The general configuration model approach of randomly linking together "half-links" from each ego [18, 19] has been adopted and modified to consider the spread of STIs. In particular, simulations have been used to consider the importance of concurrency in sexual networks [25, 26] , where concurrency is defined as being in two active sexual partnerships at the same time. A dynamic sexual network was simulated, with partnerships being broken and reformed such that the network density remained constant over time. The likelihood of two nodes forming a partnership depended on their degree, but this relationship could be tuned to make concurrency more or less common and to make the mixing assortative or disassortative based on the degrees of the two nodes. Transmission of an STI (such as gonorrhoea and chlamydia [25] or HIV [26] ) was then simulated upon this dynamic network, showing that increasing concurrency substantially increased the growth rate during the early phase of an epidemic (and, therefore, its size after a given period of time). This greater growth rate was related to the increase in giant component size (see Section 3.1) that was caused by increased concurrency.
The alternative approach of simulating the behaviour of individuals is obviously highly complex and fraught with a great deal of uncertainty. Despite these problems, three groups have attempted just such an approach: Longini's group at Emory [28] [29] [30] [31] , Ferguson's group at Imperial [32, 33] , and Eubank's group at Los Alamos/Virginia Tech [34, 35] . The models of both Longini and Ferguson are primarily agent-based models, where individuals are assigned a home and work location within which they have frequent infection-relevant contacts together with more random transmission in their local neighbourhood. The Longini models separate the entire population into subunits of 2000 individuals (for the USA) or 13000 individuals (for South-East Asia) who constitute the local population where random transmission can operate; in contrast, the Ferguson models assign each individual a spatial location and random transmission occurs via a spatial kernel. In principle, both of these models could be used to generate an explicit network model of possible contacts. The Eubank model is also agentbased aiming to capture the movements of 1.5 million people in Portland, Oregon, USA; but these movements are then used to define a network based on whether two individuals occur in the same place (there are 180 thousand places represented in the model) at the same time. It is this network that is then used to simulate the spread of infection. While in principle this Eubank model could be used to define a temporally varying and real-valued network (where the strength of connection would be related to the type of mixing in a location and the number of people in the location); in the epidemiological publications [35] , the network is considered as a static contact network in which extreme heterogeneity in numbers of contacts is again predicted, and the network has "small world" like properties (see below). A similar approach of generating artificial networks of individuals for stochastic simulations of respiratory disease has been recently applied to influenza at the scale of the United States, and the software made generally available [36] . This software took a more realistic dynamic network approach and incorporated flight data within the United States, but was sufficiently resource-intensive to require specialist computing facilities (a single simulation taking around 192 hours of CPU time). All three models have been used to consider optimal control strategies, determining the best deployment of resources in terms of limiting transmission associated with different routes. The predicted success of various control strategies, therefore, critically depends on the strength of contacts within home, at work, within social groups, and that occuring at random.

Movement

The movement of passenger aircraft as collated by the International Air Transport Association (IATA) provides very useful information about the long-distance movement of individuals and hence how rapidly infection is likely to travel around the globe [37, 45, 46] . Unlike many other network models which are stochastic individual-level simulations, the work of Hufnagel et al. [37] and Colizza et al. [45] was based on stochastic Langevin equations (effectively differential equations with noise included). The early work by Hufnagel et al. [37] focused on the spread of SARS and showed a remarkable degree of similarity between predictions and the global spread of this disease. This work also showed that extreme sensitivity to initial conditions arises from the structure of the network, with outbreaks starting in different locations generating very different spatial distributions of infection. The work of Colizza was more focused towards the spread of H5N1 pandemic influenza arising in South-East Asia and its potential containment using antiviral drugs. However, it was H1N1 influenza from Mexico that initiated the 2009 pandemic, but again, the IATA flight data provided a useful prediction of the early spread [47, 48] . While such global movement networks are obviously highly important in understanding the early spread of pathogens, they unfortunately neglect more localised movements [49] and individual-level transmission networks. However, recent work has aimed to overcome this first issue by including other forms of local movement between populations [40, 50] . This work has again focused on the spread of influenza, mixing long-distance air travel with shorter range commuter movements and with the model predictions by Viboud et al. [40] showing good agreement with the observed patterns of seasonal influenza. An alternative form of movement network has been inferred from the "Where's George" study of the circulation of dollar bills in the USA [38] ; this provided far more information about short-range movements, but again did not really inform about the interaction of individuals.
A wide variety (and in practice the vast majority) of movements are not made by aircraft but are regular commuter movements to and from work. The network of such movements has also been studied in some detail for both the UK and USA [39, 40, 51] . The approaches adopted parallel the work done using the network of passenger aircraft, but operate at a much smaller scale, and again, influenza and smallpox have been the considered pathogens. As with the aircraft network certain locations act as major hubs attracting lots of commuters every day; however, unlike the aircraft network, there is the tendency for the network to have a strong daily signature with commuters moving to work during the day but travelling home again in the evening [52] . As such the commuter network can be thought of as heterogeneous, locally clustered, temporal, and with each contact having different strengths (according to the number of commuters making each journey); however, to provide a complete description of population movement, and hence disease transmission requires other causes of movement to be included [51] and requires strong assumptions to be made about individual-level interactions. The key question that can be readily addressed from these commuter-movement models is whether a localised outbreak can be contained within a region or whether it is likely to spread to other nodes on the network [39] .
The use of static networks to model the very dynamic movement of livestock is questionable. Expanding on earlier work, Green et al. [53] simulated the early spread of FMD through movement of cattle, sheep, and pigs. Here, the livestock network was treated dynamically, with infection only able to propagate along edges on the day when that edge occurred; additional to this network spread, local transmission could also occur. These simulations enabled regional patterns of risk to a new FMD incursion to be assessed, as well as identifying markets as suitable targets for enhanced surveillance. Vernon and Keeling [55] considered the relationship between epidemics predicted from dynamic cattle networks and their static counterparts in more detail. They compared different network representations of cattle movement in the UK in 2004, simulating epidemics across a range of infectivity and infectious period parameters on the different network representations. They concluded that network representations other than the fully dynamic one (where the movement network changes every day) fail to reproduce the dynamics of simulated epidemics on the fully dynamic network.

Contact Tracing Networks.

Contact tracing and hence the networks generated by this method can take two distinct forms. The first is when contact-tracing is used to initiate proactive control. This is often the case for STIs, where identified cases are asked about their recent sexual partners, and these individuals are traced and tested; if found to be infected, then contact tracing is repeated for these secondary cases. Such a process is related to the snowball sampling that was discussed earlier, with the notable exception that tracing is only performed from known cases. Similar contact-tracing may operate for the early stages of an airborne epidemic (as was seen for the 2009 H1N1 pandemic), but here, the tracing is not generally iterative as contacts are generally traced and treated so rapidly that they are unlikely to have generated secondary cases. An alternative form of contacttracing is when a transmission pathway is sought between all identified cases [1, 60, 61] . This form of contact tracing is likely to become of ever-increasing importance in the future when improved molecular techniques and statistical inference allow infection trees to be determined from genetic differences between samples of the infecting pathogen [62] .
These forms of network have two main advantages but one major disadvantage. The network is often accompanied by test results for the individuals within the network, as such we not only have information on the contact process but also on the resultant transmission of infection. In addition, when contact tracing is only performed to define an infection tree, there is the added advantage that the infection process itself defines the network of contacts, and hence there is no need for human interpretation of which forms of contact may be relevant. Unfortunately, the reliance on the infection process to drive the tracing means that the network only reflects one realisation of the epidemic process and, therefore, may ignore contacts that are of potential importance and would be needed if the epidemic was to be simulated; therefore, while they can inform about past outbreaks, they have little predictive power.

Surrogate Networks.

Obtaining large-scale and reliable information on who contacts whom is obviously very difficult; therefore, there is a temptation to rely on alternative data sets, where network information can be extracted far more easily, and where the data is already collected. As such the movement networks and contact tracing networks discussed above are examples of such surrogate networks although their connection to the physical processes of infection transmission are far more clear. Other examples of networks abound [22, [63] [64] [65] ; while these are not directly relevant for the spread of infection, they do provide insights into how networks form and grow-structures that are commonly seen in surrogate networks are likely to arise in the types of network associated with disease transmission. One source of network information that would be fantastically rich and also highly informative (if not immediately relevant) is the network of friendships and contacts on social networking sites (such as Facebook); some sites have made data on their social networks available, and these data have been used to examine a range of sociological questions about online interactions [66] .

Theoretical Constructs.

Given the huge complexity involved in obtaining large-scale and reliable data on realtransmission networks many researchers have instead relied on theoretically constructed networks. These networks are usually highly simplified but aim to capture some of the known (or postulated) features of real-transmission networks-often the simplifications are so extreme that some analytical traction can be gained. Here, we briefly outline some of the commonly used theoretical networks and identify which features they capture; some of the results of how infection spreads on such networks are discussed more fully in Section 4.2.

Configuration Networks.

An alternative formulation that offers a compromise between tractability and realism occurs when individuals that exist in fully interconnected cliques have randomly assigned links within the entire population [69, 70] ( Figure 1(d) ). As such, these networks mimic the strong interactions within families and the weaker contacts between them. While such models offer a significant improvement over configuration networks and capture the known importance of the household in transmission, they make no allowance for clustering between households due to spatial proximity. Hierarchical metapopulation models [71] allow for this form of additional structure, where households (or other groupings) are themselves grouped in an ascending hierarchy of clustering.

Lattices and Small

Small world networks improve upon the rigid structure of the lattice by allowing a low number of random contacts across the entire space (Figure 1(e) ). Such long range contacts allow infection to spread rapidly though the population and vastly reduce the shortest path length between individuals [74] -this is popularly known as six degrees of separation from the concept that any two individuals on the planet are linked through at most six friends or contacts [75] . Therefore, small world networks offer a step towards reality, capturing the local nature of transmission and the potential for long-range contacts [76, 77] ; however, they suffer from neglecting heterogeneity in the number of contacts and the tight clustering of contacts within households or social settings.

Expected Network Properties.

Here, we have shown that a wide variety of network structures have been measured or synthesised to understand the spread of infectious diseases. Clearly, with such a range of networks, no clear consensus can be drawn on the types of underlying network structures that are generally present; in part, this is because different studies have focused on different infectious diseases and different diseases require different transmission routes.
Three fundamental problems still exist in the study of networks. Firstly, are there relatively low-dimensional ways of capturing key aspects of a network's structure? What constitutes a key aspect will vary with the problem being studied, but for epidemiological applications, it should be hoped that a universal set of network characteristics may emerge. There is then the task of assessing reasonable and realistic ranges for these key variables based on values computed for known transmission networks-unfortunately very few transmission networks have been recorded in any degree of detail although modern electronic devices may simplify the process in the future. Secondly, there is the related statistical problem of inferring plausible complete networks from the partial information collected by methods such as contact tracing. This is equivalent to seeking an underlying model for the network connections that is consistent with the known partial information, and hence, has strong resonance with the more mechanistically motivated models in Section 2.3. Even when the network is fully realised (and an epidemic observed), there is considerable statistical difficulty in attributing risk to particular contact types. Finally, there are the key questions of predicting the dynamics of infection on any given network-and while for many complex networks, direct simulation is the only approach, for other simplified networks some analytical traction can be achieved, which helps to provide more generic insights into which elements of network structure are most important. These three key areas are discussed below.

Network Properties

Real networks can exhibit staggering levels of complexity. The challenge faced by researchers is to try and make sense of these structures and reduce the complexity in a meaningful way. In order to make any sense of the complexities present, researchers over several decades have defined a large variety of measurable properties that can be used to characterise certain key aspects [63, 65, 86] . Here, we describe the definitions of the most important characterisations of complex networks (in our view), and outline their impact on disease transmission models.

Degrees, Distributions, and Correlations.

For the extreme case of P(k) following an unbounded power law and assuming equal transmission across all edges, Pastor-Satorras and Vespignani [87] showed that the classic result of the epidemic threshold from mean field theory [10] breaks down. In real-transmission networks, the distribution of degree is often heavily skewed, and occasionally follows a power law [22] , but is always bounded, leading to the recovery of epidemic threshold, but one which is much lower than expected in evenly mixed populations [88] .

3.4.

The clustering property of networks is essential to the understanding of transmission processes. In clustered networks, rapid local depletion of susceptible individuals plays a hugely important role in the dynamics of spread [95, 96] ; for a more analytic treatment of this, see Section 4.2 below.

Subgraphs.

Considering higher order structures can be very informative but is more involved. Milo and coworkers began by looking for specific patterns of connections between nodes in small subgraphs, dubbed motifs. Given a connected subgraph of size 3, for example, there are 13 possible motifs. Statistically, some of these appear more often and are found to be overrepresented in certain real networks compared to random networks [97] . Understanding the motif composition of a complex network has been shown to improve the predictive power of deterministic models of transmission when motifs are explicitly modelled (see Section 4.2 and House et al. [98] ).
Although the impact of communities in transmission processes has not been fully explored, a few studies have shown it can have a profound impact on disease dynamics [100, 101] . An alternative measure of how "well-knit" a graph is, named conductance [102] , most widely used in the computer science literature has also been found to be important in a range of networks [103] .
In the sections that follow, we discuss how these network properties can be inferred statistically and the improvements in our understanding of the transmission of infection in networks that have come as a result.

Analytic Methods.

In this section, we use the word "analytic" broadly, to imply models that are directly numerically integrable, without the use of Monte Carlo simulation methods, rather than systems for which all results can be written in terms of fundamental functions, of which there are very few in epidemiology. Analytic approaches to transmission of infection on networks fall into three broad categories. Firstly, there are approaches that calculate exact invasion thresholds and final sizes for special networks. Secondly, there are approaches for calculating exact transient dynamics, including epidemic peak heights and times, but again, these only hold in special networks. Finally, there are approaches based on moment closure that are give approximately correct dynamics for a wide class of networks.
Before considering these approaches on networks, it is worth considering what is meant by nonnetwork mixing and showing explicitly how this can derive the standard transmission terms from familiar differential equation models. Nonnetwork mixing can be taken to have one of two meanings: either that every individual in the population is weakly connected to every other (the mean-field assumption), or that an Erdös-Rényi random graph defines the transmission network, depending on context. To see how this determines the epidemic dynamics, we consider a population of N individuals, with a homogeneous independent probability q that any pair of individuals is linked on the network, which gives each individual a mean number of edges n = q(N − 1). We then assume that the transmission rate for infection across an edge is τ and that the proportion of the population infectious at a time t is I(t); then, the force of infection experienced by an average susceptible in the population is nτI(t) ≡ βI(t). The quantity β, therefore, defines a population-level transmission rate that can be interpreted in one of two ways as N → ∞. In the case where the population is assumed to be fully connected, the limit is that q is held at unity, and so τ is reduced to as N is increased to hold q(N − 1)τ constant. In the case where the population is connected on a random graph, q is reduced as N is increased to hold n constant.
In either case, having defined an appropriate populationlevel transmission rate, a stochastic susceptible-infectious model of transmission is defined through a Markov chain, in which a population with X susceptible individuals and Y infectious individuals transitions stochastically to a population with X − 1 susceptible individuals and Y + 1 infectious individuals at rate βXY/(N − 1). Then, the exact mean behaviour of such a system in the limit N → ∞ then has its transmission behaviour captured bẏ
The estimation of this quantity for complex disease histories, from data likely to be available, is considered by Wallinga and Lipsitch [116] . We, therefore, focus on the transmission process, since this is most affected by network structure, and other elements of biological realism typically act at the individual level. An important caveat to this, however, is when an infected individual's level of transmissibility varies over the course of their infectious period, which sets up correlations between the processes of transmission and recovery that pose a particular challenge for analytic work, especially in structured populations, as noted by for example Ball et al. [117] .

Exact Invasion.

where K km defines the number of cases in individuals with k contacts from an individual with m contacts during the early stages of the epidemic. Here and elsewhere in this section we use square brackets to represent the numbers of different types on the network; hence, [m] is the number of individuals with m edges in the network and [km] is the number of edges between individuals with k and m contacts, respectively. In addition, p is the probability of infection eventually passing across the edge between a susceptible-infectious pair (for Markovian recovery rate γ and transmission rate τ this is given by p = τ/(τ + γ)). The basic reproductive ratio is given by the dominant eigenvalue of the next-generation matrix
By considering the number of secondary cliques infected by a clique with one initial infected individual, a threshold called R * can be defined. (For the configuration-model of households where each household is of the same size and each individual has the same number of random connections outside the household, the threshold R * is given later as (20) ; however, the methodology is far more general). The calculation of the invasion threshold for the recently defined triangular configuration model [118, 119] involves calculating both the expected number of secondary infectious individuals and triangles rather than just working at the individual level. Trapman [120] deals with how these sort of results can be related to more general networks through bounding. A general feature of clustered networks for which exact thresholds have been derived so far is that there is a local-global distinction in transmission routes, with a general theory of this given by Ball and Neal [121] , where an "overlapping groups" and "great circle" model are also analysed. Nevertheless, care still has to be taken in which threshold parameters are mathematically well behaved and easily calculated (e.g [122] ).

Exact Final

Size. The most sophisticated and general way to obtain exact results for the expected final size of a major outbreak on a network is called the susceptibility set argument and the most general version is currently given by Ball et al. [117] . We give an example of these kind of arguments from Diekmann et al. [123] , who consider the simpler case of a network in which each individual has n contacts. Where there is a probability p of infection passing across a given network link (so for transmission and recovery at rates τ and γ, resp., p = τ/(τ + γ)), the probability that an individual avoids infection is given by

Exact Dynamics. Some of the earliest work on infectious diseases involved the exact solution of master equations

(where the probability of the population being in each possible configuration is calculated) on small, fully connected graphs as summarised in Bailey [126] . The rate at which the complexity of the system of master equations grows means that these equations quickly become too complex to integrate for the most general network. The presence of symmetries in the network, however, does mean that automorphism-driven lumping is one way to manipulate the master equations (whilst preserving the full stochastic information about the system) for solution [127] . At present, this technique has only been applied to relatively simple networks; however, there are no other highly general methods of deriving exact lower-dimensional systems of equations from the master equations. Nevertheless, other specific routes do exist that allow exact systems of equations of lower dimensionality to be derived for special networks. For static networks constructed using the configuration model (where individuals have heterogeneous degree but connections are made at random such that the presence of short loops can be ignored in a large network, see Figure 1 (c)), an exact system of equations for SIR dynamics in the limit of large network size was provided by Ball and Neal [69] . This construction involves attributing to each node an "effective degree", which starts the epidemic at its actual degree, and measures connections still available as routes of infection and is, therefore, reduced by transmission and recovery. Using notation consistent with elsewhere in this paper (and ignoring the global infection terms that were included by Ball and coworkers) this yields the relatively parsimonious set of equationṡ S k = −ρ τ + γ kS k − γ(k + 1)S k+1 ,
While R 0 can be derived using expressions like (11), calculation of the asymptotic early growth rate r requires systems of ODEs like (14) . If we assume that transmission and recovery are Markovian processes with rates τ and γ, respectively, two measures of early behaviour are

15

where · informs about the average over the degree distribution. These quantities tell us that the susceptibility to invasion of a network increases with both the mean and the variance of the degree distribution. This closely echos the results for risk-structured models [10] but with an extra term of −1 due to the network, representing the fact that the route through which an individual acquired infection is closed off for future transmission events. For more structured networks with a local-global distinction, there are two limits in which exact dynamics can also be derived. If the network is composed of m communities of size n 1 , . . . , n m , with the between-community (global) mixing determined by a Poisson process with rate n G and the within-community (local) mixing determined by a Poisson process with rate n L , then in the limit as the communities become large, n i → ∞, the epidemic dynamics on the system areṠ
where Z n ∞ (τ, γ) is the expected final size of an epidemic in a household of size n with one initial infected. Of course, the within-and between-community mixing for real networks is likely to be much more complex than may be captured by a Poisson process, but these two extremes can provide useful insights. These models show that network structure of the form of communities reduces the potential for an infectious disease to spread, and hence, greater transmission rates are required for the disease to exceed the invasion threshold.

Approximate

where O represents empty sites within the network that are not currently occupied by individuals, and the parameter ε = 0.8093 accounts for the clustering within lattice-based networks. House and Keeling [133] considered a model of infection transmission and contact tracing on a network, where the closure scheme for [ABC] triples was asymmetric in A and C-this allowed the natural conservation of quantities in a highly clustered system. The work on dynamical PGF models [129] can be seen as an elegant simplification of this pairwise approach that is valid for SIR-type infection dynamics on configuration model networks. The equations can be reformulated as S = g(θ),
where g is the probability generating function for the degree distribution, p S and p I correspond to the number of contacts of a susceptible that are susceptible or infected, respectively, and θ is defined as probability that a link randomly selected from the entire network has not been associated with the transmission of infection. Here, the closure assumption is implicit in the definition of S; that is, an individual only remains susceptible if all of its links have not seen the transmission of infection, and that the probability is independent for each link, which is comparable to the assumptions underlying the formulation by Ball and Neal [69] , equation (14) . The precise link between this PGF formulation and the pairwise approach is discussed more fully in House and Keeling [134] . There are many other extensions of this general methodology that are possible. Writing ODEs for the time evolution of triples and closing at a higher order allows the consideration of the epidemiological consequences of varying motif structure [98] . Sharkey et al. [135] considered closure at triple level on directed networks, which involved a more sophisticated treatment of third-order clustering due to the larger repertoire of three-motifs in directed (as compared to undirected) networks. It is also possible to combine stochastic and network moment closure [136] . Timevarying, dynamical networks, particularly applied to sexually transmitted infections where partnerships vary over the course of an epidemic, were considered using approximate ODE-based models by Eames and Keeling [137] and Volz and Meyers [138] . Sharkey [139] considered models appropriate for local networks with large shortest path lengths, where the generic indices μ, λ in (21) stand for node numbers i, j rather than node degrees k, l.
Another approach is to approximate the transmission dynamics in the standard (mean-field) differential equations models. Essentially, this is a form of moment closure at the level of pairs rather than triples. For example, in Roy and Pascual [140] the transmission rate takes the polynomial form
where the exponents, p and q, are typically fitted to simulated data but are thought to capture the spatial arrangement of susceptible and infected nodes. Also, Kiss et al. [141] suggest Transmission rate to S k from I l ∝ k(l − 1)(S k )(I l ), (26) as a way of accounting for each infected "losing" an edge to its infectious parent. Finally, a very recent work [142] presents a dynamical system to capture epidemic dynamics on triangular configuration model networks; the relationship between this and other ODE approaches is likely to be an active topic for future work.

Inference on Networks

The presence of contact network data for populations provides a unique opportunity to estimate the importance of various modes of disease transmission from disease incidence or contact tracing data. For example, given knowledge of the rate of contact between two individuals, it is possible to infer the probability that a contact results in an infection. If data on mere connectivity (i.e., a 1 if the individuals are connected and 0 otherwise) is available, then it is still possible to infer a rate of infection between connected individuals. Thus, the detail of the inference is determined to a large extent by the available detail in the network data [145] .

Availability of Data.

Epidemic models are defined in terms of times of transitions between infection states, for example a progression from susceptible, to infected, to removed (i.e., recovered with lifelong immunity or dead) in the so-called "SIR" model. Statistical inference requires firstly that observations of the disease process are made: at the very least, this comprises the times of case detections, remembering that infection times are always censored (you only ever know you have a cold a few days after you caught it). In addition, covariate data on the individuals provides structure to the population and begins to enable the statistician to make statements about the importance of individuals' relationships to one another in terms of disease transmission. Therefore, any covariate data, however slight, effectively implies a network structure upon which disease transmission can be superimposed.

Inference on Homogeneous Models.

For homogeneous models the basic reproduction number, or R 0 , has several equivalent definitions and can be defined in terms of the transmission rate β and removal rate γ. For nonhomogeneous models, the definitions are not equivalent; see for example [122] .
where w(·) is the probability density function for the generation interval t j − t i , that is, the time between infector i's infection time and infectee j's infection time. Of course, infection times are never observed in practice so symptom onset times are used as a proxy, with the assumption that the distribution of infection time to symptom onset time is the same for every individual. Bayesian methods are used to infer "late-onset" cases from known "early-onset" cases, but large uncertainty of course remains when inferring the reproductive ratio close to the current time as there exists large uncertainty about the number of cases detected in the near future. Additionally, a model for w(·) must be chosen (see, [157] ). The tradeoff in the simplicity of estimating R 0 in these ways, however, is that although a population wide R 0 gives a measure of whether an epidemic is under control on a wide-scale, it give no indication as to regional-level, or even individual-level, risk. Moreover, the two examples quoted above do not even attempt to include population heterogeneity into their models though the requirement for its inclusion is difficult to ascertain in the absence of model diagnostics results. It is postulated, therefore, that a simple measure of R 0 , although simple to obtain, is not sufficient in order to make tactical control-policy decisions. In these situations, knowledge of both the transmission rate and removal rate are required.

Inference on Household Models.

The ability to relax assumptions further has been predominately due to use of Markov chain Monte Carlo (MCMC) methods as first considered by O'Neill et al. [162] for household models following earlier studies of Gibson and Renshaw [164] and O'Neill and Roberts [165] who focused on single, large outbreaks. This methodology has been used to in combination with simulation and data augmentation approaches to tailor inference methods for specific data sets of interest; for example, Neal and Roberts [166] consider a model with a spatial component of distance between households and data containing details of dates of symptoms and appearance of rash and has also resulted in a growing number of novel methods for inference, for example Clancy and O'Neill [167] consider a rejection sampling procedure and Cauchemez et al. [168] introduce a constrained simulation approach. Even greater realism can be captured within household models by considering the different compositions of households and, therefore, the weighted nature of contacts within households. For example, Cauchemez et al. [169] considered household data from the Epigrippe study of influenza in France 1999-2000 and showed that children play a key role in the transmission of influenza and the risk of bringing infection into the household.
Whilst new developments are appearing at an increasing rate, the significant majority of methods are based upon final size data and are developed for SIR disease models, perhaps due in part to the simplification of arguments for deriving final size distributions. One key, but still unanswered question from these analyses of household epidemics is how the transmission rate between any two individuals in the household scales with the total number of individuals in the household (compare Longini and Koopman [160] and Cauchemez et al. [169] ). Intuition would suggest that in larger households the mixing between any two individuals is decreased, but the precise form of this scaling is still unclear, and much more data on large household sizes is required to provide a definitive answer.

Inference on Fully Heterogeneous Populations.

Perhaps the holy grail of statistical inference on epidemics is to make use of an individual-level model to describe heterogeneous populations at the limit of granularity. In this respect, Bayesian inference on stochastic mechanistic models using MCMC have perhaps shown the most promise, allowing inference to be made on both transmission parameters and using data augmentation to estimate the infectious period.
An analysis of the 1861 outbreak of measles in Hagelloch by Neal and Roberts [166] demonstrates the use of a reversible jump MCMC algorithm to infer disease transmission parameters and infectious period, whilst additionally allowing formal comparisons to be made between several nested models. With the uncertainty surrounding model choice, such methodology is vital to enable accurate understanding and prediction. This approach has since been combined with the algorithm of O'Neill and Roberts [165] and used to analyse disease outbreaks such as avian influenza and foot and mouth disease in livestock populations [146, 147, 170] and MRSA outbreaks in hospital wards [171] .

Discussion

The use of networks is clearly a rapidly growing field in epidemiology. By assessing (and quantifying) the potential transmission routes between individuals in a population, researchers are able to both better understand the observed distribution of infection as well as create better predictive models of future prevalence. We have shown how many of the structural features in commonly used contact networks can be quantified and how there is an increasing understanding of how such features influence the propagation of infection. However, a variety of challenges remain.

Open Questions.

Several open problems remain if networks are to continue to influence predictive epidemiology. The majority of these stem from the difficulty in obtaining realistic transmission networks for a range of pathogens. Although some work has been done to elucidate the interconnected structure of sexual encounters (and hence the sexual transmission network), these are still relatively smallscale compared to the population size and suffer from a range of potential biases. Determining comparable networks for airborne infections is a far greater challenge due to the less precise definition of a potential contact.
To date the vast majority of the work into disease transmission on networks has focused on static networks where all links are of equal strength and, therefore, associated with the same basic rate of transmission. However, it is clear that contact networks change over time (both on the short-time scale of who we meet each day, and on the longer time-scale of who our main work and social contacts are), and that links have different weights (such that some contacts are much more likely to lead to the transmission of infection than others). While the simulation of infection on such weighted time-varying networks is feasible, it is unclear how the existing sets of network properties or the existing literature of analytical approaches can be extended to such higher-dimensional networks.
35 section matches

Abstract

Results: An influenza transmission risk contour map was developed for T versus RH. Empirical equations were created for estimating: 1. risk relative to temperature and RH, and 2. time parameterized influenza transmission risk. Using the transmission risk contours and equations, transmission risk for each country's locations was compared with influenza reports from the countries. Higher risk enclosed locations in the tropics included new automobile transport, luxury buses, luxury hotels, and bank branches. Most temperate locations were high risk.
Environmental control is recommended for public health mitigation focused on higher risk enclosed locations. Public health can make use of the methods developed to track potential vulnerability to aerosol influenza. The methods presented can also be used in influenza modeling. Accounting for differential aerosol transmission using T and RH can potentially explain anomalies of influenza epidemiology in addition to seasonality in temperate climates.

Background

The contrasting epidemiology of influenza in the tropics versus temperate regions has been discussed for many years, and it has been accepted for decades that jet aircraft are a major vector for global spread of influenza [1] . This study is an attempt to better understand aerosol influenza transmission for indoor locations by examining temperature and humidity indoors where jet travelers are likely to interact with locals and comparing humid tropical locations with temperate winter ones. In recent years, much attention has been given to the spread of influenza around the world, especially with the continuing H5N1 outbreaks since 2003 and the H1N1 pandemic in 2009. Extensive research has been conducted to understand the mechanism of transmission of influenza virus, including environmental conditions that favor transmission. Various aerosol studies have shown that micron range droplet particles from breathing, talking, coughing and sneezing bear influenza viruses, and that the aerosol route is an important contributor to infection [2, 3] . The particles making up aerosol in normal exhalation are less than 1 micron in size; aerosol particles range from 0.1 micron to 5 micron [2, 4] , and these smallest particles are primary vectors of contagion [5, 6] .
Questions have been raised as to whether or not aerosol transmission of influenza occurs or is a significant contributor to its epidemiology, and whether vitamin D is a determining factor [7] [8] [9] [10] . We believe that our study sheds helpful light on these matters by defining a framework that starts to formalize the effect of temperature and RH conditions on such transmission. We treat this more extensively in the discussion section.
We intend this study to be primarily targeted at public health planners and epidemic model developers. Interventions to successfully interrupt spread of influenza that have been studied in depth are quarantine, isolation, different types of masks, gloves, hygiene, and combinations of these [11] . Public health planners can use our results to consider making climate control adjustment recommendations, which can help control aerosol transmission. As well, modeling of epidemics in software depends on assumptions about where contagion is likely to occur. Some types of modeling today may take into account generalized types of mixing locations which are enclosed [12, 13] , as it is believed that most transmission (including aerosol) occurs indoors, with much attention put on social network [14, 15] . We believe that such models can be improved by modeling of interior temperature and RH.
The authors developed a contour map of T versus RH based on literature from Lowen et al. [16] [17] [18] and others. In the studies of Lowen et al. guinea pigs were exposed to aerosol infection from another guinea pig for 7 days in an environmental cabinet maintaining temperature and relative humidity at varying levels. Thus, where we refer to a 25% risk of transmission, or a 25% contour, we mean that the risk of aerosol infection of one guinea pig over 7 days of continuous aerosol exposure to an infected guinea pig is roughly 25% (25% G7 ). We use this animal model as a baseline for estimation of differential risk to human populations. It is understood that temperature and humidity are not the only factors in aerosol transmission; however, we believe that they are primary factors along with dilution by air exchange and distribution by air currents [2, 19] . In modern building systems, recirculation of indoor air for energy efficiency is also a likely factor. We collected data in 8 countries in the tropics and 2 Australian cities during winter (June-September 2009). Relative humidity and temperature readings were taken in public areas frequented by travelers (e.g. hotels, banks, malls, shops, taxis, buses, etc.) as well as during flights between nations. Observations were also recorded of behaviors that could augment the spread of influenza significantly. Interviews were conducted in major cities in the tropics and Australia to improve understanding of influenza transmission conditions.

Contagion estimation contour development

After review, aerosol transmission estimations were primarily drawn from Lowen et al. to interpolate contours of transmission of influenza in the humidity versus temperature phase space. A set of contours was generated that is believed to be mildly conservative (Figure 1 and Additional file 1 -Contagion contour estimation details).

Results

The aerosol transmission contours are presented in Fig Figure 4 Temperature distance from 25% G7 transmission risk contour. The 4A ogive (cumulative distribution curve for the histogram) for each nation was plotted with a vertical line at the zero point. For Figure 4A bins of 2 degrees C width are used to categorize temperature distances from the 25% G7 transmission risk contour. Figure 4B presents the same temperature distance data in a linear density plot for each nation, with a tick mark for each temperature distance, and the arithmetic mean. The temperature distances were calculated for each point shown in Figure 3 Average Distance from 25% Contour Estimate Mortality per million Figure 6 Closed, Air Conditioned Tropical Ground Transport. The scatter plot (6A) and linear density plot (6B) show all data (multiple time course readings per trip) for motor vehicles categorized by old and new. The vehicle data shown above is for vehicles with closed windows and air conditioning. All old taxi, tour and hotel car readings show low risk (above the 25% G7 transmission risk line). Most readings for new cars in this low risk region are due to higher temperature and low humidity, presumably because automotive engineers are using evaporative cooling from the skin of passengers in dehumidified air to lower perceived temperature. 6B data points produced the cumulative histogram of temperature distances ogive/density charts of Figure 4 . This shows the relationship of readings to the 25% G7 risk contour for the entire dataset. Figure 5 shows correlation of average distance from the 25% G7 contour to cases of influenza. We will now examine in more detail different types of locations surveyed.

Buses and taxis

Luxury buses are used for long-distance trips of 4-12 hours, have air conditioning (AC) and windows operable only in emergency. A not uncommon complaint is that these buses are too cold. They are commonly used by traveling visitors, and also by middle class inhabitants of the region desiring more comfort in their travel. Luxury bus trips (N = 16) fell into the region of concern below the 25% G7 transmission risk contour (Figure 6 ), and passengers shedding virus are a probable source of aerosol. Additionally, vendors often appear at stops selling merchandise. Some vendors get on at one stop and travel with the bus for 45 minutes or so, selling products and entertaining the passengers. Others get on the bus at stops and spend shorter periods of time on the order of 10 minutes, getting off one stop later. Some get on and off at the same stop in 2-4 minutes. Total estimated vendor time inside buses is on the order of 3-5 hours per day over as many as 20 buses, each bus containing potential influenza aerosol, raising their chances of both infection and transmission. Wake effects such as those proposed to explain SARS transmission in a modern jet aircraft [24] are generated by them, and they lean in close to many passengers. Additionally, they have physical contact through money and product exchange, which is usually food. Inexpensive buses used by locals have no AC, using open windows for ventilation. Readings confirmed that their temperatures and RH are equal to or higher than ambient temperatures (data not shown). Vendors serving these inexpensive buses were fewer, and were only observed to board briefly at stops, departing the bus within 2-3 minutes, or to make sales directly through windows.
Trips in high end taxicabs and new minivan shuttles (N = 21) also showed good transmission conditions ( Figure 6 ). Trips in old taxis (N = 11), even if closed and air-conditioned, and those not using recirculation, stayed out of the high risk region. The ability of a new automobile to rapidly lower humidity with the windows closed on recirculation setting to between 45% and 25% RH within 5 minutes or less is remarkable (Figure 7 ). New high end taxis and hotel cars were commonly occupied by travelers for 20-45 minutes (to/from airports, to/from holiday or business meetings).

Non-residential buildings

Of high end tropical hotels studied (N = 22) approximately 50% had good conditions for aerosol transmission in common areas. However, all tropical hotels having good conditions were in the high RH region near 65% (data not shown).
Tropical locations to find good conditions for aerosol influenza transmission to the general public were bank branches, dining facilities, retail shops and offices ( Figure 8 ). The overall impression was that business locations in the tropics needing to appear high-status set their AC systems to generate low humidity and temperature.
In the temperate Australian winter, 98% of buildings (N = 100) showed good aerosol transmission conditions ( Figure 3B ). Thus, no breakdown by type of facility is presented.

Dwellings

Dwellings were not surveyed in all nations and the sample was low; however, where surveyed, estimates were made of how typical RH and temperature were for a dwelling class. The majority of tropical apartment buildings (N = 10, data not shown) had open-air common areas and showed poor aerosol conditions. Surprisingly, temperature and RH conditions in dwellings surveyed were not optimum for transmission in Australia, although the sample was insufficiently large.

Airplane flights

Using our transmission contours, conditions for influenza transmission exist during deplaning (Figure 9 ) for intervals of 7 minutes or less from the time passengers It can be seen that within 5 minutes, relative humidity is lowered to between 45% and 25%. Time 0:00 is street ambient temperature and the first interior reading is at or after 1 minute. The humidity in a new automobile was significantly lower than the street humidity at the time of the first reading after the door was closed. That these vehicles recirculated their air would be expected to contribute to contagion. N = 15. stood up until aircraft cleared (mean 3 min 55 sec, N = 12). During this time, ventilation was often turned to a low setting or off. The authors believe risk is probably low on a per passenger basis, because the period is limited to the short time in which passengers file out of the airplane and other factors may override T and RH. Aircraft data ( Figure 10 ) is otherwise presented without risk interpretation (see discussion).

Development of contagion contours

Literature shows opposing conditions for transmission of viruses in general; low relative humidity (RH) and high RH [5, 25, 26] with temperature a secondary factor. Theory predicts osmotic forces should affect enveloped viruses such as influenza, while icosahedral viruses (e.g. polio, norovirus) would not be so sensitive for structural reasons. Enveloped viruses generally have highest infective stability at RH somewhat below 50% [5] , and non-enveloped icosahedral viruses usually show greatest infective stability in aerosol in high humidity conditions [25, 26] . Data of Lowen et al. [16, 17] at 20°C show optimal transmission of influenza by aerosol at a first RH range from 20-40%, and at a second from 60-70%. Lower temperatures improve transmission, with temperatures above 30°C reducing transmission to zero. These data correlate with other in-vitro studies [5, [25] [26] [27] .

Equation 3:

A study by Harper [25] examined, in-vitro, survival of 4 cultured viruses in aerosol, at temperatures ranging from 21°C to 24°C. To the degree his results differ from Lowen in the 50% + humidity range, they might be explained by his higher temperature. If so, that would change the contours of influenza transmission risk (Figure 1) somewhat, although the current transmission risk contours would remain conservative. Alternatively, this difference may be from the droplet fluid carrying virus used by Harper, as mentioned above.
There is an argument that influenza strains might vary in stability from mutations sufficiently to affect the contours of transmission as taken from Lowen. However, evolutionary argument supports virus stability in aerosol as strongly conserved, since in humans, viruses with lesser aerosol stability will not propagate as well as those with greater stability (unless aerosol stability is compensated for by some other propagation enhancement), and viruses with optimum stability will be selected for during host to host transmission [30, 31] . Thus, literature results from human influenza virus strains would be expected to be from virus near the practical limit of aerosol stability. Further, osmotic pressure generates tensile force on the envelope, which will exhibit resistance to osmotic pressure not exceeding the weakest envelope bilayer hydrogen bonds.
Based on the considerations above, contours were generated based on linear interpolation of Lowen et al. [16, 17] cross-validated with others [5, 25, 27, 32] . These contours apply to RH conditions from 20% to 80%, although it is likely that contours above 80% RH have lower transmission risk than at 80%. Both the region from 0% to 20% RH and that above 80% RH are less clear and need investigation. The justification for using these risk contours in larger scale environments is based on data from studies that show long term persistence (hours) of viable aerosol virus [25, 27] .

Statistical validity of the contour graph

As presented by Lowen et al. [16, 17] in studies of aerosol transmission of influenza over 7 days, there are three temperature groups, 5°C, 20°C and 30°C at varying RH. For the 5°C temperature there are four RH categories, 35%, 50%, 65% and 80%. At 20°C and 30°C there is an additional fifth at 20% RH. At 30°C there is no transmission. At 5°C transmission varies from 100% to 50% and at 20°C from 100% to 0%. Thus, where statistical power is in question is between 5°C and 20°C. As discussed [17] , the difference in transmissibility between 5°C and 20°C at 50% and 80% humidity is significant (p < 0.05). This leaves the 65% relative humidity results at 20°C to be examined.
To further evaluate the Lowen data, we considered it in the context of Harper [25] and Schaffer [27] data on time course viability of influenza virions at differing temperature and humidity, because it is axiomatic that the longer virions can remain viable in aerosol, the more likely they are to cause infection by this route. Harper shows support for the transmission decline of Lowen, as viability declines when RH increases toward 50%. Schaffer data for one hour survival at 21°C (see figure two of Schaffer et al.) also shows a viability trough at 50% RH rising at humidity above 50% followed by a decline [27] . These features of Harper and Schaffer further support the Lowen 20°C data for 50% RH, which was already of sufficient statistical significance. Additionally, Schaffer supports the 65% RH increase in transmission called out as statistically of insufficient power by Lowen et al. A further argument in favor of the 65% RH increase in transmission is care to present conservative contagion contours where there is a question; thus we retained the feature showing a rise in contagion at 65% RH.

25% G7 transmission risk contour selection

For visual inspection purposes the 25% G7 transmission estimate contour is emphasized and became the reference using the following rationale.
The guinea pig experiments of Lowen were performed for 7 days. In most public places such as banks and hotel lobbies with good conditions for transmission of influenza, the time people spend is on the order of 10 minutes. This corresponds by integration of equation 2 to a crude risk of 1/1959 for any one entry and exit at the 25% G7 contour. Thus, assuming 300 patrons per day yields approximately 1 case every 6 days for a usual branch, assuming an infected individual is shedding virus continuously. Risk on a bus ride of 8 hours at the 25% G7 contour yields a crude risk of 1/41, which roughly corresponds to 1 new infection per 8 hour bus ride assuming continuous virus shedding. Assuming an influenza case of 7 days virus shedding duration, we thought it improbable most locals would take more than two 8 hour luxury bus rides with 50 passengers per bus in that time span (vendors excluded) for a total of 2 new infections. (Table 1) These crude risks represent the rationale of our risk cutoff in enclosed spaces. However, we do not think one contour cutoff is always appropriate.

Aircraft data interpretation

A considerable amount of data was collected for aircraft, however, transmission risk on aircraft is complex. First, the influenza contagion space relative to temperature and RH on aircraft is mostly unknown since studies have not been done below 20% RH, and large portions of flights can occur with RH in the 3% to 15% range ( Figure 10 ). How such extremely low RH affects transmission is unknown. Second, although influenza was communicated well to aircraft passengers circa 1979 during a ground delay (38 of 54 passengers in 4.5 hours, 1 index case) given lack of air circulation [34] , HEPA filtration of recirculated cabin air on most aircraft today mitigates this hazard, together with outside air exchange in flight. Literature raises questions about efficacy of HEPA filters on aircraft [35] ; however, the careful epidemiology of SARS on an aircraft [24] suggests HEPA filters and air exchange were fairly effective on that aircraft because of the apparent wake pattern of infection. That correlates with modeling of wake particles carried behind persons moving along the aisle [36] . SARS, like influenza, is an enveloped RNA virus of the same size, which likely has similar filtration characteristics. Third, on some aircraft, ozone is negligible due to catalytic units [37] . Ozone For quick, rough estimates in a public health setting, estimates such as these can be made by use of the table. These kinds of estimates were used to decide which was the reasonable aerosol transmission risk contour to use to demarcate the low risk boundary of the figure 1 graph.
(See Discussion, 25% G7 transmission risk contour selection.) It should be noted that these numbers are useful as guidance rather than being completely predictive since other factors such as ventilation diluting virus aerosol, rate of exhalation of aerosol virus, direction of airflow, and wake effects will have major effects on actual infection numbers.

Alternative views on aerosol transmission of influenza

Examining literature questioning whether influenza is transmitted by aerosol, it appears a major question is the distance at which aerosol transmission of influenza occurs, and we contend that such transmission varies greatly with conditions. There is also a proposal that vitamin D hormone is a regulator of seasonal influenza incidence in the context of questions raised about influenza's fundamental epidemiology [10] .

Aerosol question -Han et al. example

In the example of Han et al. [9] the researchers state that no aerosol transmission occurred because interview data indicated that the 9 infected parties out of 31 tour group members all either talked to or were coughed on directly by the index case over a 3 day period. (Aircraft infections in Han are left aside for the reasons discussed above.) Noting that in a warm, humid environment contact transmission would be the expected primary mode, a guess can be hazarded in light of the current study. What the temperature, relative humidity or ventilation characteristics were for the indoor environments of this tour group are not known with precision. A best guess of aerosol expectation based on bus data for comparable outdoor conditions in the current study derived a low of 1 infection (7 hours, 40% G7 risk) and a high of roughly 5 -6 infections (7 hours, 100% G7 risk, poor ventilation roughly doubling infections). (Better estimates would depend on actual T and RH, ventilation and other factors.) Leaving aside the reliability of interviews, the droplet (non-aerosol per Brankston et al. [8] ) doses administered by talking and coughing (presumably at close range) will drop off quickly with distance due to conditions and dilution. And depending on RH, those droplets may evaporate into aerosol.

Aerosol question -Branktson et al. and Lemieux et al

The concerns of Lemieux et al. [7] and Brankston et al. [8] as to whether aerosol influenza infection is significant hinge on a number of matters. Primary among these is what aerosol transmission means, as defined in the differentiation of droplets ¥5 mm and aerosol §5 mm. We find Tellier's response to Lemieux et al. compelling [40] . To it, we add several notes.
We believe that influenza is transmitted both by contact and by aerosol. The conversion of droplets ¥5 mm to smaller size varies based on humidity and we reiterate Tellier's point that studies have shown viable virus after hours suspended in air [25, 27] . Tropical temperature and humidity does not block transmission by contact, but does block transmission by aerosol [17] . Since it has long been observed that influenza epidemics are muted in the tropics, and they die out in temperate summers, this argues that aerosol transmission is necessary to sustain a rise in R 0 above 1 in large populations.

Provide inexpensive tools to monitor environment

Digital psychrometers cost on the order of $100 to $160 in single quantities (larger purchases may be possible at lower cost). Based on visualizing the data collected, steps can be taken to attempt to either move higher risk environments in the direction of lower aerosol transmission risk, or else direct the use of other measures in those environments. A chart (Figure 1 ) should be inexpensive to distribute.

Educate luxury bus, taxicab, and hotel car operators

Luxury buses deserve special attention, as they cross borders and travelers spend upwards of 4 hours in the environment. The close quarters of these vehicles' recirculating air are a good opportunity for aerosol transmission of influenza (and other respiratory diseases).

Influenza transmission on aircraft is probably fairly low

Based on our examination, we think that influenza transmission on aircraft is probably a not a serious risk most of the time, as discussed above, although the passenger numbers are quite large. Most of the risk appears to be off the aircraft, although wake effects can be troublesome, and T and RH may regulate whether wake effects can occur. Lacking viability data for the humidity range common on aircraft, how that works is clearly an open question. However, since the period starting when passengers stand up after landing to emptying the aircraft does fall within our parameters and is quite short, it may be an insignificant cost for airlines to flush cabin air from the end of the runway after landing until passengers leave the aircraft to further lower transmission risk on aircraft.
Public health relative to other disease and temperature vs. RH Many viruses and bacteria will display viability conditions opposite to influenza. Endemic disease threat such as M. tuberculosis should be weighed since TB is correlated with tropical climates [47] , suggesting its aerosol transmission is optimum in high RH and warm temperature. TB is a hardy organism that forms culturable aerosol from coughing [48] but the aerobiology of transmission is not well explored [47] . Guinea pig model TB transmission studies in parallel with influenza exploring variations of temperature and RH relative to HEPA filtration and ultraviolet light as recommended by Nardell and Piessens [47] would be desirable. The TB concern indicates that in TB endemic regions humidity lower than 60% should be targeted on the transmission contour map (Figure 1 ). There are also commonalities between other measures that can minimize influenza aerosol contagion and measures against TB and other microbial aerosol (such as UV irradiation of upper air [49] ).

Summary

Climate control for enclosed spaces should be added to public health to control influenza epidemics. The range between 20% and 80% RH covers most human habitation outside of aircraft, and the region above 80% RH appears to be a low transmission risk, although both these regions should be explored. In the tropics, getting an indoor facility out of the region of highest risk should be simple and low or no cost. In temperate regions, controlling AC to stay out of the optimum transmission region may be more challenging. At a minimum, the low or no cost step of changing climate control parameters should not raise the R 0 (reproductive number) of an influenza epidemic and will lower it considerably if seasonal influenza transmission is any guide. The authors hope for further refinement; however, this is an inexpensive starting point with highly probable benefits, which should be a net savings for nations. For those who perform epidemiological studies, analyzing data in light of temperature and relative humidity will help our understanding of influenza epidemiology.
List of abbreviations 25% G7 : Contour corresponding to a 25% risk of transmission of influenza from one guinea pig to another over 7 days; 25-hD-25-hydroxyvitamin D; AC: Air conditioning; C: Centigrade; HEPA: High efficiency particle absorbance; HVAC: Heating ventilating and air conditioning; RH: Relative humidity; SARS: Severe acute respiratory syndrome; T: Temperature; URTI: Upper respiratory tract infection; UV: Ultraviolet light Additional material
27 section matches

Abstract

We review behavioural change models (BCMs) for infectious disease transmission in humans. Following the Cochrane collaboration guidelines and the PRISMA statement, our systematic search and selection yielded 178 papers covering the period 2010-2015. We observe an increasing trend in published BCMs, frequently coupled to (re)emergence events, and propose a categorization by distinguishing how information translates into preventive actions. Behaviour is usually captured by introducing information as a dynamic parameter (76/178) or by introducing an economic objective function, either with (26/178) or without (37/178) imitation. Approaches using information thresholds (29/178) and exogenous behaviour formation (16/178) are also popular. We further classify according to disease, prevention measure, transmission model (with 81/178 population, 6/178 metapopulation and 91/178 individuallevel models) and the way prevention impacts transmission. We highlight the minority (15%) of studies that use any real-life data for parametrization or validation and note that BCMs increasingly use social media data and generally incorporate multiple sources of information (16/178), multiple types of information (17/178) or both (9/178). We conclude that individual-level models are increasingly used and useful to model behaviour changes. Despite recent advancements, we remain concerned that most models are purely theoretical and lack representative data and a validation process.

Introduction

The main objective of infectious disease transmission models is to inform and guide policy-makers to prepare for and respond to (re)emerging infectious diseases, particularly when sufficient information from controlled experiments is lacking. However, the impact of infectious disease transmission and policy interventions are subject to hosts' behaviour. Therefore, there is an interest to incorporate behaviour change in response to disease-related information into models for infectious disease transmission.
We refer to models incorporating behavioural immunity as 'behavioural change models' (BCMs), which typically complement models for disease transmission in an attempt to mimic real life dynamics. In essence, a BCM is a model in which individuals are responsive to external information about the disease and as a result take one or more preventive measures to reduce the chance of contracting the disease. The external information individuals respond to can be global (equally available and relevant to all individuals) or local (individual availability and relevance determined by physical or social proximity to the information source). Furthermore, this information can be specified in terms of actual risks ('prevalence-based') or of perceptions of these risks ('beliefbased'), as well as a mixture of all the above [7] . Vaccination is a common prevention measure with varying uptake, given historical fluctuations in the trade-off between the perceived risks of vaccine-related side effects (VRSEs) and of vaccinepreventable disease. Other common prevention measures include social distancing and condom use.
A widely used theoretical foundation for the formation and dynamic nature of individuals' behaviour comes from game theory. Game theory has a rich history in social sciences with the Prisoner's Dilemma being a frequently used illustration (see [8] for a comprehensive introduction). Game theory assumes individuals take rational decisions based on a tradeoff that embodies the anticipated rational decisions of all other individuals in society. Even though these assumptions are often not observed in real life [9] , a multitude of BCMs in the setting of infectious disease transmission still use a game-theoretical foundation that caused the development of, for instance, 'vaccination games' [10] and 'epidemic games with social distancing' [11] .
Although there is increased recognition for the need to incorporate behavioural changes in infectious disease transmission models, a consensus on the proper methodology to do so is lacking. It appears much research is not supported by empirical information but departs from a theoretical foundation with arbitrarily chosen parameter values and no validation process. As a result, there is large heterogeneity in the triggers for behavioural change and the impact on disease transmission, as well as the conclusions of such studies. There is a need for empirical data from, for instance, surveys or discrete choice experiments to support the validity of these models and to guide further research [7, 15] .
The main goal of this paper is to systematically review and document how and to which extent behavioural immunity has been explored in infectious disease transmission models over the past 5 years. In brief, we aim to investigate to which extent: (i) technological advancements and increased data availability have enriched BCMs, (ii) the literature has coped with the fact that behavioural immunity is often contingent on the disease and not coupled to disease dynamics, (iii) modelling efforts are validated with quantifiable observations and parametrized, (iv) the current models have assessed the importance of social networks in individual decisions, (v) the process of transferring information to behaviour is managed and (vi) irrational behaviour is demonstrated.
In the following sections, we systematically identify and analyse BCMs applied to infectious disease transmission, starting from where a previous review in 2010 left off [7] . These models are categorized in order to distinguish their assumptions, methods, disease and transmission-specific applications and implications. Furthermore, a critical point of view is taken when evaluating these models in terms of their real-life applicability. Current pitfalls and opportunities are identified to support the development of more advanced BCMs in the near future.

Search

We searched PubMed and Web of Science (WoS) for records published between January 2010 and December 2015. After discussing and defining the inclusion and exclusion criteria, we obtained our final search query which we used in PubMed on 12 January 2016 and in WoS on 13 January 2016: '(behavio* OR decision*) AND (change* OR influence* OR dynamic* OR adapta* OR adapt OR adaptive OR strategic*) AND (infect* OR epidemic OR epidemics OR epidemiology OR epidemiological OR epidemiologic OR pandem* OR outbreak*) AND (disease* OR vaccin*) AND (model OR models OR modelling OR modeling OR simulat* OR transmission*)'.

Selection

Infectious diseases. Only records that concern infectious diseases are included in the selection. Infectious diseases are defined using the WHO definition: infectious diseases are caused by pathogenic microorganisms, such as bacteria, viruses, parasites or fungi; the diseases can be spread, directly or indirectly, from one person to another [18] . Model. Records should consist of a mathematical model for behavioural change, for infectious disease transmission or a coupled model combining these two. Individual behaviour. Behaviour is considered the consequence of personal and voluntary choices made by an individual, i.e. we exclude studies tackling forced interventions such as school closure or mandatory vaccination, but include government interventions creating awareness, education in prevention, etc. External trigger(s). At least one trigger for modelled individuals to change their behaviour is external and has to be related to infectious disease. We exclude models with exclusively intrinsic triggers from the selection (e.g. an individual's own human immunodeficiency virus (HIV) status). Preventive measure. A preventive measure is central to the analysis (e.g. vaccination or social distancing). The behaviour of the individual is defined by the decision (not) to take preventive measures. Humans. We are interested in diseases in humans and behaviour of humans regarding these diseases, and therefore exclude research on plants, animals, the behaviour of the model itself or the behaviour of governments or institutions. Original research. We exclude review articles, letters, editorials and comments. English language. Excluding articles written in other languages.

Data extraction

Using a common data extraction protocol for each eligible article, F.V. and L.W. independently retrieved from the full text: (i) infectious disease; (ii) disease category (sexually transmitted infection (STI), influenza-like illness (ILI), childhood disease, vector-borne disease (VBD) or other); (iii) prevention measure (vaccination, social distancing etc.); (iv) source of information (global, local or multiple); (v) type of information ( prevalence-based, belief-based or multiple); (vi) effect on the model (disease state, model parameters, contact structure or multiple); (vii) disease transmission model description; (viii) BCM description; (ix) whether there was interaction between the behaviour and disease transmission model; (x) whether the analysis incorporated real-life data; and (xi) movement of individuals in the model. When applicable, other interesting information was extracted using free form fields. Again, discrepancies in interpretation were resolved through discussion.

Model structure categories

In table 1, we categorized the studies according to disease, prevention measure (topic) and whether the model is implemented at the population level or at the individuallevel (i.e. using an IBM or contact network) to simulate infectious disease transmission. Metapopulation models for disease transmission were also identified and are labelled in bold. Furthermore, the columns indicate at which level the impact of prevention measures is modelled, distinguishing whether behavioural change is implemented through a switch in infectious disease state (e.g. vaccination immunizes previously susceptible persons, and this can be modelled by moving them from the susceptible to the immune state), a change in model parameters (e.g. hygiene measures may be assumed to reduce the effectiveness of transmission) or in social contact structure (e.g. social distancing may be mimicked by a link-breaking or rewiring process between susceptible and infectious individuals in contact networks). Studies can appear in multiple categories, as some have multiple prevention strategies or multiple effects on the disease transmission model. For the transmission model category, we interpreted to which extent heterogeneity is introduced in the model. All references are categorized and represented in a spreadsheet that can be found as electronic supplementary material. The model type is often disease-dependent. For instance, all retrieved models for measles and/or pertussis are population models with vaccination as a preventive measure that affects the disease state in the transmission model. Moreover, the models are often preventiondependent. We observe that most of the models that use vaccination as a prevention strategy will impact the model through a switch in disease state. For instance, in many compartmental susceptible -infectious -recovered (SIR) disease models, vaccinated individuals move to the R compartment. General models with social distancing as a prevention strategy usually impact the model in terms of a modified contact structure, contingent on the disease transmission model. Whereas for influenza applications, this only applies for one out of seven references.

Diseases

In table 1, we classified the records based on four specific disease categories, one category for general models (not specifying a disease) and one category for other diseases. Most models retrieved were on influenza or influenza-like illness (ILI) and HIV. Other frequent diseases studied with BCMs are 'measles & pertussis' and 'syphilis & gonorrhoea'. Historically, perceptions of high risks, associated with measles and pertussis vaccination, have adversely affected the uptake of these vaccines. As a result, these are topical applications for transmission models incorporating behavioural changes, as discussed in [19] . The literature on measles is becoming more diverse as VRSE perceptions evolve; Bhattacharyya & Bauch [89] describe a model in which parents delay vaccinating their children as a result of an exogenous vaccine scare, whereas the same authors use social networks of imitation behaviour for VRSE perception spread in response to a vaccine scare [91] , and d'Onofrio et al. [92] introduce public interventions in their model to increase vaccine uptake. Diseases in the 'other' category are: SARS, smallpox-like disease, malaria, hepatitis B, Ebola, pneumococcus, pneumonic plague, toxoplasmosis and cholera. General models do not explicitly specify a disease, often assuming general applicability. As noted earlier, models tend to be disease-specific. In the case of influenza or influenza-like illness, some models look at seasonal changes in behaviour with backward looking individuals evaluating the success of their (vaccination or social distancing) strategy during previous season(s) [20,27 -31,69] . HIV BCMs are often coupled with a public health information/ education campaign aimed at evaluating public health measures to control epidemic spread or to study the costeffectiveness of these control measures [71, 72, 77, 81, 83] . An example of a more advanced, game-theoretic model is the model by Tully et al. [75] . They use an agent-based model (ABM) for the spread of risk perception, sexual behaviour and HIV transmission in the context of individual sexual encounters evaluating the behaviour of ( potential) partners.

Emergence-driven research

Between 2010 and 2015, much research has been emergencedriven. That is, the research field responds by focusing on diseases that are of major interest because of a change in the threat they present to public health. The influenza A/H1N1 pandemic of 2009 has largely influenced the development of BCMs for influenza. For example, Poletti et al. [54] use the influenza A/H1N1 pandemic of 2009/2010 to parametrize an influenza transmission model with behavioural changes focusing on the spread of risk perceptions in the population. In addition, a model on Ebola virus disease (EVD) was published in 2015 in response to the epidemic outbreak in Liberia [100] . The authors use WHO and CDC data to parametrize the model suggested in an attempt to mimic disease transmission and to identify behavioural changes as drivers of the disease dynamics. Note that, in the current review, we relate 'emergence' not only to disease emergence, but also the emergence of a vaccine scare (such as observed with measles-mumps -rubella (MMR) vaccination and pertussis whole-cell vaccination [91] ) or the emergence of new therapies for endemic diseases (such as the development of a multi-season influenza vaccine [26] ). rsif.royalsocietypublishing.org J. R. Soc. Interface 13: 20160820 Table 1 . Model structure categories. References in bold represent metapopulation models. References with author names in italics represent references that use empirical data for parametrization and/or validation. PrEP, pre-exposure prohylaxis. References in category 'Other' specify a disease other than the above. Hygiene measures include face-mask use, increased hand washing, etc.

Disease transmission models

We identify three major categories of models: populationlevel models, metapopulation and individual-level models. Population-level models traditionally formulate compartments according to health state (e.g. susceptible, infectious and recovered) and simulate transitions between the compartments over time using population averages. These models are often based on the mass-action principle to designate the transmission probability. Each individual has an equal probability of contracting disease given the disease state levels in the population. Metapopulation models split the population into different subpopulations with their own (spatial) general characteristics and disease-related parameters. The individual-level category consists of network models and IBMs. Network models represent disease transmission on a network where nodes (individuals) are connected to each other using links. This allows to model individuals with different degrees, representing how many links a node has (i.e. number of neighbours/direct contacts). IBMs or ABMs typically incorporate more heterogeneity and stochasticity on individuals' characteristics such as spatial location, age, gender, sexual orientation, etc. The model selection depends on disease characteristics, data availability, modelling purpose (i.e. what outcome figures are you interested in?), computational resources, etc.
Individual-level models are gaining interest in the BCM literature since they can introduce heterogeneity in behaviours, tackle clustering of vaccine sentiments and look at stochastic and local outbreaks of infectious diseases with a high vaccination coverage (e.g. measles). Moreover, given an underlying contact structure, individual-level models are well suited to model social distancing behaviour in terms of reduced contacts as a prevention strategy. Remarkably, for measles and pertussis we found deterministic models only, despite the widely acknowledged stochastic nature of outbreaks in highly vaccinated populations. Note that, in table 1, we also made a distinction between individual-level and populationlevel models in the category 'disease transmission model'. Metapopulation models are displayed in bold.

Information gathering

We observe that most BCMs are using information that is globally available and prevalence-based. These models are frequently game-theoretic (or pay-off maximizing) behavioural change frameworks coupled with disease transmission models at the population level. Studies that met our eligibility criteria, but are unclear about the information individuals use [14, 56, 85, 103, 111, 187] were excluded from figure 2. Given the increasing individual heterogeneity in disease transmission models, it is becoming more interesting to incorporate local information in BCMs. In network models and IBMs, one could for instance model the local spread of information through direct contacts with crucial implications in terms of clustering of both disease prevalence and opinions [186] .

How is the transfer from information to behaviour managed?

Based on full-text analysis, we extracted how individuals were modelled to translate the information they receive into behavioural change. Traditionally, behaviour formation models were composed of a game-theoretic framework in which individuals have perfect information on diseaserelated data and prevention effectiveness. Individuals are then assumed to use this information in a utility-maximizing game by comparing the expected costs of infection with the expected costs of the prevention measure. However, more advanced and different BCMs have been developed since. We identified five distinct categories for characterizing the decision-making process of individuals, listed in § §3.8. rsif.royalsocietypublishing.org J. R. Soc. Interface 13: 20160820 function moves susceptible individuals into lower susceptibility classes with lower transmission rates, independent of disease dynamics. These models are relatively rare and most often focus on policy implementations and short-term effects of behaviour on disease transmission.

Information threshold (29/178)

We retrieved 29 BCMs in which behaviour change is modelled conditional on exceeding a predefined information threshold [12, 42, 57, 58, [61] [62] [63] 70, 78, 81, 88, 114, 127, 132, 133, 135, 136, [138] [139] [140] [141] [142] [143] [144] 162, 163, [180] [181] [182] . The information the individual assesses can be obtained in a direct way (e.g. through prevalence in neighbours) or in an indirect way (e.g. through rumours or opinions). These models do not elaborate on how behaviour is rationally determined or influenced by relevant factors. Instead, behaviour formation is a result of a predefined threshold function. Examples include switching to social distancing when the number of infectives exceeds a threshold [114] , social distancing by rewiring once a noninfected node connects to an infected node [132] , and-as in Wu [184] [185] [186] 188, 189] . In this category, instead of a threshold, the information is a continuous input in the decision-making process of individuals. At the population level, we can characterize these BCMs as information driving the flow in and out the prevention taking compartment. Two subcategories can be distinguished: models with a direct relation between infectious disease parameters and behaviour formation (i.e. behaviour changes visà-vis disease dynamics), and models with an indirect relation, through an information spread medium. For the former subcategory, the behaviour or decision-making process is predefined as a functional relation depending on disease transmission parameters. The functional form does not need to be linear. Some examples are vaccination coverage as a positive decreasing function of perceived risk of VRSE [148] , the percentage of the susceptible population engaging in avoidance actions increases as the disease becomes more prevalent [48] and a model where the effective contact rate reduces with the number of infectives [119] . The latter subcategory requires a third-party spreading the information for individuals to receive. For instance through mass media, neighbours, formation of opinions in the population, etc. A multitude of these models introduce an 'aware' compartment in the model where aware and unaware individuals are assigned distinct disease transmission parameters such that aware individuals have lower susceptibility of acquiring infection. See for example Funk et al. [104] , in which a rate introduces people in an 'aware' class after which the awareness spreads through the population, coupling disease transmission with a BCM. Interestingly, some models introduce information spread models with characteristics from disease transmission models where individuals are, for example, susceptible to or infected with diseaserelated information. Misra et al. [105] use a model with media coverage creating awareness in the population, also introducing an 'aware' compartment in a population model. Social impact is introduced in a model by Ni et al. [186] , where they use a variety of complex networks for the spread of opinions driving the individual probability of prevention behaviour. The use of a network is convenient to model these dynamics as they allow clustering of, for instance, vaccine-related sentiments in the population. Most often these models assign additional characteristics to nodes (which represent individuals), apart from disease state. An example could be that a node is assigned a disease state and an opinion which is either provaccination or contravaccination. When simulating the disease and behaviour dynamics in this network, when nodes interact, transmission of both disease and opinions can occur. Such that if a provaccine node is surrounded by many vaccine sceptics, it might change its opinion towards the opinions of its links (i.e. neighbours) and as a result this will influence the individual's probability of taking vaccination as a prevention measure.

An economic objective function (37/178)

This 'economic' class of BCMs is also quite common with 37 articles being retrieved [10,11,13,19,21 -24,26,32,35,41,52,59, 75,76,79,87,90,101,110,112,118,128,145-147,151,155,157,158,160, 167,169,179,183,190] . This approach assumes individuals take their prevention decision based on an objective function, which they attempt to optimize (i.e. by maximizing benefits and/or minimizing costs). Game theory grounded models form an integral part of this category. By way of example, one can assume that individuals have knowledge about both the disease and their options for prevention and make rational decisions based on this knowledge. People accordingly possess a (perceived) cost of infection (c i ) and a (perceived) cost of the prevention measure (c p ), which can, for instance, be assumed to be 100% effective. Another important input in people's decision-making, their probability of infection (l) can be assumed to be dependent on disease prevalence, which evolves over time. For instance, one can define this using an SIR model under the mass action principle as the force of infection, i.e. l ¼ bI, where b is the per-contact transmission rate, and I is the fraction of infectives in the population. This way the behavioural change framework can be coupled to the disease dynamics. The individual makes the following trade-off, with P, the choice of taking the prevention measure with an objective function with imitation. It is recognized that some social or peer influence should be incorporated in the decision-making process of the individuals (see also models with information as a dynamic parameter). As a response to this concern, the (rational) 'game-theoretic' model has been adapted to include social influence or imitation behaviour. In these models, it is assumed that people compare their own prevention-related behaviour with that of other individuals in society. Through comparison, individuals learn whether their own behaviour is optimal and, to which extent they should adapt it. Typically, a sampling rate is assumed for individuals sampling other individuals from the population. After sampling an individual from the population, the trade-off is compared and people switch strategies with a probability as a function of the pay-off difference. Often, a Fermi-like function is used, guiding the adoption to the better strategy depending on the magnitude of the pay-off difference. Other switching functions/ strategies are used, but naturally, the larger the beneficial pay-off difference, the higher the probability of switching your behaviour. An example of a Fermi function, taken from [31] is given in this section. If we represent the payoff of the strategies of individual i (with strategy s i ) and individual j (with strategy s j ) as 1 i and 1 j respectively, and the pay-off difference is defined by D1 ij ¼ 1 i 2 1 j . Then, the probability of individual i switching to the strategy of individual j is

Model parametrization and validation

One may question how well BCMs approach reality, as there is a paucity of empirical data on behavioural responses to disease-related information informing these models. We examined whether and how data were used to parametrize BCMs, and to which extent these data support the underlying theoretical model. Moreover, we critically assessed model parametrization, distinguishing data-driven from assumptiondriven parametrization, for the disease model, the BCM and the complete integrated model. A first, striking observation is that most models are solely theoretical because they are constructed independently from empirical observations. Often a stability analysis is performed, and equilibria are obtained in order to grasp the dynamics of the model in the absence of parameter values. Others perform numerical simulations with either assumptions on parameters or referring to other studies supporting their choice of parameters. Less than 20% of the studies has (partially) fitted or validated their model to behavioural and/or disease transmission data. Retrospective studies on disease emergence are particularly useful when real-time data on behavioural change and disease transmission during an outbreak are available over a sufficiently long time. Social media data and other electronic sources of information are also increasingly used, thus creating opportunities for 'big data' collection on disease transmission, behaviour formation and spatial location [25, 60, 66] . Next, we briefly describe studies constructing their models using observational data, i.e. studies not exclusively making assumptions or taking parameters from literature.
To underpin BCMs, participatory experiments have been performed to capture social distancing. Maharaj et al. [146] and Chen et al. [183] collected data through a game in which participants trade-off social contacts versus their risk of infection. Such data can be used to parametrize game-theoretic models of social distancing and adaptive networks with link deletion. In addition, survey data have been used to assess behavioural change. Zhong et al. [48] used survey (Public Risk Communication Survey, 2009) data to parametrize their BCM. Robinson et al. [14] surveyed sexual attitudes and lifestyle to build a sexual contact network. The IBM in Gray et al. [85] for syphilis transmission was also informed with survey data on sexual behaviour. Additionally, disease transmission parameters were calibrated from syphilis diagnosis among gay men in Victoria, Australia. A survey on altruism rsif.royalsocietypublishing.org J. R. Soc. Interface 13: 20160820 and self-interest was conducted by Shim et al. [23] to calibrate the behavioural change parameters regarding influenza vaccination. In Schumm et al. [127] , the BCM is represented by a dynamic social contact network with social distancing, constructed from a survey and census data. Cohen et al. [24] surveyed a convenience sample of students about their risk perceptions for influenza A/H1N1 to estimate the utility values of different behaviours. The study by Fierro & Liccardo [70] , used data on awareness and concern about the risk of contagion to populate their model on A/H1N1 influenza transmission with behavioural parameters. Moreover, they also validated their output through comparisons with Italian influenza surveillance data from 2009. The health belief model (HBM) [191] is frequently used to retrieve prevention behaviour and parametrize BCMs. The parameters in the HBM in Durham & Casman [3] were calibrated, using survey data on perceived severity and susceptibility during the 2003 SARS outbreak in Hong Kong. Karimi et al. also use the HBM for their ABM on influenza in 2015 [45] . For validation, the authors compare their model output with similar influenza ABMs in the literature. Another model tackling the influenza A/H1N1 pandemic in 2009 is the model by Bayham et al. [60] , who used data from the American timeuse survey and the National Health and Activity Patterns Survey (NHAPS). Moreover, Google Trends data are represented as a proxy for subjective risk perception and weather data are used to control for the effects of extreme weather phenomena. Xia et al. [25] constructed a social network using data of an online Facebook-like community to construct a BCM for disease and vaccine awareness on the 2009 influenza A/H1N1 pandemic in Hong Kong. The same pandemic has inspired Springborn et al. to use home television viewing as a proxy for social distancing [56] . Pawelek et al. [66] used Twitter data of self-reporting for awareness spread and ILI surveillance data (UK Health Protection Agency) of the 2009 A/H1N1 influenza pandemic for disease transmission. In addition, Collinson et al. [68] constructed a model on influenza A/H1N1, incorporating mass media report data from the Global Public Health Intelligence Network.
Incidence and outbreak data have been useful to inform the disease dynamics parallel with BCMs. For the 2009 influenza pandemic, Zhong et al. [48] parametrized their transmission model with outbreak data from Arizona and Xiao et al. [65] estimated parameters using outbreak data (laboratory-confirmed cases) from Shaanxi province in China. Schumm et al. [127] focused on observational census and survey data from rural areas. Andrews & Bauch [41] calibrated both disease and behaviour parameters to vaccine coverage and disease incidence data. Althouse & Hébert-Dufresne [88] used surveillance-based incidence rates for syphilis and gonorrhoea from 1941 to 2002. Gray et al. [85] calibrated disease transmission parameters from data on syphilis diagnosis among men who have sex with men in Victoria, Australia. An HIV transmission model including adaptive condom use and sexual partnerships in South Africa is fitted to HIV prevalence data in Nyabadza et al. [71] . The publication makes projections for disease dynamics when scaling up condom use and reducing the number of sexual partners stepwise with 10%. Behavioural change parameters are not calibrated in this publication. The HIV model of Viljoen et al. [80] is fitted to prevalence data in South Africa and Botswana to look at the effect of awareness on disease spread.

What are current behavioural change models capturing?

In general, there is a need for empirical research to underpin the development of valid models approximating real-life behaviour and disease transmission. Some attempts for recent BCMs illustrate the difficulty of finding suitable observational data. For instance, Springborn et al. [56] used television viewing habits (average viewing time) as a proxy for social distancing, although this proxy is far removed from a direct estimation of social distancing in an outbreak situation. More promising sources of information include: survey data using, for instance, the HBM framework (also see [191, 195, 196] ) or time-use surveys [3, 14, 23, 24, 45, 48, 60, 72, 85, 127, 183] or digital sources such as social media [25, 60, 66, 146, 197] . Real-life data collection during the influenza A/H1N1 pandemic in 2009 has been a milestone for the parametrization of BCMs with increased collection of both behaviour and disease-related information. For instance, Van Kerckhove et al. [198] studied social contact patterns of symptomatic ILI cases during the pandemic. We encourage the collection of such real-time data in future outbreaks to guide policy-makers in the establishment of an optimal response strategy. For some models, data are just not available, and one needs to resort to assumptions to model behavioural change. Note also that excluding behavioural change from infectious disease models equates to assuming behaviour is unaffected by risk perceptions and disease incidence, and vice versa. Ignoring behavioural responses in the face of substantial changes in risk perceptions is probably worse than making assumptions within a theoretical model in the first place. This review has also met with important limitations in clarity of assumptions and methods in many publications, notwithstanding transparency is an essential part of publishing credible and replicable research.

Disease-dependent model specification

We observed that the specification of BCMs largely depends on the disease being investigated and the prevention measures considered. Clearly, the transmission characteristics (e.g. air and saliva borne versus STIs), the potential prevention measures (e.g. social distancing versus condom use) and the epidemic stage (e.g. emergence versus endemic equilibrium versus elimination) are interdependent, and determine both the utility and specification of a BCM. For instance, many influenza models use vaccination as a prevention measure with individuals evaluating their previous influenza vaccination decisions to determine the current season's strategy. It would seem unrealistic to require more data to parameterize both behavioural change and disease transmission models with the aim to develop more general models that suit any infectious disease, albeit that behavioural change in response to one disease's risk perceptions could change the risk perceptions of another. At the current stage of BCM development and parametrization, generalized BCMs accommodating multiple pathogens and different transmission routes seem unrealistic. However, it would be easier to combine multiple diseases with the same transmission and prevention properties. For instance, BCMs assessing the combined effects of vaccination scares on MMR and diphtheria, tetanus, pertussis (DTP) disease seem intuitively possible and relevant, though technically challenging and high on data demands.

Social networks and individual-based modelling

Some models use a single social network for both the disease transmission process and the formation of behaviour. Nonetheless, depending on the background, separate networks may be needed to model the spread of risks and the spread of information influencing behaviour. Take for instance anti-vaccine sentiments. These are often spread through blogs, Facebook groups and other social media [197] . Unlike these sentiments, infections are not spread through the Internet, and as a result require an additional network of physical contacts (see also Grim et al. [202] , who make the case for modelling multiple networks). Additionally, the timescale of disease transmission can differ substantially from that of information spread leading to behaviour change. The models by Fukuda et al. [31] , Helbing et al. [167] and Maharaj & Kleczkowski [134] are useful examples to guide further development of BCMs with separate parallel and sometimes interacting networks.

Conclusion

We have systematically reviewed the literature on BCMs published from 2010 until 2015. We analysed and classified 178 references after full-text processing. We proposed a classification of the BCMs based on the decision-making process of the individual. We can summarize our findings in line with the six aims we listed in the introduction. Regarding the technological advancements and increased data availability (i), we find that social media and big data are useful to parametrize BCMs and present an as yet insufficiently explored source of information. Social media can, however, introduce a bias in individuals' prevention-or disease-related perceptions. In addition to the health recommendations they make, policy-makers can optimize their influence by enabling the collection and accessibility of government-owned data (such as surveillance) and by establishing a quality label for disease-related websites. Further, we can confirm that behavioural immunity is often contingent on the disease (ii): BCMs are disease and situation-dependent, which we strongly support. Regarding model validation and parametrization with quantifiable observations (iii), we can state that additional data sources are needed to specify relevant BCMs. Although the 2009 influenza pandemic presented an opportunity for parametrization and validation of both disease transmission and BCMs for flu-like illnesses, there is still much room for improvement in other disease areas. Current models have, without a doubt, assessed the importance of social networks in individual decisions (iv). Individual-level models such as IBMs are extremely useful to tackle behaviour changes and to mimic disease transmission better. More specifically, (v) the diversity observed in BCMs has increased the feasibility of introducing social influences and irrational behaviour (vi). In terms of policy recommendations, it is highly important to think about the total effect of an intervention, with possible implications on all prevention strategies.
The expansion of BCMs has been remarkably valuable. We encourage researchers to incorporate behaviour changes in future disease transmission models and to be transparent about the assumptions they make if data sources for parametrization or validation are sparse.
26 section matches

Abstract

In 2016 the World Health Organization identified 21 countries that could eliminate malaria by 2020. Monitoring progress towards this goal requires tracking ongoing transmission. Here we develop methods that estimate individual reproduction numbers and their variation through time and space. Individual reproduction numbers, R c , describe the state of transmission at a point in time and differ from mean reproduction numbers, which are averages of the number of people infected by a typical case. We assess elimination progress in El Salvador using data for confirmed cases of malaria from 2010 to 2016. Our results demonstrate that whilst the average number of secondary malaria cases was below one (0.61, 95% CI 0.55-0.65), individual reproduction numbers often exceeded one. We estimate a decline in R c between 2010 and 2016. However we also show that if importation is maintained at the same rate, the country may not achieve malaria elimination by 2020.

G reat strides have been made since 2000 in reducing the burden and mortality of malaria. The World Health Organisation (WHO) estimates that 57 out of the 106 countries with endemic malaria transmission in 2000 reduced their incidence of malaria by >75% between 2000 and 2015 1 . As a result, malaria elimination at the national level, defined as the absence of local transmission within a country 2 , is now one of the targets in the WHO Global Strategy for Malaria 2016-2030 3 . In 2016 the WHO identified 21 countries for which it would be realistic to eliminate malaria within the next five years 4 .
As countries attempt to move towards malaria elimination, tracking progress through quantifying changes in transmission over space and time is key. This information is necessary to effectively target resources to remaining 'hotspots' and 'hotpops' 5 where transmission remains, decide if and when it is appropriate to scale back interventions, and to evaluate the success of existing interventions. However, as countries approach zero cases, increasing focality in transmission and the impact of imported cases pose challenges to both reaching elimination 6 and measuring progress towards that goal. Increased spatial and temporal heterogeneity in malaria cases [7] [8] [9] in low transmission settings reduces the usefulness of national or regional level trends in incidence or prevalence, which can mask small areas of high transmission intensity. Furthermore, end-game surveillance and control measures are increasingly expensive per case. Therefore, while interventions must be targeted efficiently to be costeffective 7, 8, 10 , the identity of areas driving remaining transmission and their stability over time are poorly understood.
A wide variety of contextually varying factors have been hypothesised to drive transmission in low transmission settings, including increased risk in concentrated populations due to factors such as occupation (e.g., agricultural workers) 6 , asymptomatic individuals acting as reservoirs of infection 11, 12 , changes in vector behaviour 13 and resistance to antimalarial 14 and insecticidal interventions 15 . Importation of malaria cases from neighbouring countries poses an additional challenge in many elimination settings. If many cases of malaria are imported, control measures may appear less effective due to small numbers of locally acquired cases arising from imported cases 16, 17 . If there is sufficient importation, local cases can continue to occur even when the reproduction number of malaria under control, R c , is below 1. Conversely areas with a high underlying R c but little importation may see sudden outbreaks of cases following a rare importation event due to their receptivity to malaria infection 18 . Challenges arise in measuring the sustainability of elimination 6, 17 , both in terms of quantifying the impact of control measures on transmission in the lead up to elimination, and in determining the risk of resurgence once elimination is achieved [19] [20] [21] . This information is also important when deciding if, when, and how to scale back intervention and surveillance methods 19 .
Meeting these challenges requires measuring changes in transmission, often at fine spatial scales. However, existing methods used to quantify malaria transmission are poorly suited to elimination settings 9 . Parasite prevalence rates (PR) are not accurate below a PR of 1-5% 22,23 due to the large sample sizes necessary for precise estimates at low prevalence. The entomological inoculation rate (EIR), often seen as the 'gold standard' in measures of transmission intensity, is not reliable when transmission is highly focal and potentially unstable since EIR is very sensitive to heterogeneities in vector populations 24, 25 . Use of serological data, while promising [26] [27] [28] , is not currently feasible for use in many near-elimination contexts, as suitable cross-sectional survey data and/or appropriate markers to determine changes in malaria transmission are not available in all contexts.
A possible alternative, or complementary, measure of malaria transmission is the incidence of malaria cases, obtained through routine surveillance by Ministries of Health. Surveillance data are widely collected and sensitive to short term changes in transmission. While utilising these data can pose challenges, particularly in low-resource settings due to limitations in surveillance infrastructure and difficulty in establishing completeness of reporting, they can provide a wealth of information when such challenges are overcome. Individual level incidence data can be used to reconstruct the most likely pathways of transmission and estimate individual reproduction numbers, providing fine-scale insights into spatiotemporal transmission characteristics. While individual level surveillance data is often used in outbreak analysis of epidemic infections 29, 30 , robust methods are rarely applied to vector-borne diseases such as malaria, with a few notable exceptions 17, 31, 32 .
Here we aim to estimate individual reproduction numbers over time and space by adapting methods from the study of information diffusion processes. These methods address the general problem of reconstructing information transmission using known or inferred times of infection by a 'contagion' [33] [34] [35] [36] . They provide an adaptable framework to integrate multiple data types 37 , identify likely unobserved cases/external infection sources, and have been evaluated using real and simulated transmission processes at multiple scales and network structures 36 .
El Salvador provides a promising context to explore this approach. In 1980, El Salvador had the highest incidence of malaria amongst all Mesoamerican countries-with 95,835 cases and a 38% share of all cases in Mesoamerica. However, by 1995, the country contributed just 2%, maintaining low incidence until the present day. The country is now in the elimination phase and saw seven malaria cases in 2015 (0.1% of cases in Mesoamerica) 38 . Epidemiologists in El Salvador have kept records at a high spatial and temporal resolution throughout their malaria control and elimination efforts. In addition there has been a long history of reactive and active case detection, testing and treating all patients with fever with antimalarials and an extensive network of community malaria workers has been in place since the 1950s 38 , evidence suggesting that case detection and treatment is strong. A full understanding of elimination in El Salvador could therefore provide useful insights for other countries as they aim to achieve and sustain elimination. Using the epidemiological line-list maintained by the Ministry of Health, we applied our methods to these data to estimate how transmission varied over space and time in El Salvador between 2010 and 2016.
Our results suggest a decline in R c between 2010 and 2016, with seasonal peaks during the wet season and during holiday periods. However we find that, based on the observed distribution of R c over time, with individual reproduction numbers often exceeding one R c >1), we cannot say with 95% confidence the country will achieve malaria elimination by 2020, assuming that importation is maintained at the same rate. Our results illustrate the role of importation in driving transmission dynamics in this country and provide independent estimates of the likelihood that El Salvador can eliminate malaria by 2020.

Results

Network reconstruction and estimated R c values. Between 2010 and the first two months of 2016, a total of 91 cases of malaria were confirmed by microscopy in El Salvador, of which 30 were classified as imported. There were a total of six cases of P. falciparum, all of which were imported. Our estimated transmission network is shown in Fig. 1 . Overall, the temporal dimension, informed by the prior distribution for the serial interval (Fig. 1a) , dominates the identification of infector-infectee pairs (Fig. 1b) . We identified two locally acquired cases which could not be plausibly linked to other cases within the dataset (Fig. 1c) . These were estimated in periods in which a clear gap in the data was apparent, and may therefore represent unidentified importations, relapse cases or unreported locally acquired sources of infection.
Spatial distribution of cases and R c . Data were highly focal, with 70% of cases originating from two adjacent administrative departments neighbouring Guatemala, and 32% of cases originating from just two municipalities within these regions (Jujutla and Acajutla) (Fig. 3a, b) . This pattern was also reflected in the spatial distribution of R c . While most areas of the country are predicted to have a low risk of R c reaching above one over the time observed, several regions have a much higher predicted risk of R c > 1 (Fig. 3c ). In these regions, the majority of cases imported into the region could be expected to result in at least one onward transmission event. However it is important to note that uncertainty in these predictions is high in areas where we have not seen cases. The area where we have the least uncertainty in our estimate, around the borders of Guatemala, suggest that most cases occurring there did not contribute to onward transmission.
Impact of imported cases on transmission. The mean marginal gain to the likelihood of including infections from imported cases into the constructed transmission networks was much higher than including locally acquired cases (0.081 compared to 3.44e −7 ), suggesting that imported cases are a major driver of transmission. Visual inspection of the most likely chains of transmission ( Fig. 1c ) also are suggestive of this, where the index case in a cluster of linked cases was almost always an imported case. Endgame predictions based on R c and stochasticity. To investigate potential timelines to elimination (i.e., the absence of local transmission) we characterised heterogeneity in the reproduction number using a Gamma distribution which, when fitted to the data, suggests a threshold mean R c of 0.22, below which there would a <5% chance of any individual reproduction number exceeding one. Using our fitted trend in the mean R c , we expect this level to be reached by 2023, assuming no change in the rate of importation (Fig. 2c ).

Discussion

Understanding how transmission varies over time and space is critical to efforts to achieve and maintain elimination of infectious diseases such as malaria. Reconstructing transmission chains and estimating individual reproduction numbers has been used widely within epidemiological analysis 30, 39, 40 , but rarely used to study vector-borne or endemic diseases, albeit with a few notable exceptions 31, 32 . Separately, similar problems have been approached within human social network analysis, through a family of approaches known as independent cascade models [33] [34] [35] [36] .
Here we have adapted these methods to routine data from an eliminating Central American context, El Salvador, in order to inform progress towards their malaria elimination goals. Our results suggest that the time-averaged R c has been below 1 in El Salvador since 2010, suggesting that sustained endemic transmission at the country level has already been interrupted. However, we estimated individual reproduction numbers exceeding one, resulting in ongoing outbreaks of transmission. Assuming the downward trend observed in R c between 2010 and 2016 continues, we expect the probability of such outbreaks to be <5% by 2023 if current levels of malaria importation and control continue. However, because we found imported cases to have higher reproduction numbers and their inclusion in the transmission tree increased the overall likelihood of the tree much more than locally acquired cases, it is important to note that the rate of importation is likely to affect the distribution of R c . With increased importation this timeline to elimination could lengthen. Conversely, if importation was reduced, the timeline would be shortened. Thus the levels of malaria importation from neighbouring countries would likely need to be decreased in order to achieve elimination by 2020, following current WHO certification policy of three years of zero locally acquired cases.
The Elimination of Malaria in Mesoamerica and Hispaniola (EMMIE) initiative aims to eliminate local malaria transmission from the entire Mesoamerican region by 2020 41 . Our results support the need for a regional approach to elimination. The clear impact of importation in driving residual transmission in El Salvador highlights the need for cross-border collaboration. In order to drive transmission down, areas of the highest 'receptivity' to intervention and 'vulnerability' to importation of cases must be identified. Approaches such as ours, which map transmission risk, could be combined with information about human movement to identify foci for increased surveillance, vector control and other interventions. Our approach using El Salvador as a case study could be adapted and used in other Central American countries or other contexts aiming for elimination.
We identified two cases with no clear source. When raising the threshold likelihood for linking observed cases as part of our sensitivity analysis and reducing the number of possible edges in the network, we find 7 missing cases. There is evidence in some low transmission contexts, especially where rapid declines of malaria have been seen recently, of significant asymptomatic and/ or submicroscopic reservoirs of infection which may transmit to onwards transmission 42 . These could be sources of the missing infections identified in our study. However, El Salvador is unlikely to have a large amount of asymptomatic cases due to a long history of low numbers of cases. If our missing source of infections was mainly indigenous asymptomatic infections, it would signify that there is an asymptomatic reservoir contributing to onward transmission and that must be controlled to reach elimination. This could be achieved through PCR-based screening and treatment or increased vector control in focal areas. An alternative explanation is that there may be a small number of unreported symptomatic cases or relapse cases which were not reported or detected, which could be indigenous or imported. If due to importation this would further support the need for strong regional cooperation via initiatives such as EMMIE to reduce burden in neighbouring countries, and to maintain vigilance over extended periods in a near-elimination stage.
It is important to consider whether methods presented here can be used in low resource settings that are earlier in the elimination process. In these contexts the number of cases is likely to be higher and there may be less complete reporting data and potentially a higher reservoir of asymptomatic infection. In order to address these challenges several adaptations to the methods presented here may be required. First, there may be a need to incorporate more sources of information, e.g., demographic, spatial and possibly genetic data 30, 37 . Second, Bayesian data augmentation techniques 43 may be required to explore the implications of large amounts of missing infection and potential reporting biases. In the case of more asymptomatic or untreated malaria there may be more uncertainty in the serial interval of malaria, however using our current approach can propagate this uncertainty through the model. Generalisations to full likelihood based or Bayesian hierarchical inference 36 can be beneficial by providing flexibility through parametric forms by allowing for the incorporation of additional factors (e.g., genetic distance) specific to the disease and context. This work provides a novel framework for making use of routine surveillance data, and allows quantification of malaria transmission and its variation over space and time in contexts where traditional methods such as parasite prevalence are unsuitable. This is key in designing optimal strategies to accelerate, achieve and maintain elimination. To apply to other contexts several adaptations and extensions may be required. Firstly, in this dataset there were no confirmed relapse cases, however in many contexts we may see P. vivax relapse, in which case the algorithm could be adapted to allow for a likelihood for 'reinfection' or a looped network edge. Second, in settings where transmission links are less clearly identifiable or different data sources are available, this approach can be adapted to incorporate additional features such as spatial or genetic distance weightings into the likelihood function 37 , following on from work based on Wallinga and Teunis approaches 30, 43, 44 . Finally, asymptomatic reservoirs and causes of missing cases, as well as their impact on transmission dynamics could be explored in more detail to consider surveillance system design and evaluation of its strength.
In conclusion, this work adapts concepts from network theory to build and apply novel methods to map transmission over space and time in a near-elimination setting, using only routine malaria surveillance data. Such approaches offer opportunities to better understand transmission dynamics and their heterogeneities in near elimination settings to better target interventions for elimination. We estimated timescales for reaching elimination and clarified the effect of importation on the speed and feasibility of achieving and maintaining zero cases. In the context of El Salvador, our results highlight the impact of importation on sustained transmission and highlight the need for cross-border collaboration. Our approach could be useful in a wide range of contexts where good quality routine surveillance data are collected, such as outbreaks and endemic diseases nearing elimination.

Methods

We defined a prior range of possible serial interval distributions for malaria. The serial interval distribution of treated, symptomatic P. falciparum malaria, previously characterised using empirical and model based evidence 47 was adapted to inform the prior distribution for the relationship between time and likelihood of transmission between cases in El Salvador. Two cases imported from West Africa were P. falciparum, however the remainder of cases were P. vivax. As a result the prior distribution was altered to better reflect the biology of P. vivax and the dominant vector species in El Salvador, Anopheles albimanus, but was uninformative enough to allow for possible variation in transmission dynamics, for example due to imported infections with P. falciparum. In addition, there is a possibility of a small number of asymptomatic or undetected and therefore untreated infections contributing to ongoing transmission, which will take on a longer serial interval. By defining a prior distribution for the serial interval distribution we can account for some of this uncertainty.
Determining the transmission likelihood. We assume cases were classified correctly from case investigation as imported or locally-acquired based on recent travel history. Following this assumption, locally acquired cases could have both infected others and been infected themselves. However imported cases could only infect others, as we assume their infection was acquired outside of the country. There were no confirmed relapse cases in the dataset, and all cases were treated with primaquine and chloroquine (radical cure) after being detected. Treatment is initiated before cases are confirmed by microscopy (see Supplementary Fig. 1 ). Active case detection is initiated locally following a confirmed case and in active foci in which local surveillance is believed to be weak. In these scenarios blood slides are examined within 24 h of being taken 49 . Given this, we assume that an individual can only be infected once by a case that has shown symptoms earlier in time.
; t n f g , time ordered such that t 1 transmission chain, T . The goal of our model is to infer the most probable network structure, G, connecting these n infections. We can view cases as nodes in a network G, and possible transmission events as the edges linking nodes. We infer G solely from the symptom onset times t, a serial interval distribution, and prior probability distributions for the serial interval distribution parameters.
G contains all possible spanning transmission chains over which an infection could spread given the observed times. G therefore includes the most likely transmission tree, but also includes, other possible trees supported by the data. We therefore can view a particular transmission tree T as a realisation of a stochastic diffusion process generated over an underlying network G. Crucially, G, accounts for competing edges and is sparse (only includes plausible edges).
For a given transmission tree T describing infection events linking cases and assuming the independent cascade model 33 , the (upper triangular) likelihood of observing our times of symptom onset is simply the product of all permissible pairwise transmission likelihoods in the tree 35 . Our exposition until this point is the same as that introduced by Wallinga and Teunis 29 and extended to a wide variety of contexts by others [30] [31] [32] 43, 44, 50 . However, in contrast to previous methods based on Wallinga and Teunis we maximise the likelihood f ðtjGÞ conditional on an underlying G, a problem that is NP-hard 51 . Previous approaches have either allowed all possible connections in G 29 , sampled from the likelihood 52 or explored a limited number of pathways 53 . Here, by following the approach introduced by Rodriguez and Schölkopf 35 , we find the most likely underlying transmission network given the timing of symptom onset for a set of k transmission events linking cases. The computational hardness of maximising f ðtjGÞ meant that an optimal solution could only be found by exploring every possible transmission tree on G. However, due to the submodularity of the independent cascade model 33 a near optimal solution could be found using a greedy algorithm. Briefly, the greedy algorithm used starts with an empty graph, and then add edges sequentially such that the marginal gain in the likelihood of the transmission tree for each iteration is maximised. The marginal gain measures of importance for each edge of the network through the gain that each edge provides to the total solution over competing edges, and therefore applies shrinkage to the raw pairwise likelihood with the likelihood of competing edges. We stop when we have reached k edges (see Supplementary Fig. 2 ). Stopping at k edges ensures that the resulting network is sparse which not only ensures a parsimony but removes unnecessary edges that could influence R c calculations. An appropriate value of k is defined by adding edges until the marginal gain in likelihood of adding additional edges is below a given threshold (0.0005). We carried out a sensitivity analysis and find our results are robust to changes in this threshold between 0.001 and 1e−10 (Supplementary Note 2, Supplementary Fig. 3) .
Accounting for missing cases. Assuming all cases reaching community health workers or health facilities are recorded, missing cases may be generated by two processes. Symptomatic cases may be missed by not seeking care or being found through active case detection, and or cases may be asymptomatic and therefore unlikely to seek care or be detected. The latter may have densities of parasites in their blood which are too low to be detectable by microscopy if active case detection occurs. These processes apply to both imported cases or locally acquired cases. We assume the pool of asymptomatic cases in the country is low and has a small contribution to ongoing transmission. To estimate the proportion of cases which may be going undetected within our independent cascade framework, we consider outside sources of infection, π that represent unobserved individuals who can infect any observed individual, i, in a transmission chain. Every observed individual i can get infected by unobserved individuals, π, with an arbitrarily small probability ε. This so called ϵ-edge is connected to every node in our network and do not, by design, participate in the diffusion propagation. The ϵ-edges prevents breaks in the network diffusion cascade where the likelihood of transmission between observed cases is sufficiently low, and instead the case is linked to an external source. The specific value of ε was set at 0.0005, aiming to find a balance between false positives and false negatives when linking cases by infection events. The higher the value of ε, the larger the number of nodes that are assumed to be infected by an external source.
In contrast of traditional methods based on Wallinga and Teunis 29 using the marginal gain in this way encapsulates not only the pairwise likelihood of transmission between two cases, but conditions this likelihood on the impact of competing edges in the inferred network. Given the provable near optimal solution of the greedy algorithm and the use of marginal gains in calculating R c , our estimates of R c provide more rigorous estimates of reproduction numbers than just using standard Wallinga and Teunis 29 approaches, which do not consider the overall transmission tree in optimisation and do not account for missing cases (see Supplementary Note 3 for full derivation of methods).
Mapping R c . To map estimates of transmission risk, individual reproduction numbers were divided into those above and below one. The latitude and longitude of the reproduction numbers were included in a geospatial hurdle model implemented in rINLA 54 where demographic and environmental covariates were used to estimate the likelihood of a case having a reproduction number above 1 if imported into the area in 2010 (Supplementary Note 4, Supplementary Table 2) ). This is a measure of malaria 'receptivity' or underlying transmission potential rather than overall malaria risk, as importation likelihood is not quantified in this analysis. AUC scores from leave one out cross validation were used to assess model fit ( Supplementary Fig. 4) .
2 section matches

Abstract

Due to worldwide increased human mobility, air-transportation data and mathematical models have been widely used to measure risks of global dispersal of pathogens. However, the seasonal and interannual risks of pathogens importation and onward transmission from endemic countries have rarely been quantified and validated. We constructed a modelling framework, integrating air travel, epidemiological, demographical, entomological and meteorological data, to measure the seasonal probability of dengue introduction from endemic countries. This framework has been applied retrospectively to elucidate spatiotemporal patterns and increasing seasonal risk of dengue importation from South-East Asia into China via air travel in multiple populations, Chinese travelers and local residents, over a decade of 2005-15. We found that the volume of airline travelers from South-East Asia into China has quadrupled from 2005 to 2015 with Chinese travelers increased rapidly. Following the growth of air traffic, the probability of dengue importation from South-East Asia into China has increased dramatically from 2005 to 2015. This study also revealed seasonal asymmetries of PLOS Neglected Tropical Diseases | https://doi.

Discussion

The number of imported cases reported in surveillance systems could be predicted by the estimated time series with one-month lag, which might be due to the longer period of travel, and the delay identification and reporting of imported infections by the routine surveillance. The gaps between the estimates and reported numbers found in this study also highlight the needs to improve the capacity of surveillance systems and formulate strategies to mitigate these imported contagions, and public health authorities and partners in areas with huge volume of imported infections and high risk of autochthonous transmission should consider implementing appropriate actions at an early stage of potential seasonal transmission. These could include health education in Chinese travelers and early identifying the infections in entry points, and improve the capacity of surveillance, vector control, laboratory diagnosis, and clinical management.
2 section matches

Abstract

Methodology/Principal Findings: We develop an extension of standard epidemiological models, appropriate for emerging infectious diseases, that describes the probabilistic progression of case numbers due to the concurrent effects of (incipient) human transmission and multiple introductions from a reservoir. The model is cast in terms of surveillance observables and immediately suggests a simple graphical estimation procedure for the effective reproductive number R (mean number of cases generated by an infectious individual) of standard epidemics. For emerging infectious diseases, which typically show large relative case number fluctuations over time, we develop a Bayesian scheme for real time estimation of the probability distribution of the effective reproduction number and show how to use such inferences to formulate significance tests on future epidemiological observations. Conclusions/Significance: Violations of these significance tests define statistical anomalies that may signal changes in the epidemiology of emerging diseases and should trigger further field investigation. We apply the methodology to case data from World Health Organization reports to place bounds on the current transmissibility of H5N1 influenza in humans and establish a statistical basis for monitoring its evolution in real time.

Introduction

The quantity that measures the epidemic potential of a pathogen is the basic reproduction number R 0 [11, 12] . R 0 is defined as the average number of new infections created by an infectious individual in an entirely susceptible population. For established human pathogens, leading to standard epidemics, R 0 .1, as is the case of seasonal or pandemic influenza [13] [14] [15] [16] [17] [18] [19] . In practice, epidemiological data typically permit only the estimation of the effective reproduction number R, which may differ from R 0 due to acquired immunity and other factors. For an emerging infectious disease, when transmission is only incipient [20] and the pathogen is adapting to the population, it becomes crucial to monitor quantitative changes of the effective reproduction number over time. Thus, the detection and tracking of an emerging disease can be formalized in terms of monitoring R, as it evolves and approaches the critical threshold RR1. This is likely the current state of H5N1 avian influenza in humans, where complete absence of human to human transmission would imply R = 0, but likely R is very small, as a few cases of possible human contagion suggest [21] [22] [23] .
19 section matches
T he Ebola outbreak of 2014 has demonstrated very clearly the vast harm that infectious diseases can cause when combined with other exacerbating factors. Poor health systems, a slow international response, low public trust in government and medicine in the outbreak area, the introduction of the virus to urban settings, and social practices such as burial rites that increase the risk of contagion have created a 'perfect storm' for Ebola transmission (1) . The gravity of the Ebola outbreak is demonstrated by thousands of deaths, 1 and by the potential for further spread. The interconnected and cross-border nature of the threat has facilitated Ebola transmission in West Africa, and it has also led to cases appearing in both the United States and continental Europe.
As the global community mobilises in response to Ebola, it is clear that there is a need to better acknowledge that the spread of infectious diseases is related to the circumstances in which human society is organised. This has long been the case; human interactions with animal hosts have led to infectious disease outbreaks dating at least as far back as the Justinian Plague (541Á542 AD), while global trade and travel have facilitated disease transmission, from the plague in the 14th century, smallpox in the 16th century, to SARS or a novel influenza virus in the 21st century (2Á4). Indeed, it is often noted that in today's interconnected world, a disease can spread nearly anywhere on Earth within 24 hours. But infectious disease risks are not simply a by-product of globalisation. As Ebola demonstrates, a great many factors, some not restricted to the disease control or public health arena, can influence the spread of infectious disease. Assessing disease-related health risks thus necessitates understanding where they might arise and how successful it might be in transmitting. The largest risks tend to occur when novel diseases appear, when familiar diseases appear in novel geographic locations, or when preventative disease control measures break down, whether due to socioeconomic inequalities, lack of resources, accident, or conflict.
Despite this, the interconnections and interdependencies that create (and are created by) emerging infectious diseases tend to be downplayed, to the detriment of public health. In order to address the topic, this paper provides disparate examples of infectious disease outbreaks, with the common theme being the many factors that combined to worsen the situation. 'Infectious disease drivers and interconnected risk' section will discuss the types of factors that can influence the spread of infectious disease. 'Examples of interconnections and interdependencies in recent infectious disease events' section will provide some examples of recent infectious disease incidents that have been created by the interconnectedness of factors introduced in the 'Infectious disease drivers and interconnected risk' section. These examples are primarily focused on the European Union (EU), the area in which the authors are based, but the global linkages have been emphasised. 'Analysing interconnected and interdependent infectious disease risks' section will discuss some of the integrated risk modelling approaches that have been developed in an attempt to monitor and even predict infectious disease transmission, and the 'Conclusion' section will summarise this paper by reiterating the need for and importance of recognising the interconnected and interdependent nature of infectious disease risks.

Infectious disease drivers and interconnected risk

Globalisation and environmental change A wide range of infectious disease drivers can be grouped under this category, including climate change, land-use patterns, global trade and travel, migration, and so on. Climate change involves mean temperature increases in many parts of the world, as well as increased likelihood of adverse or even extreme weather events (11Á13). Many infectious diseases are temperature sensitive as many vectors and pathogens are dependent upon permissive ambient conditions. There is thus a substantial body of research that collectively demonstrates that warming will increase the transmission of vector-borne diseases in the geographic ranges of their distribution (14Á18). Changing temperature and precipitation patterns can affect the habitats and population growth of cold-blooded disease vectors, such as mosquitoes and ticks, as well as the replication rates of infectious diseases within their hosts, and even the rates at which disease-carrying vectors bite humans (18Á20).
Among the best substantiated indicators of the observed effects of climate change on infectious disease is evidence of an altitudinal increase of malaria in the highlands of Columbia and Ethiopia (21) and of the northerly expansion of the disease-transmitting tick species, Ixodes ricinus, in Sweden (22) . Many modelling studies project significant shifts in the transmission of vector-borne diseases such as malaria (23, 24) , dengue (25) , and Chikungunya (26) under climate change scenarios, but it is important to note that the extent of observed changes will depend on the presence or absence of mitigating measures, such as vector control or socioeconomic development (27, 28) . Other examples of infectious diseases in Europe anticipated to be affected by climate change include West Nile virus (29) , salmonella (30) , campylobacter, and cryptosporidium (31, 32) .
Intensified global trade and travel, not to mention migration, render political borders irrelevant and create further possibilities for global disease transmission (36Á38). There are numerous examples of the arrival, establishment, and spread of 'exotic' pathogens to new geographic locations, including malaria, dengue, Chikungunya, West Nile, and bluetongue in recent years, aided by shipping or other trade routes (36) . This process is facilitated when the environmental conditions in different parts of the world share common characteristics (36) . Meanwhile, numerous vaccinepreventable diseases, such as polio, meningitis or measles, can also be introduced or reintroduced to susceptible populations as a consequence of international travel (39) .

Social and demographic factors

Social and demographic contexts can significantly influence the transmission of infectious disease, while also creating increased vulnerabilities for some population subgroups. The elderly are at greater risk of many infectious diseases, and the ageing trend in many high-income countries could increase the challenges related to nosocomial (hospital-acquired) and nursing-home acquired infections. An additional challenge related to population ageing is that the share of employed workers in a country decreases. The combination of more people to care for and fewer taxrelated revenues may challenge publicly financed public health and disease control programmes (7) .
It is widely established that socially and economically disadvantaged groups suffer disproportionally from disease (42) . This is applicable to infectious disease burdens in both high-and low-income settings (43, 44) . Income inequalities are generally widening globally, and this appears to be have been exacerbated in many countries due to the global economic crisis (45) . Rising unemployment and the prospect of public health budget cuts can increase the risk of infectious disease transmission (44, 46) , with a prominent example being an outbreak of HIV among people who inject drugs (PWID) in Greece (see 'Measles among Roma in Bulgaria and HIV among PWID in Greece: the impact of socioeconomic contexts' section) (47, 48) . In a similar fashion, it has been speculated that tuberculosis rates could rise in some countries in Central and Eastern Europe (49) .
Social trends and behaviours can also play a significant role in infectious disease transmission. The most notable example would be vaccine hesitancy, the phenomenon through which vaccination coverage rates remain suboptimal due to the varying and complicated reasons that individuals may have for not getting vaccinated (50, 51) . In some cases, this might be related to misconceptions about the safety or efficacy of vaccines (50, 52) , whereas in others this may be related to religious or cultural beliefs (53) .

Health and food production

The financing, provision, and quality of healthcare systems; the availability of vaccines, antivirals, and antibiotics medicines, and appropriate compliance to treatment protocols are all important determinants of infectious disease transmission. Although the correlation between healthcare system financing and efficacy is not perfect, recent budget cuts to healthcare are an important consideration when anticipating infectious disease risk. In part related to the global economic crisis, it has been reported that many high-income governments have introduced policies to lower spending through cutting the prices of medical products and, for example, through budget restrictions and wage cuts in hospitals (54) . There are many indirect and direct pathways through which budget cuts could affect disease transmission; to provide just one example, it has been estimated that 20Á30% of healthcare-associated infections are preventable with intensive hygiene and control programmes 2 Á should investments in this area diminish, then healthcare-acquired infections could become an even more problematic issue. There are currently roughly 4.1 million healthcare-associated infections each year in the EU alone. 3 A broader issue related to healthcare provision is population mobility for both healthcare professionals and patients who might increasingly seek work or healthcare in other countries Á the provision of cross-border healthcare and the mitigation of cross-border health threats will necessitate collaboration across borders (55, 56) and solutions for the brain-drain of medical personnel from resource-poor countries (57) . Also related to the healthcare provision and practice is the over-prescription or overuse of antibiotics. In combination with a lag in pharmaceutical innovation, rapid transmission, and poor infection control measures, this has driven resistance of organisms such as methicillin-resistant Staphylococcus aureus, or extended-spectrum beta-lactamases, and carbapenemase-producing gram-negatives such as Klebsiella pneumoniae carbapenemase (KPC) (58) . Antimicrobial resistance is currently one of the major health risks facing society (59) .

Dengue and Chikungunya in Europe: links with globalisation and environmental change

The sustained transmission of a vector-borne disease requires the presence of a pathogen, a vector capable of transmitting that pathogen, and a susceptible human population (36) . The interconnections between multiple disease drivers and interdependencies created by a globalised world have enabled the global expansion of vector-borne diseases, of which recent examples include the arboviruses dengue (70) and Chikungunya (71, 72) .
In Europe, the climatic conditions have been permissive enough to enable Ae. albopictus to gradually expand (often via transportation networks) from its introduction in Italy, where it arrived in 1990 (76) . Today, Ae. albopictus is established in many regions of the Mediterranean Basin, including in Spain, France, Italy, Croatia, and Greece. In addition, the mosquito has been introduced to regions as far north as Germany, the Czech Republic, and Slovakia. 4 Models based on the known climatic determinants of Ae. albopictus suggest that many more areas of Europe could be suitable habitats for the mosquito (77) as well as for Chikungunya transmission, with some regions currently also amenable to dengue transmission (74) . Under climate change scenarios, additional areas in Central and Western Europe, but fewer areas in southern Europe, could be climatically suitable (26, 78) .
As a result of the continuing expansion of Ae. albopictus in Europe, aided by trade and travel networks, climatic conditions, and genetic evolution, two diseases previously 'exotic' to Europe now pose a persistent infectious disease risk. This sort of risk is clearly not restricted to Europe; in late 2013, Chikungunya was introduced and then initially locally transmitted in a few Caribbean islands, creating a high risk of further disease spread across the region. 5 By early September, 2014, Chikungunya transmission has been reported in 31 countries and territories in the Caribbean and the Americas, with over 700,000 suspected cases and over 8,600 confirmed autochthonous cases. 6 Measles among Roma in Bulgaria and HIV among PWID in Greece: the impact of socioeconomic contexts Socioeconomic contexts affect the spread of disease. When financial circumstances deteriorate, the most vulnerable members of society are at even greater risk of infectious disease. Two examples in this section demonstrate the links between public health provision, income disparities, and infectious disease.
Another highly vulnerable group for infectious diseases in Europe are PWID. In the wake of the global economic crisis, many health professionals across Europe anticipated adverse effects in relation to infectious disease incidence and control (92) . An observed rapid increase in reported HIV cases among PWID in both Greece and Romania was thought to be linked to the economic crisis (48) . In Greece, for example, HIV incidence among PWID increased by 1600% in 2011 (47) . From an epidemiologic standpoint, it is difficult to causally link the economic crisis with an upsurge in HIV in a particular subpopulation. Nonetheless, there are potential causal factors worth noticing. Yearly change in the Greek GDP has been found to be inversely associated with HIV case reports, homelessness, unemployment among PWID, and HIV prevalence among drug injectors seeking drug treatment in Athens (47) . Meanwhile, causal pathways have been hypothesised to include the following factors: the economic recession increased income disparities, leading to increased homelessness; this contributed to increased injecting network sizes among PWID, which were then exposed to new introductions of HIV from migrant communities, themselves subject to difficult socioeconomic circumstances in their home countries (47) . Transmission risk was intensified by low levels of available injecting equipment and other prevention services (47) . Although difficult to demonstrate the causal linkages, this scenario suggests 5 http://ecdc.europa.eu/en/press/news/_layouts/forms/News_DispForm. aspx?List08db7286c-fe2d-476c-9133-18ff4cb1b568&ID0940&Root Folder0/en/press/news, accessed January 12, 2014. that the economic crisis, which originated in the financial sector, may have contributed to a large upswing of HIV in a vulnerable population, making conditions among PWID even more tenuous and, simultaneously, leading to fewer resources dedicated to harm prevention.
Birds are a natural reservoir for influenza viruses, and A virus subtypes H5, H7 and H9 have all led to outbreaks in human populations (94) . In recent years, the most significant outbreaks have been related to H5N1 and H7N9. Although a limited number of human cases infected with influenza A(H5N1) has been reported, the high case fatality and its potential ability to adapt to human hosts have raised concern at the global level (95) . More recently, in the spring of 2013, 145 people in China were infected by the avian influenza strain A(H7N9), leading to 45 deaths (96) . The virus was detected in poultry but also in the environment. The closure of live poultry markets in April 2013 did lead to a dramatic drop in the number of cases (97) although sporadic cases have been reported through the end of 2013. 7 A broad combination of factors can trigger and sometimes amplify avian influenza outbreaks (98, 99) . Ecological and environmental factors play a key role. Population density of both human and animals Á as well as the proximity between them Á are known risk factors for avian influenza infections in humans. Live bird markets and human consumption patterns of poultry and other avian species are also known to contribute to the risk of both influenza emergence as well as infection (100) . Seasonality is another influencing factor, although different hypotheses exist as to why winter seasons are traditionally driving influenza transmission (101) . Bird migratory patterns, particularly where migratory birds might interact with livestock poultry, create potential pathways for introduction of the virus into new regions. Air travel can quickly lead to the rapid global spread of influenza (3) . Meanwhile, the level of available public health measures, from mole-cular surveillance to rapid vaccine production, are important determinants of the impact that any given influenza outbreak might have (102) . Significantly, different avian influenza strains have different characteristics, further challenging the public health response. For example, influenza A(H5N1) is highly pathogenic in birds, leading to natural sentinel surveillance systems, while influenza A(H7N) may circulate among healthy birds, thereby remaining undetected (103) .
The combination of conflict and circulating poliovirus is potentially highly problematic, as there are currently more than two million Syrian refugees globally, with an increase of 700,000 since July 2013 (111) . This vast number, with high numbers of children, many of whom may be unvaccinated, creates the potential for further regional spread of poliovirus. In a globalised world, this risk is not localised. In the EU, 10% of new asylum applications between January 1 and August 2013 were from Syria, while the number of undocumented migrants from Syria has also increased (111) . Such circumstances increase the risk of poliovirus reintroduction and transmission.

Analysing interconnected and interdependent infectious disease risks

There is a growing awareness within the health sector that the wide range of factors Á many of them risks generated from other sectors Á that can combine to affect the transmission of infectious disease need to be better and more holistically monitored, assessed, and acted upon. This is reflected by calls to approach the topic from a broader systems perspective (7, 33, 116Á119) , which tends to emphasise the need for integrating insights from multiple sectors and disciplines. Similarly, the One Health approach, which recognises the intimate relationship between environmental conditions, animal health, and human health, has been promoted at the global level 15 ; ways of operationalising One Health concepts have also gained traction (120) . Growing attention to the social determinants of health, meanwhile, is another crucial development for assessing and monitoring infectious-disease-related risks.
This approach has been used to predict the environmental suitability of malaria transmission in Greece (124) . Malaria was eradicated from Greece in 1974, but in 2009 (and subsequent years), locally acquired cases were identified. Remotely sensed data were used to describe the environmental and climatic conditions where future transmission could happen in Greece. Sea-level altitude and the mean and annual variation of land-surface temperature, both for daytime and night-time, were predictors in this model. Defining the areas of high risk helped guide the public health responses and to integrate preparedness and response activities, including targeted epidemiological and entomological surveillance, vector control activities, and awareness rising among the general population and health workers, in the areas environmentally suitable for transmission.
Linking data from other relevant fields is currently at a very early stage of development. Incorporating human behaviour into infectious disease modelling has thus far remained an understudied area (127) . There are, however, recent promising approaches that involve tapping into mainstream and social media so as to measure and monitor issues such as vaccine hesitancy (128) . The emerging field of digital epidemiology seeks to leverage the vast amount of digital information that exists and combine data relevant to both the transmission of disease as well as health behaviours (129) . This field offers enormous promise but also needs to make progress in resolving questions surrounding methodologies, data quality and availability, and the privacy of online data.
8 section matches

Abstract

Current scientific evidence suggests that dromedary camels are a major reservoir host for MERS-CoV and an animal source of transmission to humans [4] . Camel husbandry is a deeply rooted aspect of Qatari culture. Camels are deemed precious animals, a source of business and a symbol of socioeconomic status.

Actions of the ERC Network of Decision Makers, Professional Partners, and Interested Parties

On 4 October 2012, the media reported that the first Qatari case (Q1) had been cured and that the patient was recovering. The media also reported on a press conference held by the Medical Committee of the Qatar Hajj Commission stating that "all clinical and preventive preparations for the Hajj season were in place and that there was no concern of the emergence of an outbreak as no scientific evidence was available on human-to-human transmission up to that point in time". Despite the concern, no cases of the novel coronavirus were reported among the 3.2 million pilgrims, the citizens of the KSA, or the citizens of Qatar until after the end of October 2012. Table 2 On 24 November 2012, the SCH issued a press release reporting that a second case of the novel virus had been confirmed. The press release was issued after four days of the case confirmation. It stated the following: (1) the patient was admitted to the hospital by the end of October and was diagnosed with the novel virus on 20 November; (2) the patient was recovering but had been transferred abroad upon the request of his family; (3) all of the patient's suspected contacts were screened and tested negative as confirmed by a qualified external laboratory; (4) WHO had been officially notified of the case (Q2), which was identified as the sixth case worldwide; (5) intensive consultations were held via conference call with several scientific entities (such as WHO) on the day following the confirmation of the case (21 November). The media reacted by circulating a WHO report on the disease remarking that MERS-CoV belongs to the SARS family and that an alert was issued to reinforce surveillance globally. However, it acknowledged that more information was needed to understand MERS-CoV's virology (Table 2) .

November 2012

Cases and deaths continued to be reported from KSA (Kingdom of Saudi Arabia) through August 2013. Table 3 Between November 2012 (end of the Second Phase reported above) and August 2013, the KSA continued to report cases and deaths caused by MERS-CoV. The hypothesis that MERS-CoV infection was related to camels resurfaced when the local media covered a British study suggesting that camels may have a possible role in MERS-CoV transmission to humans. This finding was considered very controversial because of the high consideration with which camels are held in Qatar society (Table 3) .

August 2013

More facts were being shared to help establish a realistic understanding of the virus. The press release explained that human-to-human transmission was possible but it also consolidated that it could be treated (locally).
There were concerns now as to why after 9 months of zero cases people begin to hear about new 5 cases being reported in less than 2 months. This created more pressure on the SCH professionals to bring about reasonable explanations. This epidemiological breakthrough helped proof that SCH was not waiting passively for the big technical institutions to reveal the virus characteristic but rather was really engaged with them in the efforts to understand the disease characteristic and risk. Yet, this discovery also implied that the investigation efforts would not be easy. It would not be easy to convince the camel owners (with the majority involved in the camel race business) to accept the notion that their camels might play a role in the disease transmission.

2-6 December 2013

On 2 December 2013, SCH and MOE organized a joint press conference to address the concerns of the public. The organizations hoped to disseminate the messages, that, firstly, the MERS-CoV epidemic in Qatar was not spreading, and, secondly, the recent detection of the source of the virus would have contributed to a greater understanding of the nature of the disease and its mode of transmission.
A national plan to screen camels was announced; nonetheless, the SCH and MOE explained that there was no intention to restrict camel movement, due to the fact that there was insufficient evidence of the role camels played in the disease transmission pattern.

Discussion

However, the response was not without error. The inherent uncertainty and lack of sufficient information caused the government spokespersons and the media to provide conflicting messages, which might have contributed to raising public concern. For example, on the very first press conference held on 24 September 2012, the spokesperson called on the public "not to panic", despite the unknown path the "novel virus" could have taken. As a consequence, the media persistently focused on the resemblance of MERS-CoV to SARS to anchor the lack of information on previous knowledge and experience [3] . Similarly, another public official announced, on 6 December 2013, (after almost one year from the first case) that Qatar is "safe", which seemed to contradict the fact that MERS-CoV's mode of transmission remained unclear and neighboring countries continued to report new cases and deaths.
1 section matches

III. EPIDEMIOLOGICAL MODELING WITH QUARANTINE AND ISOLATION

On a very basic level, an outbreak as the one in Hubei is captured by SIR dynamics [17] . The population is devided into three compartments that differentiate the state of invididuals with respect to the contagion process: (I)nfected, (S)usceptible to infection, and (R)emoved (i.e. not taking part in the transmission process). The corresponding variables S, I, and R quantify the respective compartments' fraction of the total population such that S + I + R = 1. The temporal evolution of the number of cases is governed by two processes: The infection that describes the transmission from an infectious to a susceptible individual with basic reproduction number R 0 and the recovery of an infected after an infectious period of average length T I . The basic reproduction number R 0 captures the average number of secondary infections an infected will cause before he or she recovers or is effectively removed from the population.
2 section matches

Abstract

The outbreak of the novel coronavirus disease, COVID-19, originating from Wuhan, China in early December, has infected more than 70,000 people in China and other countries and has caused more than 2,000 deaths. As the disease continues to spread, the biomedical society urgently began identifying effective approaches to prevent further outbreaks. Through rigorous epidemiological analysis, we characterized the fast transmission of COVID-19 with a basic reproductive number 5.6 and proved a sole zoonotic source to originate in Wuhan. No changes in transmission have been noted across generations. By evaluating different control strategies through predictive modeling and Monte carlo simulations, a comprehensive quarantine in hospitals and quarantine stations has been found to be the most effective approach. Government action to immediately enforce this quarantine is highly recommended.

DISCUSSION

is the (which was not peer-reviewed) The copyright holder for this preprint . https: //doi.org/10.1101 //doi.org/10. /2020 clinics and the contagion among family members. The limited medical resources only allowed patients with severe symptoms to be hospitalized. A low sensitivity of diagnosis further increased the waiting time for a confirmed case. Insufficient hospital beds resulted in a large number of home isolated patients, often leading to family infection. Observing the tremendous epidemic, the Chinese government built square cabin hospitals with more than twenty thousand beds and quickly moved all patients into these hospitals. All people with suspected symptoms or with close patient contacts were isolated in the government-managed quarantine stations. This comprehensive quarantine method has successfully reduced the transmission rate by 81.5%, and also greatly shortened the infectious time interval. The analysis in this study showed that the epidemic can be controlled within a few weeks if the comprehensive quarantine was conducted on January 20 th or earlier. Concurrently with the development of this manuscript, it was reported that South Korea had more than 1,000 home isolations in the city of Daegu for a suspected SARS-CoV-2 infection. It is highly recommended that other countries immediately quarantine all suspected patients.