Complex Data and Deep Learning for Disease Outbreak Prediction

Disease outbreaks are not easily predicted because they occur only when multiple factors trigger the rapid spread of disease. Key factors can often be identified, e.g., excess rainfall leading to outbreaks of Rift Valley fever virus (RVFV), but the complex circumstances that lead to outbreaks remain elusive for several reasons. First, gathering varied datasets (climatic, genetics, demographic, historical, and behavioral) is time consuming and expensive. Second, the computing capabilities to mine and analyze such varied and complex datasets has not been available until recently. In this application using RVFV as a case study, we are modeling the interplay between vectors, livestock, wildlife, climate, and humans. In collaboration with InsightAI, we are also applying machine learning to construct models for inference and prediction of RVFV outbreaks. We achieve broad applicability by separating data gathering from deep learning and execution. As such, once data curation and conversion of a dataset has been completed, one can take advantage of deep learning in the absence of a computer expert. As deep learning makes few assumptions about the data, this approach is transferable to other outbreak scenarios and diseases.


RVFV is a Category A arbovirus pathogen that causes human hemorrhagic fever, retinitis and encephalitis in Africa and the Middle East.  The virus has high potential for deliberate release and bioterrorism since it may be aerosolized and transmitted by inhalation or contact with mucosal tissue as well as by blood-feeding mosquitoes.  Given the lack of a small animal model that faithfully recapitulates the diverse outcomes of human infection and the fact that RVF epidemics occur in remote areas, little is known about how innate immune events contribute to disease pathogenesis and protective adaptive immunity aside from limited observations that delayed type I IFN responses are associated with severe illness and mortality.


Funded by Center for Innovation in Global Health (2017)