Our growth rate of Covid-19 has flattened quite a bit here in Washington state, and we are now discussing how to reopen society. If we are not careful, and move too aggressively, it may be a few short weeks before we are back in the position we were in a few weeks ago. As a sailor and AI practitioner, I’ve been asking myself what role AI can play in making this return seamless?
Why do I mention sailing? Well sailing is a relatively safe sport, due in large part to a number of high tech tools, perhaps the most significant one is regular access to accurate weather forecasts.
The national weather service operates a massive sensor array including sensors located on weather balloons, in airplanes, on land, in sea buoys, and from satellites.
The sensors measure things like the atmospheric pressure, temperature, humidity, and wind velocity [1]. All of this data is continuously gathered, stored, and made readily available to downstream forecasting models (both public and private) that ultimately produce various spatiotemporal forecasts. The forecasts are made in various forecast horizons (e.g. different hours or days into the future) and at various locations or regions. Weather forecasts include, but are not limited to temperature, wave height, wind direction and magnitude.
So why am I going through all of this? Because none of it exists for infectious disease on anything like the scale we have for weather.
We’ve seen how incredibly instrumental weather forecasts are for not just preventing death and mayhem on a wide scale, but also for ensuring a stress free sail.
Back to Covid-19 and reopening the economy, so, how do we do it? There’s been much talk of contact tracing, which is certainly required and relevant. But even before someone officially tests positive (so you can contact trace) in a region (e.g. neighborhood, city, county or state), other leading variables likely indicate the probability of an uptick in confirmed positives in a given region.
Just like a confluence of variables can indicate a hurricane is on the way, we believe an infectious disease forecasting service powered by AI can predict the likelihood of a new wave of C19 and other infectious diseases.
While we likely cannot detect the arrival of a single case which can trigger a new round of infections, our hypothesis is that a new round of infections will be preceded by signals detectable by a health sensor network. We also believe there exists a significant latency between this time period and a formal confirmed positive molecular testing based diagnosis.
So, how do we build a system that can accurately forecast essential infectious disease information that can help everyone from individuals to companies, to governments determine what actions to take in the future?
Similar to weather sensors, we first require a wide array of geographically dispersed, privacy respecting, health related sensors that can be used in AI backed regional forecasting models
like we discussed in our article titled Using AI to Predict Influenza, C19 and other Infectious Disease Rates.
First let us consider smart body based sensors like the Kinsa smart thermometer which are likely to provide a time lead over a confirmed diagnosis. This thermometer has national coverage, and uploads temperature data shortly after a person’s temperature is taken. Most importantly,
well before a patient schedules an appointment to see a doctor, informs the physician of symptoms, gets a Covid-19 test ordered, and some time later gets results, this or similar thermometer data can be made available
to an infectious disease forecasting service. In addition to an increase in body temperature, C19 has some other tell tale signatures like a dry cough.
Automated acoustic analysis of cough sounds has been available for many years [2]; identifying a class of cough with distinct sounds has been done [3, 4] and more advanced work to directly identify C19 presence likelihood in a given patient is also under way with promising results [5, 6].
In addition to temperature and cough based sensors, lung, or pulmonary sounds may also play a role in providing leading predictive data for forecasting. We at Xyonix built an AI system to enable a digital stethoscope to automatically gather training data and classify heart and lung sounds. Pulmonary sounds have been studied extensively as an indicator of pneumonia [7].
In addition to leveraging acoustic analysis to assess lung condition, a smart pulse oximeter like that built by iHealthLabs [8] can be an effective way to detect and centrally upload oxygenation problems. By the time a patient feels shortness of breath, C19 caused pneumonia may already have progressed into the lungs. Early detection of oxygenation problems can help detect silent hypoxia, which serves as a leading indicator for the need for formal C19 treatment. [9]
Other potentially important body based sensors might include excreta sensors. Smart toilet technology is also rapidly making progress. Here at Xyonix we are helping a company automatically analyze in-bowl imagery. A team at Stanford has built a toilet that can detect multiple disease indicators automatically [10]. In a recent Stanford Medicine study, 32% of Covid-19 patients exhibited gastro-intestinal symptoms and 12% showed diarrhea symptoms [11]. Smart toilet detection of these symptoms could prove valuable in higher level AI forecasting models, for example if a large spike in a population appeared.
Widespread Self-Reported Symptom Assessment
Regular self reporting of symptoms can be rolled out immediately and stands a reasonable chance of providing leading predictive information. For example, Facebook recently partnered with the Delphi Group at CMU to gather symptom data from a sample of Facebook’s massive audience; together they released county level maps of Covid-19 symptoms. [12, 13] The project’s goal is to work with multiple large tech companies and eventually get millions of weekly symptom updates for use in forecasting models and visualizations. While we know many patients remain asymptomatic for some time, we are increasingly discovering new symptoms of the virus that often lead a clinical diagnosis. For example, we are hearing reports of a loss of smell and taste [13.1]; in addition new symptoms like rashes, sometimes discolored, on skin patches along toes and fingers are being reported. [13.2]
In addition to person based sensors, any infectious disease forecasting service will require environmental sensors and data.
Sensors can help provide environmental information about population density, social distancing adherence, body temperatures in public spaces, and weather information like temperature and sun exposure.
For example, Landing.AI recently released a tool that analyzes video from multiple camera vantage points in public spaces to measure subject spatial distancing. [14]
Multi-spectral cameras in public spaces can also measure body temperatures and inform infectious disease forecasting. Another potential environmental sensor is being researched at the University of Michigan; the team is assessing sewage for Covid-19. [15] We know viruses like influenza can propagate more readily depending on the weather. A recent study in Wuhan found that Covid-19 “seems to spread better in summery weather, with an optimum temperature of 19˚C, humidity of 75 per cent and less than 30 millimetres of monthly rain. Even more worryingly, the researchers found that cold air destroys the virus.” [16, 17]
In addition to person based and environmental sensors, we need to rely on biological test data. One likely essential biological factor will be the presence of antibodies in a given community. We have a number of new serological tests emerging that test the presence of Covid-19 antibodies. Recently, for example,
a study found that about 25% of New York City residents are estimated to have antibodies to the virus where much of the rest of the state had an antibody presence rate of around 4% [18].
An infectious disease forecasting (IDF) service might benefit significantly from having this information as a higher presence of antibodies in a region will likely reduce the infectiousness and future confirmed case counts. And conversely, a lower antibody presence will likely have the opposite effect.
We’ve reviewed an array of potential leading health sensor indicators, but in order to leverage AI to actually make Covid-19 forecasts, we need regular and accurate updates of the things we are trying to forecast. In machine learning or AI parlance, we call these target variables. The CDC has a pipeline for obtaining data like confirmed Covid-19 positives and deaths, including the case zip code data, regularly from state health departments nationwide. There are, however, many areas for improvement in the pipeline. For example, the system is not fully automated, including many points of manual data assembly; in addition, the lag times are not consistent and enforced. Nonetheless, some organizations can often add value to the data and are taking matters into their own hands. The New York Times, for example, has a github repository containing daily confirmed case counts for every county and state in the US.
While we have data that can be used today for forecasting,
a centralized effort could go a long way to ensuring individual privacy while improving the consistency, latency, breadth and accessibility of relevant data. If done right, this would in turn enable improved accuracy in AI based infectious disease forecasting
as well as valuable visually information dense dashboards.
Sharing Effective Interventions and Remediations
One key ingredient to this system, which goes further than what we do with weather forecasting, is to actually include all intervention and remediation efforts taken by state, county and municipal governments. If our dataset also includes these efforts in a normalized fashion, such as the dates and extent of school and business closures, our system could also help automatically assess the impact of these changes on actual case counts. If government officials are armed with detailed knowledge on how remediation actions affected future case counts, deaths and hospital utilization, they can make more informed and granular decisions. For example, a high probability of infection in a particular neighborhood 2 weeks out might trigger sample based population testing, which might in turn lead to the limited shutdown of businesses and schools in that neighborhood, rather than relying on the very blunt and broad based often state-wide actions being taken and discussed today.
To summarize, we have not seen the last of Covid-19, in fact, we are now entering a phase where more granular forecasts and remediation impact tracking are essential. This is also not the last pandemic we will see; it is essential that we build up our defenses to minimize the impact. Artificial Intelligence including deep learning based systems like we discussed HERE may prove instrumental. In order to implement such an exhaustive and accurate infectious disease forecasting system, however, we require coordinated action and significant investment in a wide health sensor network and a data aggregation and dissemination platform.