NLP for Public Health: Vaccine Hesitant Persona Mapper, A Case Study

Summary

Xyonix partnered with Columbia University to tackle vaccine hesitancy by gathering and analyzing millions of social media documents and building a detailed public vaccine hesitancy map available to enrich public health strategies and power tailored interventions to boost vaccination rates.

The Challenge

Columbia University embarked on a pioneering research project aimed at understanding vaccine hesitancy by analyzing public discourse across various online platforms. The goal was to map the vast spectrum of beliefs driving vaccine hesitancy, recognizing that not all hesitancy stemmed from identical concerns. Some individuals were influenced by political beliefs, while others harbored community or religious biases. This nuanced understanding was crucial for public health departments to devise more targeted and effective messaging strategies.

Pink Dot with Number 1
1

Public Health departments struggled with vaccine hesitancy exacerbated by COVID-19 and wanted more targeted messaging.

Pink Dot with Number 1
2

Columbia University researchers faced a lack of consistently available and specialized machine learning expertise.

The Solution

Data Collection and Corpus Creation

We collected millions of documents from social media platforms over a 15-month period, capturing the public discourse around vaccines before and during the COVID-19 pandemic. This extensive corpus formed the foundation of our analysis.

Taxonomy Development and Annotation

We developed a comprehensive taxonomy of vaccine hesitancy reasons and applied it to the corpus. The project saw us manually annotating over 6,000 examples, establishing a ground truth for machine learning models to help develop the vaccine hesitancy map.

Machine Learning and Active Learning

Utilizing the manually annotated examples, we trained a multi-label classification model. Through an active learning process, this model was continuously refined, allowing us to extend our analysis across the entire corpus with increasing accuracy.

API Development and Hosting

To make the corpus and our findings accessible, we developed a hosted service with an API, enabling easy navigation and utilization of the annotated corpus and machine learning model outputs. We also made the corpus available in annotated and raw forms for other researchers to access and analyze further.

Checkmark with Text Display and Pink Outline
Solution
Collected and analyzed millions of social media documents to understand public sentiment on vaccines.
Checkmark with Text Display and Pink Outline
Solution
Created a multi-label classification model using active learning for precise analysis.
Checkmark with Text Display and Pink Outline
Solution
Developed a detailed taxonomy for vaccine hesitancy.
Checkmark with Text Display and Pink Outline
Solution
Launched a hosted service with an API for easy access to insights and data.
 

The Results

  • Richly Annotated Corpus: Our work resulted in a collection of millions of documents, augmented with more than 6,000 manual and millions of machine driven annotations, capturing a comprehensive view and hesitancy map of public attitudes towards vaccines during the COVID-19 pandemic and the surrounding period.

  • Granular Insights into Vaccine Hesitancy: Through our taxonomy and subsequent analysis, we uncovered the diverse causes of vaccine hesitancy causes, from political beliefs to conspiracy theories, offering nuanced insights critical for targeted public health messaging.

  • Empowered Public Health Departments: Equipped with our in-depth findings and our client’s subsequent published findings, public health departments were better positioned to develop precise interventions designed to boost vaccination uptake.

  • Sustained Technical Expertise: By partnering with Xyonix, Columbia University overcame its challenge of maintaining technical industry expertise over a sustained period not subject to the academic calendar ; this significantly contributed to the project's success and enduring influence on public health policy formation and overall public health strategy.



Learn more about the project in this podcast episode of ours


Have a project in mind? We might be able to help. Reach out to us.