Health and Social Media Group






This project concerns social and online media influences on HIV/STI transmission behaviors. To combat HIV effectively, we need to know where treatment resources need to be aimed at and where new health risks are developing. One challenge is that communities’ and individuals’ risk factors for HIV change dynamically and it usually takes several years for these changes to show up in the large surveys that government agencies publish. In that time, early prevention opportunities go unused. One candidate for a data source that can predict HIV rates and that is updated often is Twitter. Twitter can be accessed in real-time and a subset of tweets can be mapped to state, county, and even zip-code regions of origin. We expect that sociodemographic variables (e.g., poverty) and Twitter content will predict HIV outcomes, and that these two factors interact.  For instance, when sociodemographic factors make people more likely to seek out social information (e.g., because they are uncertain and lack resources; Albarracín et al., 2010), reading a local friend’s tweet that encourages risky sex may be more influential. We are using research-guided methods to identify HIV risk factors and associated language features (e.g., words referring to drug use). Additionally, we will use machine learning methods to explore new keywords and topics that have not been captured by the research literature yet. In the long term, we are planning to extend these studies to predict future outbreaks of HIV.

To achieve this, we have a fantastic team of contributors from many different areas. Our team includes principal investigator Dolores Albarracín (U of I), research assistant professor Man-Pui Sally Chan (U of I), as well as professors Bo Li (Statistics, U of I),  Hari Sundaram (U of I, Advertising and Computer Science), Chengxiang Zhai (U of I, Computer Science), Shaowen Wang (Geography, U of I), Liliane Windsor and Marta Durantini (Social Work), Travis Sanchez and Patrick Sullivan (Epidemiology, Emory University), and David Holtgrave (Public Health, John Hopkins University),

The team also includes outstanding graduate students Cici Liu (Psychology, U of I), Benjamin White (Psychology, U of I), Alex Morales (Computer Science, U of I), and Ismini Lourentzou (Computer Science, U of I), and Carol Ann Lee (Social Work, U of I), as well as highly accomplished postdoctoral fellows Aashna Sunderrajan (a psychologist; U of I), postdoctoral fellow Bita Fayaz (an economist; U of I), postdoctoral fellow Annie Jung (a psychologist, U of I), and postdoctoral fellow Thomas O'Brien (a psychologist, U of I). The project is greatly facilitated by Sehr Amer, who received a BS in Psychology.

Current Projects


Contact PI / Project Leader: ALBARRACIN, DOLORES



PROJECT SUMMARY With the advent of Big Data methods, social media is the proverbial low-hanging fruit to disseminate HIV prevention and testing messages on a large scale 1,2. These media already transmit messages on condom use, HIV testing, and Pre-Exposure Prophylaxis (PrEP), from government institutions, NGOs, private citizens, and community groups, but it does so in an informal way. This real-time repository of real-world health messages, along with our ability to mine and pinpoint counties that need to engage in a conversation about HIV prevention and testing, offers a unique opportunity to develop Big Data methods for geographically targeted message dissemination. Despite some interventions designed for online delivery 3–19, the overall potential of social media and their most promising contents (e.g., actionable messages with behavioral instructions) have surprisingly not been established to date. Our project will focus on the disease-burdened population of Men Who Have Sex With Men (MSM), and will develop a highly significant computing infrastructure (Aim 1) to automatically and continuously input social media postings from Twitter, Facebook, and Instagram, behavioral data from the American Men Internet Survey (AMIS), and HIV prevalence data from, and using that triangulation, to target counties that need social media messages about condom use, HIV testing, and/or PrEP for MSM. Using machine learning methods, the same platform will then select actionable and acceptable messages to fill county gaps. Once the platform has been refined with input from research participants who are employees of health departments, it will be used to send experimental messages (Aim 2), selected to match county needs, and to be actionable and acceptable to a group of health departments randomized to the experimental condition. The success of the experimental messages will be gauged by the hallmarks of social media, repostings, Likes, Dislike, and comment favorability, compared with the success of a random selection of HIV-relevant messages sent to a different group of health departments randomized to the control condition. The project is innovative in several ways. First, the social media messages will involve diverse inputs (text, images, videos) never before brought together in this area. Second, we are not aware of the prior use of the proposed triangulation involving epidemiological, behavioral, and social media data. Third, a method of mining naturally accruing messages will be new and transformative, allowing for the generation of “live” campaigns with messages selected that remain current, sustainable, and community-based by design. Further, the use of an implementation-science experiment at a large, geographically distributed, scale is highly novel. These research aims are facilitated by unique team expertise about communication and persuasion, Big Data methods, public health, and Bayesian spatio-temporal modeling, and leading institutions in the areas of psychology, public health, and computer science.


Contact PI / Project Leader: ALBARRACIN, DOLORES


This project will develop, implement, and evaluate a transformative virtual initiative, high-impact initiative for protecting rural drug-using populations at risk from HIV/HCV outbreaks in the midst of the opioid epidemic in the United States. Most counties at high risk for HIV/HCV outbreaks are clustered in Appalachia and the Midwest. Although the origins of the opioid crisis are complex, social determinants of health, misconceptions about opioids, and a culture of isolation and despair are among the most critical causes. However, Internet use is common among opioid users in these areas (e.g., Facebook, Google) and may support scalable programs to supplement medical and public health responses to the crisis in difficult to reach regions. This project will involve two innovative platform modules: Social Action (virtual meeting and networking, contact with health officials, community consultation, and strategic news releases); and Communication (messages debunking misconceptions; messages promoting HIV/HCV testing, syringe exchange, and HIV pre-exposure prophylaxis (PrEP); on-the-spot responses to questions posed via live chat). The digital platform will incorporate Big Data methods to identify efficacious messages, misconceptions, and debunking messages. The project will target rural millennials (ages 18-35), a population that is sufficiently young to be malleable, to prevent opioid addiction and negative health outcomes in middle age. Thus, this project aligns with the NIDA agenda of generating rapid, scalable, and socially supported responses to the opioid epidemic. To evaluate the initiative, the Principal Investigator (PI) and colleagues will conduct a cluster randomized controlled trial with opioid using participants (n = 1,200) and non-opioid-using community members (n = 1,200). Assessments will occur every four months over a year. Trial outcomes will include HIV/HCV testing, syringe exchange participation, condom use, exchanging sex for money or drugs, initiation of PrEP, medication-assisted substance abuse treatment, and measures to avoid drug-related adverse events (e.g., carrying Naloxone kits for reversing overdoses), in addition to isolation and stigma measures. The Annenberg Public Policy Center has committed to hosting and maintaining the platform in parallel with, ensuring sustainability. Given the lack of a similar social action platform in any domain and the urgent need for innovative and cost-effective approaches for the largely rural, white, poor, and anti-syringe-exchange counties at risk for HIV/HCV outbreaks, this project exemplifies the vision of the Avant-Garde Program. The PI has a distinguished record of novel, complex, theoretically-informed, and successful research on behavior change, HIV, and digital communication. The institutional support for this project is outstanding and includes, a coalition to reduce opioid vulnerability co-directed by the PI. The project brings together stellar and committed investigators at eleven elite universities and an impressive network of state health departments.


Contact PI / Project Leaders: ALBARRACIN, DOLORES

Other PI/Project Leaders: SALLY CHAN


PROJECT SUMMARY To achieve critical health milestones (e.g., National HIV/AIDS Strategy1), the public health system needs methods to predict HIV epidemiology within a region. An unexpected surge of new diagnoses in Miami, FL or Austin, IN, may well be avoided if public health officials are able to forecast these changes and to intervene in anticipation. However, modeling approaches are underutilized as mainstream tools to aid public health decisions,2 owing to barriers including (a) unavailability of user-friendly methods that consider the spatiotemporal relations among predictors of HIV transmission dynamics, (b) lack of inclusion of powerful big social media data to gauge population norms and diffusion of information about HIV testing and prevention services, (c) lack of integration of disperse yet relevant sources of data to predict HIV epidemiology, (d) lack of visualization tools for the results of that integration, and (e) lack of models to gauge the impact of new interventions (e.g., an HIV vaccine), or changes in current interventions. In this application, we propose methods that, if successful, will allow public health officials and the scientific community to make such refined predictions and thereby to plan for interventions such as PrEP (PreExposure Prophylaxis). The project will rely on existing but disperse sources of regional epidemiological, socio-structural, social media, and intervention data to produce models and Cyber-GIS-HIV, a tool that can be used by public health officials and researchers. The tool will analyze data and produce results in an integrated output identifying vulnerable regions, and predicting future pockets of vulnerability and the effects of changes in intervention policy. We will integrate epidemiological and biomedical service data recorded by health departments, data from the US Census, the American Community Survey, the American Men Internet Survey, transmission network datasets, social media data, and effect sizes from new interventions to derive predictions. We will also develop new methods for social media analyses and compare spatio-temporal modeling techniques. The system will offer recommendations about service allocation for a zip code, a county, and a region, set to introduce services equally across areas, or to target the areas that would give the most improvement for the state as a whole. The University of Illinois, Emory University, and the University at Albany offer the ideal social science, public health, and computing infrastructure for this project. The team (Illinois: Albarracin, Chan, Li, Sundaram, and Wang; Albany: Holtgrave) has developed cutting-edge big-data models to predict HIV and flu, as well as original spatiotemporal analysis and existing state-of-the-art CyberGIS tools. Dr. Do at Emory served in the division of HIV surveillance epidemiology at CDC for two decades and is now a faculty member. In addition, health department personnel will be involved in designing and in testing CyberGIS-HIV during the last year of the project, if the methods pass a pre-established set of Go/No Go criteria.

Related Readings

[Click on underlined entries to view the paper]

Chan, M. S., Morales, A., Farhadloo, M., Palmer, R. P., & Albarracín, D. (2019). Harvesting and harnessing social media data for psychological research. In H. Blanton & G. D. Webster (Eds.), Social Psychological Research Methods: Social Psychological Measurement.


Chang, L., Huang, H.Y., Albarracín, D., & Bashir, M. (2019). Who shares what with whom? information sharing preferences in the online and offline worlds. Advances in Human Factors in Cybersecurity. DOI:10.1007/978-3-319-94782-2-15.

Morales, A., Gandhi, N., Chan, M-P S., Lohmann, S., Sanchez, T., Ungar, L., Albarracín, D., and Zhai, C. (2018). Multi-attribute topic feature construction for social media-based prediction. Proceedings IEE Big Data.

Lohmann, S., White, B. X., Zuo, Z., Chan, M.-p. S., Li, B., Zhai, C., & Albarracín, D. (2018). HIV-messaging on Twitter: An analysis of current practice and data-driven recommendations. AIDS, 2799-2805.

Chan, M. S., Hawkins, L., Winneg, K., Fardhaloo. M., Jamieson, K. H., & Albarracín, D. (2018). Legacy and social media respectively influence risk perceptions and protective behaviors during emerging health threats: A multi-wave analysis of communications on zika virus cases. Social Science & Medicine, 212-50-59.

Lohmann, S., Lourentzou, I., Zhai, C., & Albarracín, D. (2018). Who is saying what on Twitter: An analysis of messages with references to HIV and HIV risk behavior. Actas de Investigación Psicológica/Psychological Research Records, 3, 1311-1321. 


Shand, L., Li, B., Park, T., & Albarracin, D. (2018). Spatially varying autoregressive models for prediction of new HIV diagnoses. Journal of the Royal Statistical Society. (Supplementary material)


Chan, M. S., Lohmann, S., Morale, A., Zhai, C., Ungar, L. H., Holtgrave, D. R., & Albarracín, D. (2018). An Online Risk Index for the cross-sectional prediction of new HIV, chlamydia, and gonorrhea diagnoses across U.S. counties and across years. AIDS and Behavior. (Supplementary material)

Chan, M-P S, …, & Albarracín, D. (2018) .Sources affecting knowledge and behavior responses to the zika virus in U.S. Households with current pregnancy, intended pregnancy, and a high Probability of unintended pregnancy. Journal of Public Health, 40, DOI:  10.1093/pubmed/fdy085

Bae, R. E., Maloney, E. K., Albarracín, D., & Cappella, J. N. (2018). Does interest in smoking affect youth selection of pro-smoking videos? A selective-exposure experiment. Nicotine & Tobacco Research,

Albarracín, D., Romer, D., Jones, C. R, Jamieson, K. H., & Jamieson, P.E., (2018). Misleading claims about tobacco products in YouTube videos: Effects of misinformation on unhealthy attitudes. Journal of Medical Internet Research, 20. doi:10.2196/jmir.9959

Coppock, D., Zambo, D., Moyo, D., Tanthuma, G., Chapman, J., Lo Re III, V., Graziani, A., Lowenthal, E., Hanrahan, N., Littman-Quinn, R., Kovarik, C., Albarracín, D., Holmes, J., & Gross, R. (2017). Development and usability of a smartphone application for tracking antiretroviral medication refill data for Human Immunodeficiency Virus. Methods of Information in Medicine. 56(5):351-359. doi: 10.3414/ME17-01-0045


Albarracin, D., Liao, V., Yi, J., & Zhai, C. (2016). Emerging communication systems to promote physical activity. Understanding exposure, attention, and behavior change from psychological and computational perspectives. In Zhu, W. & Owen. C., (Eds), Sedentary behavior and health.


Ireland, M., Chen, B., Schwarz, A., Ungar, L., & Albarracin, D. (2015). Future-oriented tweets predict lower county-level HIV prevalence in the United States. Health Psychology.


Ireland, M., Chen, Q., Schwartz, A., Ungar, L., Albarracin, D. (2015). Action Tweets Linked to Reduced County-Level HIV Prevalence in the United States: Online Messages and Structural Determinants. AIDS and Behavior.


Cappella, J., Kim, H., Albarracin, D. (2015). Selection and transmission processes for information in the emerging media environment: Psychological motives and message characteristics. Media Psychology.


Stephens, Z. D., Lee, S. Y., Faghri, F., Campbell, R. H., Zhai, C., Efron, M. J., ... & Robinson, G. E. (2015). Big Data: Astronomical or Genomical?. PLoS Biol, 13(7), e1002195.

Leginus, M., Zhai, C., & Dolog, P. (2015). Beomap: Ad Hoc Topic Maps for Enhanced Exploration of Social Media Data. In Engineering the Web in the Big Data Era (pp. 200-218). Springer International Publishing.

Leginus, M., Zhai, C., & Dolog, P. (2015). Personalized generation of word clouds from tweets. Journal of the Association for Information Science and Technology.


The Team
Click on portraits to see their personal page

Psychology Department, University of Illinois at Urbana-Champaign

603 East Daniel Street, Champaign, Illinois 61820.  (217) 244-7019.