How ScHARe's Big Data Approach Can Yield Big Gains in SDOH Research

By Dr. Deborah Duran, Ph.D.
National Institute on Minority Health and Health Disparities (NIMHD)
Posted April 22, 2024

Social determinants of health (SDOH) are critical aspects for advancing health equity and part of this year’s focus for National Minority Health Month. Recent decades have brought increased recognition that social determinants of health SDOH drive health disparities, exerting profound and potentially larger impacts than medical care on risk for obesity, heart disease, cancer, maternal and infant mortality, and overall life expectancy.

These impacts are reflected in grim statistics. A baby born today in Colorado’s Summit County can expect to live 87 years, while a baby born in South Dakota’s Oglala Lakota County can expect to live only to 66—a life expectancy the United States as a whole surpassed by 1950. Black women in the United States have nearly 3 times higher risk of dying while pregnant than White women. The rates of maternal mortality among all racial and ethnic groups in the United States are much higher than in other comparable developed nations.

Determining how to address SDOH impacts is no simple task. But research on SDOH has reached a major inflection point: advances in technology, data science, and artificial intelligence (AI) have unlocked new resources that can help researchers identify, understand, and mitigate negative SDOH impacts, as well as identify protective ones. These tools have the potential to transform health disparities and health outcomes research.

The Science Collaborative for Health disparities and Artificial intelligence bias Reduction (ScHARe), which is training thousands of researchers, is designed to make these tools widely accessible. ScHARe’s new centralized cloud computing research platform improves access to population science, including SDOH-related Big Data, and breaks down the data silos that impede efforts to understand and address health disparities.


SDOH Research Requires Big Data
The U.S. Centers for Disease Control define SDOH as “the conditions in which people are born, grow, work, live, and age, and the wider set of forces and systems shaping the conditions of daily life.” SDOH include the ability to access safe housing, convenient transportation, nutritious foods, and quality health care. We know SDOH exert profound impacts on health, quality of life, and vulnerability to disease, but understanding how they do so is far from straightforward, considering:

  • SDOH effects can manifest over decades. Exposure to lead in childhood, for example, heightens the risk of developing dementia later in life. Exposure to violence in childhood can alter brain development and stress responses, with health effects that reverberate throughout a lifetime.
  • The burden of SDOH that negatively impacts health is not always predictable. People vary in their underlying susceptibility to specific stressors and diseases; for example, the same stressor may lead to depression or anxiety in some people and not others.
  • Complex interactions exist among SDOH. For example, children who experience childhood trauma but have strong sources of love and support may fare better than those who lack such support. A lack of education can limit employment opportunities and income.

To understand how SDOH affect health—how they intersect, when they exert the largest impacts, and through what behavioral and biological mechanisms—researchers need massive amounts of data about individuals’ lives and health over time.

Yet much of the data exist in silos. Government benefits and income data might reside in one database and health outcomes data in another. Researchers need the ability to access and link these large, disparate datasets. They also need access to data collected and analyzed for one study and never revisited. Moreover, data must be broadly accessible—currently, data is often inaccessible to women and other groups historically underrepresented in data science. Addressing these needs is the driving force behind ScHARe.


Why ScHARe Makes Me Optimistic

ScHARe hosts a growing collection of more than 200 health disparities, health outcomes, and population science datasets that researchers can access and analyze. These datasets include NIMHD-funded research to satisfy NIH’s new Data Management and Sharing policy. ScHARe also provides easy-to-use, off-the-shelf data science and cloud computing tools that enable researchers to link data and amplify the power, complexity, and nuance of their analyses.

Most importantly, ScHARe is diversifying data science. Although AI is an essential analysis and policy tool, AI-based algorithms can be biased by poor design and insufficient or biased training data. Diverse perspectives are needed to detect and root out potential biases in these data. This means upskilling researchers and students traditionally underrepresented in data science through ScHARe’s monthly “Think-a-Thon” webinar series that provides hands-on training and opportunities for research.

As you recognize National Minority Health Month this year, consider joining our rich and growing community of scholars transforming the study of SDOH. To learn more, sign up for the ScHARe listserv today.


Dr. Deborah Duran, Ph.D., is senior advisor of data science, data analytics, and data systems to the NIMHD director. She coordinates NIMHD’s efforts to obtain better minority representation in data science and addresses biases in emerging technologies.

<span class="translation_missing" title="translation missing: en-US.projects.blog_posts.show.load_comment_text">Load Comment Text</span>