AI Partnership Unlocks Clinical Secrets in Patient Health Records
- $1.8 billion: The global Real-World Evidence (RWE) market in 2024, projected to surge past $7 billion by 2033. - 70%: The estimated portion of healthcare data that is unstructured, containing critical clinical narratives. - 60 million: The number of healthcare records Datavant’s platform enables to move between organizations.
Experts view this AI partnership as a breakthrough in transforming unstructured clinical data into regulatory-grade evidence, accelerating drug development and deepening clinical understanding with unprecedented speed and accuracy.
AI Partnership Aims to Unlock Clinical Secrets Buried in Health Records
SAN FRANCISCO & NEW YORK – May 18, 2026 – A new collaboration between AI-native clinical intelligence firm Lighten Platforms, Inc. and healthcare data giant Datavant is poised to revolutionize how life sciences organizations conduct research. The partnership will integrate Lighten’s sophisticated AI-powered data curation directly into Datavant’s sprawling analytics platform, promising to transform the messy, narrative text of patient records into regulatory-grade evidence that can accelerate drug development and deepen clinical understanding.
This strategic alliance aims to solve one of the most persistent challenges in medical research: accessing the wealth of information trapped within unstructured clinical notes. By creating a seamless pipeline from raw, complex data to validated, analysis-ready insights, the two companies are betting they can set a new standard for the speed and scientific rigor of Real-World Evidence (RWE).
The Untapped Goldmine of Unstructured Data
The market for RWE solutions is booming, driven by a growing demand from regulators and payers for evidence of a drug’s value and effectiveness in everyday clinical practice. Valued at approximately $1.8 billion in 2024, the global RWE market is projected to surge past $7 billion by 2033. This growth is fueled by the understanding that the most valuable clinical information often lies outside the neat checkboxes and structured fields of an Electronic Health Record (EHR).
While structured data—such as billing codes and lab values—is easily quantifiable, it provides only a skeletal outline of a patient's story. The vast majority, estimated at over 70% of all healthcare data, is unstructured. This includes the detailed narratives dictated by physicians, pathology reports, surgical notes, and discharge summaries. Within this text lies the critical context that researchers need: confirmed diagnoses, observations on disease severity, rationale for treatment changes, and nuanced descriptions of patient response. Historically, accessing this data at scale required either months of painstaking manual abstraction by clinical experts or the use of early-generation AI tools that often sacrificed quality and clinical accuracy for speed.
“The next era of real-world evidence will be defined by the ability to understand the full patient journey, not just fragments captured in structured fields,” said Xinkun Nie, PhD., Founder and CEO of Lighten. “Our collaboration with Datavant helps put deeply curated longitudinal patient journeys in the hands of researchers to study disease and treatment response with a level of clinical context that has historically only been possible through small-scale manual review.”
Building Trust with 'Regulatory-Grade' Evidence
A central claim of the partnership is its ability to produce “regulatory-grade” evidence. This term signifies that the data and resulting analyses are of sufficient quality, relevance, and reliability to be considered by regulatory bodies like the U.S. Food and Drug Administration (FDA) and the European Medicines Agency (EMA) in their decision-making processes. This has become increasingly important since the 21st Century Cures Act directed the FDA to evaluate the use of RWE to support drug approvals and new indications.
Achieving this standard requires more than just powerful algorithms. It demands a transparent and validated process that can withstand scientific and regulatory scrutiny. Lighten asserts its platform is fundamentally different from other AI tools, describing it as an “AI-native technology platform” that embeds the operational discipline and clinical expertise of rigorous abstraction teams directly into its workflows. Rather than simply flagging keywords, the platform is designed to reconstruct a patient's journey by reasoning across different clinical notes, surfacing and resolving ambiguities with AI-powered workflows overseen by clinical experts. This focus on engineering quality from the start is designed to build the trust necessary for high-stakes applications.
“Real-world evidence has long been constrained by the gap between the clinical depth required for high-stakes decisions and what structured data alone can support,” stated Emily Rubinstein, VP and Group Head of Data Insights and Science at Datavant. “By integrating high-fidelity information from unstructured EHR data, curated by Lighten, directly into the analytical workflow, we’re streamlining the path from data to evidence while preserving the scientific integrity required for the most demanding use cases.”
A Strategic Synergy of Depth and Scale
The true power of the alliance lies in its strategic synergy. Lighten brings the specialized technology for deep, nuanced curation of complex clinical text. Datavant provides the unparalleled scale, connectivity, and security infrastructure to make that curated data actionable across the healthcare ecosystem.
Datavant’s platform is a central hub for healthcare data collaboration, enabling the movement of more than 60 million healthcare records between thousands of organizations. Its network includes over 80,000 hospitals and clinics and more than 350 real-world data partners. This vast reach is underpinned by a significant investment in security and privacy-preserving technologies, such as data tokenization, which allows patient records from different sources to be matched and linked without sharing personally identifiable information.
This combination means that the rich, longitudinal patient datasets curated by Lighten can flow directly into Datavant's secure analytics environment. For life sciences companies, this integration promises to dramatically accelerate research timelines for a wide range of needs, including epidemiology studies, characterizations of natural disease history, comparative effectiveness analyses, and studies to support label expansions for existing drugs. By compressing work that traditionally takes months into days, the partnership aims to help researchers answer more complex questions faster than ever before.
By uniting deep clinical reasoning with massive data scale, the Lighten and Datavant collaboration represents a significant step toward realizing the full promise of real-world evidence. This new capability to systematically unlock and interpret the rich narratives within patient records has the potential to accelerate the development of new therapies, refine our understanding of disease, and ultimately enable a more informed and effective healthcare system.
📝 This article is still being updated
Are you a relevant expert who could contribute your opinion or insights to this article? We'd love to hear from you. We will give you full credit for your contribution.
Contribute Your Expertise →