AI in the ICU: Study Finds 99.3% Accuracy for Parent-Facing Chatbot
- 99.3% accuracy: BastionGPT demonstrated 99.3% sentence-level accuracy in answering parent questions in a pediatric ICU.
- 1,225 sentences reviewed: Physicians reviewed 1,225 chatbot-generated sentences, identifying only eight minor errors.
- Net Promoter Score (NPS) of +57: Parents reported high satisfaction with the AI tool.
Experts conclude that specialized, HIPAA-compliant AI platforms like BastionGPT can achieve high accuracy and trust in clinical settings, augmenting human care without replacing it.
AI in the ICU: Study Finds 99.3% Accuracy for Parent-Facing Chatbot
DALLAS, TX – March 26, 2026 – A groundbreaking study is providing a new benchmark for the role of artificial intelligence in sensitive medical settings. Bastion Intelligence's HIPAA-compliant AI platform, BastionGPT, demonstrated 99.3% sentence-level accuracy in a peer-reviewed feasibility study supporting communication with parents in a pediatric intensive care unit (PICU).
The study, published in the April 2026 issue of Critical Care Explorations, was led by researchers at Baylor College of Medicine and Texas Children's Hospital. It evaluated the AI's ability to answer parent questions using secure, patient-specific data from electronic health records (EHR), a critical capability that sets it apart from general-purpose AI tools.
For parents navigating the overwhelming stress of a child's critical illness, the AI assistant served as a potential source of clear, consistent information. The results suggest that purpose-built AI can not only meet stringent accuracy standards but also earn the trust of clinicians and families in one of healthcare's most demanding environments.
A New Standard for Trust in Clinical AI
The central finding of the study is the platform's remarkable accuracy. Over the course of the trial, physicians reviewed 1,225 chatbot-generated sentences and identified only eight errors, all of which were classified as minor. Crucially, no moderate or severe errors were detected. The high degree of agreement among the physician reviewers (Gwet's AC2 = 0.98) further strengthens confidence in the findings.
This level of validated accuracy addresses a core challenge facing AI adoption in healthcare: trust. While consumer-facing AI like ChatGPT, Gemini, and Claude have showcased the power of large language models (LLMs), their use in clinical settings is fraught with risks related to patient privacy, data security, and the potential for inaccurate, or "hallucinated," information.
"While general-purpose AI tools like ChatGPT, Gemini, and Claude have expanded what's possible with large language models, clinical environments require a HIPAA-compliant AI solution designed specifically for the workflows and safety standards that physicians and care teams depend on," said Joshua Spencer, CEO of Bastion Intelligence, in the company's announcement.
BastionGPT is engineered to solve this problem. It functions as a secure, compliant platform that integrates leading AI models from providers like OpenAI, Google, and Anthropic. This architecture allows it to connect directly to a patient's EHR data, enabling it to provide personalized, context-aware answers while adhering to the strict privacy mandates of HIPAA. This ability to safely handle protected health information is a fundamental differentiator from off-the-shelf AI applications.
Beyond Theory: AI's Tangible Impact on Critical Care
The study's significance extends beyond technical accuracy to the real-world human experience. Both parents and healthcare providers reported positive interactions with the tool, suggesting a tangible benefit at the bedside.
Parents demonstrated high engagement, asking a median of six questions during their sessions, and reported high satisfaction, culminating in a Net Promoter Score (NPS) of +57—a strong indicator of loyalty and willingness to recommend the tool. For providers, the platform's response quality received a median rating of 5.0 out of 6.0. Physicians, in particular, expressed strong comfort with the routine use of the tool.
Interestingly, the research data revealed a slight variance in comfort levels, with physicians reporting a higher comfort level than nurses (5.0 vs. 4.0). This nuance highlights the importance of tailored implementation and training strategies for different members of the care team as such technologies are deployed.
"This study validates what we've been building toward, a medical GPT platform that clinicians can trust at the point of care," stated Dr. Sanjiv Mehta, a co-author of the study.
The research was designed as a prospective, single-arm feasibility study, meaning its primary goal was to assess the practicality and initial efficacy of the intervention. The promising results have now provided the foundation and statistical basis for designing a larger, more definitive randomized controlled trial, marking a clear step on the path from initial research to widespread clinical validation.
Empowering Families in the Face of Crisis
The PICU is an environment of immense stress and information overload for parents. Medical terminology is complex, a child's condition can change rapidly, and access to busy clinicians is not always immediate. This study explores how AI can help bridge that gap.
Research shows that parents of critically ill children are often active seekers of health information online and hold generally positive attitudes toward the potential of AI chatbots. The BastionGPT study provides a concrete example of how this can be realized safely. By providing a channel for parents to ask questions and receive accurate, EHR-informed answers in understandable language, the tool has the potential to reduce anxiety, improve health literacy, and foster a greater sense of involvement in their child's care.
The researchers are clear that the technology is designed to enhance, not replace, human connection. The study's authors envision the chatbot as "a supplement to rather than a substitute for bedside communication." The goal is not to put a screen between doctors and families, but to give families a resource to process information and formulate questions, leading to more productive and meaningful conversations with the care team.
Navigating the Broader Landscape of AI in Medicine
The BastionGPT study arrives as the healthcare industry grapples with how to best harness the power of generative AI. The competitive landscape includes major technology players like Microsoft, through its Nuance division, and Google Cloud Healthcare, which are also developing enterprise-grade, compliant AI solutions for the sector. Furthermore, leading hospital systems like Texas Children's are developing their own proprietary AI models for tasks ranging from diagnostics to streamlining patient communication.
However, experts consistently raise critical considerations. AI models are only as good as the data they are trained on, and biases in data can lead to inequitable outcomes. This is particularly salient in pediatrics, where a child's physiology changes dramatically with age and rare diseases present unique challenges. One study has already found socioeconomic disparities in AI chatbot use among parents, a trend that healthcare systems must actively work to mitigate.
Ultimately, the consensus among medical ethicists and AI specialists is that human oversight remains non-negotiable. The physician must always be the final decision-maker, using AI as a powerful tool for decision support and administrative efficiency, not as a replacement for clinical judgment. The success of the BastionGPT study, with its emphasis on physician review and its positioning as a supplementary tool, aligns perfectly with this guiding principle.
As this technology advances, the findings from the PICU study offer a compelling blueprint. By prioritizing security, validating accuracy through rigorous peer review, and focusing on augmenting human care, specialized AI platforms are demonstrating a clear and valuable role in the future of medicine.
