Articles
General

An AI pilot is only as good as the data pipeline underneath it: our response to the FDA on AI-enabled optimization of early-phase clinical trials

We would hold both sponsor and CRO accountability for AI deployment in live trials simultaneously. That is an uncommon position, and it changes what we think the pilot needs to measure.

Lindus recently submitted a public comment to the FDA on the AI-Enabled Optimization of Early-Phase Clinical Trials Pilot Program (Docket No. FDA-2026-N-4390). We support the pilot and believe it can be designed to answer harder questions than the request for information currently sets up. 

Lindus executes Phase Ib-IV clinical studies in partnership with sponsors and is developing a portfolio of therapeutic assets. We are one of a small number of organizations that hold both of those roles at once.  Lindus functions as a service provider under ICH E6(R3) and  as developer, deployer, and operator within the same program under the NIST AI Risk Management Framework. We are not advising sponsors on what to do with the tools we built for them. Lindus is subject to the same regulatory obligations as any sponsor, and we are operationally responsible for the infrastructure on which those tools run.

The pilot's governance overlooks the role of CROs

The RFI names sponsors, technology vendors, academic partners, patient advocacy groups, and FDA as stakeholders. CROs are not on the list. In many early-phase trials, the CRO owns data collection, manages relationships with investigator sites, and operates the data acquisition tools. Many AI-assisted tools running on trial data are hosted on CRO-managed infrastructure.

This is a structural gap, not a failure of anyone's execution. The model the pilot is built on does not yet name the parties responsible for the data the AI depends on. Organizations that span developer, deployer, and operator roles are among the few that can show how AI-assisted tools perform across the full deployment lifecycle under regulatory accountability. 

Holistic governance - models and data pipelines

The most important thing the pilot can govern is the relationship between data collection and AI operation. A bolt-on tool receives data after it has passed through a separate collection system, inheriting that system’s latency, format inconsistency, and data quality gaps. An integrated architecture, in which data collection and AI operations share the same underlying infrastructure, means the model operates on data governed by the same standards that produced it. This removes an entire class of failure modes before the model runs. It reflects how we built CitrusTM, our AI-assisted trial operating system, in which data collection and AI operations are a single environment governed by the same standards throughout, with clinical, medical, and regulatory oversight at every decision point. We asked the FDA to treat this as a GOVERN activity under the RMF, a precondition for participation rather than a variable to accommodate after enrollment. Sponsors using bolt-on tools should not be excluded, but their results should be analyzed as a separate cohort, since the infrastructure question is one of the most important areas the pilot can answer.

From that starting point, five recommendations follow:

  1. Require two-layer performance reporting, with integrated versus bolt-on architecture as a primary stratification variable. Real-time data to the FDA is only as good as the data being delivered. An AI-assisted safety monitoring tool that can flag a signal for clinical review within hours of data entry is still limited by the timing of data arrival. The commonly cited benchmark for investigator site data entry is 24 to 48 hours after a participant visit, but actual performance varies substantially and is rarely reported at the trial level. Therefore, metrics should be reported in two layers:
  • Data pipeline metrics cover the time from clinical event to data acquisition tool entry, completeness at scheduled intervals, and the frequency of reconciliation discrepancies across investigator sites. 
  • Model performance metrics cover time from data availability to alert, alert positive predictive value, and false negative rate on retrospectively identified events. 

    Reporting them separately is also a structural test: tools within integrated infrastructure should outperform bolt-on tools on pipeline metrics, regardless of model sophistication; without that, the signal stays invisible.
  1. Name CROs and organizations that carry both sponsor and CRO accountability in the pilot’s governance structure. These are distinct roles with distinct obligations. A service provider executing on a sponsor’s behalf is not the same as an organization that holds sponsor and CRO accountability at once. The pilot needs to name both.
  2. Build the parallel-run evaluation mechanism from the start. To learn whether early AI-assisted insights have value, the tools run in parallel with full trial execution, generating and recording outputs at defined intervals, with those outputs not driving real-time decisions. Those recorded outputs are then compared against the trial's final results. Repeated across multiple trials, that comparison is what shows whether early insights predict outcomes that matter. This has to be designed in at the outset, not added retrospectively.
  3. Weight documented production deployment above claimed modeling capability when assessing AI maturity. A tool validated in a development environment with strong documentation is not the same as one that has operated against real trial data at investigator sites. The failure modes that matter most in clinical trials are deployment failures. Organizations without production deployment experience tend to market their modeling capability because that is what they have. We see this from both sides, as the organization that develops and deploys these tools. Maturity is multidimensional: a sponsor may be strong in modeling but nascent in data governance. Assess it across independent dimensions, including data readiness, model validation, organizational change management, and AI lifecycle management, and let the result set the level of support and monitoring a participant receives, rather than whether they are eligible.
  4. Require stakeholder-specific explainability disclosure as a condition of participation, defined before participant selection begins. Explainability needs differ by stakeholder in kind. An investigator deciding whether to act on a safety alert needs to know which data drove it and whether it aligns with their direct clinical observation. A data safety monitoring board needs the model's performance characteristics and known failure modes. A regulator reviewing AI-assisted analysis needs documentation of the training data, validation approach, and accountability chain from the output to the human decision. A single standard applied uniformly serves none of them well. The information to satisfy these requirements is available at the point of deployment; the gap is that vendors are not currently required to provide it. Defining minimum disclosure per stakeholder tier as an entry criterion protects the scientific integrity of the pilot and sets a standard the field can build toward.

The earliest evidence will likely land in oncology and rare diseases

Oncology dose-escalation and rare diseases are strong candidates for the earliest trials, where the FDA has the most data history and regulatory precedent. Dose-escalation studies benefit from AI-assisted integration of pharmacokinetic, pharmacodynamic, and safety data ahead of escalation committee meetings, with clinical oversight at every step. Novel modalities sharpen that case: with siRNAs, bispecific antibodies, and gene therapies, identifying the optimal biological dose quickly is commercially critical for smaller sponsors with narrow development windows. In rare disease studies, low enrollment rates typically drive longer trial durations, and AI-assisted eligibility screening against structured electronic health record data - with clinical confirmation of each match - has the potential to compress them.

It is important to note that AI-assisted tools do not reduce adverse events; they may improve the speed and accuracy of detection and the appropriateness of clinical response. Integrated infrastructure reduces a class of failure modes; it does not eliminate risk. The pilot's value is in generating evidence about the organizational and infrastructure conditions that determine whether these tools produce reliable output in live trials, a more durable contribution than performance data from well-resourced sponsors under favorable conditions.

Our full comment goes through each section of the RFI in detail and is publicly available.

About Lindus

Lindus is a next-generation CRO, engineered to give biotech and pharma sponsors confidence and control to generate clinical data on time and on budget. Citrus™, an AI-assisted trial operating system with clinical oversight, connects patient identification across 40 million EHRs to full-scope execution across the US and UK.

Lindus has delivered over 45 trials across cardiometabolic, respiratory, psychiatry, dermatology, women's health, and diagnostics. In trials where Lindus owned enrollment, 82% completed on or ahead of the timeline presented at proposal. Milestone-based payments align incentives around timelines and budgets. The company is backed by leading investors and advisors, including Peter Thiel and Prof. Robert S. Langer.

Contributors

Michael Young - Co-CEO

Michael was previously a Special Adviser to the UK Prime Minister, where he advised on a range of issues including life sciences. Before serving in government, he worked at McKinsey & Company and L.E.K. Consulting, predominantly on commercialization and M&A in the healthcare space. His time in government during the pandemic showed him how much trial delivery depended on fragmented, outdated infrastructure, and convinced him the industry needed a fundamentally different operating model; without this none of the groundbreaking advances in understanding human biology will ever benefit patients. He founded Lindus to fix these bottlenecks.

Gemma Harrison - Chief Product & Technology Officer

Gemma leads R&D of Citrus™, Lindus’ AI-native operating system. She is a technology executive who has spent her career building and scaling engineering teams. Before Lindus she spent three years as a Principal Software Engineering Manager at Microsoft, leading teams of up to 30 engineers on Telecoms products in Azure. Prior to that she spent 23 years at Metaswitch Networks, advancing from Software Engineer to Senior Software Engineering Manager. She holds a BA in Mathematics and a Diploma in Computer Science from the University of Cambridge, and is an active advocate for women in technology.

Shini Pattni - Head of Legal

Shini leads the legal function at Lindus, overseeing commercial contracting, data protection and privacy, and employment law, along with the legal side of fundraising and company strategy. Before Lindus she was Legal Counsel at Acast, where she negotiated the company's most significant commercial agreements and led its implementation of the Digital Services Act. She trained and qualified at Freshfields Bruckhaus Deringer in the intellectual property, commercial, and data protection team. She holds a Postgraduate Diploma in Intellectual Property Law and Practice from the University of Oxford and a Graduate Diploma in Law with distinction from BPP University.

Collin Anderson - Clinical Project Lead

Collin works at the intersection of clinical science and trial execution, leading full-scope trial execution from setup through close-out as the primary point of contact for sponsors. Trained as an epidemiologist, he brings a quantitative, evidence-driven perspective to protocol design and endpoint strategy, balanced by a focus on real-world operational constraints. Before Lindus he was Manager of Clinical Development at Vial, where he co-founded and led the clinical development function across early-phase programs. His experience spans Phase I through IV across therapeutic areas including cardiometabolic, immunology, oncology, and infectious disease, giving him a view of trials from the sponsor's chair, the operational seat, and the front line of patient enrollment alike.

This is some text inside of a div block.
This is some text inside of a div block.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Continue reading

More from Lindus

Lindus BG Image
01 Jun 2026
Blog

Stop prototyping, start building: How AI is shaping product & design in 2026

At Lindus, we believe AI will empower our tech organization to build better products. LLMs are unlocking potential across product management, design, and engineering, in ways that would have seemed far-fetched even two years ago. But the way we work is changing fast, and we think it's worth sharing what that looks like in practice.

Lindus BG Image
20 May 2026
Blog

What James Lind would ask us on Clinical Trials Day

What James Lind would ask us on Clinical Trials Day is whether the systems we have built are worthy of the participants, sites, investigators, and teams who make clinical research possible.We do not think they are yet, but we know they can be.

Lindus BG Image
11 Mar 2026
Blog

Achieving FDA-Grade Data Quality in Large-Scale Liquid Biopsy Screening Trials

"The data pipeline, from medical record retrieval to structured EDC output, has to be designed for regulatory scrutiny from the start. Retrofitting data quality into an operational model that wasn't built for it creates avoidable risk at exactly the wrong stage of a program."

Lindus BG Image
26 Feb 2026
Blog

Maintaining Complete Long-Term Follow-Up Data in Large-Scale Oncology Screening Trials

Every cancer case in a large screening trial must be fully documented, with complete diagnostic, staging, treatment, and outcome data. Losing even a small number of cases to incomplete follow-up can undermine the statistical power the entire study was designed to achieve. The operational model has to be built around preventing that from the start.