Multimodal Intelligence for Life Sciences on AWS

Multimodal Intelligence for Life Sciences on AWS

Life sciences teams don’t have a data problem. They have a context problem.

The insights that matter most require connecting clinical trial designs, MRI scans, patient narratives, and more, yet traditional tools struggle to bring this data together.

In this webinar, experts from Vespa.ai, Reveal HealthTech, and AWS will show how they solved this challenge, how it works, and what it unlocks.

Note: This transcript has been cleaned up and formatted for readability using AI.

Welcome & Introductions

Bonnie (Host): Welcome, everyone. Today we’re exploring how life sciences organizations can unlock insight across the full spectrum of their data — from structured clinical data and patient narratives to medical imaging and commercial intelligence. Joining us are experts from Reveal Health Tech, Vespa AI, and AWS.

Gloria (Reveal Health Tech — Applied AI Lab): I lead the Applied AI Lab at Reveal Health Tech, a software solutions company serving life sciences and healthcare clients. My team explores the latest advances in generative AI, building everything from agentic AI foundations to RAG pipelines — and now BioCanvas, which we’ll be discussing today.

Ramesh (Reveal Health Tech — AI & Machine Learning): I lead the AI and machine learning team at Reveal, helping healthcare and life sciences clients solve key problems using AI. I’ve been working at this intersection for nearly two decades, starting with a PhD in machine learning applied to medical imaging.

Ariella (AWS): I’m a specialist in data and AI at AWS, with a background in computational biology and genomics. My focus is helping healthcare and life sciences customers integrate their data and work at scale to advance the science.

Harini (Vespa AI): I lead industry go-to-market for Vespa AI, focusing on healthcare and life sciences. My background spans bioinformatics, real-world evidence, and time with both AWS and Snowflake. I love exploring how leading-edge technology patterns from industries like e-commerce can apply to precision medicine, clinical, and commercial life sciences.


The Industry Challenge: More Data, Fewer Insights

Bonnie: Life sciences organizations often say they have more data than ever, yet actionable insights remain difficult to uncover. Why is this such a persistent challenge?

Harini: It’s a combination of factors. The first is scale — over 60% of life sciences data is unstructured, and most organizations have only been able to build chatbots that handle generic questions against structured or simple text data. The moment a business user asks something specific — enrollment rates, adverse events for a particular drug — the system breaks down. Precision is lost.

The second challenge is agentic complexity. As organizations build multi-agent AI systems, retrieval accuracy in the first agent directly affects everything downstream. The longer the reasoning chain, the more accurate your retrieval must be.

And then there’s the enterprise triangle: cost, performance, and quality — all three at once. Organizations need to handle complex PDFs, 3D imaging, and billions of rows of data at scale, within regulatory compliance, in a cost-optimal way.

The reframe I’d encourage is this: stop thinking of retrieval as just a chatbot or a RAG problem. Think of it the way companies like Publicis approach web-scale search — interpreting the question, applying filters, ranking results intelligently, and returning a precise answer. That’s the capability life sciences needs to build.

Bonnie: Traditional search was built for structured content, but the majority of life sciences data isn’t structured — that’s the gap.


What Is Multimodal Intelligence?

Bonnie: For those less familiar with AI systems, what does multimodal retrieval actually mean in practice?

Ariella: Multimodal goes far beyond video and images. Think about the science itself: to understand what’s happening in a human being — especially in complex disease — you’re drawing on pathology at different time points, genetics, transcriptomics, proteomics, cell biology. Each of those is a different modality. The goal is to bring all of them together so scientists can synthesize and innovate.

One of the most powerful things we can do is give scientists visibility into data they didn’t know existed. If a scientist doesn’t know a dataset is available, they’ll never ask questions of it. Unlocking that awareness is enormously valuable — and ultimately helps patients.


Where Multimodal Intelligence Creates Value in Life Sciences

Gloria: We see multimodal data being valuable across the entire life sciences value chain.

Clinical trial recruitment is one of the most compelling examples. Over 80% of clinical trials fall short in recruiting truly eligible patients, partly because recruitment has traditionally relied on tabular clinical data and manual review of clinical notes. Incorporating medical imaging and other modalities alongside structured data — and making all of it searchable together — fundamentally changes what’s possible.

Clinical trial monitoring is another major use case. Trial data arrives in many formats beyond structured spreadsheets: adverse event reports, qualitative feedback, site-level documentation. Being able to parse, fuse, and surface all of that in real time gives trial teams a far clearer picture.

On the commercial side, we’ve built and deployed a solution for a global pharma client that surfaces trending topics, KOL sentiment, and social media signals to field teams — enabling them to ask better questions about how to manage drug perception in specific channels. Bringing the right data directly to end users changes the quality of decisions they can make.


Introducing BioCanvas

Bonnie: What is BioCanvas, and what problem were you solving when you built it?

Gloria: When we saw the opportunity to incorporate multiple data modalities — not just text and structured data, but images, video, audio, time series — we asked ourselves: can we build something applicable across different teams and organizations?

We quickly realized that no generic, plug-and-play solution would work. Disease areas, drugs, and trials are all too different. So we built BioCanvas as an accelerator — we’ve done the hard work of ingesting raw data, preprocessing it, fusing it, and making it ready for retrieval. The last mile is always customized with the client, tailored to their specific disease area or workflow.

We also built BioCanvas with strict regulatory, security, and compliance requirements in mind. It’s designed to be deployed within the client’s own cloud environment, so no data crosses the firewall. Sensitive data and PHI are handled appropriately from day one.

Architecturally, BioCanvas sits in the middle layer. On one side are all the data modalities — video, images, audio, time series. BioCanvas handles the heavy lifting: preprocessing, fusing, harmonizing. On the other side are the use cases: medical scientific engagement, patient recruitment with highly specific queries, commercial drug launch analysis, and competitive intelligence.

Concrete example — Clinical Trial Control Tower: The goal is to help users answer questions like “How are my clinical trials performing against target?” — incorporating not just structured data, but adverse event reports, qualitative feedback in different formats, and patients with similar scan results or demographics. BioCanvas pre-integrates DICOM data preprocessing, text-based preprocessing, multimodal embedding models, and robust tensor storage to enable one-shot hybrid search with re-ranking.


Under the Hood: How Multimodal Search Works

Ramesh: Every modality requires its own ingestion, preprocessing, and embedding approach. A 2D X-ray is different from a 3D MRI or CT scan, which is different from a pathology image or next-generation sequencing data. Each has to be processed, fused, and harmonized with other data sources in its own distinct way.

When a user searches for “patients between age 25 and 45 with comparable neuroimaging to this sample image,” we’re executing a hybrid query: structured search over age fields, image similarity search over neuroimaging, and text search over clinical notes — all at once, ranked and prioritized appropriately.

This is where multimodal intelligence earns its name. It’s not just embedding images and running RAG. It’s building a system that can intelligently combine many different search strategies, prioritize each data source as needed, and surface the most relevant result. Every use case requires calibration — there’s no one-size-fits-all solution. We build BioCanvas with opinionated defaults, then work with the scientists who’ll use the platform to tune it to their specific needs.


Why Vespa AI Powers BioCanvas

Harini: Vespa is a spin-off of Yahoo, established in 2019, originally built to power Yahoo’s entire retrieval system — including Yahoo Finance. It was purpose-built for fast, web-scale retrieval. Today, it powers search and retrieval for companies like Publicis and Spotify.

For life sciences specifically, Vespa offers four critical differentiators:

Multimodality: Vespa’s native tensor backend can represent complex objects — like entire proteins or spatial transcriptomics data — as vectors and retrieve across them in sub-seconds. This is essential for high-dimensional search.

Precision: Vespa supports relevance tuning, re-ranking, and user signal context — the same techniques that make e-commerce search highly personalized. In life sciences, this means you can search a pathology slide by providing a visual image as your query and find similar disease progressions or cancer subtypes.

Enterprise compliance: Vespa runs within your VPC, across all major cloud providers. Data stays where it needs to stay. It integrates cleanly into existing tech stacks and regulated environments.

Cost optimization: 3D medical images are storage-intensive. Vespa is optimized for retrieval and ranking on the fly, avoiding the compute and memory costs of processing images sequentially or joining across modalities inefficiently.

Together, those four levers — modality support, precision, compliance, and cost — address the core challenges enterprises face when trying to realize ROI from AI investments in life sciences.


Demo Walkthrough

Clinical Trial Control Tower & Cohort Builder

The BioCanvas Clinical Trial Control Tower opens with a map showing trial performance across sites nationally, with key KPIs and stats on the right. Principal investigators get a personalized view of which trials are at risk relative to protocol.

Drilling into a trial detail view, users see both structured tabular data from trial sites and parsed patient safety experience reports — originally in Excel or PDF format — presented in a consumable format. Users can then chat directly with ML model output, asking questions like: “Can you forecast a viable runway for this trial without material changes?”

When a trial shows low enrollment and increasing dropout rates, the platform transitions into a patient recruitment workflow powered by Vespa’s tensor store. Users can upload a medical image — for example, a breast cancer scan with a tumor bounding box — and layer on additional search criteria like age, treatment history, and family history. Results are returned not as a flat list but in intelligent groupings (exact matches, close matches, etc.), each showing high-level clinical data alongside image attachments and clinical notes for deeper verification.

Commercial Intelligence Dashboard

The commercial use case opens with a regional and state-level view of how a specific drug is performing in the market — including trending topics and sentiment over time. Users can chat with the data to surface competitor signals, drawing not just from internal reports but also from social media and video content that has been preprocessed and fused into the platform. The result is a dramatically shortened path from data to insight to action.


What Separates Organizations That Are Succeeding

Ariella: The organizations moving fastest are the ones systematically unlocking their data — building connective tissue between their AI tools and their data assets, piece by piece.

You can’t bring all your modalities into a platform at once and expect everything to work. The key is to prioritize, start knocking things down, and keep going. Some organizations have 100 years of unstructured documents — you’re not going to ingest all of that. But if you can start with even 15 to 20 years of data, you can already make meaningful decisions based on what’s already been studied.

The most fundamental capability is findability. If the data isn’t findable, it can’t be accessible, interoperable, or reusable. Get data findable first. Then your scientists can do what they love — innovating on behalf of patients and asking the right questions.


Getting Started: The First Steps

Gloria: The most common first reaction when organizations see BioCanvas is: “My data isn’t ready.” And at the organizational level, that may be true — but data readiness varies significantly across teams.

The real first step is an organizational assessment: which teams already have two or three modalities of data accessible enough to start fusing? That team can go first. Others can start with just two modalities — structured plus text, or structured plus PDF.

The second step is change management. Generative AI changes operating models, not just tooling. Teams need to prepare mentally and organizationally for how workflows will shift when these capabilities are available.

Ramesh: On the technical side, the key is a willingness to start small. Even partial data, combined with domain experts willing to collaborate with technical teams, is enough to move forward. We’ve worked with organizations at every stage of their infrastructure journey — from those with mature cloud environments to those with just a basic AWS account. The readiness to try new things matters more than where you’re starting from.


Q&A

Q: What are the challenges of using other cloud-native vector databases, and why use a fusion model like Vespa?

Ramesh: In theory, you could implement similar capabilities with another vector database. What draws us to Vespa is that hybrid search — across text, structured data, and multiple image modalities — is a native capability, not something you have to build on top of. Vespa also natively supports multi-phase search, where an early phase narrows candidates quickly and a later, more expensive phase re-ranks them. And scalability is built in: for life sciences data volumes, being able to scale without significant backend ops work lets us focus on enabling the science rather than managing infrastructure.


Q: How do you solve commercial use cases, especially with video embeddings?

Gloria: Commercial use cases involve a broader range of data formats than clinical — video (combining pixels and audio), raw audio, CSV, PDF, PowerPoint, and more. The preprocessing challenge is proportionally greater.

There’s also a meaningful difference in what you’re optimizing for. In commercial settings, users typically want high recall — surfacing as much relevant information as possible to inform strategy. In scientific discovery, precision matters more — getting to the single most relevant answer. Tuning the retrieval system differently for these two contexts is part of how we customize BioCanvas for each use case.


Q: How does personalization apply in life sciences? Can you give examples?

Harini: Personalization in life sciences works similarly to e-commerce — different users searching the same data corpus have very different information intent.

Take a commercial example: a brand manager searching for information about a drug product needs very different results than a medical science liaison from the same team searching with the same prompt. The brand manager wants competitive strategy signals. The liaison wants content suitable for a scientific conference or paper. Their roles, past query history, and departmental context all become signals that can guide retrieval.

Traditional RAG ignores this entirely — it translates a question into an embedding and retrieves the closest match. It doesn’t account for unstated intent. With personalization, you encode the user’s role and interaction history as context, reducing the burden of prompt rewriting and making retrieval meaningfully more accurate. That’s one of the key patterns we’re importing from leading e-commerce applications into life sciences.


Looking Ahead: What Excites the Panelists Most

Harini: The pace of investment and announcement in healthcare life sciences AI right now is remarkable — across hyperscalers, foundation model companies, and the industry itself. The promise isn’t just productivity. It’s fundamentally changing timelines. If drug discovery can move from 15 years to 5, that changes what’s possible for patients. We’re in a moment where ecosystems like this — a hyperscaler, a product company, and a services firm, coming together without fine-tuning a single model — can solve real industry problems. Every day brings a new challenge we can actually solve.

Ariella: The technology is no longer what’s holding us back. With all the advances in coding assistance and AI, the question is how quickly we can think about the problems and push the industry forward. We’re moving toward personalized medicine faster than anyone expected, and the investment from LLM companies, foundation model teams, pharma, and health systems is converging in an unprecedented way. Every day there’s something new to read — it’s an extraordinary time to be in this space.

Ramesh: What excites me most is seeing the technology being used to genuinely accelerate science. AlphaFold is the most visible example, but we’re seeing rapid progress in vision language models, hybrid models, and retrieval systems. We’re just beginning to see how these can be deployed. The pace of improvement is remarkable.

Gloria: At the individual level, what I’m most looking forward to is the patient experience changing. Today, patients with rare diseases spend years cycling through specialists, repeating tests, hitting information barriers because their data exists in different formats across different systems. On the other side, physicians and life sciences teams are wading through the same fragmented data landscape to answer very specific clinical questions. The technology we’re building is aimed at closing that gap — so patients get answers faster, and the right treatment sooner. That’s the future I’m working toward.