Unified Clinical Data Integration and Orchestration with Datafusion
The Challenge
Healthcare technology providers often work in environments where multiple EHR systems such as Epic, Athenahealth, and Cerner are used across different clinics and hospitals. Each system comes with its own APIs, authentication methods, and data formats, which makes integration difficult to standardize.
In this case, a health technology company needed a reliable way to connect with all these systems while running a multi-tenant SaaS platform. Their applications depended on consistent access to clinical data, regular data extraction for reporting and compliance, and scheduling that worked correctly across different time zones.
There were also operational challenges. Different types of users needed different levels of access, from platform administrators to clinic-level users. At the same time, each clinic required full control over its own setup, including credentials and where clinical data would be stored. Many preferred storing data in their own cloud environments such as Amazon Web Services, Google Cloud, or Microsoft Azure.
On top of that, the system had to support a large volume of notifications for both patients and staff without slowing down APIs. The existing approach relied on separate integration services, cron jobs, and notification workers across different products, which made the system harder to maintain and scale.
The Solution
To address these challenges, the company implemented Datafusion as a centralized layer between their applications and the various EHR systems.
Instead of maintaining separate integrations for each vendor, Datafusion provides a single, consistent way to access clinical data. It uses a scope-driven approach where each request determines how the data should be fetched and from which system. For example, Epic integrations use secure OAuth2 flows with signed tokens and FHIR APIs, while other systems are handled through an integration engine that manages vendor-specific differences.
Beyond integration, Datafusion also brings together scheduling, access control, data storage, and notifications into one platform. This helped simplify the overall architecture and reduced the need to build and maintain multiple parallel systems.
Core Delivery Approach
Data was sourced from Epic, Athenahealth, and Cerner systems.
The scope included clinical APIs, bulk data extraction, scheduled data pulls, notifications, and access control.
A centralized platform team managed the system, while each clinic handled its own configurations.
This setup made it possible to keep things consistent across clinics while still giving each clinic the flexibility it needed.
Technologies Used
| LAYER | TECHNOLOGY |
|---|---|
| Frontend | React.js, Redux Toolkit, Bootstrap |
| Backend | Node.js |
| Database | PostgreSQL |
| Integration Engine | Mirth Connect (Java) |
| Notifications | Twilio, Voyage |
| File Storage | AWS S3, Google Cloud Storage, Azure Blob Storage |
Key Capabilities Delivered
Unified access to multiple EHR systems
Datafusion introduced a single endpoint for accessing clinical data. Requests are handled based on configuration, so the same interface works across different EHR vendors without requiring separate implementations.
Role-based access control
The system defines clear roles to manage access. Platform admins handle global setup and onboarding. Master users manage their clinic’s configuration, users, and integrations. Standard users have limited access for day-to-day operations. Some actions are shared between admin and master users where needed.
Clinic-level scheduling
Scheduling is tied directly to clinic configurations rather than being managed globally. Each scheduled job is linked to a specific clinic setup, making it easier to run data pulls aligned with real clinical workflows. Time zones are handled properly, and the same logic is used for both manual and automated requests.
Bulk data processing and tracking
For large data requests, the system supports bulk operations where jobs can be started, monitored, and completed without affecting real-time APIs. Background processing ensures that long-running tasks don’t impact user-facing performance.
Clinic-controlled data storage
Datafusion allows clinical data to be written directly to the clinic’s cloud storage. A consistent structure is used to organize data, which supports tracking, auditing, and reprocessing when needed.
Scalable notifications and real-time updates
Notifications are handled asynchronously so that API performance is not affected. The platform supports email, SMS, and in-app messaging. Real-time updates are delivered through socket connections, removing the need for polling.
Multi-tenant architecture
Each clinic operates within its own boundary while sharing the same platform. Authentication is handled securely, and configurations are resolved dynamically based on the clinic making the request.
The Impact
After implementing Datafusion, the company saw clear improvements in development speed and system reliability.
New clinics could be onboarded much faster since most of the work became configuration rather than building new integrations. The engineering team no longer needed to maintain separate codebases for each EHR system. Data handling became more transparent, and clinics had better control over where their clinical data was stored.
Scheduling became more predictable, and the system handled high volumes of notifications without performance issues. Overall, the platform became easier to manage and scale.
Why It Worked
The biggest reason this approach worked was consolidation. Instead of solving the same problems in multiple places, everything was brought into one system.
The role-based access model ensured that users only had access to what they needed, improving governance and security. Linking scheduling to clinic configurations made the system more aligned with real-world workflows. Asynchronous processing helped maintain performance under heavy load.
Allowing clinics to store their data in their own cloud environments also built trust and supported compliance requirements.
Outcome
Datafusion turned a fragmented and complex integration setup into a single, well-structured platform.
Clinical data access is now consistent, scheduling is reliable, and system performance remains stable even at scale. The platform supports both operational needs and long-term analytics without requiring additional infrastructure.
What previously required multiple systems and significant maintenance effort is now handled through a single, unified solution that is easier to manage, extend, and trust.