In this article we will learn about Merge API, a unified API solution to help developers build native integrations with multiple 3rd party end systems in one go using the 1:many connectors provided by Merge.
We will also learn about what happens post integration - how data syncs happen on Merge API predominantly via a pull based model where Merge stores a copy of the end customer data in their servers, which could lead to end customer anxiety around security and compliance.
Finally, we will talk about how Knit API is solving for this by working on a push based model which does not require any customer data storage - helping you reduce friction with your customers and alleviate their concerns on how their data is being handled.
Let’s dive in.
What are unified APIs?
Essentially, a unified is a 1:Many API which helps developers go-live with multiple integrations within a category of SaaS with a one time effort of integrating with the unified API provided by the platform. For example, let's say an Employee Benefits platform wishes to integrate with multiple HRMS systems which its existing or potential customers use. In the absence of a unified API, the developers at the benefits platform will have to read API documentation of each HRMS, understand its data model and build the connectors 1:1. This wastes massive dev effort on a repetitive task which essentially serves the same basic purpose of syncing employee data from the customers HRMS to the benefits platform. Unified APIs save all of this effort by normalizing the data models of all the HRMS tools out there in to one common data model so the developers at the benefits platform have to work with just one connector rather than building a connector for each different HRMS tool in a 1:1 manner.
Other than building the integrations faster, unified APIs also help you maintain them in a low effort way by providing DIY integration management dashboards which your front line customer success teams can use to diagnose and fix any issues with live integrations so that your engineering team does not need to get involved every time there is a break or question around data syncs from your customers.
If you are wondering whether a unified API is a right solution for you or not, read this
Now, let us look at the components of a unified API solution.
Key components of a unified API solution
Any unified API solution has four basic components -
1. The auth component
Users of your APP use the auth component, embedded in your APP, to authenticate and authorize access to their enterprise app to your SaaS application.
2. 1:Many Connectors
1:many connectors are simply put, a common data model which abstracts the data models of all the existing applications in a category of apps so that your engineering team can work with just the one connector provided by the unified API platform rather than individually integrating with all the connectors within that category of apps. This saves massive time as your dev team does not need to understand the nuances of each and instead build your product logic on the common data model of the unified API provider.
3. Integration Management
Often the most neglected piece when teams build integrations in-house, integration management dashboards are one of the key value propositions of Unified API platforms since they help your frontline teams diagnose and fix any integration or sync issues without having to involve the engineering team each time. Think about the massive amount of time it can save in maintaining the integrations which are already built and live.
4. Data Syncs
This is probably the core of the product - getting data in from the source app to your app and writing back from your app to the source is why we build integrations. Because this is important, we will talk about this in more detail below, and along the way of understanding Merge's data sync model.
Data syncs via unified APIs
To understand how data syncs happen via unified APIs, we first need to understand that there are two parts to the process -
1. Data syncs between the source app and the unified API provider
2. Data syncs between the unified API provider and your app
Data syncs between the source app and the unified API provider
The first part of the data sync is to read data from the source APP and do something. Now, here again, there are two phases:
- The initial data sync
- Delta syncs thereafter
The initial data sync happens when your app’s user has authenticated and authorized the unified API platform to access data from the source app for the first time. This is when Merge API accesses and stores a copy of the data in its own database. This is the copy of the data that Merge API uses to serve your app, i.e., the consumer app.
Post the initial syncs, the delta syncs come into the picture. The purpose of the delta syncs is to inform the consumer app of any changes in the data, for example title, manager, or location changes for any employee if you are syncing with a source HRMS system.
Now here, depending on the source system, delta syncs could be handled via webhooks OR by periodic polling of the source app.
- When the source app supports Webhooks, like Rippling, it dispatches any delta events to Merge, which then changes its copy to reflect the new information available.
- When the source app does not support Webhooks, like Success Factors, Merge has to again read the entire data from the source app and create a fresh copy of the data by polling the source system at pre-set frequencies. Depending on the plan you have, these sync frequencies could be every 24 hours or more frequent.
The thing to note is that in both scenarios, whether or not the source app supports Webhooks, Merge API serves the consumer app via its stored copy of the data.
Concerns with a data storage first model for data syncs
A data storage based model brings with it multiple challenges. First and foremost the end customers who are authorizing your app to access their data via Merge API might not be comfortable with a third party having a copy of their data stored somewhere. Even when Merge API is SOC2 compliant, as ex-users of Merge APIs HRMS integrations for our HRTech venture, we had a segment of customers who had concerns about the handling of the data, employee data being PII, and there were also concerns about where the data is being stored (in some cases outside the customers geography).
This added unnecessary friction between us and our customers, requiring additional infosec questions and paperwork which delayed deal closures or in some cases led to deals not closing at all.
Data syncs between the unified API provider and your app
Now that the unified API provider has the data, your app must consume the data for your business logic to work. There are two main philosophies or approaches here - pull vs push.
The pull model
In a pull model, your servers are busy making calls to the data providers like HRIS, Payroll systems etc. to get data. In order to do so, you will need to create a polling infra.
If you're doing so for 10-15 batch jobs, perhaps it is manageable. But imagine doing this for hundreds, even thousands, of jobs. The problem gets harder. Further, if there is no new data to report, you just wasted your compute resources on an empty call.
The push model
Now compare this with the push model. Typically, you will be required to subscribe to certain events by registering a webhook. When the event happens, the data providers would notify you with appropriate payload on the registered destination URL.
You don't have to manage any polling infra, nor would you waste compute resources on inconsequential calls. With event driven microservices architectures on the rise, the push model is definitely easier to work with and scale vs a pull model.
Which approach does Merge API use? Pull or Push?
Here, the Merge API relies heavily on a pull-based approach (though it does provide delta webhooks, which we will talk about below). Let’s look at the three options Merge API provides to consumer apps-
1. Periodic Syncs
Here, your app is expected to periodically poll the Merge copy of the data, such as every 24 hours. What this means is that you will have to build and maintain a polling infrastructure at your end for each connected customer. This is ok if you have a small number of customers, but quickly gets difficult to maintain if you have lots of connected customers.
2. Ad-hoc Syncs
If your app wants to sync data frequently, Merge API provides an option for you to write sync functions which can pull only data which has changed since last sync using its modified_after timestamp. While this reduces the data load, it still requires the polling infrastructure we talked about in point 1.
3. Webhooks
Merge’s webhooks are again of two types - sync notifications and changed data webhooks.
Sync notification events are simply notifications to your app that something in the data has changed and expects you to start the ad-hoc sync once you receive the notification - so essentially pull. On the other hand, while it does offer the changed data webhooks, it does not guarantee scale and data delivery via these webhooks.
From Merge’s doc:
- Merge offers webhooks that deliver updated data when individual data instances are updated, but depending on the amount of user data, this will not scale well.
- Make sure to implement polling and don't rely entirely on notification webhooks. They can fail for a variety of reasons (such as downtime on your end or failed processing). Merge does attempt to redeliver multiple times using exponential backoff, but we still recommend calling your sync functions on a periodic cadence of around once every 24 hours.
So you see the problem? You will not be able to work around the need for building and maintaining a separate polling infrastructure.
Knit’s approach to data syncs
While everyone talks about security, at Knit, we actually have walked the talk by embedding security in our platform architecture.
We do NOT store a copy of the source data with us. And we have built a completely events driven architecture from the ground up, so we work only with Webhooks and deliver both the initial and delta syncs to your app via events.
So you have less compliance and convincing to do with your customers, and do not have to waste engineering resources on polling while at the same time get guaranteed scalability and delivery irrespective of the data load.
Another advantage of a true events driven architecture is that it supports real time use cases (where the source APP supports real time webhook pushes) which a polling based architecture does not.
While we will soon be covering our architecture that guarantees security, scale and resilience for event driven stream processing in more detail in a follow up post, you could read more about the basics of how Knit API functions here: Knit API Documentation
Other advantages of Knit API over Merge API
1. Auth component
Knit’s auth component offers a lot of flexibility in terms of design and styling vs what Merge API offers.
It is a Javascript SDK which is far more customizable as compared to iframe which is Merge’s choice for the frontend auth component.
So if you want to make sure that the auth component which your customers are interacting with, looks and feels similar to your own APP, Knit API might just be the right solution for you.
2. Integration Management
Knit provides deep RCA and resolution including ability to identify which records were synced, ability to rerun syncs etc. It also proactively identifies and fixes any integration issues itself.
While Merge also offers customer success dashboards, they are not as deep, so your frontline folks will have to reach out to your engineering teams more frequently. And we all know how much engineering teams enjoy maintaining integrations and going through logs to check for data sync issues rather than building cool new core product features ;)
Final Thoughts
Knit is the only unified API solution which does not store customer data, and offers a scalable, and reliable push driven data sync model for large data loads.
This has several benefits:
- Better security: Since Knit does not store any customer data, you and your customers can be confident about data security and you can close deals much faster.
- Better developer experience: Since you do not need to maintain any polling infrastructure for Knit due to its events driven architecture, your developers have lesser issues maintaining it.
- Better scalability & reliability: Even with a webhook first model, Knit guarantees data delivery and scalability irrespective of the amount of data being synced between the source and destination apps. We offer a 99.99 SLA and aiming for 99.9999 soon.
Curious to learn more about Knit? Get started today