Hi all, Erika Newland, Senior Software Engineer and Technical Lead at Orum.
I had the pleasure of speaking at Fintech Devcon this year and I wanted to share my presentation with those that weren’t able to make it on our blog. The topic was event-based architecture and how Orum is leveraging it to solve some of the hardest and most important problems for payments platforms.
In all software systems, tradeoffs between speed, user experience, and scale pose challenges. In real-time, 24/7/365, multi-rail money movement, requirements for handling volatile and unpredictable volumes, managing operational risk, and gleaning data insights increase the dimensionality of the problem.
To meet these requirements, Orum’s payments platform, Momentum, deploys an event-based architecture stack composed of light-weight AWS services managed and deployed via reusable infrastructure-as-code components. Together, the components comprise a pub/sub event delivery model.
How this works at Orum
Momentum produces events via a transactional outbox pattern. Each Momentum microservice deploys an RDS outbox table along with its domain data tables. For each write to the domain model, services publish relevant events to the outbox table. This dual write process allows us to customize event schemas, decoupling domain schemas and normalization strategies from event schemas. Event schemas may contain, for example, the results of a join between a customer table and a transaction table. This allows the data tables to remain normalized while encapsulating all relevant information in an event.
The outbox poller module, a custom, reusable terraform module deployed in each service, pushes events from Momentum’s outbox tables. The module’s terraform deploys an AWS cron-driven Node lambda that queries the table for new event records and writes the records to Orum’s event producer module. The outbox poller module is packaged directly with the lambda code.
The outbox poller writes events to our custom event producer module. Each service deploys a copy of the reusable event producer module. This allows us to write events simultaneously to our datalake for event analytics via Kinesis Data Firehose and to our subscribing microservices using SNS. Without our fan-out lambda, we wouldn’t be able to simultaneously write events to our datalake and to our subscribing services using optimal infrastructure for each use case. Since I like illustrations as much as my colleague Dave Cline, so here’s one below:
Handling the complexity of payments
Payment platforms are complex, let alone 365/7/24 multi-rail, muli-speed, multi-step payment platforms. This means that the facilitation of event processing across the Momentum platform requires each of our microservices to have domain-specific requirements for event consumption.
Two of our most interesting event consumption problems occur:
- If a single step in a payment lifecycle requires inputs from multiple events arriving in inconsistent order
- If we need to put the payment lifecycle into a holding pattern for a period of time
To handle scenarios in which multiple events are required to advance a payment lifecycle, we use Event Coordinators. These are specialized event handlers. For a given process, the coordinator stores the incoming event data in a domain-specific schema. The coordinator then checks all relevant tables to determine if all required events have been logged. If all requirements are met, the coordinator emits an event to progress the payment lifecycle. If not, the handler returns nil and the coordinator waits for the next event to arrive.
Momentum handles multi-step payments that require waiting periods between payment legs. For example, to reduce operational risk in account to account payments, Momentum waits a predetermined period of time between sending payment instructions to the payment networks (NACHA, for example) for each account. To automate this time-based processing, Orum engineers developed cron-based event gating.
In cron-based event gating, the service:
- Calculates the time at which the next stage of the payment lifecycle should begin
- Writes the time to a “hold time” table
- Releases an event updating the payment status to indicate that the payment is being held.
A cron-driven lambda periodically reads the” hold time” table, finds payments whose “hold time” has passed, and produces events indicating that the payment is ready to resume the life cycle. The specialized “hold time” table allows us to separate the orchestration data for holding and promoting payments from the payment domain schema, while the cron-driven lambda performs periodic reads without requiring the consumption of a service thread and the use of the server clock to keep time.
Putting it all together
The magic of Momentum comes together through a few key features. It relies on an event-based architecture to solve problems including resiliency, orchestration, and data insights. We use light-weight AWS services managed and deployed via reusable Terraform components to quickly and easily spin up event production in our microservices. Lastly, our domain-specific event consumption allows us to customize solutions for handling payment complexity for each microservice.