An 8-person platform team and a 22-person automation COE support a 14 million customer bank. We keep it running by betting on orchestration, resilient automation, and case-based workflows where linear diagrams break down.
We have been serious about RPA since 2016 and serious about orchestration since 2024. Here is what that looks like today.
Our platform team rebuilt the bank's automation stack in 2023 around a single orchestration runtime. Different teams use different front doors into the same product.
The backbone. Business analysts model in a BPMN-style designer. Developers author the same runtime in a flow-based, code-near experience. Same engine. Same deployment pipeline. Same observability.
How we chose two front doorsFor the workflows that cannot be drawn as diagrams. Commercial lending, claims, internal investigations. Stages hold structure, tasks adapt on the fly.
Commercial lending case studyOur 420 bots sit in the execution layer. Integrations prefer APIs and fall back to UI automation when upstream systems drift. Code-generating agents help our COE author and maintain bots at scale.
Why we fall back to UIPrinted, laminated, signed by everyone. In every Accrual platform team room. We update them about once every two years when someone has a better idea.
One orchestration runtime. Different authoring experiences for different audiences. No second product for developers. No second product for analysts.
Linear processes get BPMN. Everything else gets case management. Forcing chaos into a diagram is how you get a 47-step diagram that nobody maintains.
APIs are faster when they exist and stable. UI automation is the safety net when vendors deprecate without warning. Every integration ships with both paths.
Our code-generating agent for bot authoring is a pair programmer, not a magic button. Our engineers review every change. Our agent ships faster than either alone.
Every bot ships with contract tests for its APIs and regression tests for its UI targets. The 420th test is as important as the 1st. We test Delegate-class changes end-to-end.
BPMN processes, flows, and bots all ship through the same CI. Same gates. Same rollback strategy. Same observability surface.
When work is inherently non-linear, give it stages and tasks. Commercial lending is a case. Claims is a case. Internal investigations are a case.
If only one person can ship a change safely, that change should not ship. Our 8 platform engineers are interchangeable, on purpose.
Not the other way around. When a business team needs something that is not on the roadmap and it is reasonable, we move the roadmap.
Reliability is the brief. Prestige is a nice-to-have. Our incidents are rare and our pages are boring. Both are on purpose. The moment a platform starts optimizing for interesting, it stops being trustworthy.
It did not happen overnight and we did not plan it this way. We just kept saying yes to the next thing and yes to writing down why.
Posts from our platform engineers and COE leads. No sponsored content. No thought leadership. Just working notes from people building this stuff.
How our bots adapt when upstream dependencies break, using a hybrid pattern that prefers APIs and falls back to UI automation.
Why we gave our developers a flow-based designer on the same orchestration runtime we use for BPMN. And how it changed our velocity.
Why we rebuilt commercial lending as a case rather than a diagram. Stages where stages help, flexibility where flexibility matters.
If you like working on small teams that ship things used by millions of people the next morning, get in touch.