Our platform team is eight engineers supporting a 14-million-customer bank. People ask how that is possible without everyone being on fire. The short answer is we say no well, we automate the boring things, and we refuse to be heroes. The longer answer is this post.
The first thing an eight-person team has to do is decide what it does not do. We maintain the orchestration runtime. We maintain the bot authoring platform. We maintain the observability surface. We maintain the deploy pipeline. We do not maintain individual bots, individual BPMN processes, or individual flows. Those are owned by the teams that build them.
This is harder than it sounds. Teams want us to own their thing because owning things is a chore. Our job is to give them the tools so that owning their thing is not a chore, and then decline to take it on when they ask.
The second thing is to push everything repeatable into tooling. If a platform engineer is manually doing something more than twice a quarter, we automate it. Every time.
Our deploy pipeline handles bot deploys, BPMN deploys, and flow deploys the same way. One command. Same rollback. Same gates. Platform engineers do not review deploys. The pipeline reviews deploys.
Our observability surface is the same across all three authoring experiences. Platform engineers do not teach teams how to find their logs. The teams find their logs.
Our onboarding tooling is thorough enough that a new bot author is productive within two weeks without sitting next to a platform engineer. This took us eighteen months to get right. It has paid back many times over.
The hero engineer is a bug, not a feature. If you need a specific person to ship a specific thing safely, that thing is not production-ready, regardless of how elegant the code is.
Any one of our eight platform engineers can be on call. Any one can ship a hotfix to any part of the platform. We do not have specialists. We have generalists who have picked up deep knowledge in two or three areas.
This is a deliberate choice and it costs us something. A specialist is faster at their specialty. A generalist is slower at any one thing but faster across the whole system. For a team of eight supporting a bank, we think generalists win. We might not make the same choice at team of thirty.
One-on-ones. Code review. Postmortems. Career conversations. Weekly platform reviews where we look at the last seven days of incidents together and discuss patterns. These are human rituals and we keep them human.
The reason is that these are the rituals that let us automate everything else. If we lose the postmortems, we lose the pattern recognition that tells us what to automate next. If we lose the weekly review, the team drifts. A team of eight is small enough that it can drift without anyone noticing, and it can drift in ways that take six months to unwind.
We hire for three things, in order: judgment, writing, and engineering. Judgment because eight people supporting a bank have to make calls constantly about what to prioritize, what to drop, and what to defer, and those calls compound. Writing because the team communicates asynchronously across three time zones and bad writing slows everyone down. Engineering because we are, in the end, an engineering team.
We do not hire for narrow specialty. We hire for people who can pick up a specialty in three months and teach it to the team in six.
We are also actively hiring. If that description sounds like you, see our open platform roles.
We have not grown headcount, but we have grown throughput. Our coding agent authors new bots, drafts integration fixes, and generates test scaffolding. It does not replace the judgment that eight platform engineers bring. It removes the boilerplate that slowed them down.
In practice this means: when an engineer picks up a ticket to fix a broken integration, the agent has already staged a pull request with the proposed change and test fixtures from the last successful run. The engineer reads it, adjusts what needs adjusting, and merges. We have measured the cycle time improvement at three to four times faster, with no degradation in fix quality. That is the equivalent of adding several person-days of capacity per week without adding headcount.
We formalized this in our eleventh platform principle, added in April 2026: "The agent goes first." Before a human investigates an incident or authors a bot change, the agent is already drafted. Humans review. The agent does not decide. But starting from a draft changes the economics entirely.
Growing fast. We have grown from six to eight engineers in three years. That is on purpose. If we doubled the team next year, we would spend twelve months figuring out how to work at that size before we got anything else done. Small teams compound slowly. We have made peace with that.
Long-term strategy documents. Our planning horizon is one quarter ahead. We do rough sketches of the next two quarters. Beyond that, we assume we will be wrong and we focus on making the current quarter solid.
Being prestigious. Our team is not famous. Our technology is not featured in industry awards. None of our engineers are conference keynote speakers. This is fine. The bank runs. Our incidents are rare. Our engineers are happy. Prestige is a nice-to-have for a platform team. Reliability is the brief.
The full platform overview, including the principles that govern everything this team builds.
How we gave our developers a flow-based authoring experience on the same runtime we use for BPMN. And why that distinction matters.
The hybrid API/UI pattern that kept HR onboarding running through a 03:17 deprecation. How the coding agent helps when it happens.