Post-Merger Tech Integration: Ten Systems, Nine Months, Zero Downtime

The deal closed on a slide. Two companies become one, the synergies are circled in green, and someone in finance has already counted the savings from running half as many systems. Then the slide lands on engineering, and you discover that “half as many systems” means picking, for each of ten overlapping pairs, which team’s life work gets switched off.

That is the job. Two large consumer platforms had merged, and the day after, we owned two of everything. Two payment stacks. Two identity systems. Two order pipelines, two notification services, two catalogs, two of every internal tool that two sets of competent engineers had spent years building because each company needed exactly that thing. None of it was broken. That was the problem.

The customer was supposed to feel nothing. Not a forced password reset, not a re-link of a payment method, not an order that vanished mid-flight. Zero downtime was not a stretch goal somebody added to look ambitious. It was the floor, because both platforms were live at the kind of scale where a bad hour shows up in the national press.

You do not migrate ten systems. You sequence them.

The first instinct is to rank the pairs by how much money each consolidation saves and start with the biggest. That instinct is wrong, and it took one tense planning week to talk the room out of it.

Systems do not stand alone. Identity sits underneath everything. Payments depends on identity. Orders depend on payments and identity. The notification service hangs off orders. If you collapse payments before you have collapsed identity, you are reconciling two user tables in the middle of the money path, which is exactly where you least want a surprise. So the sequence follows the dependency graph from the bottom up, not the savings column top down. Boring foundations first. The systems with the fattest synergy number went late, on purpose, because they sat highest in the stack and needed stable ground under them.

We picked identity to go first. Lowest glamour, highest leverage. Once both platforms read from one source of truth for who a user is, every consolidation after it got simpler, because “which account is this” stopped being a question you had to ask twice.

The strangler, because the alternative is a war room and a prayer

There is a version of this where you build the unified system, pick a Saturday at 2am, cut everything over, and keep a bridge line open with forty people on it. I have lived that version. It works until the one time it does not, and the one time it does not is the time you remember.

So we did not cut over. We strangled.

For each pair, we stood a facade in front of both old systems: one interface, the two legacy implementations hidden behind it. Every caller in the company moved to the facade while the facade still routed to whichever original system already owned that traffic. Nobody downstream knew anything had changed, because from where they sat, nothing had. Then, behind the facade, we built the one consolidated system, and we moved traffic onto it a slice at a time. A percentage of users. Then a region. Then everything. The old systems kept serving until they served nothing, and only then did we turn them off.

Two platform stacks, A and B, each with its own identity, payments, and orders services, converging through a facade layer into one unified stack, with old systems strangled slice by slice

Two stacks become one through a facade per pair. The facade routes to the old systems first, then bleeds traffic onto the unified one a slice at a time, until the legacy boxes serve nothing and get switched off.

The facade is what buys you zero downtime, because the riskiest moment, the actual switch, becomes a routing change you can make for one percent of traffic and undo in seconds if the dashboards turn red. No Saturday. No bridge line. Just a dial you turn slowly with your hand on the rollback.

Which one wins, and how you actually decide

Here is the part nobody tells you. The decision of which system survives is almost never made on technical merit, and pretending otherwise is how you lose the room.

We tried, early, to run it as a clean technical bake-off. Score each pair on scale headroom, code health, operational maturity, security posture, total cost to run. Tidy. Defensible. And it produced answers that were politically radioactive, because in three of the ten pairs the technically stronger system belonged to the smaller company, the one that had just been acquired, and choosing it meant telling the larger, older engineering org that their thing was the one getting killed.

So we kept the scorecard but added two columns the bake-off had ignored. Migration cost: how painful is it to move the losing system’s data and traffic onto this winner. And blast radius: if this winner has a bad day, how much of the merged company is down. A system that scored slightly lower on code health but sat on the data that was ten times harder to migrate often won anyway, because the cheapest path to one system mattered more than which codebase was prettier. We wrote the criteria down, weighted them in the open, and let the scores fall where they fell. Not because it removed the politics. Because when the answer is unpopular, “here is the rubric we all agreed to in March” is the only thing that holds.

In the end the split was roughly even. The acquirer’s systems won about half, the acquired company’s won the other half, and that even split did more for morale than any all-hands speech, because nobody could tell a story where one side had simply eaten the other.

Data reconciliation is where the bodies are buried

Picking the winner is the easy decision. Moving the loser’s data onto it is the months of work nobody puts on a slide.

The same user existed in both systems with different IDs, different normalization, and the occasional contradiction: a phone number that two systems disagreed on, an address one had updated and the other had not, a wallet balance that had to match to the cent or someone was getting robbed. You cannot just union two tables and hope. We built a reconciliation pass for every pair that ran in shadow for weeks before any cutover: pull both sources, apply the merge rules, and diff the result against what was already live, then surface every mismatch to a human who decided the rule for that class of conflict. The merge logic got smarter every week as the diff count fell. We did not turn a dial on real traffic until the shadow diff for that slice was boringly small and every remaining mismatch was understood.

For money, the bar was higher. Balances and ledgers got reconciled to the cent, both directions, and any system that could not prove its numbers matched did not get traffic. That is not paranoia. That is the difference between a migration and an incident report.

The near-miss

We almost lost the wallet.

Payments was one of the late, high-stakes consolidations, and we were bleeding a region onto the unified system, maybe a tenth of that region’s traffic. The reconciliation had been clean for weeks. Then a small but growing set of users started seeing a wallet balance that was wrong by a few cents, sometimes a few dollars. Not catastrophic on any one account. Catastrophic as a pattern, because a payment platform that cannot agree with itself on how much money you have is a payment platform nobody should use.

The cause was the kind of thing no bake-off scorecard would have caught. The two old systems rounded fractional currency at different steps in the calculation, one before applying a promotional credit and one after, and for years each had been internally consistent so nobody had ever noticed. The reconciliation had matched the stored balances perfectly, because the stored balances were correct. The drift only appeared on new transactions computed by the unified system using one rounding convention against ledger history written under the other.

The facade saved us. Because the cutover was a routing dial and not a Saturday cliff, we turned the dial back to zero for payments in that region inside a few minutes, and the affected users were served by their original system again before most of them noticed. The wrong balances stopped accruing. Then we spent two weeks doing the boring, correct thing: making the rounding convention explicit, replaying the affected transactions under it, reconciling every touched account to the cent, and adding the rounding case to the shadow diff so it could never sneak through again. Then we turned the dial back up, slowly.

If that had been a big-bang cutover, the same bug would have hit a whole region at once with no quick way back, and “a few cents wrong for a handful of users” would have been “the wallet is broken” on the front page.

What I would tell the next person doing this

Sequence by dependencies, not by savings. The system with the biggest synergy number is usually the one that needs the most stable ground beneath it, so it goes last.

Put the migration cost and the blast radius into the decision rubric, in writing, before anyone knows which system each column favors. The rubric is not there to find the right answer. It is there to survive the wrong-feeling one.

And remember that you are not switching off ten systems. You are telling ten teams, half of whom just got acquired and are already wondering if they still have a job, that the thing they built with their hands is the thing being turned off. The strangler pattern was the easy half. Doing it so that both sides could look at the final architecture and see their own work in it, that was the part that actually shipped the merger.