The Infrastructure Mistakes Almost Everyone Makes First Time Round

A practical look at the infrastructure mistakes most organisations make as they modernise, and why they’re a natural result of growth, not failure.

In my experience, infrastructure issues don’t really come from bad decisions. Oddly, they come from completely reasonable decisions. Just… made at speed, with the information that was available at the time.

FormusPro usually only get involved after things have started to feel harder than they should.

Not broken. Not failing exactly. Just heavier.

Changes take longer. Fixes carry more risk. Teams spend more time keeping things steady than moving forward.

By that point, people are often assuming something has gone badly wrong.

In reality, what’s usually happened is far more ordinary. Your infrastructure has grown, adapted, and absorbed pressure without ever being given the space to reset.

I’m not here to call anyone out for completely reasonable mistakes everyone makes.

What I’m going to try and do is show you the patterns I see again and again as organisations modernise their infrastructure across hybrid and cloud environments.

If some of these feel familiar, try not to worry, that’s not a red flag.

It’s a very normal place to be.

Infrastructure Rarely Breaks All at Once

Instead, it usually drifts.

Infrastructure rarely fails in one, single dramatic moment (and when it does, you hear about it on the news).

There’s almost never a single decision you can point to and say, “that was it”. What happens instead is far more subtle and less identifiable.

Things tend to slowly move away from the state they were originally designed for.

A workaround becomes permanent. A temporary exception never gets revisited. A system that was meant to be simple starts carrying more responsibility than anyone expected.

None of this feels risky at the time. In fact, most of it feels sensible.

Sensible Decisions Made Under Time Pressure

When organisations modernise, speed matters. Projects have deadlines. Teams are small. The business needs progress now, not in an ideal future state.

So infrastructure evolves to meet immediate needs. Access gets added so work can continue. Environments are duplicated to reduce friction. Changes are made directly because there isn’t time to re-engineer things properly.

Individually, these decisions are rarely wrong.

They solve the problem in front of you. The issue is that they’re made in isolation, without much opportunity to step back and look at how the overall estate is changing.

Over time, that gap adds up.

The Slow Accumulation No One Plans For

The drift we’re talking about isn’t going to be obvious.

If you know what you’re looking for, you’ll start to see people relying on knowledge that lives in other’s heads rather than in documentation. Certain systems become ‘sensitive’, so fewer people are willing to touch them. Changes that used to feel routine now need careful co=-ordination.

From the outside, everything will still look fine.

Your services are running. Users aren’t complaining. But underneath?

Your infrastructure is becoming harder and harder to reason with and harder still to change safely.

This is usually the point where your teams feel the pressure, even if they can’t quite name the cause. Nothing is broken, but nothing feels simple anymore either.

And that’s when organisations start to realise that infrastructure doesn’t just need to work. It needs to stay understandable as it grows.

Ownership Gets Blurry As Systems Multiply

As infrastructure grows, responsibility for it rarely keeps pace.

Early on, ownership is obvious. Systems are few. Changes are visible. The people who built things are often the same people running them. Questions get answered quickly because everyone knows where to look.

But as hybrid and cloud environments expand, that clarity starts to fade. Not because anyone lets go deliberately, but because the number of moving parts increases faster than roles and expectations evolve.

When Everything Works, No One’s Responsible

When things are running smoothly, that lack of clarity stays hidden.

Teams assume someone else is looking after a platform. Things work because they’ve always worked. Access remains open because it hasn’t caused a problem yet. Responsibilities blur across infrastructure, security, applications and operations, with just enough overlap to keep things moving.

This usually feels efficient. Fewer blockers. Less friction. More autonomy.

But the downside is subtle. Decisions start happening without a clear sense of who owns the long-term impact. Changes get made because they can be made, not because someone is explicitly accountable for how they affect the wider estate.

When Something Fails, Everyone Fails

The moment something does go wrong, that very same flexibility becomes the main problem.

Incidents turn into group conversations rather than clear escalation paths. People hesitate before making changes because they’re not sure who should approve them. Fixes take longer, not because they’re complex, but because responsibility has to be worked out first.

This is often when I hear some of my favourite quotes, such as… ‘It’s always worked like this,’ or ‘We’re not sure who owns that anymore.’

Not as excuses… as literal statements of fact.

Ownership hasn’t disappeared. It’s just been spread so thin that no one feels fully confident stepping in.

Why Hybrid Estates Make This Worse, Not Better

Hybrid infrastructure is especially good at hiding ownership gaps.

Different parts of the estate evolve at different speeds. Cloud platforms move quickly. Legacy components change more slowly. Third-party services sit somewhere in between. Each layer often comes with its own tooling, permissions, and assumptions about who’s responsible.

Over time, this creates invisible seams. Things still connect. Data still flows. But accountability doesn’t always follow the same path.

Without conscious effort, ownership ends up defined by history rather than intent. Who built it. Who last touched it. Who ‘usually’ gets called when something breaks.

That works for a while. Then one day it just doesn’t.

Visibility Fades Long Before Risk Appears

One of the trickiest things I’ve found about modern infrastructure is that risk rarely shows itself early.

Most teams have some level of monitoring in place. Systems are up. Dashboards are green. Alerts fire when something stops responding. On the surface, that looks like visibility.

But operational visibility and structural visibility aren’t the same thing.

You can know when something is down without really understanding how close it is to failing, or how exposed it’s becoming over time.

Monitoring Tells You What’s Broken, Not What’s Fragile

Monitoring is great at spotting symptoms.

CPU spikes. Storage fills up. Services stop responding. These are clear, measurable events, and most teams get better at detecting them as they modernise.

What’s much harder to see are the conditions that make those events more likely.

Configuration drift. Hidden dependencies. Identity sprawl. Assumptions baked into scripts or pipelines that no one actively checks anymore. None of these trigger alarms on their own, but they quietly reduce your margin for error.

By the time monitoring does alert you, the risk has already been there for a while.

  When Reports Describe Yesterday, Not Today

As estates grow, reporting often becomes a compromise.

Data is pulled from different tools. Snapshots are taken at different times. Some information is real-time, some is already out of date by the time it’s reviewed. Individually, those reports are useful. Together, they don’t always tell a coherent story.

I’ve found this is especially common in hybrid environments, where on-premises assumptions, cloud-native tooling, and third-party services all describe the world slightly differently.

The result is a picture that looks complete, but isn’t quite current.

Teams think they understand their infrastructure because they can see something. In reality, what they’re seeing is often a version of the estate that no longer fully exists.

 

The Risk Was Always There When It Finally Shows Up

That’s why incidents can feel sudden, even when nothing obvious has changed.

From the team’s point of view, the failure comes out of nowhere. From the infrastructure’s point of view, it’s been building quietly for months.

Small changes stack up. Assumptions age. Dependencies tighten. The system still works, but it’s less forgiving than it used to be.

That’s usually when visibility becomes a priority. Not because anyone ignored it, but because it’s hard to justify deeper insight when everything appears stable.

Until it isn’t.

Cloud Platforms Will Solve This For Me… Right?

One of the reasons cloud adoption accelerates so quickly is that it genuinely helps at the start.

Provisioning is faster. Scaling feels simpler. Teams can move without waiting for hardware, lengthy approvals, or long lead times. Compared to what came before, it often feels like friction has finally been removed.

And for a while, that’s true.

The challenge is that cloud doesn’t remove complexity. It changes where it lives.

 

Early Wins Hide Later Complexity

In the early stages, cloud platforms absorb a lot of decisions for you.

Defaults are sensible. Services work well together. You can get something live quickly without fully understanding everything under the surface. That’s a feature, not a flaw. It’s what allows teams to make progress.

But those early wins can hide the shape of what’s forming underneath.

As more services are added, patterns start to diverge. Naming conventions drift. Environments are set up slightly differently depending on who built them and when. Decisions that made sense for one workload quietly become precedent for others.

None of this feels problematic day to day. It only becomes visible when you try to change something that crosses those boundaries.

 

The Moment Cost, Security, And Reliability Collide

Eventually, different concerns start pulling in different directions.

Security teams want tighter controls. Finance wants clearer cost visibility. Engineering teams want speed and autonomy. All of those are reasonable. All of them matter.

The tension appears when the platform hasn’t been shaped to balance them.

Suddenly, a change that should be simple requires multiple conversations. A cost-saving measure introduces operational risk. A security improvement adds friction in places it wasn’t expected.



From the outside, it can feel like cloud has stopped delivering on its promise. In reality, the promise hasn’t changed. But the within your organisation context has.

Cloud platforms work best when their foundations are revisited as they grow.

When that doesn’t happen, the friction doesn’t disappear. It just re-emerges somewhere else.

Security Gaps Are Often Structural, Not Negligent

When security issues do surface, the assumption is often that something’s been missed.

A control wasn’t applied. A process wasn’t followed. Someone didn’t do what they were supposed to do.

In my experience, that’s rarely, if ever, what happened.

Most of the security gaps I see in my work at FormusPro aren’t caused by carelessness. They’re caused by infrastructure that’s grown in ways security models didn’t quite keep up with.

 

Identity Sprawl Isn’t A People Problem

As environments expand, access grows with them.

New services need permissions. New teams need autonomy. Temporary access becomes semi-permanent because removing it feels riskier than leaving it in place. Over time, identities multiply, and so do the ways they’re used.

No one is deliberately creating risk. They’re enabling work to continue.

The challenge is that identity often becomes the connective tissue across hybrid and cloud environments. When that sprawl isn’t actively shaped, it becomes difficult to answer basic questions with confidence. Who has access to what? Why do they have it? What would break if it changed?

Those aren’t failures of intent. They’re symptoms of growth.

 

Security Tools Don’t Fix Unclear Foundations

Modern security tooling is powerful. There’s no shortage of platforms promising visibility, control, and reassurance.

But tools sit on top of the infrastructure you already have.

If environments are inconsistent, ownership is blurred, or access models evolved organically, security tools end up reflecting that complexity rather than resolving it. Alerts increase. Policies grow more detailed. Exceptions multiply. No one wants that.

At this point, teams can feel like they’re doing more security work than ever, without feeling any more confident.

The issue isn’t that the tools are wrong. It’s that they’re being asked to compensate for foundations that were never designed to scale that way.

 

When Security Becomes Reactive Instead Of Supportive

This is usually the turning point, and the bit where FormusPro is asked to get involved

Security starts showing up as friction. Changes slow down. Teams work around controls rather than with them. Conversations become defensive, even when everyone is trying to do the right thing.

These are all signs that security has been forced into a corrective role, instead of being something the infrastructure naturally supports.

Strong security doesn’t come from locking things down harder. It comes from clarity. Clear ownership. Clear identity models. Clear understanding of how the estate actually behaves day to day.

Without that, gaps don’t just appear. They hang around until dealt with.

Documentation Is Always ‘Coming Later’

I can’t tell you how often I’ve heard “the documentation is coming later”. And look… no teams set out with the explicit goal of neglecting their documentation.

At the very start, it always feels unnecessary. The systems are new and shiny. The people who built them are right there. Questions get answered quickly and informally. Writing things down feels like something that can wait.

And for a while, it can.

 

Why Documentation Never Quite Catches Up

As your infrastructure estate grows though, the pace of change is going to outstrip your habit of recording it.

Projects finish. Priorities shift. Teams move on to the next thing. Documentation becomes something that’s meant to happen after delivery, but ‘after’ never quite arrives.

Even when documentation does exist, it often reflects how things were at a point in time, not how they behave now. Small changes go undocumented. Exceptions aren’t captured. Context lives in conversations rather than systems.

And over time, the gap widens.

 

When Knowledge Becomes A Bottleneck

This is where the impact starts to show.

Certain people become the unofficial owners of specific parts of the estate. Not because they’re the only ones capable, but because they’re the only ones who know how things actually fit together.

That will work, right up until it doesn’t.

Do any of these feel familiar? Handovers become risky. Changes get delayed because the ‘right person’ isn’t available. Teams avoid touching areas they don’t fully understand, even when those areas are critical.

If you’ve reached this point, the issue isn’t missing documentation. It’s that understanding has become fragile.

 

Why This Usually Goes Unnoticed For So Long

From the outside, everything still looks fine.

Systems run. Incidents get resolved. Work continues. There’s no obvious signal that knowledge is concentrated or incomplete.

It’s only when organisations try to scale, modernise further, or change how teams operate that the absence of shared understanding becomes painful.

Documentation was never the goal in itself. Shared clarity was. And without it, infrastructure becomes harder to change with confidence.

Your Teams Know What Needs Fixing, They Just Don’t Have Time

By the time organisations reach this stage, very little of what I’ve talked about is a surprise.

Teams usually have a good sense of where the friction is. They know which systems feel fragile. They know which processes slow things down. They know which parts of the estate they’d rather not touch unless they absolutely have to.

The challenge isn’t awareness. It’s timing.

Business Momentum Versus Technical Reality

Infrastructure rarely gets space to pause.

There’s always another project waiting. Another delivery milestone. Another change that feels more urgent than revisiting foundations. From a business point of view, that makes complete sense. Stopping to tidy up infrastructure can feel like slowing progress.

So technical debt gets deferred, again and again, not because it’s ignored, but because it’s competing with visible outcomes.

The result is an estate that continues to function, but with less room for error each time something changes.

 

Why Knowing Isn’t The Same As Being Ready

Even when teams want to address the issues, readiness isn’t guaranteed.

Fixing structural problems often cuts across ownership boundaries. It requires time from people who are busy on something else. It carries short-term risk in exchange for longer-term stability. That’s a hard trade-off to make when everything is ‘working well enough’.

So knowledge sits there, unacted on.

People adapt instead. They work around fragile areas. They add process where clarity is missing. They rely on experience rather than fixing the underlying shape of the estate.

That’s not avoidance. It’s pragmatism.

 

When The Right Moment Finally Arrives

The good news is that what usually triggers change isn’t a single, catastrophic incident.

It’s a gradual sense that things are taking more effort than they should. That delivery is slowing. That risk is becoming harder to explain away. That growth is starting to feel constrained rather than enabled.

That’s often when teams decide it’s time to stop carrying the weight and deal with it properly.

Not because they didn’t know before.

But because now, the timing finally makes sense.

Final Thoughts

If there’s a common thread running through all of this, it’s that none of these situations come from poor intent.

They come from growth. From momentum. From teams doing their best to keep things moving whilst the ground underneath them keeps shifting. And that’s a really good thing. But…

Modern infrastructure is expected to be flexible, resilient, and secure, all at the same time.

It’s also expected to support change without slowing the business down. That’s a difficult balance to strike, especially when environments evolve faster than the structures around them.

What matters isn’t whether these patterns appear. For most organisations, they will. What matters is recognising them early enough to deal with them deliberately.

Because infrastructure doesn’t just need to work today. It needs to stay understandable as it grows.

And that’s a very different challenge.

Ready For More?

Spreadsheets Are Holding Back Manufacturing

Why Spreadsheets Are Still Holding Back UK Manufacturing

UK manufacturers still rely on spreadsheets, but the inefficiency, data silos and human error they cause are stunting growth. Discover how Microsoft Dynamics 365 and FormusPro help build smarter, connected operations.

Infrastructure As Code

What Is IAC (Infrastructure-As-Code)? 

Infrastructure as Code (IaC) explains how modern organisations design, manage and rebuild infrastructure consistently, reducing risk and supporting secure, scalable operations.

HR Systems

Your Systems All Talk To Each Other. So Why Doesn’t HR?

Why HR is often left behind during digital transformation, and why leadership teams are rethinking how people data fits into joined-up systems.

Speak To An Expert

To find out about how we create systems around the Microsoft D365 platform or to ask us about the specific industry focused digital management systems we create, get in touch. Tel: 01432 345191 A quick call might be all you need, but just in case it isn’t, we’re happy to go a step further by popping by to see you. We serve clients throughout the UK and beyond. Just ask.
This field is for validation purposes and should be left unchanged.
Name(Required)