OpenShift Platform Lead

FormusPro offers awesome career opportunities in software engineering, data science and much, much more.

United Kingdom

Hybrid

Full Time

OpenShift Platform Lead

Infrastructure

We are seeking an experienced OpenShift Platform Lead to own and manage our OpenShift-based virtualization platform that delivers enterprise VM hosting services. This role is responsible for the complete lifecycle management of the platform, including design, architecture, BAU operations, patching, upgrades, incident response, and driving platform stability.

You will lead the implementation, work closely with SRE and operations teams, and enable seamless VM migration from legacy infrastructure. This is a hands-on technical leadership role requiring deep OpenShift expertise and the ability to balance operational excellence with strategic platform evolution.

This is a full-time hybrid-based role.

To be successful in the role, you will need to be a good team player with excellent communication skills, have the ability to manage your own workload, and work well on your own initiative and under direction.

Key Responsibilities

Platform Leadership & Strategy

Own the technical strategy and roadmap for the OpenShift Virtualization platform
Define platform architecture, design patterns, and technical standards
Lead platform lifecycle management including major/minor upgrades and Red Hat CoreOS updates
Drive platform stability improvements and performance optimization initiatives
Establish platform governance, compliance, and security policies
Build relationships with Red Hat support and leverage Technical Account Management (TAM)

Lifecycle & Operations Management

Manage complete platform lifecycle from installation through upgrades to decommissioning
Plan and execute OpenShift platform upgrades (4.x releases) with zero/minimal downtime
Coordinate quarterly/monthly Red Hat CoreOS (RHCOS) patching cycles
Oversee OpenShift Virtualization operator upgrades and feature enablement
Maintain platform health through proactive monitoring and capacity planning
Ensure platform meets defined SLAs and availability targets (99.9%+)

Incident & Event Management

Lead Major Incident response for platform-level issues (Sev 1/2)
Perform root cause analysis (RCA) and implement preventive measures
Collaborate with SRE team on incident postmortems and improvement plans
Manage platform-related events including maintenance windows
Coordinate emergency changes and rollback procedures
Participate in on-call rotation for critical platform escalations

Change Implementation & Release Management

Review and approve platform changes through Change Advisory Board (CAB)
Plan and execute complex platform changes with risk assessment
Implement infrastructure-as-code (IaC) practices using Ansible and Terraform
Drive GitOps adoption for platform configuration management
Coordinate release windows for platform updates with business stakeholders
Ensure change documentation and runbook accuracy

VM Migration & Workload Onboarding

Lead VM migration strategy from VMware/legacy platforms to OpenShift Virtualization
Design VM migration runbooks and automation workflows
Create and maintain VM templates, golden images, and standardized configurations
Enable application teams for self-service VM provisioning
Troubleshoot VM performance, networking, and storage issues
Optimize VM placement, resource allocation, and cluster balancing

Platform Stability & Performance

Define and monitor key performance indicators (KPIs) for platform health
Implement chaos engineering practices to validate platform resilience
Tune OpenShift control plane and worker node performance
Optimize storage performance (ODF/Ceph) for VM workloads
Configure network policies and OVN-Kubernetes for optimal VM networking
Drive continuous improvement initiatives based on operational metrics

Knowledge, Skills and Experience

Must-Have Skills & Experience

Experience Requirements:

8-12 years of overall IT infrastructure experience
5+ years of hands-on experience with Red Hat OpenShift Container Platform (4.x)
3+ years of experience with OpenShift Virtualization (KubeVirt) or similar VM hosting platforms
3+ years of experience in platform/infrastructure leadership roles
2+ years of experience with Red Hat Enterprise Linux (RHEL 7/8/9) and Red Hat CoreOS (RHCOS)

Technical Skills:

Expert-level OpenShift administration (oc CLI, Web Console, API)
Advanced OpenShift Virtualization knowledge (VMs, DataVolumes, CDI, live migration)
Advanced Red Hat CoreOS and Machine Config Operator (MCO) experience
Advanced Linux administration and troubleshooting (RHEL-based)
Advanced storage management (ODF/Ceph, Storage Classes, PV/PVC, CSI drivers)
Advanced networking (OVN-Kubernetes, Multus, Network Policies, SDN concepts)
Advanced automation skills (Ansible, Bash scripting, Python)
Intermediate Kubernetes concepts (Operators, Custom Resources, Pod lifecycle)
Intermediate Infrastructure-as-Code (Terraform, GitOps tools like ArgoCD/Flux)
Intermediate observability platforms (Prometheus, Grafana, AlertManager)

Platform Operations:

Proven experience managing platform lifecycle (installation, upgrades, patching)
Strong incident management and major incident response experience
Experience with change management processes and release coordination
Demonstrated ability to perform root cause analysis and implement preventive measures
Experience with capacity planning and performance tuning
Track record of driving platform stability improvements

Certifications Required (one or more):

Red Hat Certified Engineer (RHCE)
Red Hat Certified Specialist in OpenShift Administration

OR equivalent demonstrable experience

Desirable Skills & Experience

Highly Desirable:

Red Hat Certified Architect (RHCA) certification
Red Hat Certified Specialist in OpenShift Virtualization
Experience with Red Hat Advanced Cluster Management (RHACM)
Experience with Red Hat Advanced Cluster Security (RHACS/Stackrox)
GitOps expertise (ArgoCD, Flux, Tekton)
Chaos engineering experience (Litmus, Chaos Mesh)
Experience with OpenShift on multiple infrastructures (bare metal, VMware, AWS, Azure)

Nice to Have:

Certified Kubernetes Administrator (CKA) or CKS
Experience with multi-tenancy and namespace isolation strategies
Knowledge of compliance frameworks (PCI-DSS, HIPAA, SOC2, ISO 27001)
Experience with backup solutions (Kasten K10, Veeam, Commvault)
Programming skills in Go, Python, or Java
Experience with hybrid/multi-cloud architectures
ITIL v4 Foundation certification

Key Success Metrics

Platform availability: 99.9%+ uptime
Successful upgrade completion rate: 100% with zero unplanned rollbacks
Incident MTTR: < 2 hours for Sev 1/2 incidents
VM migration velocity: Target VMs per month with <5% issues
Platform capacity utilization: 70-80% optimal range
Change success rate: >98% first-time success

Work Environment

Some evening/weekend work required for maintenance windows
Available 24 x7 during major issues

Ready To Join The Team?

Our amazing team is growing rapidly and we are always looking for talented individuals.

Comapny Benefits

In addition to the competitive salary, you will also be entitled to the following benefits:

Time allocated for personal development and training
Pension contributions
Death in Service Cover
Income Protection
20 days holiday per annum, plus 5 lifestyle days and the option to buy additional days
Access to well being and legal assistance 24 hours a day, 7 days a week
Access to multiple offerings including supermarket savings, discounted days out, the daily coffee or a summer holiday – there’s something to suit everyone’s lifestyle
Company social events throughout the year
Access to an electric vehicle scheme
Access to private medical insurance

Why join FormusPro

Empower. Impact. Change.

That’s our motto, and we mean it. Our focus is helping our clients solve problems with Microsoft technology that enables them to achieve their goals. We’re a technical bunch (and proud of it). That combination of how we partner with our clients and our technical DNA has created an environment that allows our team to thrive.