Recruit Myself
Back to jobs
C
Verified

Create Digital Solutions

Senior SRE / DevOps Engineer (AI Platforms & Multi-Project Infrastructure)

LondonFull-timeSeniorCompetitiveMay 7, 2026
Share

Job Description

Hybrid Remote, 1 day per week in Victoria London

Role Overview

We are looking for a senior SRE / Dev

Ops practitioner to design, standardise, and operate cloud platforms that support multiple AI-driven products and services.

This role focuses on building opinionated, reusable infrastructure patterns that enable teams to rapidly deliver AI workloads while maintaining high standards for reliability, security, and cost control.

You will develop platform architecture across multiple concurrent projects, ensuring consistency in how services are deployed, integrated, and operated. This includes shaping how AI/ML workloads are built, deployed, and monitored, as well as defining clear patterns for service communication, API exposure, and infrastructure provisioning.

This is a hands-on role for someone who is comfortable making strong architectural decisions, reducing variability across teams, and balancing flexibility with standardisation in a fast-moving environment.

Key Responsibilities

Platform Architecture & Standardisation

  • Define and implement opinionated architecture patterns for cloud-native and AI-enabled services on AWS

  • Establish reusable blueprints for these same services

  • Drive consistency across multiple projects through shared modules, templates, and platform tooling

Infrastructure as Code & Automation

  • Build and maintain Terraform-based infrastructure, using modular and reusable design

  • Define CI/CD patterns for:

  • Infrastructure deployment

  • Application and model delivery

  • Enforce best practices through pipelines and automation rather than documentation

Reliability, Observability & Operations

  • Embed SRE principles across all services:

  • Monitoring, logging, tracing

  • SLIs/SLOs and alerting

  • Continuously improve reliability, performance, and cost efficiency

  • Operate API gateway/data plane technologies (e.g. Kong)

Required Skills & Experience

  • Strong experience operating AWS-based platforms in production

  • Proven experience with Terraform, including module design and CI/CD integration

  • Hands-on experience with container platforms (ECS preferred; EKS acceptable if adaptable)

  • Experience operating API gateways (Kong or equivalent)

  • Solid understanding of cloud networking and service discovery patterns

  • Experience supporting multiple teams or projects on a shared platform

  • Strong troubleshooting and production operations experience

AI / Data Platform Experience (Required)

  • Practical experience running or supporting AI/ML workloads in production, such as:

  • Model inference services

  • Batch processing pipelines

  • Integration with LLM APIs or hosted models

  • Understanding of:

  • Scaling characteristics of AI workloads

  • Cost considerations (compute-heavy workloads, GPU usage, etc.)

  • Familiarity with tooling such as:

  • Model serving frameworks

  • Data processing pipelines

  • Or managed AI services on AWS

Desirable Skills

  • Experience with GPU workloads or specialised compute environments

  • Familiarity with feature stores, vector databases, or embedding pipelines

  • Knowledge of event-driven architectures

  • Experience with security best practices (IAM, secrets management, Zero Trust)

  • Exposure to platform engineering or internal developer platforms

Wider Skills

  • Ability to make and defend clear architectural decisions

  • Comfortable operating across multiple concurrent workstreams

  • Strong communication and stakeholder management skills

  • Detail-oriented with a bias toward automation and standardisation over ad hoc solutions

Pay: £60,000.00-£80,000.00 per year

Work Location: Hybrid remote in London W2 2UH

Verified Listing

This role has been verified for authenticity, market-rate compensation, and remote eligibility.

Apply now

Step 1 of 1
Newsletter

Stay at the forefront
of market

Get the latest updates on AI-powered hiring, career growth, and technical deep-dives delivered to your inbox.

No spam. Just pure intelligence.

Senior SRE / DevOps Engineer (AI Platforms & Multi-Project Infrastructure) at Create Digital Solutions | Recruit Myself