Jobs / Cap***

Senior Platform Engineer

Cap*** · London, ENG, United Kingdom
Visa sponsorship details are locked. Unlock company name and apply link with .
London, ENG, United KingdomExp: 4+ yrsHybrid
Remuneration
Not specified
Location
London, ENG, United Kingdom
Visa sponsorship
Sponsors visa

Job summary

Cap*** is building an AI Operating System for private capital funds, aiming to revolutionize how they find, research, analyze, monitor, and manage investments. The company has experienced significant growth, achieving product-market fit and expanding across the US, UK, and Europe. They are now scaling their team after a large Series A funding round.

Benefits

ESOP

Qualifications

  • Four or more years running production infrastructure at a venture-backed startup or top tech firm
  • Experience owning systems end-to-end
  • In-depth experience with Kubernetes in production, including designing, operating, and debugging clusters under load
  • Proficiency with GitOps using ArgoCD, Helm, and Terraform
  • Security-minded design with IAM, secrets, and network boundaries
  • Curiosity about underlying system mechanics
  • Experience with self-hosting or homelab environments
  • Passion for platform engineering
  • Ability to thrive in fast-moving environments
  • Experience with GPU and LLM workloads on Kubernetes (strong plus)
  • Experience with Istio or other service mesh
  • Multi-cloud experience (Azure first, with AWS and GCP)
  • Familiarity with LGTM observability stack
  • Experience with compliance work (SOC 2, ISO 27001)

Responsibilities

  • Own the infrastructure for an AI platform serving leading PE firms
  • Manage Kubernetes estate, multi-cloud footprint (Azure primary, AWS, GCP), Istio service mesh, Terraform infrastructure, and LGTM observability stack
  • Ensure security by default across all infrastructure layers
  • Shape engineer workflows, CI/CD pipelines, and guardrails for confident deployments
  • Own platform end-to-end, including architecture, implementation, and operation
  • Operate and scale Kubernetes clusters, including GitOps deployments with ArgoCD and Helm, networking with Istio, and demanding stateful/compute workloads
  • Implement AI infrastructure, including self-hosted model serving and GPU workloads on Kubernetes
  • Manage multi-cloud Terraform estate for repeatability, auditability, and least privilege
  • Utilize LGTM stack (Loki, Grafana, Tempo, Mimir) for platform observability
  • Implement security and compliance measures: identity and access, secrets management, network policy, hardening, and enterprise compliance
  • Take responsibility for built systems and their impact on customers and team
  • Manage on-call duties and incidents
  • Build guardrails to ensure the easy path is the safe path

Skills

Argo CDAWSAzureGCPGrafanaHelmIAMIstioKubernetesLokiMimirTempoTerraform

Relocation

No