Jobs / Cap***
Senior Platform Engineer
Cap*** · London, ENG, United Kingdom
Visa sponsorship details are locked. Unlock company name and apply link with .
London, ENG, United KingdomExp: 4+ yrsHybrid
Remuneration
Not specified
Location
London, ENG, United Kingdom
Visa sponsorship
Sponsors visa
Job summary
Cap*** is building an AI Operating System for private capital funds, aiming to revolutionize how they find, research, analyze, monitor, and manage investments. The company has experienced significant growth, achieving product-market fit and expanding across the US, UK, and Europe. They are now scaling their team after a large Series A funding round.
Benefits
ESOP
Qualifications
- Four or more years running production infrastructure at a venture-backed startup or top tech firm
- Experience owning systems end-to-end
- In-depth experience with Kubernetes in production, including designing, operating, and debugging clusters under load
- Proficiency with GitOps using ArgoCD, Helm, and Terraform
- Security-minded design with IAM, secrets, and network boundaries
- Curiosity about underlying system mechanics
- Experience with self-hosting or homelab environments
- Passion for platform engineering
- Ability to thrive in fast-moving environments
- Experience with GPU and LLM workloads on Kubernetes (strong plus)
- Experience with Istio or other service mesh
- Multi-cloud experience (Azure first, with AWS and GCP)
- Familiarity with LGTM observability stack
- Experience with compliance work (SOC 2, ISO 27001)
Responsibilities
- Own the infrastructure for an AI platform serving leading PE firms
- Manage Kubernetes estate, multi-cloud footprint (Azure primary, AWS, GCP), Istio service mesh, Terraform infrastructure, and LGTM observability stack
- Ensure security by default across all infrastructure layers
- Shape engineer workflows, CI/CD pipelines, and guardrails for confident deployments
- Own platform end-to-end, including architecture, implementation, and operation
- Operate and scale Kubernetes clusters, including GitOps deployments with ArgoCD and Helm, networking with Istio, and demanding stateful/compute workloads
- Implement AI infrastructure, including self-hosted model serving and GPU workloads on Kubernetes
- Manage multi-cloud Terraform estate for repeatability, auditability, and least privilege
- Utilize LGTM stack (Loki, Grafana, Tempo, Mimir) for platform observability
- Implement security and compliance measures: identity and access, secrets management, network policy, hardening, and enterprise compliance
- Take responsibility for built systems and their impact on customers and team
- Manage on-call duties and incidents
- Build guardrails to ensure the easy path is the safe path
Skills
Argo CDAWSAzureGCPGrafanaHelmIAMIstioKubernetesLokiMimirTempoTerraform
Relocation
No