Jobs / Münchener Verein Versicherungsgruppe

Site Reliability Engineer (m/w/d) Schwerpunkt Observability & Security

Apply Now

Münchener Verein Versicherungsgruppe · Deutschland

DeutschlandOnsite

Apply Now

Remuneration

Not specified

Location

Deutschland

Visa sponsorship

Not specified

Job summary

The Site Reliability Engineer (SRE) with a focus on Observability & Security will ensure system transparency, security monitoring, and establish SRE metrics. This role involves implementing SRE-driven security and automation, and acting as a DevSecOps consultant to support other IT teams in secure architecture, deployment, and monitoring.

Qualifications

Experience in Site Reliability Engineering (SRE), DevOps, or system administration with a strong focus on automation, platform security, and system reliability.
Practical experience with monitoring and APM tools, ideally Elastic APM and CheckMK.
Fundamental understanding of container orchestration (OpenShift/Kubernetes) and Java-based runtime environments (Quarkus, RedHat EAP).
Proficiency in version control (Git), automation tools (Ansible), and basic understanding of CI/CD pipelines (GitLab CI/CD, Jenkins).
Good knowledge of at least one programming or scripting language (e.g., Python, Go, Java, or Bash) for automating repetitive tasks.
Proactive approach to taking responsibility and implementing own ideas with the team.
Team player with experience and analytical skills to enrich the DevOps team.
Fluent in German.

Responsibilities

Utilize modern observability tools for performance, availability, and application health monitoring in live operations.
Detect unauthorized access attempts, performance drops, or platform behavior deviations proactively and automatically.
Analyze complex dependencies and interactions between containerized applications, identify bottlenecks, and accelerate root cause analysis.
Support the team in defining and monitoring SLIs and SLOs to ensure system quality.
Ensure secure and encrypted communication between networked applications and automate secure credential provisioning.
Define and automate security guardrails at the platform level.
Design and implement fault-tolerant patterns (e.g., Rate Limiting, Circuit Breaking) and automated recovery processes.
Advise and support other IT teams on fault-tolerant architecture, secure deployment, and optimal monitoring of new IT systems.

Skills

AnsibleBashElastic APMGitGitLabGitLab CIGoJavaJenkinsKubernetesOpenShiftPythonCheckov

Languages

German

Relocation

Apply Now