Jobs / Münchener Verein Versicherungsgruppe

Site Reliability Engineer (m/w/d) Schwerpunkt Observability & Security

Münchener Verein Versicherungsgruppe · Deutschland
DeutschlandOnsite
Remuneration
Not specified
Location
Deutschland
Visa sponsorship
Not specified

Job summary

The Site Reliability Engineer (SRE) with a focus on Observability & Security will ensure system transparency, security monitoring, and establish SRE metrics. This role involves implementing SRE-driven security and automation, and acting as a DevSecOps consultant to support other IT teams in secure architecture, deployment, and monitoring.

Qualifications

  • Experience in Site Reliability Engineering (SRE), DevOps, or system administration with a strong focus on automation, platform security, and system reliability.
  • Practical experience with monitoring and APM tools, ideally Elastic APM and CheckMK.
  • Fundamental understanding of container orchestration (OpenShift/Kubernetes) and Java-based runtime environments (Quarkus, RedHat EAP).
  • Proficiency in version control (Git), automation tools (Ansible), and basic understanding of CI/CD pipelines (GitLab CI/CD, Jenkins).
  • Good knowledge of at least one programming or scripting language (e.g., Python, Go, Java, or Bash) for automating repetitive tasks.
  • Proactive approach to taking responsibility and implementing own ideas with the team.
  • Team player with experience and analytical skills to enrich the DevOps team.
  • Fluent in German.

Responsibilities

  • Utilize modern observability tools for performance, availability, and application health monitoring in live operations.
  • Detect unauthorized access attempts, performance drops, or platform behavior deviations proactively and automatically.
  • Analyze complex dependencies and interactions between containerized applications, identify bottlenecks, and accelerate root cause analysis.
  • Support the team in defining and monitoring SLIs and SLOs to ensure system quality.
  • Ensure secure and encrypted communication between networked applications and automate secure credential provisioning.
  • Define and automate security guardrails at the platform level.
  • Design and implement fault-tolerant patterns (e.g., Rate Limiting, Circuit Breaking) and automated recovery processes.
  • Advise and support other IT teams on fault-tolerant architecture, secure deployment, and optimal monitoring of new IT systems.

Skills

AnsibleBashElastic APMGitGitLabGitLab CIGoJavaJenkinsKubernetesOpenShiftPythonCheckov

Languages

German

Relocation

No