Phanindra B.
0About
Designing, building, and maintaining large-scale systems, ensuring high availability, scalability, and reliability Collaborated with development teams to identify and prioritize reliability improvements Developed and implemented monitoring, alerting, and incident response strategies Performed capacity planning, forecasting, and optimization to ensure system performance Automate tasks and workflows using scripting languages and automation tools Strong problem-solving skills and attention to detail Experience with monitoring and alerting tools (Prometheus, Grafana) Participate in on-call rotations and respond to incidents and outages Developed and maintained documentation for systems, processes, and procedures Stayed up-to-date with industry trends, emerging technologies, and best practices in reliability engineering