Skip to main content
Databricks
Scraped fromWork2 days ago
DevopsSenior

Sr Platform Monitoring Engineer

SreDevopsAWSAzureGCPDockerKubernetesELK StackPrometheusGrafanaPagerdutyPython
Work Type
Hybrid
Job Type
Full Time
Location
Amsterdam
Salary
Not specified

About the Position

The role involves empowering data teams and building an advanced data and AI infrastructure platform. As a Senior Platform Monitoring Engineer, you'll investigate platform incidents and enhance observability and customer experience through various technical solutions.

Responsibilities

  • Lead platform incident investigation, coordinating cross-functional teams through detection, mitigation, and resolution.
  • Conduct thorough post-incident root cause analysis to identify systemic patterns.
  • Design and implement customer-focused alerting pipelines and observability workflows.
  • Build automation tools and resolve reliability gaps.

Requirements

  • Minimum of 5 years of experience as an SRE, DevOps Engineer, or similar role.
  • Production-level experience with at least one major cloud provider (AWS, Azure, GCP).
  • Proficiency in Docker and Kubernetes.
  • Hands-on experience with ELK, Prometheus, Grafana, PagerDuty.
  • Strong proficiency in Python or similar languages.
  • Experience owning critical phases of the incident lifecycle in production environments.
  • BS, Master's, or PhD in Computer Science or related field.
Sr Platform Monitoring Engineer