-
Build, scale, and maintain Observability-as-Code (OaC) libraries that improve monitoring and observability for engineering teams.
-
Provide support to engineers by answering questions, troubleshooting issues, and helping debug problems related to the observability tooling and abstractions we manage (e.g. DataDog, Sumo Logic, Sentry).
-
Work closely with Observability Champions and engineers across teams to understand challenges, propose roadmap improvements, and shape future training sessions.
-
Collaborate with platform teams to enhance production debugging, service health insights, and system reliability across our stack.
-
Partner with observability vendors to evaluate capabilities, influence roadmaps, and lead cost optimisation initiatives based on data usage and engineering needs.
-
Contribute technical guides, internal documentation, and training sessions that promote observability best practices, helping teams become more self-sufficient.
-
Take complete ownership of significant initiatives - from technical definition to rollout - proactively driving cross-team coordination, removing friction, abstracting complexity, and delivering impactful solutions that improve the production debugging experience for engineers.
-
Write clear, concise, elegant, and well-tested code in languages like Python, Ruby, and/or JavaScript.
-
Work as part of an agile, cross-functional team focused on continual improvement and knowledge sharing.
-
Deliver solutions that go beyond what is assigned, identifying opportunities for broader impact and greater impact.
-
Help raise the bar for engineering quality and performance by mentoring others, identifying inefficiencies, and applying advanced design principles to complex challenges.
-
Contribute technical insights that improve software delivery both within and beyond your team, helping shape engineering practices and decision-making at scale.
-
5 years experience as an application developer or equivalent experience on a developer tools team.
-
Mastery of some combination of Python, Ruby, JavaScript and/or other computer languages.
-
Experience with observability and alerting tools such as Datadog and PagerDuty.
-
Experience in developer education by producing written documentation.
-
Are knowledgeable about what makes a great developer experience and have the ability to improve it by configuring or creating tools and scripts.
-
Have experience in Terraform configuration.
-
Have experience in Kafka configuration for applications.
-
Understand container orchestration from an application developer's point of view.
-
Have experience across entire ecosystems from Local dev through to Production.
-
Enjoy continually learning and using new technologies such as Kubernetes, Kafka, and AWS Lambda.
-
Are passionate and knowledgeable about engineering excellence and can educate others through written documentation, example code and presentations.
-
Occasional requirement to be on-call outside of standard hours.