Manage a small team of SREs supporting commercial SaaS platforms in the health and wellness space- 100% Remote
A Bit About Us
****Why join us?
- Flexible paid time off
- Affordable health, dental, and vision insurance options
- Monthly fitness reimbursement
- 401(k) matching
- New-Parent Paid Leave
- 1-month paid sabbatical every 5 years
- up to 100% telecommute or hybrid work in one of the offices
Job Details
Responsibilities
- Lead a team of SREs to ensure the reliability, availability, and scalability of our systems and infrastructure
- Design, implement, and maintain our infrastructure and applications
- Develop and implement monitoring and alerting systems to ensure the health of our systems and infrastructure
- Collaborate with cross-functional teams to optimize our systems and infrastructure
- Manage incident response and resolution processes
- Develop and maintain disaster recovery plans
- Ensure compliance with security and regulatory requirements
- Continuously improve our processes and infrastructure to increase efficiency and reduce downtime
Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field
- 3+ years of experience in Site Reliability Engineering or related field
- Strong background in Linux, VMware, AWS, Azure, Docker, Kubernetes, Redis, RabbitMQ, monitoring, GitLab CI, Jenkins, Terraform, ElasticSearch, Rancher, Python, Bash, and Lambdas.
- Experience leading a team of SREs
- Strong problem-solving skills and ability to work in a fast-paced environment
- Excellent communication and collaboration skills
- Experience with agile methodologies and DevOps practices
- Knowledge of security and regulatory requirements and best practices
- Ability to manage incident response and resolution processes
- Experience developing and maintaining disaster recovery plans
- Strong commitment to continuous improvement and learning.
Want to learn more about this role and Jobot?