Site Reliability Engineer

SS&C Technologies Canada Corp. • Remote (Canada, Canada) • 8h ago

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

Be part of a global team that ensures the performance, scalability, and reliability of critical cloud-based applications. As part of the Global Investor and Distribution Solutions (GIDS) Platform Services team, you’ll play a key role in keeping our systems running smoothly and efficiently—while helping shape the future of our platform.

Collaborate with global teams as part of a follow-the-sun support model.
Respond to, troubleshoot, and resolve Level 2 application incidents.
Ensure critical applications are effectively monitored using tools like Prometheus and Grafana.
Create and maintain dashboards and alerts to enhance visibility into application health.
Define, implement, and track key SRE metrics (SLOs, SLIs, error budgets).
Partner with development teams to improve application reliability and resilience.
Analyze incident trends and recommend improvements to reduce recurrence.
Automate repetitive support tasks to improve efficiency.
Participate in post-incident reviews and drive reliability initiatives.
Perform infrastructure and application patching as part of regular maintenance cycles.
Support security vulnerability remediation efforts across both infrastructure and application layers.

Qualifications

Bachelor’s degree in Computer Science, Computer Engineering, IT, or related field.
5+ years of experience for senior roles; fresh graduates welcome for junior roles.
Proficiency in one or more programming languages, preferably Java, JavaScript or Python.
Proven ability to troubleshoot complex systems.
Skilled in debugging, code optimization, and automation.
Experience with relational databases and data analysis.
Experience working in Site Reliable Engineer (SRE) roles or incident response environments.
Hands-on experience with cloud infrastructure, preferably AWS.
Familiarity with observability tools such as Grafana, ELK Stack, or similar.
Experience deploying and managing applications on Kubernetes platforms.
Strong skills in analyzing and troubleshooting issues in large-scale, distributed systems.
Familiarity with PostgreSQL and its performance tuning, monitoring, and troubleshooting.

Benefits

Flexibility: Hybrid Work Model & a Business Casual Dress Code, including jeans.
Your Future: RRSP Matching Program, Professional Development Reimbursement.
Work/Life Balance: Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays.
Your Wellbeing: Medical, Dental, Vision, Employee Assistance Program, Parental Leave.
Diversity & Inclusion: Committed to Welcoming, Celebrating and Thriving on Diversity.
Training: Hands-On, Team-Customized, including SS&C Learning Institute.
Extra Perks: Discounts on fitness clubs, travel and more!
Wide-Ranging Perspectives: Committed to Celebrating the Variety of Backgrounds, Talents and Experiences of Our Employees.