Is this role right for you? In this role you will:
- Incident and Problem Management: Participate in incident response, root cause analysis, and problem resolution processes. Drive proactive identification of potential issues and implement preventative measures.
- Automation and Tooling: Utilize scripting languages (preferably Python) and automation tools to streamline operational tasks, improve efficiency, and reduce manual intervention. Utilize modern toolsets, including AI, to solve recurring problems for the bank.
- Monitoring and Observability: Develop and maintain comprehensive monitoring and alerting solutions to proactively identify and address potential issues before they impact users. Leverage observability tools for setting up advanced warning to improve system reliability.
- Resilience Architecture Governance: Participate in the review and governance of system architectures to ensure they meet defined reliability, availability, and scalability requirements. Contribute to the design and implementation of resilient solutions.
- Capacity Planning and Performance Optimization: Participate in capacity planning activities and contribute to the identification and implementation of performance optimization strategies.
- Change Management: Participate in the change management process, ensuring changes are implemented with minimal risk and disruption to services.
- Documentation and Knowledge Sharing: Create and maintain clear and concise documentation for processes, procedures, and troubleshooting guides. Actively share knowledge with the team.
- Collaboration and Communication: Effectively communicate technical issues and solutions to both technical and non-technical stakeholders. Collaborate effectively with crossfunctional teams.
- Continuous Improvement: Identify opportunities for process and system improvements and actively contribute to their implementation.
- ITSM Governance and Compliance: Adhere to and contribute to the development and enforcement of IT Service Management (ITSM) policies and procedures, ensuring alignment with industry best practices (ITIL).
Skills Do you have the skills that will enable you to succeed in this role? We'd love to work with you if you have:
- Strong communication and interpersonal skills. Experience driving incident resolution through bridge calls and leading root cause analysis with technology partners.
- Bachelor's degree in Computer Science, Engineering, or a related field.
- Minimum of 2-5 years of experience in SRE or IT operations or engineering role.
- Strong knowledge with SRE practices and ITIL service delivery methodology.
- ITIL certification (ITIL Foundation or higher). Good understanding of IT Service Management (ITSM) principles and best practices.
- Proven experience in incident management, problem management, and root cause analysis.
- Familiarity with resilience engineering principles and practices.
- Understanding of infrastructure components (servers, networks, databases, cloud platforms).
- Proficiency in at least one scripting language (preferably Python) for automation and tooling.
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack, Splunk).
- Excellent analytical, problem-solving, and troubleshooting skills.
- Ability to work effectively both independently and as part of a team.
Nice To Have
- SRE certification.
- Knowledge of observability \& related tools.
- Experience with cloud platforms (e.g., AWS, Azure, GCP).
- Experience with configuration management tools (e.g., Ansible, Chef, Puppet).
- Familiarity with DevOps practices and CI/CD pipelines.
- Experience in the financial services industry.
What's in it for you?
- Diversity, Equity, Inclusion \& Allyship - We strive to create an inclusive culture where every employee is empowered to reach their fullest potential, respected for who they are, and are embraced through bias-free practices and inclusive values across Scotiabank. We embrace diversity and provide opportunities for all employee to learn, grow \& participate through our various Employee Resource Groups (ERGs) that span across diverse gender identities, ethnicity, race, age, ability \& veterans.
- Accessibility and Workplace Accommodations - We value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. Scotiabank continues to locate, remove and prevent barriers so that we can build a diverse and inclusive environment while meeting accessibility requirements.
- Upskilling through online courses, cross-functional development opportunities, and tuition assistance.
- Competitive Rewards program including bonus, flexible vacation, personal, sick days and benefits will start on day one.
- Dynamic Ecosystem - Free tea \& coffee, universal washrooms, and lots of space for team collaboration.
- Community Engagement - No matter where you choose to work from; we offer opportunities for community engagement \& belonging with our various programs such as hackathons.