Some careers shine brighter than others.
If you’re looking for a career that will help you stand out, join HSBC and fulfil your potential. Whether you want a career that could take you to the top, or simply take you in an exciting new direction, HSBC offers opportunities, support and rewards that will take you further.
HSBC is one of the largest banking and financial services organisations in the world, with operations in 64 countries and territories. We aim to be where the growth is, enabling businesses to thrive and economies to prosper, and, ultimately, helping people to fulfil their hopes and realise their ambitions.
We are currently seeking an experienced professional to join our team in the role of Sr. Associate Director, Software Engineering
In this role, you will
- Lead and manage global production support – Hands-on involvement in all production incidents, problem and SRE related activities along with leading and supporting the teams.
- Accountable to lead / manage the shared lean level 1 support across multiple time zones, ensure the team is sufficient to support our Important Business Services and Business Prioritized Services across all geographies where it’s corresponding businesses are based out of.
- Responsible to guide, support and coordinate with POD teams (L2 & L3 support) to to resolve production incidents within the defined SLA’s
- Fostering a culture of ownership, learning, and excellence.
- Oversee real-time monitoring, incident detection, and rapid issue resolution to minimize downtime and service disruption.
- Act as the escalation point for critical incidents, ensuring swift root cause analysis and resolution.
- Drive major incident management (MIM) processes, coordinating cross-functional teams, and leading war room discussions.
- Ensure problem management processes is adhered to identify recurring issues and drive permanent fixes.
- Implement Site Reliability Engineering (SRE) practices to improve system resilience and proactive incident prevention.
- Identify opportunities for automation, AI/ML-based automations, and self-healing capabilities to improve support efficiency.
- Represent IB and FEM as SRE Lead in various CIB meetings & Service Resilience working groups.
- Run regular resiliency call for FEM & IB covering all Incidents, track actions and share learnings across wider teams.
- Accountable for service performance, agreeing SLOs, SLAs, error budgets, and ensuring high service resiliency and stability.
- Measures, track and improves service operational KPIs (e.g. MTTR, Outages to customer)
- Manage and deliver the resiliency book of work across FEM & IB.
- Responsible for Run the Bank cost optimization
- Optimize support processes, automation, and tooling to enhance operational efficiency.
- Work with infrastructure, development, and business teams to produce / review real time monitoring and informational dashboards and to improve system reliability and performance.
- Adhoc review of change management processes to ensure processes are being adhered to with regards to CR closure and RCA of failed changes to avoid repeat incidents.
- Work closely with risk and audit teams to ensure that all risks across the applications are fully understood and managed appropriately.
- Participate in cross Global Business / Global Function (GB/GF) Communities of Practice (CoP) meetings to help influence key decisions made at a group level.
Engineering
- Drive and track resiliency OKRs for the teams set by the organization, ensuring continuous skill enhancement and process improvements.
- Monthly reporting of resiliency metrics and OKR
- Work closely with business stakeholders to ensure the service provided to them is of quality and identify gaps for improvement.
- Collaborate with business, application owners, ITSO, TPEM and infrastructure teams to align support strategies with business priorities.
- Provide regular updates to senior leadership on system health, incident trends, and risk mitigation plans
To be successful in this role, you should meet the following requirements:
- 12+ years of overall technology experience.
- Proven experience in Service Quality Management and driving engineering initiatives horizontally for service line.
- Proven experience in Incident management, Problem Management, Observability processes and able to handle crises.
- Ability to communicate to all levels of the organization in a timely manner when incidents occur.
- The ability to run a crisis call to ensure the swift resolution of incidents
- Proven experience in an IT and business environment with in-depth specialization in problem solving, metric analysis and quality improvement;
- A strong Technology background is required to be able to deep dive and intervene in often challenging Technology issues. Solid knowledge of application development, Technology Infrastructure is a must role.
- Ability to apply risk assessment and management principles and processes and find ways of solving or pre-empting issues
- Innovation - Is proactive in developing ideas, continuously searching for improvements in techniques which add value to the business and has full responsibility for implementation.
You’ll achieve more when you join HSBC.