Some careers shine brighter than others.
If you’re looking for a career that will help you stand out, join HSBC and fulfil your potential. Whether you want a career that could take you to the top, or simply take you in an exciting new direction, HSBC offers opportunities, support and rewards that will take you further.
HSBC is one of the largest banking and financial services organisations in the world, with operations in 64 countries and territories. We aim to be where the growth is, enabling businesses to thrive and economies to prosper, and, ultimately, helping people to fulfil their hopes and realise their ambitions.
We are currently seeking an experienced professional to join our team in the role of Consultant Specialist
In this role, you will:
- Ensure the availability and maintainability of our large-scale API and Microservices platform located across three points of presence in HK, UK, and the US.
- Continuously improve the reliability, capacity, and performance of our platforms by applying SRE principles and practices to drive scale, enhance observability, reduce toil, more accurately measure risk, and more safely enable business driven change.
- Elevate our expertise and maturity in safely managing our core technology stack underpinned by AWS, Kubernetes, Kong API gateway, Mulesoft API, Istio Service Mesh, and a host of supporting services in a hybrid hosting environment (i.e., private/public cloud & on-prem).
- Develop best in class observability tools and techniques enabling monitoring and alerting capability which facilitate not only incident detection and response, but also capacity management, improved release safety, and greater resource efficiency.
- Investigate, triage, and resolve production incidents and use data to articulate impact with relentless attention to the technical signals and underlying root causes that enable remediation and future avoidance/mitigation.
- Contribute to the design and engineering of auto and self-healing capability for known failure modes across our platforms.
- Contribute code to our platform repositories enabling not only our reliability agenda (e.g., monitoring-as-code), but also higher release speed and safety, simpler tenant onboarding, and improved controls.
- Author, contribute, and maintain our evolving knowledge base including support and operational runbooks, platform tenant guides, and onboarding and release documentation with an underlying goal of driving as much best practice and self-service as possible.
- Participate in regular SRE on-call rota supporting a 24/7/365 support model across our mission critical platforms within a large banking eco-system of front-end, middleware, and back-end fulfilment systems.
To be successful in this role, you should meet the following requirements:
- Be fluent in written and spoken English and be comfortable working in a multi-cultural and diverse organization with team members across the globe.
- Value effective and continual communications, honesty, transparency, and accountability
- Value failure as an opportunity and an investment in more reliable systems (Blameless post-mortem culture).
- Possess fundamentals and evidence-based problem solving skills; Drive decision-making by function, first principles-based mind-set.
- Demonstrate a bias-to-action and avoid analysis-paralysis, maintain a sense of ownership as you drive actions to the finish line with high quality and on time
- Be ego-less when searching for the best ideas and contribute effectively outside of your specialty; You think about solving problems from the standpoint of best outcome for the team
- Have strong fundamental knowledge in distributed systems and networking
- Have hands-on experience with AWS, Docker and Kubernetes
- Possess programming experience in at least one of the following languages: Python, Java, Go, Ruby, Bash scripting
- Have the ability to debug and optimise code, while automating routine tasks (i.e., TOIL reduction)
- Have a strong background in the setup, use, and optimisation of a variety of observability tools including Splunk, DataDog, AppDynamics, and Cloudwatch.
- Understand the concepts of quantifying failure and availability in a prescriptive manner using SLOs, SLIs, and Error Budgets
- Hands on experience with proprietary HSBC platforms of TP and SHP
- Production support across virtualized and/or containerised environments particularly those employing Kubernetes for workload management
- Large scale API development and management technologies/frameworks such as Mulesoft or Kong
- Infrastructure and application performance analysis and tuning
- Service Mesh technology; particularly Istio & Envoy and its variations
- DevOps and Agile ways of working
- CI/CD pipeline development
- Infrastructure-as-code tools (e.g., Terraform)
- Cloud Providers (e.g., AWS solutions architect associate, etc)
- Kubernetes (e.g., CKA)
You’ll achieve more when you join HSBC.
HSBC is committed to building a culture where all employees are valued, respected and opinions count. We take pride in providing a workplace that fosters continuous professional development, flexible working and opportunities to grow within an inclusive and diverse environment. Personal data held by the Bank relating to employment applications will be used in accordance with our Privacy Statement, which is available on our website.
Issued by – HSBC Software Development India