Site Reliability Engineer

Job type: Contract

TRR have been asked to identify multiple Site Reliability Engineers for a leading Aviation client. This will be an initial 12 month contract based on site in Brussels (some Hybrid).

The Role
Collaborate with development teams to integrate SRE best practices into the software lifecycle.
Be a member of a dynamic team to operate and maintain mission-critical applications.
Work with the newest, state of art cloud native technologies both in the cloud and on-prem.
Detect, identify, and analyze faults if they arise, help to fix them, and work on solutions to avoid further occurrence.
Monitor system performance and proactively identify and resolve issues.
Conduct root cause analysis for production errors and implement preventive measures.
Constantly improve the service availability, scalability, performance, monitoring, and overall manageability.
Be involved in common work with security experts, architects, and developers to build and improve a sustainable technical landscape.
Continuously research and assess new approaches for potential use, and provide recommendations and subject matter expertise regarding trends, technology, tools, and services.
Contribute to all areas of Site Reliability Engineering as a team member.   
Experience Required
Experience of SRE and Platform Engineering principles and frameworks.
Experience in automation, Infrastructure as Code, CI/CD.
Experience in Kubernetes and Linux system administration.
Experience in operating and automating solutions with some of the following technologies/Products:
- Observability (OpenTelemetry, Elastic stack, Prometheus, Jaeger)
- Cloud services (Azure)
- Document database (MongoDB)
- Relational Database Management Systems (PostgreSQL)
- Messaging system (Solace)
- Event Streaming (Kafka)
- IAM (Okta)
- Secret management (Hashicorp Vault)   
Please share your CV for immediate consideration

Apply for this job