Staff Site Reliability Engineer

บริษัท: Procore Technologies
ประเภทของงาน: Full-time

Job Description
We’re looking for a Staff Site Reliability Engineer to join Procore’s Project Execution Group. In this role, you’ll lead, collaborate, partner and develop solutions to maintain the health of the core platform. The goal is to ensure the chosen design and architecture is highly available, performant and reliable as this team is directly impacting Procore's internal customers and the decisions will directly impact external customer experience.
As a Staff Software Engineer on our Reliability Engineering team, you’ll help champion solutions to systemic issues affecting every team at Procore. Leveraging your software and systems architecture expertise, you’ll conduct consultative engagements with our service authors that improve our software’s reliability. If you have a passion for solving complex problems unique to running large, highly scalable, resilient systems with modern technologies; we would love for you to join us!
This position reports into Engineering manager and will be based in our Austin Office. We’re looking for someone to join us immediately.
What you’ll do:
Lead projects within a small team of Reliability Engineers to continually improve the reliability of Procore’s services through engineering and process improvement
Collaborate with your peers to envision, design, and develop solutions in your respective area with a bias toward reusability, toil reduction, and resiliency
Surface opportunities across the broader organization for solving systemic issues
Use a collaborative approach to make technical decisions that align with Procore’s architectural vision
Partner with internal customers, peers, and leadership in planning, prioritization, and roadmap development
Develop teammates by conducting code reviews, providing mentorship, pairing, and training opportunities
Serve as a subject matter expert on tools, processes, and procedures and help guide others to create and maintain a healthy codebase
Facilitate an “open source” mindset and culture both across teams internally and outside of Procore through active participation in and contributions to the greater community
What we’re looking for:
BS or MS degree in Computer Science or related discipline; or comparable work experience. Technical Certifications are a plus
8+ years of combined experience as a Software, Resiliency, or Reliability Engineer, with proficiency in one or more languages (Ruby, Node.js, Java preferred)
Experience architecting and designing services within distributed systems
Experience seeking and solving complex problems
Experience working with software, platforms, and infrastructure at scale (we run thousands of hosts and have millions of users) 
Experience as a technical leader on projects with the ability to course-correct as needed
Experience with the following is preferred:
Public cloud (AWS, GCP)
Container orchestration (Kubernetes)
Cloud automation tooling (e.g., CloudFormation, Terraform, Ansible)
Continuous Integration Tooling (e.g., Circle CI,Jenkins, Travis, etc.)
Continuous Deployment Tooling (e.g., ArgoCD, Spinnaker)
Service Mesh / Discovery Tooling (e.g., Consul, Envoy, Istio, Linkerd)
Contributions to open-source projects

Printรายงานการใช้งานผิด

Apply for this job