
Intermediate Site Reliability Engineer, Environment Automation - GitLab
GitLab is the intelligent orchestration platform for DevSecOps. We operate as a fully distributed, asynchronous team across multiple regions, embracing AI as a core productivity multiplier. GitLab fosters a high-performance culture driven by values and continuous knowledge exchange, where careers accelerate, innovation flourishes, and every voice is valued. Join us in building technology that transforms how the world develops software.
An Overview of the Role
Join the Dedicated team as an Intermediate Site Reliability Engineer focused on Environment Automation. Your work will power hundreds of isolated GitLab environments for our customers, ensuring they are reliable, scalable, secure, and consistent. You will treat everything as code, contributing to automation across the entire lifecycle, from initial provisioning to day-to-day operations. You'll collaborate with senior SREs to manage many tenant environments in parallel, each with unique constraints and integration points.
You will define, deploy, and maintain GitLab environments across cloud providers using infrastructure as code, deployment packages, and Kubernetes. You'll contribute to automation that reduces manual work, build tooling for orchestrating upgrades and configuration changes safely at scale, and support an observability stack to understand and improve environment health. Your work directly impacts customer experience with GitLab Dedicated and other managed offerings, allowing them to focus on building software while we ensure their GitLab environments are production-ready.
Examples of work you'll do:
- Contribute to the design and evolution of infrastructure automation using Terraform, Ansible, and Kubernetes to provision, upgrade, and operate many GitLab environments with minimal manual effort.
- Help debug and resolve production issues across Kubernetes clusters, GitLab components, and cloud services, then assist in building automation and safeguards to prevent similar issues from recurring.
- Assist in creating and maintaining deployment and orchestration tools, such as Helm Charts, omnibus-gitlab configurations, and multi-tenant workflows, to manage GitLab environments at scale.
What You'll Do
- Contribute to automating operational tasks across many GitLab environments, from initial provisioning and configuration updates to upgrades and routine maintenance, reducing manual work and improving reliability at scale under senior guidance.
- Help build and refine the observability stack for multi-tenant GitLab environments, monitoring signals across Kubernetes, cloud services, and GitLab applications for early issue detection and basic capacity tracking.
- Assist in responding to platform alerts and incidents, collaborating with Environment Automation SREs and engineering teams to troubleshoot production issues across multiple tenants and document findings.
- Support planning and implementation of infrastructure changes, capacity expansions, and new service rollouts for Dedicated and other managed GitLab environments, contributing to efforts that improve resource efficiency and environment isolation.
- Develop and maintain scripts, automation tools, and infrastructure-as-code workflows that manage parts of the GitLab environment lifecycle, enabling more repeatable, self-service operations.
- Apply and help implement best practices for running GitLab on Kubernetes and cloud platforms, focusing on day-to-day reliability, performance, and security while learning to keep environments consistent.
- Participate in the on-call rotation for production GitLab environments with appropriate support, triaging and mitigating incidents across clusters and cloud providers and contributing to post-incident reviews.
- Document operational tasks, runbooks, and lessons learned to create clear, repeatable processes and candidates for future automation, improving shared knowledge and reducing manual toil.
What You'll Bring
- Experience working as an SRE or in a similar role operating production infrastructure, with an interest in automating the lifecycle of many environments or tenants in parallel.
- Hands-on experience with Golang (required) and the ability to read, understand, and modify infrastructure tools written in Go.
- Hands-on experience running Kubernetes-based workloads in production, including basic understanding of deployments, rollouts, and debugging common issues.
- Familiarity with infrastructure automation and configuration management tools such as Terraform and Ansible, including experience working with modules, variables, and managing state safely for multiple environments.
- Solid understanding of Git-based workflows and infrastructure-as-code practices, with the ability to contribute to reusable modules, templates, and pipelines.
- Experience working in distributed systems or cloud-based production environments, ideally in SaaS or managed service settings, with comfort participating in incident response and on-call rotations under guidance.
- A proactive mindset focused on automation and documentation, seeking opportunities to remove manual steps, improve runbooks, and turn repetitive tasks into reliable, self-service tools.
- Comfort working asynchronously across distributed teams and a desire to contribute to GitLab's values of collaboration, transparency, and iteration.
About the Team
We are responsible for building, running, and evolving the entire lifecycle of GitLab environments that power the GitLab Dedicated platform. You'll be part of a team focused on owning the reliability, scalability, performance, and security of automated single-tenant GitLab instances and their supporting services. GitLab Dedicated provides fully managed, isolated environments for customers worldwide, meaning your work directly impacts how organizations run their mission-critical software delivery on GitLab. We operate in a fully distributed, asynchronous environment across multiple regions, collaborating on everything from infrastructure automation and environment lifecycle design to incident response and capacity planning. You'll solve novel challenges at scale, from orchestrating infrastructure-as-code workflows across hundreds of tenants to designing automation that keeps environments consistent, secure, and up to date. We continuously seek to reduce complexity and improve efficiency by leveraging cloud vendor managed products and services.
How GitLab Supports Full-Time Employees
- Benefits to support your health, finances, and well-being
- Flexible Paid Time Off
- Team Member Resource Groups
- Equity Compensation & Employee Stock Purchase Plan
- Growth and Development Fund
- Parental Leave
Please note that we welcome interest from candidates with varying levels of experience; many successful candidates do not meet every single requirement. Additionally, studies have shown that people from underrepresented groups are less likely to apply to a job unless they meet every single qualification. If you're excited about this role, please apply and allow our recruiters to assess your application.
Open to
India
Sign in to track applications and earn points.