Site Reliability Engineering Lead Role | Hiver | Bengaluru

Posted 9 July 2019 by Jhoom Choudhary (@jhoom_hiver)

Hiver https://hasjob.co/hiverhq.com/div6x , Bengaluru · hiverhq.com · Full-time employmentFull-time employment · IT/Systems AdministrationIT/Systems Administration

Company Description

Hiver (http://hiverhq.com) turns Gmail into a simple, powerful team collaboration tool. We’re a profitable, rapidly growing SaaS company with a highly rated product, and with customers all over the world.We're an agile, driven team deeply motivated by the idea of building a globally respected company from India. Our work culture is focussed on transparency, ownership, and openness. We are ambitious and focused, yet humble, warm and empathetic.

Opportunity

Site Reliability Engineers (SREs) are responsible to ensure that our systems are healthy, monitored, automated, and designed to scale. As a manager of this team, you'll use your technical expertise to handle the growing infrastructure, make it reliable and scalable and work closely with our development teams from the early stages of design all the way through identifying and resolving production issues. Additionally, you will build and grow the SRE team to handle the above responsibilities.

Responsibilities

Build and Lead a team of SREs ensuring that production applications are stable and reliable.

Be directly responsible for uptime.

Manage on-call rotation across the SRE team.

Own end-to-end availability and performance of key services and build automation to prevent problem recurrence.

Automate current manual infrastructure management and alerts handling processes via Kubernetes, Terraform, CI/CD pipelines etc.

Assist in the roll-out and deployment of new product features and installations.

Find scalability bottlenecks and areas for performance improvements. 

Work closely with technical leads to ensure that platforms are designed with scale and operability in mind

Help SREs in your team to grow and develop their careers through mentorship and performance management.

Requirements

5+ years of technical experience in Site Reliability Engineering

3+ years of experience as a people manager in an Engineering or Operations capacity.

Strong Linux administration skills with an emphasis on shell scripting.

Expertise with AWS and GCP platform.

Expertise with Terraform, Docker, Kubernetes (or other orchestration tools), and Jenkins.

Experience with infrastructure monitoring platforms (Datadog, Prometheus) and Application Performance Management (APM) systems (New Relic).

Experience with Configuration management tools (Puppet, Chef, Ansible).

Experience with CI/CD pipeline configuration, deployment, and support.

Experience making hiring decisions for SRE/DevOps teams.

Hands-on technical experience with supporting multi-tenant applications is required.

Job Perks

- Macbook to work on

- Free and tasty food inside the company

- Flexi working hours

- 24 leaves an year, excluding the sick/medical leaves

- Sat and Sun weekly off

- Located in the prime area of Bengaluru so easy to get to work

Email this Share on WhatsApp

Apply for this position

Login with Google or GitHub to see instructions on how to apply. Your identity will not be revealed to the employer.

It is NOT OK for recruiters, HR consultants, and other intermediaries to contact this employer