Roles & Responsibilities:
Work with Product Development teams to establish joint operational priorities.
Oversee 24x7 emergency on-call rotation.
Deploy, and maintain high-availability Linux systems, with a continuing focus on standardization of administration and configuration.
Develop automation tools for maintenance, troubleshooting, and administration.
Configure and manage performance monitoring tools and notification services.
Providing leadership and guidance in all aspects of deployments, monitoring, reports, and tools.
Provide technical and managerial leadership for internal IT.
Work closely with both internal and external teams and vendors to guarantee continuous uptime, delivery on Service Level Agreements and maintaining site performance and operations.
Accelerate the push to move all infrastructure from self-managed data centers to AWS.
Track operating costs for all non-production and production systems and react when costs are abnormal.
Key skills Required:
Experience managing SysOps or DevOps teams. Expertise and working knowledge of network/server infrastructure running in virtualized and cloud environments.
Track record of influencing a US based non-technical management team
Experience working in an and leading a 24x7x365 Operations environment.
Working with in distributed systems and manage multiple function teams
Experience with networking technologies, including network security.
Experience with Linux in a production environment
Experience troubleshooting and resolving application and/or system-related issues
Strong written and spoken English
Track, measure and maintain uptime, incident response, and issue resolution times in accordance with service level agreements 24x7x365.
Own incident management and problem management processes for the team to drive continuous improvement for end user experience and SLA guarantees.
Experience with and solid understanding of web and app servers, Apache, PhP, MySQL. Scripting in puppet,shell, etc.