Senior DevOps Engineer for Machine Learning Job [ES-011121]
Our partner is the world leader in enterprise applications in terms of software and software-related service revenue. We are looking for a Senior DevOps Engineer for their organization.
- Place of work:
- Encourage and drive innovative thinking to solve complex problems, perform troubleshooting of multiple application and platform layers,
- identify areas of issue and work cross functionally to deliver solutions to improve the availability, scalability, performance and efficiency of company's Machine Learning Services,
- push requirements to aid in recovery automation and avoid problem reoccurrence,
- work with Development and Support to capture and drive root cause remediation data and requirements,
- be an advocate for continuous improvement of service and design,
- mentor developing talents and engineers,
- deployment of company’s Machine Learning Services to various cloud providers,
- automation of development and deployment processes,
- management of distributed cloud landscapes (Infrastructure as Code),
- continuous delivery of our software,
- additionally, you will contribute to our Site Reliability Engineering processes.
- Bachelor's or Master's Degree in Software Engineer, Computer Science or a related technical field,
- 8+ years of experience in relevant roles (including 2-3 years in Ops or DevOps roles),
- proven experience in mentorship and technical leadership within an Ops or DevOps culture,
- experience with at least one of the following cloud-related technologies & concepts: Cloud Foundry, Kubernetes, Docker, AWS, Azure, OpenStack or other IaaS/PaaS environments,
- strong knowledge of operating Linux/Unix based systems, understanding kernel, shell, scripting, etc,
- strong familiarity with Enterprise class Fault Monitoring and Performance Management tools,
- experience in designing and implementing advanced automation pipelines with Jenkins,
- good working knowledge of widespread software development tools such as git, github, nexus, maven, etc.,
- excellent communication and interpersonal skills,
- excellent English language skills.
- Must be prepared to rapidly obtain strong familiarity with the company’s Machine Learning Services and technology set,
- an interest in hardware acceleration (GPU, TPU, FPGA),
- hands on experience in load & security tests, understanding of IT security principles and disaster recovery,
- experience with enterprise database technology,
- understanding of infrastructure and networking methodologies and practices including but not limited to TCP/IP, UDP, ICMP, etc.,
- solid Understanding of Enterprise / Service Provider Data Center Architecture (high density servers, backbone routers/switches, load balancers).