CloudLinux is looking for a brilliant Senior Site Reliability Engineer (SRE) to join the Release Engineering Department, a team that plays a critical role in maintaining both external and internal infrastructure related to package repositories, with a strong focus on delivering and managing repository distribution to users.
This role offers a unique opportunity to collaborate with multiple development teams, accelerate progress, and provide enterprise-level solutions globally. Responsibilities include Linux OS administration, designing system solutions at an architectural level, advancing cloud technologies, system programming, Python/Linux scripting, and working with virtualization. This is a remote position best suited for professionals located in Europe and CIS, as the team primarily operates within European time zones.
As our Senior Site Reliability Engineer, you will:
- Design, implement, and manage scalable, resilient, and secure wide company repository infrastructure for CloudLinux products as a first assignment.
- Automate software operations for re-usability and consistency across private and public clouds, taking into consideration the complexities of distributed systems.
- Monitor system performance and troubleshoot issues proactively to ensure optimal uptime and reliability.
- Automate deployment processes using Infrastructure as Code (IaC) principles.
- Share your experience, know-how, and best practices with other team members in design sessions, system architecture discussions, mentorship, and "doing work together".
Requirements
To be successful, you should have:
- Strong background in development: an ideal candidate had started a career as a developer, then rolled to infrastructure-based projects on a large scale.
- Proven experience as a leading SRE or in a similar role, with a strong focus on Linux environments.
- Proficiency in modern agile SDLC practices and principles, orchestration, and CI/CD tooling i.e. Python, Java, Terraform, Ansible, Cloudformation, Puppet, Chef, or similar.
- Knowledge of the Grafana ecosystem or similar, building dashboards, alert rules, PromQL, as well as frontend observability.
- Excellent technical knowledge of IT Infrastructure, including network and application load balancers, switches, routers, and IP addressing.
- Strong analytical and problem-solving skills with a focus on root cause analysis and mitigation.
- Excellent communication and teamwork skills with the ability to collaborate effectively across engineering teams.
- English: at least Intermediate level required.
Benefits
What's in it for you?
- A focus on professional development.
- Interesting and challenging projects.
- Fully remote work with flexible working hours, that allows you to schedule your day and work from any location worldwide.
- Paid 24 days of vacation per year, 10 days of national holidays, and unlimited sick leaves.
- Compensation for private medical insurance.
- Co-working and gym/sports reimbursement.
- Budget for education.
- The opportunity to receive a reward for the most innovative idea that the company can patent.
By applying for this position, you consent to the processing of your personal data as described in our Privacy Policy (https://cloudlinux.com/candidate-privacy-notice), which provides detailed information on how we maintain and handle your data.