Site Reliability Engineering: Devops & Automation Job in Aera Technology
Site Reliability Engineering: Devops & Automation
Aera Technology
4+ weeks ago
- Pune, Pune Division, Maharashtra
- Not Disclosed
- Full-time
- Permanent
Job Summary
Site Reliability Engineering at Aera is creating next-generation, hybrid cloud infrastructure that enables our SaaS platform to process billions of Machine Learning transactions on petabytes of data every day.
As our customer base rapidly grows, we are looking for experienced Site Reliability Engineers to join our global Software Engineering team and help us deliver our vision.
If you share our passion for building the next generation of enterprise software, and implementing it for the most sophisticated customers in the world, you ve met your match. Headquartered in Mountain View, California, we're growing fast, with teams in Mountain View and San Francisco (California), Bucharest and Cluj-Napoca (Romania), Paris (France), Munich (Germany), London (UK), Pune and Bangalore (India), Sydney (Australia) and Singapore. So join us, and let s build this!
As our customer base rapidly grows, we are looking for experienced Site Reliability Engineers to join our global Software Engineering team and help us deliver our vision.
In this role you will:
- Design, build, release, and maintain a fully automated, Infrastructure as Code ecosystem that ensures 4+ nines availability of our SaaS platform.
- Continuously innovate your way out of existing and yet-to-be-discovered problems, with an eye on what s next as we anticipate and remain ahead of customer expectations.
- Obsess about, measure, and optimise system performance, continuously pushing your capabilities beyond current boundaries as our platform scales and customer base grows.
- Learn what a healthy platform ecosystem looks like, and build Observability into the platform which prevents outages from impacting service availability.
- Seek out and build relationships across teams that positively impact our culture of collaboration, innovation, with an understanding of how your work contributes to the bottom line of the business.
Your day will consist of:
- Participating in infrastructure design, platform management, and capacity planning discussions to ensure we are scaling to meet business needs.
- Writing code that automates activities that have historically been executed manually.
- Gathering and analyzing metrics from our platform using Observability methods to assist in performance tuning, debugging, and root cause analysis.
- Collaborating with development teams to improve our platform services through innovative new designs, rigorous testing and release methods.
- Ensuring we are meeting our Service Level Objectives, (SLOs) by reviewing our Service Level Indicators, (SLIs) and reporting deviations along with remediation and mitigation plans and schedules.
- Helping restore service availability, followed by debugging, and root cause analysis for issues that occur in our Production environments.
- Helping provide 24/7/365 coverage in a Follow-the-Sun model for on-call support.
Your ideal qualifications are:
- A Bachelor s degree in Computer Science or other related technical, and/or scientific discipline. A strong background in advanced Mathematics is a plus.
- Ability to write code (structured and OO) with one or more high level languages, such as Python, Java, C/C++, and JavaScript.
- Ability to write code using multiple automation languages like Terraform and Ansible.
- Working knowledge of Cloud-based technologies, providers, and tools such as Kubernetes, service meshes , AWS, Azure, GCP, etc.
- Experience with large scale distributed systems that incorporate modern databases, (Cassandra, SQL), and big data platforms, (Exasol).
- Experience using various real-time and historical monitoring tools such as ELK, DataDog, Prometheus, Nagios, etc to troubleshoot issues in our platform.
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks, as well as an unwavering commitment to identifying root causes of infrastructure issues and resolving them.
- 3+ years working as a SRE maintaining complex, distributed systems in real time.
If you share our passion for building the next generation of enterprise software, and implementing it for the most sophisticated customers in the world, you ve met your match. Headquartered in Mountain View, California, we're growing fast, with teams in Mountain View and San Francisco (California), Bucharest and Cluj-Napoca (Romania), Paris (France), Munich (Germany), London (UK), Pune and Bangalore (India), Sydney (Australia) and Singapore. So join us, and let s build this!
Experience Required :
Fresher
Vacancy :
2 - 4 Hires
Similar Jobs for you
×
Help us improve TheIndiaJobs
Need Help? Contact us