DevOps Engineer

We are seeking a Tech Lead – Site Reliability Engineering with expertise in DevOps, QA, and Cloud to lead reliability, automation, and performance engineering efforts across cloud-based systems. This role involves leading teams, establishing SRE best practices, and implementing scalable cloud architectures to ensure high availability, security, and efficiency.

Responsibilities

SRE & Cloud Reliability Engineering:

Design and implement highly available, scalable cloud architectures. o Ensure uptime and system reliability through proactive monitoring and incident management.

Automate infrastructure provisioning and scaling using Terraform, Ansible, Kubernetes, and Helm.

DevOps & Automation:

Develop and maintain CI/CD pipelines for automated build, test, and deployment. o Implement GitOps workflows to streamline deployment processes.

Optimize performance and cost-efficiency of cloud environments.

QA & Test Automation:

Lead automated testing strategies for API reliability, performance, and security. o Implement Test-Driven Development (TDD) and Continuous Testing methodologies.

Perform load testing, stress testing, and resilience testing to prevent failures.

Observability, Monitoring & Incident Response

Set up monitoring and alerting dashboards using Prometheus, Grafana, Splunk, Datadog. Implement log aggregation and distributed tracing for deep observability.

Lead incident response, root cause analysis, and post-mortem analysis.

Security & Compliance:

Enforce cloud security best practices (IAM policies, Zero Trust, cloud encryption).

Ensure compliance with regulatory standards (SOC 2, ISO 27001, GDPR, HIPAA).

Implement threat detection and anomaly detection using AI-driven monitoring tools.

Leadership & SRE Strategy:

Lead and mentor SRE engineers, ensuring adoption of best practices.

Collaborate with DevOps, Security, and Cloud teams to implement scalable and secure cloud infrastructures.

Establish SRE operational strategies, playbooks, and incident management frameworks.

Qualifications

Bachelor’s/Master’s degree in Computer Science, IT, or a related field.

7+ years of experience in Site Reliability Engineering, DevOps, and Cloud infrastructure.

Deep expertise in SRE methodologies, cloud reliability, and distributed systems.

Proficiency in container orchestration, cloud automation, and infrastructure as code.

Strong knowledge of observability, monitoring, and AI-powered performance tuning.

Experience with security compliance, disaster recovery, and failure prevention.

Proficiency in scripting and automation (Python, Bash, Go, YAML).

Proven experience in leading SRE teams, defining strategies, and implementing best practices.

Must-Have Skills:

Expertise in Site Reliability Engineering (SRE), DevOps, and Cloud Automation.

Experience in cloud computing platforms (AWS, GCP, Azure, OpenStack).

Proficiency in Infrastructure as Code (Terraform, Ansible, CloudFormation).

Hands-on experience with Kubernetes, Docker, and OpenShift.

Deep understanding of observability, monitoring, and logging (Prometheus, Grafana, ELK, Splunk, Datadog).

Experience in CI/CD pipeline automation (Jenkins, GitHub Actions, ArgoCD, Tekton).

Expertise in performance tuning, cloud scaling, and system optimization.

Proficiency in QA automation, API reliability testing, and microservices validation.

Strong background in networking, traffic routing, and load balancing.

Experience in incident response, disaster recovery, and reliability planning.

Leadership experience in mentoring, managing teams, and driving SRE best practices.

Preferred Skills: • Experience with Chaos Engineering and fault injection frameworks (Gremlin, LitmusChaos). • Knowledge of cloud cost optimization strategies and FinOps. • Understanding of Zero Trust Security, IAM policies, and cloud-native security practices. • Proficiency in scripting and automation (Python, Bash, Go). • Familiarity with ML-based anomaly detection for reliability monitoring

You have to wait 20 seconds

Generating Apply Link...

DevOps Engineer

You may like these posts

Post a Comment

Connect with us

Popular Posts

Associate Tech Lead

Frontend Web Developer

Intern Software Engineer

Intern Software Developer

Pageviews

Blog Archive

Website Stats

Resources

Services

Footer Copyright

Contact form

Join Our WhatsApp Channel to get latest Updates Join Now →