kognitos_black-logo

Site Reliability Engineer(SRE)

San Jose, California, United States, 95112

SRE - AWS/Terraform

Description:

We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. In this role, you will play a crucial part in managing and optimizing our AWS infrastructure, employing Terraform and other standard SRE expertise. As an SRE, you will collaborate closely with cross-functional teams to ensure the reliability, scalability, and performance of our systems.

Responsibilities:

  • Design, implement, and manage the AWS infrastructure using Terraform and other relevant technologies to ensure scalability, reliability, and security.
  • Collaborate with development and operations teams to implement and improve deployment strategies, monitoring, and incident response.
  • Implement and enhance automated processes for infrastructure provisioning, configuration management, and application deployment.
  • Monitor system performance and reliability, proactively identifying and addressing potential issues to minimize downtime and improve overall system efficiency.
  • Conduct regular system reviews, making recommendations for optimizations and enhancements to ensure the best possible performance.
  • Implement and maintain infrastructure-as-code practices to enable version-controlled and reproducible infrastructure.
  • Collaborate with other engineers in architectural discussions, contributing insights and expertise to enhance the overall system design and reliability.
  • Participate in on-call rotations to respond to incidents and troubleshoot production issues promptly.
  • Continuously stay abreast of the latest technologies and best practices in SRE and cloud infrastructure management, recommending improvements to enhance system resilience and efficiency.
  • Work in a collaborative, fast-paced environment, focusing on delivering high-quality, reliable systems that meet the needs of our users.

Requirements:

  • B.S. or higher degree in Computer Science/Engineering or a related field, or equivalent work experience.
  • 5-8 years of industry experience in Site Reliability Engineering or a related role.
  • Proficient in managing AWS infrastructure and deploying resources using Terraform.
  • Experience with container orchestration tools (e.g., Kubernetes) and microservices architecture.
  • Solid understanding of networking principles, security best practices, and system performance optimization.
  • Strong scripting and automation skills (e.g., Python, Bash) for infrastructure management.
  • Familiarity with monitoring tools and practices for ensuring system health and performance.
  • Previous experience with on-call rotations and incident response.

What We Offer:

  • Competitive salary and benefits package.
  • Flexible working hours.
  • A dynamic and collaborative work environment.
  • Opportunities for professional development and growth.
  • An innovative culture that encourages creative thinking and problem-solving.

About Us:

Kognitos is a cutting-edge automation platform utilizing Generative AI and Natural Language Processing (NLP) to provide a conversational and intuitive experience for business users. As a rapidly growing company, we are committed to fostering a diverse work environment and proud to be an equal opportunity employer. We highly value diversity in our employees and do not discriminate on the basis of various characteristics, as outlined in our commitment statement. If you are a passionate and autonomous engineer with expertise in SRE, we invite you to join our team and contribute to our innovative solutions.

Remote : No

Type : Full Time

Final note

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal opportunities provider

Kognitos is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.