Site Reliability Engineer Job at iCIMS, Holmdel, NJ

UzNHV2FpSHhXdXhFLzV3RTlYZlloQWtYaUE9PQ==
  • iCIMS
  • Holmdel, NJ

Job Description

Job Summary

We are seeking a skilled Engineer, Site Reliability (SRE) to contribute to the reliability, scalability, and performance of our multi-cloud SaaS platform serving thousands of customers worldwide. This role involves hands-on technical work in incident response, system monitoring, automation, and continuous improvement of our platform reliability. The successful candidate will work within a global SRE team to ensure optimal system performance and customer satisfaction.

Responsibilities

  • System Monitoring & Reliability:
    • Monitor multi-cloud infrastructure (AWS, Azure, GCP) using New Relic, Grafana, and Sumo Logic
    • Maintain reliability of AWS resources, Auth0/Okta authentication, databases, and legacy applications
    • Implement monitoring, alerting, and dashboards for assigned systems
  • Incident Management & Response:
    • Respond to alerts and incidents within SLA timeframes
    • Perform root cause analysis and document findings
    • Create and maintain runbooks and troubleshooting procedures
    • Participate in 24/7 on-call rotation
  • Automation & Improvement:
    • Develop scripts to reduce manual operational overhead
    • Build monitoring and alerting solutions
    • Support infrastructure-as-code initiatives
    • Implement automated remediation where possible
  • Success Metrics:
    • Customer Impact : Reduced MTTR and improved customer satisfaction scores
    • Reliability : Achievement of 99.9%+ uptime SLAs across all products and regions
    • Proactive Prevention: Reduction in incident frequency through automated detection and prevention
    • Cross-functional Collaboration: Improved partnership metrics with Product, Engineering, and Customer Success teams
    • Automation Delivery: Complete assigned automation projects to reduce manual tasks
    • Knowledge Sharing: Contribute to team knowledge base and mentor junior engineers

Qualifications

  • 4+ years experience in SRE, DevOps, or Infrastructure Engineering
  • Hands-on experience with AWS (required) and Azure (preferred)
  • Strong Linux system administration skills
  • Experience with monitoring tools (New Relic, Grafana, Prometheus)
  • Scripting skills in Python, Bash, or similar
  • Knowledge of databases (SQL Server, PostgreSQL, MongoDB)

Job Tags

Worldwide,

Similar Jobs

Inspector General's Office

Deputy Inspector General Job at Inspector General's Office

 ...Job Description and Duties Under the general direction of a Deputy Inspector General, Senior, the Deputy Inspector General will conduct reviews of closed California Department of Corrections and Rehabilitation (CDCR) use-of-force investigations; assist in conducting... 

Anova Care

Virtual Assistant - Remote Job at Anova Care

 ...Summary: Anova Care, a provider of home care, home health, hospice, and palliative care services, is looking for an ANP to join our rapidly growing palliative program. This role will start as part-time and possibly transition to full-time as patient census grows. Anova... 

Endurance IT Services

Database Administrator Job at Endurance IT Services

 ...leadership on system capabilities and process improvements. The DBA plays a critical role in project management, change control, disaster recovery, and technical mentoring, ensuring alignment with enterprise standards and strategic goals. Responsibilities: Lead... 

Steel City South Pediatrics

Management Consultant Intern Job at Steel City South Pediatrics

 ...organizational needs. Why This Internship? Real-World Consulting Experience: Develop practical business analysis skills while solving healthcare-...  ...opportunities. Flexible Work Environment: Manage a part-time schedule around your studies while making a... 

City and County of Denver

Entry Level Assistant City Attorney - Civil Litigation Job at City and County of Denver

 ...and the case. Who We Are The City Attorneys Office serves as the chief legal counsel...  ...matters, offering regulatory and compliance assistance, addressing employment matters for the...  ...visa for employment authorization in the U.S. For information about right to work...