Site Reliability Engineer Job at iCIMS, Holmdel, NJ

Y0pXSHNvVGNmRmQ0TGhnUm1BSVh2NUtEVGc9PQ==
  • iCIMS
  • Holmdel, NJ

Job Description

Job Summary

We are seeking a skilled Engineer, Site Reliability (SRE) to contribute to the reliability, scalability, and performance of our multi-cloud SaaS platform serving thousands of customers worldwide. This role involves hands-on technical work in incident response, system monitoring, automation, and continuous improvement of our platform reliability. The successful candidate will work within a global SRE team to ensure optimal system performance and customer satisfaction.

Responsibilities

  • System Monitoring & Reliability:
    • Monitor multi-cloud infrastructure (AWS, Azure, GCP) using New Relic, Grafana, and Sumo Logic
    • Maintain reliability of AWS resources, Auth0/Okta authentication, databases, and legacy applications
    • Implement monitoring, alerting, and dashboards for assigned systems
  • Incident Management & Response:
    • Respond to alerts and incidents within SLA timeframes
    • Perform root cause analysis and document findings
    • Create and maintain runbooks and troubleshooting procedures
    • Participate in 24/7 on-call rotation
  • Automation & Improvement:
    • Develop scripts to reduce manual operational overhead
    • Build monitoring and alerting solutions
    • Support infrastructure-as-code initiatives
    • Implement automated remediation where possible
  • Success Metrics:
    • Customer Impact : Reduced MTTR and improved customer satisfaction scores
    • Reliability : Achievement of 99.9%+ uptime SLAs across all products and regions
    • Proactive Prevention: Reduction in incident frequency through automated detection and prevention
    • Cross-functional Collaboration: Improved partnership metrics with Product, Engineering, and Customer Success teams
    • Automation Delivery: Complete assigned automation projects to reduce manual tasks
    • Knowledge Sharing: Contribute to team knowledge base and mentor junior engineers

Qualifications

  • 4+ years experience in SRE, DevOps, or Infrastructure Engineering
  • Hands-on experience with AWS (required) and Azure (preferred)
  • Strong Linux system administration skills
  • Experience with monitoring tools (New Relic, Grafana, Prometheus)
  • Scripting skills in Python, Bash, or similar
  • Knowledge of databases (SQL Server, PostgreSQL, MongoDB)

Job Tags

Worldwide,

Similar Jobs

Indotronix International Corporation

Forklift Operator Job at Indotronix International Corporation

 ...Pay: $18.50/hr Schedule Details: 1. Training: M-F, 8am to completion usually 5pm, occasionally later) 2. Normal schedule after...  ...shift end) 3. OT as needed ***PREFERRED MINUMUM 1 YEAR FORKLIFT (STANDUP) EXPERIENCE SHOULD BE DEMONSTRATED IN WORK HISTORY****... 

WakeMed Health & Hospitals

Cardiac Sonographer-WPP Job at WakeMed Health & Hospitals

Overview:The Cardiac Sonographer performs echocardiograms of patient's anatomy/physiology and pathophysiology. They must demonstrate proficiency in the use of ultrasound equipment to visualize and record quality images of the structures of the cardiovascular system in... 

Redwood Communities Inc

Leasing Consultant Bon Air Apartments Job at Redwood Communities Inc

 ...TITLE: Leasing Consultant Bon Air Apartments EOE STATEMENT We are an equal employment opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status,... 

IHOP

Host/Hostess 1827 Job at IHOP

 ...arrival. Monitor the waiting list and inform guests when tables become available. Maintain cleanliness and organization of the hostess station and lobby area. Assist with side duties such as refilling condiments, restocking menus, and helping servers as needed.... 

Direct Impact Logistics

Dispatcher - Furniture Delivery Job at Direct Impact Logistics

 ...monitoring a fleet of trucks that are responsible for delivering furniture. Looking for someone who learns quickly and is comfortable...  ...professionalism with the client's leadership team. Oversee/assist drivers in staying on task and organized throughout the day....