Job Description
Contract // Toronto, ON (Hybrid). Please share resumes to charandeep.singh@tekishub.us.
Job Title: Site Reliability Engineer
Location: Toronto, ON (Hybrid)
Contract
Job Description:
• Monitor and maintain system reliability using tools like DataDog, VictorOps, ELK, Grafana, and Prometheus.
• Ensure uptime and performance by proactively identifying issues and responding to alerts.
• Troubleshoot, investigate and resolve complex technical issues. If required, collaborate with the engineering team for timely issue resolution.
• Handle production incidents by analyzing root causes, prioritizing resolution, escalating as needed, and adhering to defined SLAs, SLIs, and SLOs.
• Develop and implement automation scripts (Python or other scripting languages) to streamline operational tasks, improve system efficiencies, and reduce manual workload.
• Manage and maintain infrastructure across AWS environments.