Sr. Site Reliability Engineer
Title: Sr. Site Reliability Engineer
Location: Remote - must work East Coast hours
Duration: 18 month
Compensation: $79 per hour
Work Requirements: Authorized to Work in the US
TekPartners has some of the most sought after Information Technology positions available. As a reputable company in the IT staffing industry, you can trust us to place you in the right position. We currently have an opportunity for a Sr. Site Reliability Engineer.
Skillset / Experience:
This position is for an eager Senior Site Reliability Engineer to play an integral role on our Parks Commercial Systems team to help elevate practices, promote and onboard new technologies, solve complex problems, and integrate next generation digital platforms. Site Reliability Engineering (SRE) combines software and systems engineering disciplines to build and operationalize large-scale, massively distributed, fault-tolerant systems. SREs are talented engineers that improve the resiliency of production systems and reduce operational toil using a data driven approach. The Senior SRE will help support business critical systems for our guests and cast members within our client's segment. This includes consulting, architecting, developing, and operationalizing infrastructure, applications, automation, creating telemetry for monitoring, and engineering high reliability and reinforcing operational best practices.
● Fluent in core scripting languages and advanced skills in programming languages (e.g. Python, Node, Java, etc) with ability to build test coverage for all software being developed.
● Systems administration expertise with Linux and Windows platforms, including OS performance monitoring, setup, configuration, tuning, and troubleshooting.
● Experience with a major Application Performance Monitoring (APM) tool (e.g. AppDynamics, New Relic)
● Networking skills and protocols (e.g. HTTP, TLS, SSH, DNS)
● Continuous Integration (CI) Pipeline knowledge (e.g. Jenkins, Gitlab CI)
● Experience with Distributed Systems and Container Platforms (e.g. Kubernetes/GKE, ECS, Fargate)
● Experience with Source Control Management systems (e.g. Git)
● Expertise in public and private cloud hosting services (AWS, Google Cloud, Azure)
● Expert in web server technologies (e.g. Apache, Node.js, Nginx, Tomcat) including setup, configuration, performance monitoring, tuning, clustering, and debugging.
● Proficient with data technologies (e.g. NoSQL, MySQL, Redis, Elastic) including being able to perform basic setup, configuration, and troubleshooting.
● Able to implement existing base standards for new systems and/or applications for all of the following: o Site/Systems monitoring and instrumentation o Application monitoring and instrumentation o System monitoring and instrumentation o Resilience, performance & Telemetry data
● Able to diagnose simple to complex system and process problems.
● Demonstrate exceptional troubleshooting methodology, including the ability to author and instruct new methodologies to the SRE team.
● Independently resolve moderately to highly complex system and application incidents.
● Able to identify and propose system and application fixes for performance bottlenecks.
● Able to evaluate new application requirements for capacity and run-time best practices.
● Able to evaluate new systems and/or infrastructure solutions for technical feasibility against known requirements and standards.
Experience with Prometheus, AppDynamics, Splunk, SiteScope, Rundeck, or Jenkins are a plus
Our benefits package includes:
- Comprehensive Medical Benefits
- Competitive Pay, 401K
- Retirement Plan
- And Much More
TekPartners is one of the fastest growing private staffing firms in the United States. We are a premier provider of highly qualified IT talent, Workforce Solutions and Business Intelligence Solutions to many enterprise organizations across the nation. As experts in the industry, our team continues to match proven talent to the right job opportunity every day.
TekPartners is an Equal Opportunity Employer.