Site Reliability Engineer
AAA gaming client is looking for a Site Reliability Engineer to join their Infrastructure team. This is a contracting position expected to last from 12 to 18 months.
Hired candidate will make sure all teams are able to install game builds and tools, and access internal services.
Candidate must have experience with modern build pipelines.
Must be able to build scalable services, write reliable automation, and design workflows.
Participate in support rotation with other members of the infrastructure team.
Handle day to day support operations on large scale computer farm.
Partner with IT Team to improve farm stability, performance, and maintainability.
Plan, migrate and support computer farm transition to Microsoft Azure where applicable.
Support Infrastructure and partner teams on development initiatives.
Investigate, report, and resolve farm / infrastructure team issues.
Create, Adjust and Monitor Infrastructure team SLAs, SLIs and SLOs and work toward resolving any failing indicators.
Create telemetry and dashboards to visualize farm health.
Qualifications and Skills
3+ years' experience in software development
Bachelor's Degree in Computer Science, or comparable experience
Extensive experience debugging, troubleshooting, and fixing problems in a Windows environment
Proficiency in creating tools in PowerShell, Python or C#
Knowledge of source control Systems
Familiarity with cloud and cloud provisioning tools
Experience with configuration management tools
Experience with Git and Perforce
Experience with Azure DevOps, Azure Monitor Workbooks, Kusto, and App Insights
Experience with configuration management tools: Puppet, Chef, Ansible, Terraform, Packer
Experience with Kubernetes, Docker, and other container technologies
Experience with SQL / NoSQL databases