Employer: Cox Automotive
As a DevOps/Site Reliability Engineer, you will work with multiple software engineering teams focused on building, running, and monitoring multiple, fault-tolerant systems on-prem and in the cloud. You will collaborate with some very smart people, focusing on performance and reliability, while increasing the resiliency of our applications. The ideal candidate for this position would be someone who will be comfortable with the investigation into how the software performs, network traffic flows and service interaction. If you love to figure out how all the pieces are put together and interact in a complex environment, have a passion for trouble shooting, root cause analysis, innovating in the newest technologies, and enjoy building solutions for resiliency and â€œalways availableâ€ applications, we want to talk to you.
What You Get To Do:
- Work with product managers and software engineers to increase the scalability, reliability, and performance of our systems
- Think at scale, with a focus on ensuring stability and maximizing the performance of services you own
- Take advantage of cloud computing capabilities using Amazon Web Services (AWS)
- Participate in service capacity planning and demand forecasting, software performance analysis and system tuning
- Roll up your sleeves to troubleshoot problems across the entire stack: hardware, network, datastores, and application â€“ and build automation to prevent problem reoccurrence
- Identifying underlying root causes and work with engineering teams to fix and/or provide recommendations or solutions for long term permanent fixes to critical production issues
- Take ownership and strive to do work you’re proud of. You believe in spreading (and acquiring) knowledge through mentorship and collaboration
- Develop effective documentation, tooling, and alerts to both identify and address reliability risks.
- Participate in on-call rotation with other members of the Software Engineering team.
- Bachelor’s in Computer Science or related field, or equivalent experience.
- Experience with HTML5 and CSS3 web standards.
- Knowledge of web libraries and frameworks, such as Angular, React, Typescript, SCSS, and Bootstrap.
- Solid Windows and Linux experience
- Using distributed version control system experience (Git preferred) to check-in code, branching, merging, pull request, code review, etc.
- Familiarity with configuration management and infrastructure as code (IaaC) tools such as Ansible, Terraform or CloudFormation
- Knowledge of CI/CD best practices and tools such as AWS CodeBuild, Jenkins and Team City
- Experience designing and delivering secure, high performance and highly available cloud services
- Experience working with partners to define and track SLIs, SLOs and SLAs using metrics and monitoring to ensure the objectives are met or exceeded
- Strong understanding of networking and DNS
- Experience working with container technologies such as Docker, Rancher, Kubernetes
We’d Love to See:
- Someone who sees technology as a hobby and not work, you enjoy taking things apart and then putting them back together to improve them
- Experience with monitoring, analysis, and alerting tools like New Relic, Splunk and OverOps
- Experience building infrastructure and supporting applications in AWS using services such as Elastic Beanstalk, Lambda, EC2, ECS, S3, SNS, Aurora, RDS, DynamoDB
- Understand and practice cost containment and Game Day activities You thrive on technical challenges and take pride in solving them
- Deliver insightful recommendations in a concise, and persuasive manner
- Strong interpersonal and communication skills with a focus on customer service
Location: ME, NH, MA, RI, CT, NY, NJ, DE, MD, VA, NC, SC, GA, FL