About the Role

Title: Sr. Site Reliability Engineer – Container

Location: Plano United States

Job Description:

Alkami is a leading cloud-based digital banking solutions provider for financial institutions in the United States that helps clients to transform through retail and business banking, digital account opening and loan origination, payment fraud prevention, and data analytics and engagement solutions. Alkami’s Mobile App Platform has been certified by J.D. Power for providing clients with “An Outstanding Mobile Banking Platform Experience.”

Founded in 2009, we continue to be recognized for our intentional culture and tremendous growth (Best Place to Work in Fintech; Best & Brightest to Work For Nationally; and Comparably’s Best Company Culture, Best Career Growth, Best Engineering Team, and Best Places to Work in Dallas, among others). Through our bold investments in technology and people, we empower our clients to grow confidently, adapt quickly, and build thriving digital banking communities through tailored experiences for over 19.5M users.

As a remote-first company, this position can sit in Plano, TX or remote in the US.

Follow us on Glassdoor and LinkedIn!

Position Overview:

The role of Senior Site Reliability Engineer (SRE), specializing in Containers, represents a senior-level position within our organization. This individual will be entrusted with the responsibility of ensuring the reliability, scalability, and performance of our infrastructure, with a primary focus on container orchestration and related technologies.

Key Responsibilities & Duties:

  • Participate in the architecture and implementation of scalable platform solutions which present long term solutions that meet business requirements
  • Collaborate with software engineers, DevOps, Information Security, and other teams to integrate applications into the platform
  • Create and share guidance for other teams on the proper implementation of infrastructure including technical guides with best practices
  • Monitor system health, performance, and reliability, and implement proactive measures to prevent downtime
  • Implement processes to ensure security vulnerabilities are remediated within SLA
  • Investigate and troubleshoot complex system and performance issues, providing root cause analysis and solutions
  • Identify opportunities for process and system improvements and contribute to ongoing performance and cost optimization efforts
  • Stay informed about industry trends and best practices to continually improve the platform
  • Participate in an on-call schedule
  • Create system infrastructure and processes documentation

Qualifications:

  • Bachelor’s degree in engineering, CS, physics, math, statistics, or another related field or equivalent work experience
  • 4+ years experience in a DevOps / SRE / Platform Engineering role
  • Direct experience of cloud platforms (e.g., AWS, Azure, GCP)
  • Proficiency in scripting and automation using tools used by DevOps professionals such as Python, Bash, Powershell, or Java/.NET development
  • Familiar with creating/modifying infrastructure-as-code (IAC)
  • Strong experience with modern CI/CD tooling focused around rapid container deployment
  • Strong understanding of networking, load balancing, and security principles
  • Experience using automation tools. Build, provision, deploy, test and monitor
  • Familiarity with creating physical and logical infrastructure diagrams
  • Ability to communicate effectively both verbally and in written form. Adapts communication style to different audiences
  • Effective presentation skills
  • Ability to work cross functionally
  • Provide mentorship to team members
  • Work is done independently and reviewed at critical points
  • Key stakeholder in projects of diverse scope from design to completion
  • Enhances relationships with internal/external partners
  • Ability to participate in on-call rotation as assigned

Desired Skills:

  • Master’s degree in computer science or related field
  • Experience with containerization and orchestration technologies, such as Docker and Kubernetes.
  • Understanding of regulatory standards or experience working in a PCI environment
  • Previous experience with Git or a similar source code management system
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack)
  • Certification: Associate or Professional level cloud, container, IaC certification

APPLY HERE