Employer: Molecula
Molecula is an Operational AI company that closes the gap between data and decision, enabling organizations to unlock the power of real-time analytics and AI. Our core technology, FeatureBase, is a feature-oriented database platform that powers real-time analytics and machine learning applications by simultaneously executing low latency, high throughput, and highly concurrent workloads. We are a burgeoning startup with a passionate team of dedicated engineers, marketers, and business experts determined to make a positive impact.
Molecula’s Engineering team is a group of brilliant makers and doers passionate about building world class products and solutions that make AI and ML possible for all. They take really challenging technical problems and turn them into elegantly simple yet incredibly complex solutions that delight our users. Most of all, they take pride in their craft and are a collaborative bunch that truly cares about the team, clients, company, and opportunity.
Molecula is looking for a Site Reliability Engineer (SRE) to join our Engineering team. With your expertise in software and systems engineering, you will help us build and operate large-scale, distributed, fault-tolerant systems that will allow our users to push the boundaries on how data is accessed today. You will provide technical leadership across cross functional software, infrastructure, data, security, and product teams to ensure we deliver the most reliable and stable feature store ever.
Responsibilities include
- Participate in design of major software components, systems, and features to improve the reliability and availability, scalability, latency, and efficiency of Molecula’s services.
- Improve our infrastructure capabilities by guiding the definition of service level objectives for Molecula services.
- Support deployments in new customer environments, monitoring, and observability
- Lead production incidents and work with cross-functional teams to drive to root cause analysis, reproduction, and resolution.
- Evangelize a culture of reliability and help mentor and train other team members on designing automation in order to meet service level objectives.
- Collaborate cross-functionally to enhance support process, documentations and playbooks.
Job Qualifications
- Minimum Qualifications
- 5+ years of software development experience with at least 2 years focused in a DevOps or Site Reliability Engineer (SRE) role
- Experience with Go, Rust, C, or Python, or similar programming languages
- Proven experience in end-to-end technical resolutions and root-cause analysis.
- Preferred Qualifications
- Bachelor’s degree in Computer Science, similar technical field of study, or equivalent practical experience.
- Experience with observability and automations for SaaS products
- Experience designing, analyzing, automating and troubleshooting large-scale distributed systems
- Experience in networking, security, hardware or OS performance tuning
- Experience with CI/CD pipelines such as GitLab, Azure DevOps, CircleCI to create end-to-end pipelines for all staging environments (Dev, Test, UAT, Production) is a plus.