Who are we?
Cohere is the leading security-first enterprise AI company. We build cutting-edge foundation AI models and end-to-end products that are designed to solve real-world business problems.
We’re training and deploying frontier models for enterprises who are building AI systems. We believe that our work is instrumental to the widespread adoption of AI and we are looking for folks that want to be part of that.
We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. Cohere is a team of researchers, engineers, designers, and more, who are all passionate about their craft.
We are a global technology company co-headquartered in Toronto and San Francisco, with key offices in London, New York City, Montreal, Seoul, Germany and Paris. Join us!
ABOUT THE ROLE
We're seeking a Senior/Staff Engineer to build and maintain the automation infrastructure that powers the development cycles of our North platform. This engineer will design and implement robust automation systems that enable engineers to efficiently test and validate changes across diverse environments and configurations. This role sits at the intersection of infrastructure and standards. You'll build the systems, frameworks, and culture that allow the rest of engineering to own quality themselves; improving and extending our testing platform by creating the infrastructure that allows engineers to write and execute tests, and enable every engineering team to ship with more confidence.
KEY RESPONSIBILITIES
- Design and implement automation pipelines that support comprehensive testing across multiple environments with varying feature flags and realistic customer data profiles
- Create intelligent testing agents that simulate real user behavior to validate different configuration combinations
- Develop and maintain GitHub workflows and actions to automate testing, deployment, and validation processes
- Manage and optimize Helm charts for deployment consistency across environments
- Implement and maintain ArgoCD workflows for continuous deployment and environment management
- Establish best practices for testing methodologies and ensure adoption across engineering teams
- Build scalable infrastructure that supports parallel test execution across diverse configurations
- Develop infrastructure-as-code templates and configurations for reproducible test environments
- Implement containerization strategies for test environments and dependencies
- Create benchmarking frameworks to measure performance and reliability across different configurations
- Monitor and improve test coverage and reliability metrics
- Collaborate with product and engineering teams to understand testing requirements and translate them into automated solutions
- Troubleshoot and resolve complex testing infrastructure issues
REQUIRED QUALIFICATIONS
- 5+ years of software engineering experience with a focus on automation and testing infrastructure
- Expert proficiency in Python and TypeScript
- Extensive experience with GitHub workflows and actions
- Deep understanding of testing methodologies and best practices
- Experience building and maintaining CI/CD pipelines
- Containerization experience (Docker, Kubernetes)
- Benchmarking experience and performance testing methodologies
- Cloud platform experience (AWS, GCP, or Azure)
- Background in developer tools or platform engineering
- Ability to design and implement complex automation systems
- Strong problem-solving skills and attention to detail
PREFERRED QUALIFICATIONS
- Experience working with LLMs in production environments
- Familiarity with infrastructure-as-code principles
- Experience with container orchestration and management
- Knowledge of performance testing tools and frameworks
- Experience with monitoring and observability tools
- Background in test framework development