hero
Vertex Ventures US
companies
Jobs

Staff Software Engineer

Cleric

Cleric

Software Engineering
San Francisco, CA, USA
Posted on Jun 9, 2025

Join us at Cleric

We're building an autonomous AI SRE that helps software engineering teams reliably investigate production incidents. Our agent combines LLMs with tools to understand systems, reason through problems, and take corrective actions - even for issues it hasn't encountered before. Our mission is to let engineers focus on building products, not fighting fires.

We're a small team of AI and infrastructure veterans backed by leading AI investors. Cleric is already in production at high scale companies and saving engineers hundreds of hours in investigations.

About the role

You'll help us scale our AI agent across multiple customer environments. You'll design and implement the infrastructure for agent deployment, execution, evaluation, and learning - creating systems that let us run our agents efficiently and reliably.

You'll architect systems to process agent telemetry data, manage our simulation environments, and train our AI agent. This includes building deployment pipelines, scaling mechanisms for handling increased load, and expanding our integrations (observability, cloud provider APIs etc) to suit diverse customer environments.

Beyond the technical implementation, you'll also set practical standards in code review, CI and observability to keep quality high while we scale. You'll mentor other engineers, provide technical direction, and ensure we're making pragmatic architectural decisions that balance immediate needs with long-term scalability.

You'll have technical autonomy in designing and implementing these systems, working closely with our founding team to expand our platform capabilities while maintaining high engineering standards.

What you'll do:

  • Build and scale our agent platform, evaluation systems, and the web applications where users manage their Cleric deployment

  • Design agent telemetry pipelines that power our simulation environments and evaluations

  • Implement monitoring and observability systems to observe platform and agent performance

  • Create testing frameworks and development patterns to maintain engineering quality

  • Build APIs, tools, and libraries that help our team ship quickly and reliably

  • Establish engineering best practices and mentor other engineers as we grow

You have:

  • 6+ years of production software engineering experience

  • Strong software engineering fundamentals with focus on simplicity and maintainability

  • Track record of working with data scientists to build high-scale production services

  • Deep experience with observability tools and practices (Datadog, OpenTelemetry)

  • Experience with cloud infrastructure (GCP, AWS) and containerization

  • Extensive experience in Python and at least one high performance language like Java, Scala, Go, or Rust

  • You've worn the pager and fixed 2AM outages

  • You challenge assumptions and propose pragmatic solutions

  • Curiosity and drive to learn new technologies

Nice to have:

  • Experience with LLM-based systems

  • Hands-on with ML / LLM platform tooling (feature stores, model serving, eval)

  • Background in building developer platforms

  • Previous startup experience

How we work:

  • Small teams, big impact: We believe that small teams can deliver great products

  • Culture matters: We like to give and receive direct feedback and keep the room inclusive

  • In-person collaboration: We believe in working closely to deliver the best results

  • AI-first approach: We use AI daily in the way we build and run the company

Interview process (you'll meet most of the team via the process)

  1. Intro Call

    • Discuss your experience, the company, product, and the role

    • Lightning tech screen (10 mins) to swap ideas on engineering practices.

  2. Software Engineering Session (1 hour)

    • Collaboratively build an application

    • Focus on practical software engineering, not algorithm challenges

  3. System Design Session (90 mins)

    • Work through a system design problem relevant to your daily work

  4. Bar Raiser (60 mins)

    • Product thinking

    • Engineering practices