About Algebra Algebra builds and operates AI-powered workflows for mid-market companies as a managed service. We identify high-value operational processes, design AI systems to run them, and own the outcome end to end. We are not consultants. We do not sell software licenses. We are the accountable operator.
The Role
This is a senior infrastructure and engineering operations role with no large team below you. You will own the cloud, deployment, security, and reliability foundation behind the AI workflows Algebra builds for clients.
You will design the deployment architecture, own cloud infrastructure, improve CI/CD, harden environments, set up monitoring, and make sure what we build can actually run in production.
This is a builder role. You will not inherit a mature platform with perfect documentation and fixed processes. You will create the standards, patterns, and systems that allow Algebra to scale from individual client deployments into a repeatable operating model.
What You'll Do
- Cloud Infrastructure
- Own Algebra’s cloud infrastructure across internal products, client deployments, and AI workflow environments
- Design secure, scalable deployment architecture for production systems
- Build and maintain infrastructure-as-code, environment templates, and repeatable deployment patterns
- Manage staging, production, and client-specific environments
- Make practical decisions across cloud architecture, cost, scalability, security, and delivery speed
- Support deployment across cloud providers or client infrastructure where required
- DevOps and Release Management
- Own CI/CD pipelines across backend services, workflow systems, and AI agent deployments
- Improve release processes so engineers can ship faster without breaking production
- Create deployment standards for new client implementations
- Automate manual operational processes wherever possible
- Manage secrets, environment variables, access controls, build pipelines, and release workflows
- Work closely with backend and AI engineers to make deployment a product strength, not a bottleneck
- Reliability and Observability
- Build monitoring, logging, alerting, and incident visibility into production systems
- Set up health checks, uptime monitoring, error tracking, and performance dashboards
- Create operational runbooks for deployments, incidents, rollback, and client support
- Identify reliability risks before they become client problems
- Improve system resilience, recovery processes, and production readiness
- Help move the company from “it works” to “it is stable, observable, and supportable”
- Engineering Systems
- Create reusable infrastructure patterns for future client deployments
- Build the technical operating model for how Algebra deploys, monitors, and supports AI workflows
- Improve developer experience across local development, staging, testing, and production release
- Identify technical debt in the infrastructure layer and fix it before it compounds
- Work across product, engineering, and delivery teams to make sure systems are not just built, but actually operable
What We're Looking For
- 10+ years of experience across cloud infrastructure, DevOps, platform engineering, or site reliability engineering
- You have owned production infrastructure for real systems, not just supported pipelines
- Strong experience with at least one major cloud provider: AWS, Azure, or GCP
- Strong experience with CI/CD, Docker, containers, infrastructure-as-code, deployment automation, and environment management
- You know how to build observability into systems: logging, monitoring, alerting, tracing, and incident response
- You can design infrastructure that balances speed, security, cost, reliability, and client requirements
- You can work directly with engineers, founders, and client-facing teams without needing everything translated into tickets Bonus Points
- You have deployed AI, automation, workflow orchestration, or agent-based systems before
- You have experience with multi-tenant or client-specific deployment architecture
- You have worked with regulated or security-sensitive clients in finance, healthcare, government, or professional services
- You understand LLM infrastructure patterns, vector databases, model APIs, agent frameworks, or AI observability
- You have experience with SOC 2, ISO 27001, penetration testing, or enterprise security reviews
- You have worked inside an early-stage startup or services-led technology business
- What This Role Is Not
- This is not a cloud maintenance role.
- You will not just keep servers running.
- You will design how Algebra deploys and operates real client systems.
- This is not a pure DevOps ticket role.
- You will be expected to form a view, make decisions, and create the infrastructure standards the company will scale on.
- This is not “move fast and ignore production.” Our systems run real business workflows for clients, so reliability matters.
- If your best work is inside a large organization where every decision has a committee, this is probably not the right fit.
- If you want to build the technical backbone of an AI operations company from the ground up - we want to talk.