The core problem: AI agents require measuring and iterating on subjective behavior - tone, decision-making, context usage. From experience, the best setup is when product managers take care of improving agent behavior, while engineers build workflows and infrastructure.
Restack's approach:
- Engineers build workflows in Python. Temporal and Kubernetes handle reliability and scalability.
- Product teams and domain experts A/B test and version control prompts and context management, without engineering required for behavioral iteration.
Technical stack (open source):
- React for frontend
- Temporal for retries and long-running workflows
- Kubernetes with horizontal pod autoscaler for agent scaling
- Context store built on Clickhouse
- Full observability and agent tracing
- MCP-compatible workflows