2026

Evolve your repo, not your agent

May 21, 2026 7 min read
Sepo overview

Recently, we’ve seen clear collaboration gaps in today’s coding agents, which are designed for short-form, single-user chats. As human-agent collaborations grow to last days or weeks, complex collaboration patterns emerge — from managing multiple long-running agents to coordinating across human teams and agents. We propose Self-evolving Repository (Sepo), a framework for long-horizon, multi-user agent collaboration. It implements the harness to run multiple coding agents (Codex/Claude Code) natively on GitHub: it turns any GitHub repository into a living artifact with native support for long-horizon collaboration between human teams and agents. On top of this, Sepo incorporates a continual learning framework that distills team preferences from past interactions into rubrics (stored on the agent/rubrics branch), and uses them to steer future coding sessions via rubric reviews and refinement. We’ve found Sepo practically useful for managing repo-level development tasks and it can enable a range of novel human-agent coding interactions. Try it today by starting a new Sepo, or install it into an existing repository with one command.

2025

Designing and Evaluating LLM Agents Through the Lens of Collaborative Effort Scaling

October 30, 2025 17 min read
Designing and Evaluating LLM Agents Through the Lens of Collaborative Effort Scaling

Current evaluations of agents remain centered around one-shot task completion, failing to account for the inherently iterative and collaborative nature of many real-world problems where human goals are often underspecified and evolve. We introduce collaborative effort scaling, a framework that captures how an agent’s utility grows with increasing user involvement and reveals critical gaps in current agents’ ability to sustain engagement and scaffold user understanding.

2023

Introducing Chapyter

July 13, 2023 7 min read

example Chapyter is a JupyterLab extension that seamlessly connects GPT-4 to your coding environment. It features a code interpreter that can translate your natural language description into Python code and automatically execute it. Incorporating powerful code generation models like GPT-4 into the notebook coding environment opens up new modes of human-AI collaboration. By enabling “natural language programming” in your most familiar IDE, Chapyter can boost your productivity and empower you to explore many new ideas that you would not have tried otherwise.