2025

Designing and Evaluating LLM Agents Through the Lens of Collaborative Effort Scaling

October 30, 2025 17 min read
Designing and Evaluating LLM Agents Through the Lens of Collaborative Effort Scaling

Current evaluations of agents remain centered around one-shot task completion, failing to account for the inherently iterative and collaborative nature of many real-world problems where human goals are often underspecified and evolve. We introduce collaborative effort scaling, a framework that captures how an agent’s utility grows with increasing user involvement and reveals critical gaps in current agents’ ability to sustain engagement and scaffold user understanding.

2023

Introducing Chapyter

July 13, 2023 7 min read

example Chapyter is a JupyterLab extension that seamlessly connects GPT-4 to your coding environment. It features a code interpreter that can translate your natural language description into Python code and automatically execute it. Incorporating powerful code generation models like GPT-4 into the notebook coding environment opens up new modes of human-AI collaboration. By enabling “natural language programming” in your most familiar IDE, Chapyter can boost your productivity and empower you to explore many new ideas that you would not have tried otherwise.