OpenSpace: Self-Evolving AI Agent Skills with 4.7k+ GitHub Stars

OpenSpace is a groundbreaking self-evolving skill engine that transforms AI agents into continuously learning systems. With 4.7k+ GitHub stars and active development, it's reshaping how agents like OpenClaw, Claude Code, and Codex learn from experience and share knowledge across teams.

What is OpenSpace?

OpenSpace is an open-source framework developed by HKUDS that plugs into any AI agent as a Model Context Protocol (MCP) server. Unlike traditional agents that start from scratch on every task, OpenSpace enables agents to evolve reusable skills, dramatically reducing token consumption while improving output quality. The framework was open-sourced on March 25, 2026, and has rapidly gained traction in the AI agent community.

The core innovation: agents don't just execute tasks—they learn from every execution, automatically capturing successful patterns as reusable skills. When one agent evolves an improvement, every connected agent can access it through the cloud skill community. This creates a network effect where collective intelligence grows exponentially.

Created by the Hong Kong University of Science and Technology Data Systems Lab (HKUDS), OpenSpace addresses three critical pain points in modern AI agents: massive token waste from repeated reasoning, costly failures that repeat across agents, and poor skill reliability as tools and APIs evolve.

Core Features and Architecture

🧬 Self-Evolution Engine

The heart of OpenSpace is its autonomous skill evolution system. Skills aren't static files—they're living entities that automatically improve themselves through three independent mechanisms:

  • FIX Evolution — When a skill breaks or becomes outdated, OpenSpace repairs it in-place with minimal changes. The same skill directory, new version.
  • DERIVED Evolution — Creates enhanced or specialized versions from parent skills. New skill directory, coexists with parents, enabling skill families to branch and specialize.
  • CAPTURED Evolution — Extracts novel reusable patterns from successful task executions. Brand new skills born from real-world success.

Three independent triggers ensure continuous improvement: post-execution analysis after every task, tool degradation monitoring that detects when dependencies break, and periodic metric monitoring that identifies underperforming skills.

📊 Full-Stack Quality Monitoring

OpenSpace tracks quality across the entire execution stack—from high-level workflows down to individual tool calls. Multi-layer monitoring covers skills (applied rate, completion rate, effective rate, fallback rate), tool calls (success rate, latency, flagged issues), and code execution (status, error patterns). When any component degrades, evolution automatically triggers for all upstream dependent skills, maintaining system-wide coherence.

🌐 Collective Agent Intelligence

The cloud skill community transforms individual agent improvements into shared knowledge. Agents can upload evolved skills to open-space.cloud with flexible access control (public, private, or team-only). Smart search with BM25 + embedding hybrid ranking helps agents discover and auto-import relevant skills. Every evolution is lineage-tracked with full diffs, creating a transparent evolution history.

💰 Token Efficiency at Scale

OpenSpace delivers measurable economic value. On the GDPVal benchmark (220 real-world professional tasks), OpenSpace agents earned 4.2× more money than baseline agents using the same backbone LLM (Qwen 3.5-Plus), while cutting 46% of costly tokens through skill evolution. Phase 2 (warm rerun with evolved skills) used only 45.9% of Phase 1 tokens—better results with dramatically lower costs.

🔧 Intelligent & Safe Evolution

Evolution is autonomous but not reckless. OpenSpace gathers real evidence before making changes, produces minimal targeted diffs rather than full rewrites, and includes built-in safeguards: confirmation gates reduce false-positive triggers, anti-loop guards prevent runaway cycles, safety checks flag dangerous patterns (prompt injection, credential exfiltration), and evolved skills are validated before replacing predecessors.

🎯 Multi-Agent Integration

OpenSpace works with any agent supporting skills: Claude Code, Codex, OpenClaw, nanobot, Cursor, and more. Two integration paths: Path A plugs OpenSpace into your agent's MCP config, while Path B uses OpenSpace directly as a standalone AI co-worker. The framework includes a local dashboard for browsing skills, tracking lineage, and comparing diffs.

Get free AI agent insights weekly

Join our community of builders exploring the latest in AI agents, frameworks, and automation tools.

Join Free

Getting Started

Installation

Clone the repository and install with pip:

git clone https://github.com/HKUDS/OpenSpace.git && cd OpenSpace
pip install -e .
openspace-mcp --help   # verify installation

For a lightweight clone that skips the ~50 MB assets folder:

git clone --filter=blob:none --sparse https://github.com/HKUDS/OpenSpace.git
cd OpenSpace
git sparse-checkout set '/*' '!assets/'
pip install -e .

Path A: Integrate with Your Agent

Add OpenSpace to your agent's MCP config (e.g., Claude Code, OpenClaw):

{
  "mcpServers": {
    "openspace": {
      "command": "openspace-mcp",
      "toolTimeout": 600,
      "env": {
        "OPENSPACE_HOST_SKILL_DIRS": "/path/to/your/agent/skills",
        "OPENSPACE_WORKSPACE": "/path/to/OpenSpace",
        "OPENSPACE_API_KEY": "sk-xxx (optional, for cloud)"
      }
    }
  }
}

Copy the delegate-task and skill-discovery skills into your agent's skills directory. These teach your agent when and how to use OpenSpace—no additional prompting needed.

Path B: Use OpenSpace Directly

Create a .env file with your LLM API key, then run:

# Interactive mode
openspace

# Execute a task
openspace --model "anthropic/claude-sonnet-4-5" --query "Create a monitoring dashboard for my Docker containers"

Or use the Python API:

import asyncio
from openspace import OpenSpace

async def main():
    async with OpenSpace() as cs:
        result = await cs.execute("Analyze GitHub trending repos and create a report")
        print(result["response"])
        for skill in result.get("evolved_skills", []):
            print(f"  Evolved: {skill['name']} ({skill['origin']})")

asyncio.run(main())

Real-World Use Cases

Professional Document Generation

OpenSpace excels at complex document workflows. On the GDPVal benchmark, compliance and form tasks improved 18.5% in quality while cutting 51% of tokens. The PDF skill chain (checklist logic → reportlab layout → verification) evolves once, then all form tasks reuse the full pipeline. Tax returns from 15 source documents, pharmacy compliance checklists, and clinical handoff templates all benefit from the same evolved skills.

Engineering Project Coordination

Multi-deliverable technical projects saw 8.7% quality improvement with 43% token reduction. Coordination skills transfer universally across diverse tasks: Web3 full-stack development (Solidity + React + tests), CNC workcell safety systems, and aerospace CFD reports all leverage the same evolved orchestration patterns.

Autonomous System Development

The "My Daily Monitor" showcase demonstrates end-to-end autonomous development. OpenSpace built a fully working live dashboard with 20+ panels streaming processes, servers, news, markets, email, and schedules—60+ skills evolved from scratch with zero human code written. The skill evolution graph shows how individual improvements compound into complete systems.

Media Production Automation

Audio and video workflows benefit from evolved ffmpeg expertise. Evolved skills encode working codec flags and fallbacks, eliminating sandbox trial-and-error. Bossa-nova instrumental generation from drum references, bass stem editing from multiple tracks, and CGI show reels from source videos all use the same evolved pipeline.

How It Compares

vs. LangGraph

LangGraph excels at defining explicit agent workflows with state machines and branching logic. OpenSpace complements this by automatically evolving the skills that LangGraph agents use. While LangGraph requires developers to manually design workflows, OpenSpace learns optimal patterns from execution history. Many teams use both: LangGraph for orchestration, OpenSpace for skill evolution.

vs. CrewAI

CrewAI focuses on multi-agent collaboration with role-based agents and task delegation. OpenSpace goes deeper: it evolves the skills that agents use to complete tasks. CrewAI agents benefit from OpenSpace integration—their evolved skills become shareable across the crew. CrewAI handles team dynamics; OpenSpace handles continuous improvement.

vs. Pydantic AI

Pydantic AI emphasizes type safety and structured outputs for Python developers. OpenSpace is language-agnostic and focuses on skill evolution and reuse. Pydantic AI agents can integrate OpenSpace via MCP to gain self-evolving capabilities. The frameworks address different concerns: Pydantic AI ensures correctness, OpenSpace ensures continuous improvement.

What's Next

OpenSpace's roadmap reveals ambitious plans for collective agent intelligence. Upcoming features include Kanban-style orchestration with skill-aware scheduling (scheduling itself evolves), collaboration pattern evolution (decomposition, handoff, prioritization strategies captured from completed tasks), role emergence (agents develop role profiles through practice, not configuration), and cross-group pattern transfer (coordination patterns discovered by one group available to others via cloud registry).

The framework is actively maintained with recent updates (April 7, 2026) adding SSE and streamable HTTP support for remote MCP connections, fixing runtime issues across grounding and skill evolution, and improving LLM credential resolution. The community is growing rapidly, with 548 forks and active contributions from 10+ developers.

OpenSpace represents a fundamental shift in how AI agents learn and improve. Rather than static systems that degrade over time, agents become self-improving entities that share knowledge across teams. As the framework matures and the skill community grows, we'll see agents that are not just more capable, but economically viable—delivering measurable ROI through token efficiency and quality improvements.

Sources