将 Codex CLI 作为 MCP 服务端运行
你可以将 Codex 作为 MCP 服务端运行,并从其他 MCP 客户端连接它,例如通过 OpenAI Agents SDK 的 MCP 集成 构建的智能体客户端。
要启动 Codex 作为 MCP 服务端,可使用下面的命令:
codex mcp-server你也可以结合 Model Context Protocol Inspector 启动 Codex MCP 服务端:
npx @modelcontextprotocol/inspector codex mcp-server向服务端发送 tools/list 后,你会看到两个工具:
codex
用于启动一个新的 Codex 会话。它接受与 Codex Config 结构对应的配置参数:
| 属性 | 类型 | 说明 |
|---|---|---|
prompt(必填) |
string | 启动 Codex 对话时的初始用户提示词。 |
approval-policy |
string | 模型生成 shell 命令时使用的审批策略:untrusted、on-request、never。 |
base-instructions |
string | 用来替代默认指令的一组指令。 |
config |
object | 覆盖 $CODEX_HOME/config.toml 中的个别配置项。 |
cwd |
string | 会话工作目录。若传相对路径,则相对于服务端进程当前目录解析。 |
include-plan-tool |
boolean | 是否在对话中包含计划工具。 |
model |
string | 可选的模型名覆盖,例如 o3、o4-mini。 |
profile |
string | 指定 config.toml 中某个配置档案作为默认项。 |
sandbox |
string | 沙箱模式:read-only、workspace-write 或 danger-full-access。 |
codex-reply
用于在提供 thread id 和提示词的前提下继续某个已有 Codex 会话。
| 属性 | 类型 | 说明 |
|---|---|---|
prompt(必填) |
string | 继续对话时要发送的下一条用户提示词。 |
threadId(必填) |
string | 要继续的 thread id。 |
conversationId(已弃用) |
string | threadId 的已弃用别名,仅用于兼容旧客户端。 |
请使用 tools/call 响应中 structuredContent.threadId 里的 threadId。与 exec / patch 相关的审批提示,也会在 params 负载中带上 threadId。
响应示例:
{
"structuredContent": {
"threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e",
"content": "`ls -lah` (or `ls -alh`) — long listing, includes dotfiles, human-readable sizes."
},
"content": [
{
"type": "text",
"text": "`ls -lah` (or `ls -alh`) — long listing, includes dotfiles, human-readable sizes."
}
]
}现代 MCP 客户端在工具调用结果里如果存在 "structuredContent",通常只会上报这一项。Codex MCP 服务端之所以同时返回 "content",主要是为了兼容旧版 MCP 客户端。
构建多智能体工作流
Codex CLI 远不止能执行临时任务。把 CLI 作为 Model Context Protocol(MCP)服务端暴露出来,再配合 OpenAI Agents SDK 进行编排,你可以构建出确定性强、便于审查的工作流,从单智能体一直扩展到完整的软件交付流水线。
本指南对应 OpenAI Cookbook 示例。你将完成:
- 将 Codex CLI 作为一个长期运行的 MCP 服务端启动起来
- 构建一个聚焦的单智能体工作流,产出一个可玩的浏览器小游戏
- 编排一支多智能体团队,加入交接、护栏和可回看的完整追踪
开始前,请先准备:
- 已安装 Codex CLI,确保
npx codex可以正常运行 - Python 3.10+ 与
pip - Node.js 18+(用于
npx) - 一个保存在本地的 OpenAI API 密钥。你可以在 OpenAI 控制台 中创建或管理它
为本指南创建一个工作目录,并把 API 密钥写入 .env 文件:
mkdir codex-workflows
cd codex-workflows
printf "OPENAI_API_KEY=sk-..." > .env安装依赖
Agents SDK 会负责协调 Codex、交接和追踪。先安装最新的 SDK 依赖:
python -m venv .venv
source .venv/bin/activate
pip install --upgrade openai openai-agents python-dotenv
将 Codex CLI 初始化为 MCP 服务端
第一步是把 Codex CLI 变成 Agents SDK 可以调用的 MCP 服务端。这个服务端会暴露两个工具:codex() 用于开启对话,codex-reply() 用于继续同一条对话,并让 Codex 在多个智能体回合之间持续存活。
创建一个名为 codex_mcp.py 的文件,并加入以下内容:
import asyncio
from agents import Agent, Runner
from agents.mcp import MCPServerStdio
async def main() -> None:
async with MCPServerStdio(
name="Codex CLI",
params={
"command": "npx",
"args": ["-y", "codex", "mcp-server"],
},
client_session_timeout_seconds=360000,
) as codex_mcp_server:
print("Codex MCP server started.")
# More logic coming in the next sections.
return
if __name__ == "__main__":
asyncio.run(main())先运行一次,确认能成功启动:
python codex_mcp.py脚本会在打印 Codex MCP server started. 后退出。接下来的示例会在更完整的工作流里复用这个 MCP 服务端。
构建单智能体工作流
先从一个范围明确的示例开始,用 Codex MCP 交付一个小型浏览器游戏。这个工作流依赖两个智能体:
- Game Designer:为游戏撰写简短设计说明。
- Game Developer:通过调用 Codex MCP 来实现这个游戏。
将 codex_mcp.py 更新为下面的版本。它会保留前面的 MCP 服务端设置,并额外加入这两个智能体。
import asyncio
import os
from dotenv import load_dotenv
from agents import Agent, Runner, set_default_openai_api
from agents.mcp import MCPServerStdio
load_dotenv(override=True)
set_default_openai_api(os.getenv("OPENAI_API_KEY"))
async def main() -> None:
async with MCPServerStdio(
name="Codex CLI",
params={
"command": "npx",
"args": ["-y", "codex", "mcp-server"],
},
client_session_timeout_seconds=360000,
) as codex_mcp_server:
developer_agent = Agent(
name="Game Developer",
instructions=(
"You are an expert in building simple games using basic html + css + javascript with no dependencies. "
"Save your work in a file called index.html in the current directory. "
"Always call codex with \"approval-policy\": \"never\" and \"sandbox\": \"workspace-write\"."
),
mcp_servers=[codex_mcp_server],
)
designer_agent = Agent(
name="Game Designer",
instructions=(
"You are an indie game connoisseur. Come up with an idea for a single page html + css + javascript game that a developer could build in about 50 lines of code. "
"Format your request as a 3 sentence design brief for a game developer and call the Game Developer coder with your idea."
),
model="gpt-5",
handoffs=[developer_agent],
)
await Runner.run(designer_agent, "Implement a fun new game!")
if __name__ == "__main__":
asyncio.run(main())运行:
python codex_mcp.pyCodex 会读取 Designer 给出的设计说明,创建 index.html,并把完整游戏写到磁盘。你可以在浏览器中打开生成的文件来试玩结果。每次运行都会得到不同的设计,玩法和细节打磨也会有所变化。
扩展为多智能体工作流
现在把单智能体方案扩展成一个经过编排、可追踪的工作流。系统会新增:
- Project Manager:创建共享需求、协调交接,并落实护栏约束。
- Designer、Frontend Developer、Server Developer 和 Tester:每个角色都有各自范围明确的指令和输出目录。
创建新文件 multi_agent_workflow.py:
import asyncio
import os
from dotenv import load_dotenv
from agents import (
Agent,
ModelSettings,
Runner,
WebSearchTool,
set_default_openai_api,
)
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX
from agents.mcp import MCPServerStdio
from openai.types.shared import Reasoning
load_dotenv(override=True)
set_default_openai_api(os.getenv("OPENAI_API_KEY"))
async def main() -> None:
async with MCPServerStdio(
name="Codex CLI",
params={"command": "npx", "args": ["-y", "codex", "mcp"]},
client_session_timeout_seconds=360000,
) as codex_mcp_server:
designer_agent = Agent(
name="Designer",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"You are the Designer.\n"
"Your only source of truth is AGENT_TASKS.md and REQUIREMENTS.md from the Project Manager.\n"
"Do not assume anything that is not written there.\n\n"
"You may use the internet for additional guidance or research."
"Deliverables (write to /design):\n"
"- design_spec.md – a single page describing the UI/UX layout, main screens, and key visual notes as requested in AGENT_TASKS.md.\n"
"- wireframe.md – a simple text or ASCII wireframe if specified.\n\n"
"Keep the output short and implementation-friendly.\n"
"When complete, handoff to the Project Manager with transfer_to_project_manager."
"When creating files, call Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}."
),
model="gpt-5",
tools=[WebSearchTool()],
mcp_servers=[codex_mcp_server],
)
frontend_developer_agent = Agent(
name="Frontend Developer",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"You are the Frontend Developer.\n"
"Read AGENT_TASKS.md and design_spec.md. Implement exactly what is described there.\n\n"
"Deliverables (write to /frontend):\n"
"- index.html – main page structure\n"
"- styles.css or inline styles if specified\n"
"- main.js or game.js if specified\n\n"
"Follow the Designer’s DOM structure and any integration points given by the Project Manager.\n"
"Do not add features or branding beyond the provided documents.\n\n"
"When complete, handoff to the Project Manager with transfer_to_project_manager_agent."
"When creating files, call Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}."
),
model="gpt-5",
mcp_servers=[codex_mcp_server],
)
backend_developer_agent = Agent(
name="Backend Developer",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"You are the Backend Developer.\n"
"Read AGENT_TASKS.md and REQUIREMENTS.md. Implement the backend endpoints described there.\n\n"
"Deliverables (write to /backend):\n"
"- package.json – include a start script if requested\n"
"- server.js – implement the API endpoints and logic exactly as specified\n\n"
"Keep the code as simple and readable as possible. No external database.\n\n"
"When complete, handoff to the Project Manager with transfer_to_project_manager_agent."
"When creating files, call Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}."
),
model="gpt-5",
mcp_servers=[codex_mcp_server],
)
tester_agent = Agent(
name="Tester",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"You are the Tester.\n"
"Read AGENT_TASKS.md and TEST.md. Verify that the outputs of the other roles meet the acceptance criteria.\n\n"
"Deliverables (write to /tests):\n"
"- TEST_PLAN.md – bullet list of manual checks or automated steps as requested\n"
"- test.sh or a simple automated script if specified\n\n"
"Keep it minimal and easy to run.\n\n"
"When complete, handoff to the Project Manager with transfer_to_project_manager."
"When creating files, call Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}."
),
model="gpt-5",
mcp_servers=[codex_mcp_server],
)
project_manager_agent = Agent(
name="Project Manager",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"""
You are the Project Manager.
Objective:
Convert the input task list into three project-root files the team will execute against.
Deliverables (write in project root):
- REQUIREMENTS.md: concise summary of product goals, target users, key features, and constraints.
- TEST.md: tasks with [Owner] tags (Designer, Frontend, Backend, Tester) and clear acceptance criteria.
- AGENT_TASKS.md: one section per role containing:
- Project name
- Required deliverables (exact file names and purpose)
- Key technical notes and constraints
Process:
- Resolve ambiguities with minimal, reasonable assumptions. Be specific so each role can act without guessing.
- Create files using Codex MCP with {"approval-policy":"never","sandbox":"workspace-write"}.
- Do not create folders. Only create REQUIREMENTS.md, TEST.md, AGENT_TASKS.md.
Handoffs (gated by required files):
1) After the three files above are created, hand off to the Designer with transfer_to_designer_agent and include REQUIREMENTS.md and AGENT_TASKS.md.
2) Wait for the Designer to produce /design/design_spec.md. Verify that file exists before proceeding.
3) When design_spec.md exists, hand off in parallel to both:
- Frontend Developer with transfer_to_frontend_developer_agent (provide design_spec.md, REQUIREMENTS.md, AGENT_TASKS.md).
- Backend Developer with transfer_to_backend_developer_agent (provide REQUIREMENTS.md, AGENT_TASKS.md).
4) Wait for Frontend to produce /frontend/index.html and Backend to produce /backend/server.js. Verify both files exist.
5) When both exist, hand off to the Tester with transfer_to_tester_agent and provide all prior artifacts and outputs.
6) Do not advance to the next handoff until the required files for that step are present. If something is missing, request the owning agent to supply it and re-check.
PM Responsibilities:
- Coordinate all roles, track file completion, and enforce the above gating checks.
- Do NOT respond with status updates. Just handoff to the next agent until the project is complete.
"""
),
model="gpt-5",
model_settings=ModelSettings(
reasoning=Reasoning(effort="medium"),
),
handoffs=[designer_agent, frontend_developer_agent, backend_developer_agent, tester_agent],
mcp_servers=[codex_mcp_server],
)
designer_agent.handoffs = [project_manager_agent]
frontend_developer_agent.handoffs = [project_manager_agent]
backend_developer_agent.handoffs = [project_manager_agent]
tester_agent.handoffs = [project_manager_agent]
task_list = """
Goal: Build a tiny browser game to showcase a multi-agent workflow.
High-level requirements:
- Single-screen game called "Bug Busters".
- Player clicks a moving bug to earn points.
- Game ends after 20 seconds and shows final score.
- Optional: submit score to a simple backend and display a top-10 leaderboard.
Roles:
- Designer: create a one-page UI/UX spec and basic wireframe.
- Frontend Developer: implement the page and game logic.
- Backend Developer: implement a minimal API (GET /health, GET/POST /scores).
- Tester: write a quick test plan and a simple script to verify core routes.
Constraints:
- No external database—memory storage is fine.
- Keep everything readable for beginners; no frameworks required.
- All outputs should be small files saved in clearly named folders.
"""
result = await Runner.run(project_manager_agent, task_list, max_turns=30)
print(result.final_output)
if __name__ == "__main__":
asyncio.run(main())运行:
python multi_agent_workflow.py
ls -R在这个流程里,Project Manager 会先写出 REQUIREMENTS.md、TEST.md 和 AGENT_TASKS.md,然后按前置文件是否存在来控制交接,依次驱动 Designer、Frontend、Backend 和 Tester 这些智能体完成各自产物。
跟踪工作流
Codex 会自动记录追踪,覆盖整个工作流中的每一次提示词、工具调用和交接。
多智能体运行结束后,可以打开 Traces dashboard 查看执行时间线。
高层追踪可以帮助你确认 Project Manager 是否在正确时机检查了前置文件并发起下一次交接。点进单个步骤后,则可以看到具体提示词、Codex MCP 调用、写入的文件,以及每一步的耗时。
这些细节让你可以逐回合审计每一次交接,并理解整个工作流是如何演进的。它们也让调试流程卡点、审查智能体行为,以及长期衡量性能变得更加直接,而无需额外埋点。