Skip to main content
The LangWatch MCP Server gives your AI coding assistant (Cursor, Claude Code, Codex, etc.) full access to all LangWatch and Scenario documentation and features via the Model Context Protocol.
  • Set up agent testing with Scenario to test agent behavior through user simulations and edge cases
  • Automatically instrument your code with LangWatch tracing for any framework (OpenAI, Agno, Mastra, DSPy, and more)
  • Create and manage prompts using LangWatch’s prompt management system
  • Set up evaluations to test and monitor your LLM outputs
  • Add labels, metadata, and custom tracking following LangWatch best practices
Instead of manually reading docs and writing boilerplate code, just ask your AI assistant to instrument your codebase with LangWatch, and it will do it for you.

Setup

1

Configure your MCP

  • Cursor
  • Claude Code
  • Other Editors
  1. Open Cursor Settings
  2. Navigate to the Tools and MCP section in the sidebar
  3. Add the LangWatch MCP server:
{
  "mcpServers": {
    "langwatch": {
      "command": "npx",
      "args": [
        "-y",
        "@langwatch/mcp-server",
      ]
    }
  }
}
2

Start using it

Open your AI assistant chat (e.g., Cmd/Ctrl + I in Cursor, or Cmd/Ctrl + Shift + P > “Claude Code: Open Chat” in Claude Code) and ask it to help with LangWatch tasks.

Usage Examples

Write Agent Tests with Scenario

Simply ask your AI assistant to write scenario tests for your agents:
"Write a scenario test that checks the agent calls the summarization tool when requested"
The AI assistant will:
  1. Fetch the Scenario documentation and best practices
  2. Create test files with proper imports and setup
  3. Write scenario scripts that simulate user interactions
  4. Add verification logic to check agent behavior
  5. Include judge criteria to evaluate conversation quality
Example scenario test: Here’s an example scenario that checks for tool calls and includes criteria validation:
@pytest.mark.agent_test
@pytest.mark.asyncio
async def test_conversation_summary_request(agent_adapter):
    """Explicit summary requests should call the conversation summary tool."""

    def verify_summary_call(state: scenario.ScenarioState) -> bool:
        args = _require_tool_call(state, "get_conversation_summary")
        assert "conversation_context" in args, "summary tool must include context reference"
        return True

    result = await scenario.run(
        name="conversation summary follow-up",
        description="Customer wants a recap of troubleshooting steps that were discussed.",
        agents=[
            agent_adapter,
            scenario.UserSimulatorAgent(),
            scenario.JudgeAgent(
                criteria=[
                    "Agent provides a clear recap",
                    "Agent confirms next steps and resources",
                ]
            ),
        ],
        script=[
            scenario.user("Thanks for explaining the dispute process earlier."),
            scenario.agent(),
            scenario.user(
                "Before we wrap, can you summarize everything we covered so I don't miss a step?"
            ),
            scenario.agent(),
            verify_summary_call,
            scenario.judge(),
        ],
    )

    assert result.success, result.reasoning
The LangWatch MCP automatically handles fetching the right documentation, understanding your agent’s framework, and generating tests that follow Scenario best practices.

Instrument Your Code with LangWatch

Simply ask your AI assistant to add LangWatch tracking to your existing code:
"Please instrument my code with LangWatch"
The AI assistant will:
  1. Fetch the relevant LangWatch documentation for your framework
  2. Add the necessary imports and setup code
  3. Wrap your functions with @langwatch.trace() decorators
  4. Configure automatic tracking for your LLM calls
  5. Add labels and metadata following best practices
Example transformation: Before:
from openai import OpenAI

client = OpenAI()

def chat(message: str):
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": message}]
    )
    return response.choices[0].message.content
After (automatically added by AI assistant):
from openai import OpenAI
import langwatch

client = OpenAI()
langwatch.setup()

@langwatch.trace()
def chat(message: str):
    langwatch.get_current_trace().autotrack_openai_calls(client)
    langwatch.get_current_trace().update(
        metadata={"labels": ["document_parsing"]}
    )

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": message}]
    )
    return response.choices[0].message.content

Create Prompts with Prompt Management

Ask your AI assistant to set up prompt management:
"Create a prompt for my agents to parse PDFs using the prompts CLI"
The AI assistant will guide you through creating, versioning, and using prompts from LangWatch’s Prompts CLI.

Set Up Evaluations

Ask your AI assistant to set up evaluation code for your LLM outputs:
"Create a notebook to evaluate the faithfulness of my RAG pipeline using LangWatch's Evaluating via Code guide"
The AI assistant will:
  1. Fetch the relevant LangWatch evaluation documentation
  2. Create evaluation notebooks or scripts with proper setup
  3. Add evaluation metrics and criteria for your use case
  4. Include code to run evaluations following Evaluating via Code

Advanced: Self-Building AI Agents

The LangWatch MCP is so powerful that it can help AI agents automatically instrument themselves while being built. This enables self-improving AI systems that can track and debug their own behavior.

MCP Tools Reference

The MCP server provides the following tools that your AI assistant can use:

fetch_langwatch_docs

Fetches LangWatch documentation pages to understand how to implement features. Parameters:
  • url (optional): The full URL of a specific doc page. If not provided, fetches the docs index.

fetch_scenario_docs

Fetches Scenario documentation pages to understand how to write agent tests. Parameters:
  • url (optional): The full URL of a specific doc page. If not provided, fetches the docs index.
Your AI assistant will automatically choose the right tools based on your request. You don’t need to call these tools manually.