LangWatch Python Repo
LangWatch Python SDK version

LangWatch library is the easiest way to integrate your Python application with LangWatch, the messages are synced on the background so it doesn’t intercept or block your LLM calls.

Prerequisites

Installation

pip install langwatch

Configuration

Ensure LANGWATCH_API_KEY is set:

export LANGWATCH_API_KEY='your_api_key_here'

Capturing Messages

  • Each message triggering your LLM pipeline as a whole is captured with a Trace.
  • A Trace contains multiple Spans, which are the steps inside your pipeline.
    • A span can be an LLM call, a database query for a RAG retrieval, or a simple function transformation.
    • Different types of Spans capture different parameters.
    • Spans can be nested to capture the pipeline structure.
  • Traces can be grouped together on LangWatch Dashboard by having the same thread_id in their metadata, making the individual messages become part of a conversation.
    • It is also recommended to provide the user_id metadata to track user analytics.

Create a Trace

To capture traces and spans, start by adding the @langwatch.trace() decorator to the function that starts your LLM pipeline. Here it is represented by the main() function, but it can be your endpoint call or your class method that triggers the whole generation.

import langwatch

@langwatch.trace()
def main():
    ...

This is the main entry point for your trace, and all spans called from here will be collected automatically to LangWatch in the background.

On short-live environments like Lambdas or Serverless Functions, be sure to call
langwatch.get_current_trace().send_spans() before your trace function ends to wait for all pending requests to be sent before the runtime is destroyed.

Capturing LLM Spans

LangWatch provides some utilities to automatically capture spans for popular LLM frameworks.

For OpenAI, you can use the autotrack_openai_calls() function to automatically capture LLM spans for OpenAI calls for the current trace.

import langwatch
from openai import OpenAI

client = OpenAI()

@langwatch.trace()
def main():
    langwatch.get_current_trace().autotrack_openai_calls(client)
    ...

That’s enough to have your OpenAI calls collected and visible on LangWatch dashboard:

OpenAI Spans

Check out for more python integration examples on the examples folder on our GitHub repo.

Adding metadata

You can add metadata to track the user_id and current conversation thread_id, this is highly recommended to unlock better conversation grouping and user analytics on LangWatch.

import langwatch

@langwatch.trace()
def main():
    langwatch.get_current_trace().update(metadata={"user_id": "user_id", "thread_id": "thread_id"})
    ...

You can also add custom labels to your trace to help you better filter and group your traces, or even trigger specific evaluations and alerts.

import langwatch

@langwatch.trace()
def main():
    langwatch.get_current_trace().update(metadata={"labels": ["production"]})
    ...

Check out the reference to see all the available trace properties.

Capturing a RAG span

RAG is a combination of a retrieval and a generation step, LangWatch provides a special span type for RAG that captures both steps separately which allows to capture the contexts being used by the LLM on your pipeline. By capturing the contexts, you unlock various uses of it on LangWatch, like RAG evaluators such as Faitfhfulness and Context Relevancy, and analytics on which documents are being used the most.

To capture a RAG span, you can use the @langwatch.span(type="rag") decorator, along with a call to .update() to add the contexts to the span:

@langwatch.span(type="rag")
def rag_retrieval():
    # the documents you retrieved from your vector database
    search_results = ["France is a country in Europe.", "Paris is the capital of France."]

    # capture them on the span contexts before returning
    langwatch.get_current_span().update(contexts=search_results)

    return search_results

If you have document or chunk ids from the results, we recommend you can to capture them along with the id using RAGChunk, as this allows them to be grouped together and generate documents analytics on LangWatch dashboard:

from langwatch.types import RAGChunk

@langwatch.span(type="rag")
def rag_retrieval():
    # the documents you retrieved from your vector database
    search_results = [
        {
            "id": "doc-1",
            "content": "France is a country in Europe.",
        },
        {
            "id": "doc-2",
            "content": "Paris is the capital of France.",
        },
    ]

    # capture then on the span contexts with RAGChunk before returning
    langwatch.get_current_span().update(
        contexts=[
            RAGChunk(
                document_id=document["id"],
                content=document["content"],
            )
            for document in search_results
        ]
    )

    return search_results

Then you’ll be able to see the captured contexts that will also be used later on for evaluatios on LangWatch dashboard:

RAG Spans

Capturing other spans

To be able to inspect and debug each step of your pipeline along with the LLM calls, you can use the @langwatch.span() decorator. You can pass in different types to categorize your spans.

import langwatch

@langwatch.span()
def database_query():
    ...

@langwatch.span(type="tool")
def weather_forecast(city: str):
    ...

@langwatch.span(type="rag")
def rag_retrieval():
    ...

# You can manually track llm calls too if the automatic capture is not enough for your use case
@langwatch.span(type="llm")
def llm_call():
    ...

@langwatch.trace()
def main():
    ...

The input and output of the decorated function are automatically captured in the span, to disable that, you can set capture_input and capture_output to False:

@langwatch.span(capture_input=False, capture_output=False)
def database_query():
    ...

You can also modify the current spans attributes, either on the decorator by using .update() on the current span:

@langwatch.span(type="llm", name="custom_name")
def llm_call():
    langwatch.get_current_span().update(model="my-custom-model")
    ...

Check out the reference to see all the available span properties.

Capturing custom evaluation results

LangWatch Evaluators can run automatically on your traces, but if you have an in-house custom evaluator, you can also capture the evaluation results of your custom evaluator on the current trace or span by using the .add_evaluation method:

import langwatch

@langwatch.span(type="evaluation")
def evaluation_step():
    ... # your custom evaluation logic

    langwatch.get_current_span().add_evaluation(
        name="custom evaluation", # required
        passed=True,
        score=0.5,
        label="category_detected",
        details="explanation of the evaluation results",
    )

The evaluation name is required and must be a string. The other fields are optional, but at least one of passed, score or label must be provided.

Synchronizing your message IDs with LangWatch traces

If you store the messages in a database on your side as well, you set the trace_id of the current trace to the same one of the message on your side, this way your system will be in sync with LangWatch traces, making it easier to investigate later on.

@langwatch.trace()
def main():
    ...
    langwatch.get_current_trace().update(trace_id=message_id)
    ...