Python Integration Guide
LangWatch library is the easiest way to integrate your Python application with LangWatch, the messages are synced on the background so it doesn’t intercept or block your LLM calls.
Prerequisites
- Obtain your
LANGWATCH_API_KEY
from the LangWatch dashboard.
Installation
pip install langwatch
Configuration
Ensure LANGWATCH_API_KEY
is set:
export LANGWATCH_API_KEY='your_api_key_here'
Capturing Messages
- Each message triggering your LLM pipeline as a whole is captured with a Trace.
- A Trace contains multiple Spans, which are the steps inside your pipeline.
- Traces can be grouped together on LangWatch Dashboard by having the same
thread_id
in their metadata, making the individual messages become part of a conversation.- It is also recommended to provide the
user_id
metadata to track user analytics.
- It is also recommended to provide the
Create a Trace
To capture traces and spans, start by adding the @langwatch.trace()
decorator to the function that starts your LLM pipeline. Here it is represented by the main()
function, but it can be your endpoint call or your class method that triggers the whole generation.
import langwatch
@langwatch.trace()
def main():
...
This is the main entry point for your trace, and all spans called from here will be collected automatically to LangWatch in the background.
On short-live environments like Lambdas or Serverless Functions, be sure to call
langwatch.get_current_trace().send_spans()
before your trace function ends to wait for all pending requests to be sent before the runtime is destroyed.
Capturing LLM Spans
LangWatch provides some utilities to automatically capture spans for popular LLM frameworks.
For OpenAI, you can use the autotrack_openai_calls()
function to automatically capture LLM spans for OpenAI calls for the current trace.
import langwatch
from openai import OpenAI
client = OpenAI()
@langwatch.trace()
def main():
langwatch.get_current_trace().autotrack_openai_calls(client)
...
That’s enough to have your OpenAI calls collected and visible on LangWatch dashboard:
Check out for more python integration examples on the examples folder on our GitHub repo.
Adding metadata
You can add metadata to track the user_id and current conversation thread_id, this is highly recommended to unlock better conversation grouping and user analytics on LangWatch.
import langwatch
@langwatch.trace()
def main():
langwatch.get_current_trace().update(metadata={"user_id": "user_id", "thread_id": "thread_id"})
...
You can also add custom labels to your trace to help you better filter and group your traces, or even trigger specific evaluations and alerts.
import langwatch
@langwatch.trace()
def main():
langwatch.get_current_trace().update(metadata={"labels": ["production"]})
...
Check out the reference to see all the available trace properties.
Changing the Message Input and Output
By default, the main input and output of the trace displayed on LangWatch is captured from the arguments and return value of the top-level decorated function and heuristics try to extract the human-readable message from it automatically.
However, sometimes more complex structures are used and the messages might not end up very human-readable on LangWatch, for example:
To make the messages really easy to read in the list and through the whole conversation, you can manually set what
should the input and output of the trace be, by calling .update(input=...)
and .update(output=...)
on the current trace:
import langwatch
@langwatch.trace()
def main(inputs):
# Update the input of the trace with the user message or any other human-readable text
langwatch.get_current_trace().update(input=inputs.question)
...
# Then, before returning, update the output of the trace with final response
langwatch.get_current_trace().update(output=response)
return response
This will make the messages on LangWatch look like this:
Capturing a RAG span
RAG is a combination of a retrieval and a generation step, LangWatch provides a special span type for RAG that captures both steps separately which allows to capture the contexts
being used by the LLM on your pipeline.
By capturing the contexts
, you unlock various uses of it on LangWatch, like RAG evaluators such as Faitfhfulness and Context Relevancy, and analytics on which documents are being used the most.
To capture a RAG span, you can use the @langwatch.span(type="rag")
decorator, along with a call to .update()
to add the contexts
to the span:
@langwatch.span(type="rag")
def rag_retrieval():
# the documents you retrieved from your vector database
search_results = ["France is a country in Europe.", "Paris is the capital of France."]
# capture them on the span contexts before returning
langwatch.get_current_span().update(contexts=search_results)
return search_results
If you have document or chunk ids from the results, we recommend you can to capture them along with the id using RAGChunk
, as this allows them to be grouped together and generate documents analytics on LangWatch dashboard:
from langwatch.types import RAGChunk
@langwatch.span(type="rag")
def rag_retrieval():
# the documents you retrieved from your vector database
search_results = [
{
"id": "doc-1",
"content": "France is a country in Europe.",
},
{
"id": "doc-2",
"content": "Paris is the capital of France.",
},
]
# capture then on the span contexts with RAGChunk before returning
langwatch.get_current_span().update(
contexts=[
RAGChunk(
document_id=document["id"],
content=document["content"],
)
for document in search_results
]
)
return search_results
Then you’ll be able to see the captured contexts that will also be used later on for evaluatios on LangWatch dashboard:
Capturing other spans
To be able to inspect and debug each step of your pipeline along with the LLM calls, you can use the @langwatch.span()
decorator. You can pass in different type
s to categorize your spans.
import langwatch
@langwatch.span()
def database_query():
...
@langwatch.span(type="tool")
def weather_forecast(city: str):
...
@langwatch.span(type="rag")
def rag_retrieval():
...
# You can manually track llm calls too if the automatic capture is not enough for your use case
@langwatch.span(type="llm")
def llm_call():
...
@langwatch.trace()
def main():
...
The input and output of the decorated function are automatically captured in the span, to disable that, you can set capture_input
and capture_output
to False
:
@langwatch.span(capture_input=False, capture_output=False)
def database_query():
...
You can also modify the current spans attributes, either on the decorator by using .update()
on the current span:
@langwatch.span(type="llm", name="custom_name")
def llm_call():
langwatch.get_current_span().update(model="my-custom-model")
...
Check out the reference to see all the available span properties.
Capturing custom evaluation results
LangWatch Evaluators can run automatically on your traces, but if you have an in-house custom evaluator, you can also capture the evaluation
results of your custom evaluator on the current trace or span by using the .add_evaluation
method:
import langwatch
@langwatch.span(type="evaluation")
def evaluation_step():
... # your custom evaluation logic
langwatch.get_current_span().add_evaluation(
name="custom evaluation", # required
passed=True,
score=0.5,
label="category_detected",
details="explanation of the evaluation results",
)
The evaluation name
is required and must be a string. The other fields are optional, but at least one of passed
, score
or label
must be provided.
Synchronizing your message IDs with LangWatch traces
If you store the messages in a database on your side as well, you set the trace_id
of the current trace to the same one of the message on your side, this way your system will be in sync with LangWatch traces, making it easier to investigate later on.
@langwatch.trace()
def main():
...
langwatch.get_current_trace().update(trace_id=message_id)
...