Python SDK API Reference
LangWatch Python SDK API reference
Setup
langwatch.setup()
Initializes the LangWatch client, enabling data collection and tracing for your LLM application. This is typically the first function you’ll call when integrating LangWatch.
Your LangWatch API key. It’s recommended to set this via an environment variable (e.g., LANGWATCH_API_KEY
) and retrieve it using os.getenv("LANGWATCH_API_KEY")
.
The URL of the LangWatch backend where traces will be sent. Defaults to the LangWatch Cloud service. For self-hosted instances, you’ll need to provide this.
A BaseAttributes
object allowing you to set default tags and properties that will be attached to all traces and spans. See BaseAttributes
type for more details.
An OpenTelemetry TracerProvider
instance. If you have an existing OpenTelemetry setup, you can pass your TracerProvider
here. LangWatch will add its exporter to this provider. If not provided, LangWatch will configure its own.
A sequence of OpenTelemetry instrumentor instances. LangWatch can automatically apply these instrumentors. (Note: Specific instrumentor types might need to be defined or linked here).
A list of SpanProcessingExcludeRule
objects. These rules allow you to filter out specific spans from being exported to LangWatch, based on span name or attributes. See SpanProcessingExcludeRule
for details.
If True
, enables debug logging for the LangWatch client, providing more verbose output about its operations.
Returns
An instance of the LangWatch Client
.
Tracing
@langwatch.trace()
/ langwatch.trace()
This is the primary way to define the boundaries of a request or a significant operation in your application. It can be used as a decorator around a function or as a context manager.
When used, it creates a new trace and a root span for that trace. Any @langwatch.span()
or other instrumented calls made within the decorated function or context manager will be nested under this root span.
The name for the root span of this trace. If used as a decorator and not provided, it defaults to the name of the decorated function. For context manager usage, a name like “LangWatch Trace” might be used if not specified.
(Deprecated) A specific ID to assign to this trace. If not provided, a new UUID will be generated. It’s generally recommended to let LangWatch auto-generate trace IDs. This will be mapped to deprecated.trace_id
in metadata.
A dictionary of metadata to attach to the entire trace. This can include things like user IDs, session IDs, or any other contextual information relevant to the whole operation. TraceMetadata
is typically Dict[str, Any]
.
If you have a known expected output for this trace (e.g., for testing or evaluation), you can provide it here.
Overrides the global API key for this specific trace. Useful if you need to direct traces to different projects or accounts dynamically.
If True
, this trace (and its spans) will be processed but not sent to the LangWatch backend. This can be useful for local debugging or conditional tracing.
The maximum length for string values captured in inputs, outputs, and metadata. Longer strings will be truncated.
If True
, a root span will not be automatically created for this trace. This is an advanced option, typically used if you intend to manage the root span’s lifecycle manually or if the trace is purely a logical grouping without its own primary operation.
Context Manager Return
When used as a context manager, langwatch.trace()
returns a LangWatchTrace
object.
The LangWatchTrace
instance. You can use this object to call methods like current_trace.add_evaluation()
.
LangWatchTrace
Object Methods
When langwatch.trace()
is used as a context manager, it yields a LangWatchTrace
object. This object has several methods to interact with the current trace:
Updates attributes of the trace or its root span.
This method can take many of the same parameters as the langwatch.trace()
decorator/context manager itself, such as metadata
, expected_output
, or any of the root span parameters like name
, input
, output
, metrics
, etc.
Adds an Evaluation
object directly to the trace (or a specified span within it).
(Refer to Evaluation
type in Core Data Types and langwatch.evaluations
module for more details on parameters.)
Triggers a remote evaluation for this trace using a pre-configured evaluator slug on the LangWatch platform.
Parameters include slug
, name
, input
, output
, expected_output
, contexts
, conversation
, settings
, as_guardrail
, data
.
Returns: Result of the evaluation call.
An asynchronous version of evaluate
.
Instruments an OpenAI client instance (e.g., openai.OpenAI()
) to automatically create spans for any OpenAI API calls made using that client within the current trace.
Takes the OpenAI client instance as an argument.
Enables automatic tracing for DSPy operations within the current trace. Requires DSPy to be installed and properly configured.
Returns a LangChain callback handler (LangChainTracer
) associated with the current trace. This handler can be passed to LangChain runs to automatically trace LangChain operations.
Returns: LangChainTracer
instance.
(Potentially) Generates a shareable link or identifier for this trace. The exact behavior might depend on backend support and configuration. Returns: A string, possibly a URL or an ID.
(Potentially) Revokes sharing for this trace if it was previously shared.
@langwatch.span()
/ langwatch.span()
Use this to instrument specific operations or blocks of code within a trace. Spans can be nested to create a hierarchical view of your application’s execution flow.
It can be used as a decorator around a function or as a context manager.
The name for the span. If used as a decorator and not provided, it defaults to the name of the decorated function. For context manager usage, a default name like “LangWatch Span” might be used if not specified.
The semantic type of the span, which helps categorize the operation. Common types include 'llm'
, 'rag'
, 'agent'
, 'tool'
, 'embedding'
, or a generic 'span'
. SpanType
is typically a string literal from langwatch.domain.SpanTypes
.
(Deprecated) A specific ID to assign to this span. It’s generally recommended to let LangWatch auto-generate span IDs. This will be mapped to deprecated.span_id
in the span’s attributes.
Explicitly sets the parent for this span. If not provided, the span will be nested under the currently active LangWatchSpan
or OpenTelemetry span.
If True
(and used as a decorator), automatically captures the arguments of the decorated function as the span’s input.
If True
(and used as a decorator), automatically captures the return value of the decorated function as the span’s output.
Explicitly sets the input for this span. SpanInputType
can be a dictionary, a string, or a list of ChatMessage
objects. This overrides automatic input capture.
Explicitly sets the output for this span. SpanInputType
has the same flexibility as for input
. This overrides automatic output capture.
Records an error for this span. If an exception occurs within a decorated function or context manager, it’s often automatically recorded.
A SpanTimestamps
object to explicitly set the start_time
and end_time
for the span. Useful for instrumenting operations where the duration is known or externally managed.
Relevant contextual information for this span, especially for RAG operations. ContextsType
can be a list of RAGChunk
objects or a list of strings.
The name or identifier of the model used in this operation (e.g., 'gpt-4o-mini'
, 'text-embedding-ada-002'
).
A dictionary or SpanParams
object containing parameters relevant to the operation (e.g., temperature for an LLM call, k for a vector search).
A SpanMetrics
object or dictionary to record quantitative measurements for this span, such as token counts (input_tokens
, output_tokens
), cost, or other custom metrics.
A list of Evaluation
objects to directly associate with this span.
If True
, suppresses the warning that is normally emitted if a span is created without an active parent trace.
OpenTelemetry Parameters: These parameters are passed directly to the underlying OpenTelemetry span creation. Refer to OpenTelemetry documentation for more details.
The OpenTelemetry SpanKind
(e.g., INTERNAL
, CLIENT
, SERVER
, PRODUCER
, CONSUMER
).
An OpenTelemetry Context
object to use for creating the span.
Additional custom attributes (key-value pairs) to attach to the span.
A list of OpenTelemetry Link
objects to associate with this span.
An explicit start time for the span (in nanoseconds since epoch).
Whether OpenTelemetry should automatically record exceptions for this span.
Whether OpenTelemetry should automatically set the span status to error when an exception is recorded.
Context Manager Return
When used as a context manager, langwatch.span()
returns a LangWatchSpan
object.
The LangWatchSpan
instance. You can use this object to call methods like current_span.update()
or current_span.add_evaluation()
.
LangWatchSpan
Object Methods
When langwatch.span()
is used as a context manager, it yields a LangWatchSpan
object. This object provides methods to interact with the current span:
Updates attributes of the span. This is the primary method for adding or changing information on an active span.
It accepts most of the same parameters as the @langwatch.span()
decorator itself, such as name
, type
, input
, output
, error
, timestamps
, contexts
, model
, params
, metrics
, and arbitrary key-value pairs for custom attributes.
Adds an Evaluation
object directly to this span.
(Refer to Evaluation
type in Core Data Types and langwatch.evaluations
module for more details on parameters.)
Triggers a remote evaluation for this span using a pre-configured evaluator slug on the LangWatch platform.
Parameters include slug
, name
, input
, output
, expected_output
, contexts
, conversation
, settings
, as_guardrail
, data
.
Returns: Result of the evaluation call.
An asynchronous version of evaluate
for spans.
Explicitly ends the span. If you provide arguments (like output
, metrics
, etc.), it will call update()
with those arguments before ending.
Usually not needed when using the span as a context manager, as __exit__
handles this.
OpenTelemetry Span Methods:
The LangWatchSpan
object also directly exposes standard OpenTelemetry trace.Span
API methods for more advanced use cases or direct OTel interop. These include:
record_error(exception)
: Records an exception against the span.add_event(name, attributes)
: Adds a timed event to the span.set_status(status_code, description)
: Sets the OTel status of the span (e.g.,StatusCode.ERROR
).set_attributes(attributes_dict)
: Sets multiple OTel attributes at once.update_name(new_name)
: Changes the name of the span.is_recording()
: ReturnsTrue
if the span is currently recording information.get_span_context()
: Returns theSpanContext
of the underlying OTel span.
Refer to the OpenTelemetry Python documentation for details on these methods.
Context Accessors
These functions allow you to access the currently active LangWatch trace or span from anywhere in your code, provided that a trace/span has been started (e.g., via @langwatch.trace
or @langwatch.span
).
langwatch.get_current_trace()
Retrieves the currently active LangWatchTrace
object.
This is useful if you need to interact with the trace object directly, for example, to add trace-level metadata or evaluations from a helper function called within an active trace.
If True
, suppresses the warning that is normally emitted if this function is called when no LangWatch trace is currently in context.
The current LangWatchTrace
object. If no trace is active and suppress_warning
is False
, a warning is issued and a new (detached) LangWatchTrace
instance might be returned.
langwatch.get_current_span()
Retrieves the currently active LangWatchSpan
object.
This allows you to get a reference to the current span to update its attributes, add events, or record information specific to that span from nested function calls. If no LangWatch-specific span is in context, it will attempt to wrap the current OpenTelemetry span.
The current LangWatchSpan
object. This could be a span initiated by @langwatch.span
, the root span of a @langwatch.trace
, or a LangWatchSpan
wrapping an existing OpenTelemetry span if no LangWatch span is directly in context.
Core Data Types
This section describes common data structures used throughout the LangWatch SDK, particularly as parameters to functions like langwatch.setup()
, @langwatch.trace()
, and @langwatch.span()
, or as part of the data captured.
SpanProcessingExcludeRule
Defines a rule to filter out spans before they are exported to LangWatch. Used in the span_exclude_rules
parameter of langwatch.setup()
.
The field of the span to match against. Currently, only "span_name"
is supported.
The value to match for the specified field_name
.
The operation to use for matching (e.g., "exact_match"
, "starts_with"
).
ChatMessage
Represents a single message in a chat conversation, typically used for input
or output
of LLM spans.
The role of the message sender. ChatRole
is a Literal: "system"
, "user"
, "assistant"
, "function"
, "tool"
, "guardrail"
, "evaluation"
, "unknown"
.
The textual content of the message.
For assistant messages that involve a function call (legacy OpenAI format).
For assistant messages that involve tool calls (current OpenAI format).
For messages that are responses from a tool, this is the ID of the tool call that this message is a response to.
The name of the function whose result is in the content
(if role is "function"
), or the name of the tool/participant (if role is "tool"
).
RAGChunk
Represents a chunk of retrieved context, typically used with RAG (Retrieval Augmented Generation) operations in the contexts
field of a span.
An identifier for the source document of this chunk.
An identifier for this specific chunk within the document.
The actual content of the RAG chunk. Can be a simple string or a more complex structured object.
SpanInputOutput
This is a flexible type used for the input
and output
fields of spans. It’s a Union that can take several forms to represent different kinds of data. LangWatch will store it as a typed value.
Common forms include:
TypedValueText
: For simple string inputs/outputs.{"type": "text", "value": "your string"}
TypedValueChatMessages
: For conversational inputs/outputs.{"type": "chat_messages", "value": [ChatMessage, ...]}
TypedValueJson
: For arbitrary JSON-serializable data.{"type": "json", "value": {"key": "value"}}
TypedValueRaw
: For data that should be stored as a raw string, escaping any special interpretation.{"type": "raw", "value": "<xml>data</xml>"}
TypedValueList
: For a list ofSpanInputOutput
objects.{"type": "list", "value": [SpanInputOutput, ...]}
When providing input
or output
to @langwatch.span()
or span.update()
, you can often provide the raw Python object (e.g., a string, a list of ChatMessage
dicts, a dictionary for JSON), and the SDK will attempt to serialize it correctly. For more control, you can construct the TypedValue
dictionaries yourself.
SpanTimestamps
A dictionary defining custom start and end times for a span, in nanoseconds since epoch.
Timestamp when the span started (nanoseconds since epoch).
For LLM operations, timestamp when the first token was received (nanoseconds since epoch).
Timestamp when the span finished (nanoseconds since epoch).
SpanTypes
(Semantic Span Types)
A string literal defining the semantic type of a span. This helps categorize spans in the LangWatch UI and for analytics. Possible values include:
"span"
(generic span)"llm"
(Language Model operation)"chain"
(a sequence of operations, e.g., LangChain chain)"tool"
(execution of a tool or function call)"agent"
(an autonomous agent’s operation)"guardrail"
(a guardrail or safety check)"evaluation"
(an evaluation step)"rag"
(Retrieval Augmented Generation operation)"workflow"
(a broader workflow or business process)"component"
(a sub-component within a larger system)"module"
(a logical module of code)"server"
(server-side operation)"client"
(client-side operation)"producer"
(message producer)"consumer"
(message consumer)"task"
(a background task or job)"unknown"
SpanMetrics
A dictionary for quantitative measurements associated with a span.
Number of tokens in the input/prompt to an LLM.
Number of tokens in the output/completion from an LLM.
Estimated or actual monetary cost of the operation (e.g., LLM API call cost).
SpanParams
A dictionary for parameters related to an operation, especially LLM calls. Examples include:
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.
Modify the likelihood of specified tokens appearing in the completion. Accepts a dictionary mapping token IDs (or tokens, depending on the model) to a bias value from -100 to 100.
Whether to return log probabilities of the output tokens.
An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with log probability. logprobs
must be True
if this is used.
The maximum number of tokens to generate in the completion.
How many completions to generate for each prompt.
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.
If specified, the system will make a best effort to sample deterministically, such that repeated requests with the same seed
and parameters should return the same result.
Up to 4 sequences where the API will stop generating further tokens.
If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available.
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p
probability mass.
A list of tools the model may call. Currently, only functions are supported as a tool.
Controls which (if any) tool is called by the model. none
means the model will not call any tool. auto
means the model can pick between generating a message or calling a tool.
A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.
TraceMetadata
A dictionary (`