Setup

langwatch.setup()

Initializes the LangWatch client, enabling data collection and tracing for your LLM application. This is typically the first function you’ll call when integrating LangWatch.

import langwatch
import os

client = langwatch.setup(
    api_key=os.getenv("LANGWATCH_API_KEY"),
    endpoint_url=os.getenv("LANGWATCH_ENDPOINT_URL") # Optional, defaults to LangWatch Cloud
)
api_key
Optional[str]
default:"None"

Your LangWatch API key. It’s recommended to set this via an environment variable (e.g., LANGWATCH_API_KEY) and retrieve it using os.getenv("LANGWATCH_API_KEY").

endpoint_url
Optional[str]
default:"LangWatch Cloud URL"

The URL of the LangWatch backend where traces will be sent. Defaults to the LangWatch Cloud service. For self-hosted instances, you’ll need to provide this.

base_attributes
Optional[BaseAttributes]
default:"None"

A BaseAttributes object allowing you to set default tags and properties that will be attached to all traces and spans. See BaseAttributes type for more details.

tracer_provider
Optional[TracerProvider]
default:"None"

An OpenTelemetry TracerProvider instance. If you have an existing OpenTelemetry setup, you can pass your TracerProvider here. LangWatch will add its exporter to this provider. If not provided, LangWatch will configure its own.

instrumentors
Optional[Sequence[Instrumentor]]
default:"None"

A sequence of OpenTelemetry instrumentor instances. LangWatch can automatically apply these instrumentors. (Note: Specific instrumentor types might need to be defined or linked here).

span_exclude_rules
Optional[List[SpanProcessingExcludeRule]]
default:"[]"

A list of SpanProcessingExcludeRule objects. These rules allow you to filter out specific spans from being exported to LangWatch, based on span name or attributes. See SpanProcessingExcludeRule for details.

debug
bool
default:"False"

If True, enables debug logging for the LangWatch client, providing more verbose output about its operations.

Returns

client
Client

An instance of the LangWatch Client.

Tracing

@langwatch.trace() / langwatch.trace()

This is the primary way to define the boundaries of a request or a significant operation in your application. It can be used as a decorator around a function or as a context manager.

When used, it creates a new trace and a root span for that trace. Any @langwatch.span() or other instrumented calls made within the decorated function or context manager will be nested under this root span.

import langwatch

langwatch.setup()

@langwatch.trace(name="MyRequestHandler", metadata={"user_id": "123"})
def handle_request(query: str):
    # ... your logic ...
    # langwatch.get_current_span().update(output=response)
    return f"Processed: {query}"

handle_request("Hello LangWatch!")
name
Optional[str]

The name for the root span of this trace. If used as a decorator and not provided, it defaults to the name of the decorated function. For context manager usage, a name like “LangWatch Trace” might be used if not specified.

trace_id
Optional[Union[str, UUID]]
default:"Auto-generated"

(Deprecated) A specific ID to assign to this trace. If not provided, a new UUID will be generated. It’s generally recommended to let LangWatch auto-generate trace IDs. This will be mapped to deprecated.trace_id in metadata.

metadata
Optional[TraceMetadata]
default:"None"

A dictionary of metadata to attach to the entire trace. This can include things like user IDs, session IDs, or any other contextual information relevant to the whole operation. TraceMetadata is typically Dict[str, Any].

expected_output
Optional[str]
default:"None"

If you have a known expected output for this trace (e.g., for testing or evaluation), you can provide it here.

api_key
Optional[str]
default:"Global or None"

Overrides the global API key for this specific trace. Useful if you need to direct traces to different projects or accounts dynamically.

disable_sending
bool
default:"False"

If True, this trace (and its spans) will be processed but not sent to the LangWatch backend. This can be useful for local debugging or conditional tracing.

max_string_length
Optional[int]
default:"5000"

The maximum length for string values captured in inputs, outputs, and metadata. Longer strings will be truncated.

skip_root_span
bool
default:"False"

If True, a root span will not be automatically created for this trace. This is an advanced option, typically used if you intend to manage the root span’s lifecycle manually or if the trace is purely a logical grouping without its own primary operation.

Context Manager Return

When used as a context manager, langwatch.trace() returns a LangWatchTrace object.

current_trace
LangWatchTrace

The LangWatchTrace instance. You can use this object to call methods like current_trace.add_evaluation().

LangWatchTrace Object Methods

When langwatch.trace() is used as a context manager, it yields a LangWatchTrace object. This object has several methods to interact with the current trace:

update
Callable

Updates attributes of the trace or its root span. This method can take many of the same parameters as the langwatch.trace() decorator/context manager itself, such as metadata, expected_output, or any of the root span parameters like name, input, output, metrics, etc.

with langwatch.trace(name="Initial Trace") as current_trace:
    # ... some operations ...
    current_trace.update(metadata={"step": "one_complete"})
    # ... more operations ...
    root_span_output = "Final output of root operation"
    current_trace.update(output=root_span_output, metrics={"total_items": 10})
add_evaluation
Callable

Adds an Evaluation object directly to the trace (or a specified span within it).

from langwatch.domain import Evaluation, EvaluationTimestamps

with langwatch.trace(name="MyEvaluatedProcess") as current_trace:
    # ... operation ...
    eval_result = Evaluation(
        name="Accuracy Check",
        status="processed",
        passed=True,
        score=0.95,
        details="All checks passed.",
        timestamps=EvaluationTimestamps(started_at=..., finished_at=...)
    )
    current_trace.add_evaluation(**eval_result) # Pass fields as keyword arguments

(Refer to Evaluation type in Core Data Types and langwatch.evaluations module for more details on parameters.)

evaluate
Callable

Triggers a remote evaluation for this trace using a pre-configured evaluator slug on the LangWatch platform.

with langwatch.trace(name="ProcessWithRemoteEval") as current_trace:
    output_to_evaluate = "Some generated text."
    current_trace.evaluate(
        slug="sentiment-analyzer", 
        output=output_to_evaluate,
        as_guardrail=False
    )

Parameters include slug, name, input, output, expected_output, contexts, conversation, settings, as_guardrail, data. Returns: Result of the evaluation call.

async_evaluate
Callable

An asynchronous version of evaluate.

autotrack_openai_calls
Callable

Instruments an OpenAI client instance (e.g., openai.OpenAI()) to automatically create spans for any OpenAI API calls made using that client within the current trace.

import openai
# client = openai.OpenAI() # or AzureOpenAI, etc.

with langwatch.trace(name="OpenAICallsDemo") as current_trace:
    # current_trace.autotrack_openai_calls(client)
    # response = client.chat.completions.create(...)
    # The above call will now be automatically traced as a span.
    pass

Takes the OpenAI client instance as an argument.

autotrack_dspy
Callable

Enables automatic tracing for DSPy operations within the current trace. Requires DSPy to be installed and properly configured.

with langwatch.trace(name="DSPyPipeline") as current_trace:
    # current_trace.autotrack_dspy()
    # ... your DSPy calls ...
    pass
get_langchain_callback
Callable

Returns a LangChain callback handler (LangChainTracer) associated with the current trace. This handler can be passed to LangChain runs to automatically trace LangChain operations.

# from langchain_core.llms import FakeLLM (or any LangChain LLM)
# from langchain.chains import LLMChain

with langwatch.trace(name="LangChainFlow") as current_trace:
    # handler = current_trace.get_langchain_callback()
    # llm = FakeLLM()
    # chain = LLMChain(llm=llm, prompt=...)
    # chain.run("some input", callbacks=[handler])
    pass

Returns: LangChainTracer instance.

share
Callable

(Potentially) Generates a shareable link or identifier for this trace. The exact behavior might depend on backend support and configuration. Returns: A string, possibly a URL or an ID.

unshare
Callable

(Potentially) Revokes sharing for this trace if it was previously shared.

@langwatch.span() / langwatch.span()

Use this to instrument specific operations or blocks of code within a trace. Spans can be nested to create a hierarchical view of your application’s execution flow.

It can be used as a decorator around a function or as a context manager.

import langwatch

langwatch.setup()

@langwatch.trace(name="MainProcess")
def main_process():
    do_first_step("data for step 1")
    do_second_step("data for step 2")

@langwatch.span(name="StepOne", type="tool")
def do_first_step(data: str):
    # langwatch.get_current_span().set_attributes({"custom_info": "info1"})
    return f"Step 1 processed: {data}"

@langwatch.span() # Name will be 'do_second_step', type will be 'span'
def do_second_step(data: str):
    # langwatch.get_current_span().update(metrics={"items_processed": 10})
    return f"Step 2 processed: {data}"

main_process()
name
Optional[str]

The name for the span. If used as a decorator and not provided, it defaults to the name of the decorated function. For context manager usage, a default name like “LangWatch Span” might be used if not specified.

type
Optional[SpanType]
default:"'span'"

The semantic type of the span, which helps categorize the operation. Common types include 'llm', 'rag', 'agent', 'tool', 'embedding', or a generic 'span'. SpanType is typically a string literal from langwatch.domain.SpanTypes.

span_id
Optional[Union[str, UUID]]
default:"Auto-generated"

(Deprecated) A specific ID to assign to this span. It’s generally recommended to let LangWatch auto-generate span IDs. This will be mapped to deprecated.span_id in the span’s attributes.

parent
Optional[Union[OtelSpan, LangWatchSpan]]
default:"Current span in context"

Explicitly sets the parent for this span. If not provided, the span will be nested under the currently active LangWatchSpan or OpenTelemetry span.

capture_input
bool
default:"True"

If True (and used as a decorator), automatically captures the arguments of the decorated function as the span’s input.

capture_output
bool
default:"True"

If True (and used as a decorator), automatically captures the return value of the decorated function as the span’s output.

input
SpanInputType
default:"None"

Explicitly sets the input for this span. SpanInputType can be a dictionary, a string, or a list of ChatMessage objects. This overrides automatic input capture.

output
SpanInputType
default:"None"

Explicitly sets the output for this span. SpanInputType has the same flexibility as for input. This overrides automatic output capture.

error
Optional[Exception]
default:"None"

Records an error for this span. If an exception occurs within a decorated function or context manager, it’s often automatically recorded.

timestamps
Optional[SpanTimestamps]
default:"None"

A SpanTimestamps object to explicitly set the start_time and end_time for the span. Useful for instrumenting operations where the duration is known or externally managed.

contexts
ContextsType
default:"None"

Relevant contextual information for this span, especially for RAG operations. ContextsType can be a list of RAGChunk objects or a list of strings.

model
Optional[str]
default:"None"

The name or identifier of the model used in this operation (e.g., 'gpt-4o-mini', 'text-embedding-ada-002').

params
Optional[Union[SpanParams, Dict[str, Any]]]
default:"None"

A dictionary or SpanParams object containing parameters relevant to the operation (e.g., temperature for an LLM call, k for a vector search).

metrics
Optional[SpanMetrics]
default:"None"

A SpanMetrics object or dictionary to record quantitative measurements for this span, such as token counts (input_tokens, output_tokens), cost, or other custom metrics.

evaluations
Optional[List[Any]]
default:"None"

A list of Evaluation objects to directly associate with this span.

ignore_missing_trace_warning
bool
default:"False"

If True, suppresses the warning that is normally emitted if a span is created without an active parent trace.

OpenTelemetry Parameters: These parameters are passed directly to the underlying OpenTelemetry span creation. Refer to OpenTelemetry documentation for more details.

kind
SpanKind
default:"SpanKind.INTERNAL"

The OpenTelemetry SpanKind (e.g., INTERNAL, CLIENT, SERVER, PRODUCER, CONSUMER).

span_context
Optional[Context]
default:"None"

An OpenTelemetry Context object to use for creating the span.

attributes
Optional[Dict[str, Any]]
default:"None"

Additional custom attributes (key-value pairs) to attach to the span.

A list of OpenTelemetry Link objects to associate with this span.

start_time
Optional[int]
default:"None"

An explicit start time for the span (in nanoseconds since epoch).

record_exception
bool
default:"True"

Whether OpenTelemetry should automatically record exceptions for this span.

set_status_on_exception
bool
default:"True"

Whether OpenTelemetry should automatically set the span status to error when an exception is recorded.

Context Manager Return

When used as a context manager, langwatch.span() returns a LangWatchSpan object.

current_span
LangWatchSpan

The LangWatchSpan instance. You can use this object to call methods like current_span.update() or current_span.add_evaluation().

LangWatchSpan Object Methods

When langwatch.span() is used as a context manager, it yields a LangWatchSpan object. This object provides methods to interact with the current span:

update
Callable

Updates attributes of the span. This is the primary method for adding or changing information on an active span. It accepts most of the same parameters as the @langwatch.span() decorator itself, such as name, type, input, output, error, timestamps, contexts, model, params, metrics, and arbitrary key-value pairs for custom attributes.

with langwatch.span(name="MySpan", type="tool") as current_span:
    # ... some operation ...
    current_span.update(output={"result": "success"}, metrics={"items_processed": 5})
    # Add a custom attribute
    current_span.update(custom_tool_version="1.2.3")
add_evaluation
Callable

Adds an Evaluation object directly to this span.

from langwatch.domain import Evaluation

with langwatch.span(name="SubOperation") as current_span:
    # ... operation ...
    eval_data = {
        "name": "Correctness Check", 
        "status": "processed", 
        "passed": False, 
        "score": 0.2, 
        "label": "Incorrect"
    } # Example, more fields available
    current_span.add_evaluation(**eval_data)

(Refer to Evaluation type in Core Data Types and langwatch.evaluations module for more details on parameters.)

evaluate
Callable

Triggers a remote evaluation for this span using a pre-configured evaluator slug on the LangWatch platform.

with langwatch.span(name="LLM Generation", type="llm") as llm_span:
    llm_output = "Some text generated by an LLM."
    llm_span.update(output=llm_output)
    llm_span.evaluate(slug="toxicity-check", output=llm_output, as_guardrail=True)

Parameters include slug, name, input, output, expected_output, contexts, conversation, settings, as_guardrail, data. Returns: Result of the evaluation call.

async_evaluate
Callable

An asynchronous version of evaluate for spans.

end
Callable

Explicitly ends the span. If you provide arguments (like output, metrics, etc.), it will call update() with those arguments before ending. Usually not needed when using the span as a context manager, as __exit__ handles this.

# Manual span management example
# current_span = langwatch.span(name="ManualSpan").__enter__() # Start span manually
# try:
#     # ... operations ...
#     result = "some result"
#     current_span.end(output=result) # Update and end
# except Exception as e:
#     current_span.end(error=e) # Record error and end
#     raise

OpenTelemetry Span Methods:

The LangWatchSpan object also directly exposes standard OpenTelemetry trace.Span API methods for more advanced use cases or direct OTel interop. These include:

  • record_error(exception): Records an exception against the span.
  • add_event(name, attributes): Adds a timed event to the span.
  • set_status(status_code, description): Sets the OTel status of the span (e.g., StatusCode.ERROR).
  • set_attributes(attributes_dict): Sets multiple OTel attributes at once.
  • update_name(new_name): Changes the name of the span.
  • is_recording(): Returns True if the span is currently recording information.
  • get_span_context(): Returns the SpanContext of the underlying OTel span.

Refer to the OpenTelemetry Python documentation for details on these methods.

from opentelemetry.trace import Status, StatusCode

with langwatch.span(name="MyDetailedSpan") as current_span:
    current_span.add_event("Checkpoint 1", {"detail": "Reached stage A"})
    # ... some work ...
    try:
        # risky_operation()
        pass
    except Exception as e:
        current_span.record_error(e)
        current_span.set_status(Status(StatusCode.ERROR, description="Risky op failed"))
    current_span.set_attributes({"otel_attribute": "value"})

Context Accessors

These functions allow you to access the currently active LangWatch trace or span from anywhere in your code, provided that a trace/span has been started (e.g., via @langwatch.trace or @langwatch.span).

langwatch.get_current_trace()

Retrieves the currently active LangWatchTrace object.

This is useful if you need to interact with the trace object directly, for example, to add trace-level metadata or evaluations from a helper function called within an active trace.

import langwatch

langwatch.setup()

def some_utility_function(detail: str):
    current_trace = langwatch.get_current_trace()
    if current_trace:
        current_trace.update(metadata={"utility_info": detail})

@langwatch.trace(name="MainOperation")
def main_operation():
    # ... some work ...
    some_utility_function("Processed step A")
    # ... more work ...

main_operation()
suppress_warning
bool
default:"False"

If True, suppresses the warning that is normally emitted if this function is called when no LangWatch trace is currently in context.

trace
LangWatchTrace

The current LangWatchTrace object. If no trace is active and suppress_warning is False, a warning is issued and a new (detached) LangWatchTrace instance might be returned.

langwatch.get_current_span()

Retrieves the currently active LangWatchSpan object.

This allows you to get a reference to the current span to update its attributes, add events, or record information specific to that span from nested function calls. If no LangWatch-specific span is in context, it will attempt to wrap the current OpenTelemetry span.

import langwatch

langwatch.setup()

def enrich_span_data(key: str, value: any):
    current_span = langwatch.get_current_span()
    if current_span:
        current_span.update(**{key: value})
        # Or using the older set_attributes for OpenTelemetry compatible attributes
        # current_span.set_attributes({key: value})

@langwatch.trace(name="UserFlow")
def user_flow():
    with langwatch.span(name="Step1") as span1:
        # ... step 1 logic ...
        enrich_span_data("step1_result", "success")
    
    with langwatch.span(name="Step2") as span2:
        # ... step 2 logic ...
        enrich_span_data("step2_details", {"info": "more data"})

user_flow()
span
LangWatchSpan

The current LangWatchSpan object. This could be a span initiated by @langwatch.span, the root span of a @langwatch.trace, or a LangWatchSpan wrapping an existing OpenTelemetry span if no LangWatch span is directly in context.

Core Data Types

This section describes common data structures used throughout the LangWatch SDK, particularly as parameters to functions like langwatch.setup(), @langwatch.trace(), and @langwatch.span(), or as part of the data captured.

SpanProcessingExcludeRule

Defines a rule to filter out spans before they are exported to LangWatch. Used in the span_exclude_rules parameter of langwatch.setup().

field_name
Literal['span_name']
required

The field of the span to match against. Currently, only "span_name" is supported.

match_value
str
required

The value to match for the specified field_name.

match_operation
Literal['includes', 'exact_match', 'starts_with', 'ends_with']
required

The operation to use for matching (e.g., "exact_match", "starts_with").

Example
from langwatch.domain import SpanProcessingExcludeRule

exclude_health_checks = SpanProcessingExcludeRule(
    field_name="span_name",
    match_value="/health_check",
    match_operation="exact_match"
)

exclude_internal_utils = SpanProcessingExcludeRule(
    field_name="span_name",
    match_value="utils.",
    match_operation="starts_with"
)

ChatMessage

Represents a single message in a chat conversation, typically used for input or output of LLM spans.

role
ChatRole
required

The role of the message sender. ChatRole is a Literal: "system", "user", "assistant", "function", "tool", "guardrail", "evaluation", "unknown".

content
Optional[str]

The textual content of the message.

function_call
Optional[FunctionCall]

For assistant messages that involve a function call (legacy OpenAI format).

tool_calls
Optional[List[ToolCall]]

For assistant messages that involve tool calls (current OpenAI format).

tool_call_id
Optional[str]

For messages that are responses from a tool, this is the ID of the tool call that this message is a response to.

name
Optional[str]

The name of the function whose result is in the content (if role is "function"), or the name of the tool/participant (if role is "tool").

Example
from langwatch.domain import ChatMessage

user_message = ChatMessage(role="user", content="Hello, world!")
assistant_response = ChatMessage(role="assistant", content="Hi there!")

RAGChunk

Represents a chunk of retrieved context, typically used with RAG (Retrieval Augmented Generation) operations in the contexts field of a span.

document_id
Optional[str]

An identifier for the source document of this chunk.

chunk_id
Optional[str]

An identifier for this specific chunk within the document.

content
Union[str, Dict[str, Any], List[Any]]
required

The actual content of the RAG chunk. Can be a simple string or a more complex structured object.

Example
from langwatch.domain import RAGChunk

retrieved_chunk = RAGChunk(
    document_id="doc_123",
    chunk_id="chunk_005",
    content="LangWatch helps monitor LLM applications."
)

SpanInputOutput

This is a flexible type used for the input and output fields of spans. It’s a Union that can take several forms to represent different kinds of data. LangWatch will store it as a typed value. Common forms include:

  • TypedValueText: For simple string inputs/outputs. {"type": "text", "value": "your string"}
  • TypedValueChatMessages: For conversational inputs/outputs. {"type": "chat_messages", "value": [ChatMessage, ...]}
  • TypedValueJson: For arbitrary JSON-serializable data. {"type": "json", "value": {"key": "value"}}
  • TypedValueRaw: For data that should be stored as a raw string, escaping any special interpretation. {"type": "raw", "value": "<xml>data</xml>"}
  • TypedValueList: For a list of SpanInputOutput objects. {"type": "list", "value": [SpanInputOutput, ...]}

When providing input or output to @langwatch.span() or span.update(), you can often provide the raw Python object (e.g., a string, a list of ChatMessage dicts, a dictionary for JSON), and the SDK will attempt to serialize it correctly. For more control, you can construct the TypedValue dictionaries yourself.

SpanTimestamps

A dictionary defining custom start and end times for a span, in nanoseconds since epoch.

started_at
int
required

Timestamp when the span started (nanoseconds since epoch).

first_token_at
Optional[int]

For LLM operations, timestamp when the first token was received (nanoseconds since epoch).

finished_at
int
required

Timestamp when the span finished (nanoseconds since epoch).

SpanTypes (Semantic Span Types)

A string literal defining the semantic type of a span. This helps categorize spans in the LangWatch UI and for analytics. Possible values include:

  • "span" (generic span)
  • "llm" (Language Model operation)
  • "chain" (a sequence of operations, e.g., LangChain chain)
  • "tool" (execution of a tool or function call)
  • "agent" (an autonomous agent’s operation)
  • "guardrail" (a guardrail or safety check)
  • "evaluation" (an evaluation step)
  • "rag" (Retrieval Augmented Generation operation)
  • "workflow" (a broader workflow or business process)
  • "component" (a sub-component within a larger system)
  • "module" (a logical module of code)
  • "server" (server-side operation)
  • "client" (client-side operation)
  • "producer" (message producer)
  • "consumer" (message consumer)
  • "task" (a background task or job)
  • "unknown"

SpanMetrics

A dictionary for quantitative measurements associated with a span.

prompt_tokens
Optional[int]

Number of tokens in the input/prompt to an LLM.

completion_tokens
Optional[int]

Number of tokens in the output/completion from an LLM.

cost
Optional[float]

Estimated or actual monetary cost of the operation (e.g., LLM API call cost).

SpanParams

A dictionary for parameters related to an operation, especially LLM calls. Examples include:

frequency_penalty
Optional[float]

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.

logit_bias
Optional[Dict[str, float]]

Modify the likelihood of specified tokens appearing in the completion. Accepts a dictionary mapping token IDs (or tokens, depending on the model) to a bias value from -100 to 100.

logprobs
Optional[bool]

Whether to return log probabilities of the output tokens.

top_logprobs
Optional[int]

An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with log probability. logprobs must be True if this is used.

max_tokens
Optional[int]

The maximum number of tokens to generate in the completion.

n
Optional[int]

How many completions to generate for each prompt.

presence_penalty
Optional[float]

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.

seed
Optional[int]

If specified, the system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

stop
Optional[Union[str, List[str]]]

Up to 4 sequences where the API will stop generating further tokens.

stream
Optional[bool]

If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available.

temperature
Optional[float]

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

top_p
Optional[float]

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.

tools
Optional[List[Dict[str, Any]]]

A list of tools the model may call. Currently, only functions are supported as a tool.

tool_choice
Optional[str]

Controls which (if any) tool is called by the model. none means the model will not call any tool. auto means the model can pick between generating a message or calling a tool.

user
Optional[str]

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse.

TraceMetadata

A dictionary (`