Instrumenting Custom Evaluator

If you have a custom evaluator built in-house which run on your own code, either during the LLM pipeline or after, you can still capture the evaluation results and connect it back to the trace to visualize it together with the other LangWatch evaluators.

You can capture the evaluation results of your custom evaluator on the current trace or span by using the .add_evaluation method:

import langwatch

@langwatch.span(type="evaluation")
def evaluation_step():
    ... # your custom evaluation logic

    langwatch.get_current_span().add_evaluation(
        name="custom evaluation", # required
        passed=True,
        score=0.5,
        label="category_detected",
        details="explanation of the evaluation results",
    )

The evaluation name is required and must be a string. The other fields are optional, but at least one of passed, score or label must be provided.

Setting up Real-Time Evaluations List of Evaluators

Get Started

Agent Simulations

LLM Observability

LLM Evaluation

LLM Development

API Endpoints

Use Cases

Support

Instrumenting Custom Evaluator