If you have a custom evaluator built in-house which run on your own code, either during the LLM pipeline or after, you can still capture the evaluation results
and connect it back to the trace to visualize it together with the other LangWatch evaluators.
You can capture the evaluation results of your custom evaluator on the current trace or span by using the .add_evaluation
method:
import langwatch
@langwatch.span(type="evaluation")
def evaluation_step():
... # your custom evaluation logic
langwatch.get_current_span().add_evaluation(
name="custom evaluation", # required
passed=True,
score=0.5,
label="category_detected",
details="explanation of the evaluation results",
)
The evaluation name
is required and must be a string. The other fields are optional, but at least one of passed
, score
or label
must be provided.
You can capture the evaluation results of your custom evaluator on the current trace or span by using the .add_evaluation
method:
import langwatch
@langwatch.span(type="evaluation")
def evaluation_step():
... # your custom evaluation logic
langwatch.get_current_span().add_evaluation(
name="custom evaluation", # required
passed=True,
score=0.5,
label="category_detected",
details="explanation of the evaluation results",
)
The evaluation name
is required and must be a string. The other fields are optional, but at least one of passed
, score
or label
must be provided.
You can capture the evaluation results of your custom evaluator on the current trace or span by using the .addEvaluation
method:
import { type LangWatchTrace } from "langwatch";
async function llmStep({ message, trace }: { message: string, trace: LangWatchTrace }): Promise<string> {
const span = trace.startLLMSpan({ name: "llmStep" });
// ... your existing code
span.addEvaluation({
name: "custom evaluation",
passed: true,
score: 0.5,
label: "category_detected",
details: "explanation of the evaluation results",
});
}
The evaluation name
is required and must be a string. The other fields are optional, but at least one of passed
, score
or label
must be provided.
REST API Specification
Endpoint
POST /api/collector
X-Auth-Token
: Your LangWatch API key.
Request Body
{
"trace_id": "id of the message the evaluation was run on",
"evaluations": [{
"evaluation_id": "evaluation-id-123", // optional unique id for identifying the evaluation, if not provided, a random id will be generated
"name": "custom evaluation", // required
"passed": true, // optional
"score": 0.5, // optional
"label": "category_detected", // optional
"details": "explanation of the evaluation results", // optional
"error": { // optional to capture error details in case evaluation had an error
"message": "error message",
"stacktrace": [],
},
"timestamps": { // optional
"created_at": "1723411698506", // unix timestamp in milliseconds
"updated_at": "1723411698506" // unix timestamp in milliseconds
}
}]
}