Implement A/B testing for your prompts using LangWatch’s version control and analytics
LangWatch enables A/B testing by allowing you to create different versions of your prompts and randomly alternate between them. Your application can test different prompt variants while LangWatch tracks performance metrics for each version.
Create different versions of your prompt for testing:
Copy
import { LangWatch } from "langwatch";const langwatch = new LangWatch({ apiKey: process.env.LANGWATCH_API_KEY});// Create base promptconst basePrompt = await langwatch.prompts.create({ handle: "customer-support-bot", scope: "PROJECT", prompt: "You are a helpful customer support agent. Help with: {{input}}", inputs: [{ identifier: "input", type: "str" }], outputs: [{ identifier: "response", type: "str" }], model: "openai/gpt-4o-mini"});// Create variant A (friendly tone) - captures version numberconst variantA = await langwatch.prompts.update("customer-support-bot", { prompt: "You are a friendly and empathetic customer support agent. Use a warm, helpful tone. Help with: {{input}}"});// Create variant B (professional tone) - captures version numberconst variantB = await langwatch.prompts.update("customer-support-bot", { prompt: "You are a professional and efficient customer support agent. Be concise and solution-focused. Help with: {{input}}"});// Store version numbers for A/B testingconst versions = { base: basePrompt.version, friendly: variantA.version, professional: variantB.version};console.log("Version numbers:", versions);
Compare metrics between versions in the LangWatch UI to see which variant performs better. Use this data to make informed decisions about which prompt version to use in production.