Azure AI Inference SDK Instrumentation
Learn how to instrument the Azure AI Inference Python SDK with LangWatch.
The azure-ai-inference
Python SDK provides a unified way to interact with various AI models deployed on Azure, including those on Azure OpenAI Service, GitHub Models, and Azure AI Foundry Serverless/Managed Compute endpoints. For more details on the SDK, refer to the official Azure AI Inference client library documentation.
LangWatch can capture traces generated by the azure-ai-inference
SDK by leveraging its built-in OpenTelemetry support. This guide will show you how to set it up.
Prerequisites
-
Install LangWatch SDK:
-
Install Azure AI Inference SDK with OpenTelemetry support: The
azure-ai-inference
SDK can be installed with OpenTelemetry capabilities. You might also need the core Azure OpenTelemetry tracing package.Refer to the Azure SDK documentation for the most up-to-date installation instructions.
Instrumentation with AIInferenceInstrumentor
The azure-ai-inference
SDK provides an AIInferenceInstrumentor
that automatically captures traces for its operations when enabled. LangWatch, when set up, will include an OpenTelemetry exporter that can collect these traces.
Here’s how to instrument your application:
The example uses the synchronous ChatCompletionsClient
for simplicity in demonstrating instrumentation. The azure-ai-inference
SDK also provides asynchronous clients under the azure.ai.inference.aio
namespace (e.g., azure.ai.inference.aio.ChatCompletionsClient
). If you are using async/await
in your application, you should use these asynchronous clients. The AIInferenceInstrumentor
will work with both synchronous and asynchronous clients.
How it Works
langwatch.setup()
: Initializes the LangWatch SDK, which includes setting up an OpenTelemetry trace exporter. This exporter is ready to receive spans from any OpenTelemetry-instrumented library in your application.AIInferenceInstrumentor().instrument()
: This command, provided by theazure-ai-inference
SDK, patches the relevant Azure AI clients (likeChatCompletionsClient
orEmbeddingsClient
) to automatically create OpenTelemetry spans for their operations (e.g., acomplete
orembed
call).@langwatch.trace()
: By decorating your own functions (likeget_ai_response
in the example), you create a parent trace in LangWatch. The spans automatically generated by theAIInferenceInstrumentor
for calls made within this decorated function will then be nested under this parent trace. This provides a full end-to-end view of your operation.
With this setup, calls made using the azure-ai-inference
clients will be automatically traced and sent to LangWatch, providing visibility into the performance and behavior of your AI model interactions.