Learn how to capture Retrieval Augmented Generation (RAG) data with LangWatch.
contexts
(retrieved documents) used by the LLM, you unlock several benefits in LangWatch:
@langwatch.span(type="rag")
. Inside this function, you should perform the retrieval and then update the span with the retrieved contexts.
The contexts
should be a list of strings or RAGChunk
objects. The RAGChunk
object allows you to provide more metadata about each retrieved chunk, such as document_id
and source
.
Here’s an example:
perform_rag
is decorated with @langwatch.span(type="rag")
.perform_rag
, we simulate a retrieval step.langwatch.get_current_span().update(contexts=retrieved_docs)
is called to explicitly log the retrieved documents.generate_answer_from_context
) is called, which itself can be another span (e.g., an LLM span).langwatch.langchain.capture_rag_from_retriever
. This function takes your retriever and a lambda function to transform the retrieved Document
objects into RAGChunk
objects.
langwatch.langchain.capture_rag_from_retriever(retriever, lambda document: ...)
: This wraps your existing retriever.lambda document: RAGChunk(...)
defines how to map fields from LangChain’s Document
to LangWatch’s RAGChunk
. This is crucial for providing detailed context information.langwatch.get_current_trace().get_langchain_callback()
in your RunnableConfig
when invoking the chain/agent to capture all LangChain operations.BaseTool
, you can use langwatch.langchain.capture_rag_from_tool
.
capture_rag_from_tool
approach is generally less direct for RAG from retrievers because you have to parse the tool’s output (which is usually a string) to extract structured context information. capture_rag_from_retriever
is preferred when dealing directly with LangChain retrievers.
By effectively capturing RAG spans, you gain much richer data in LangWatch, enabling more powerful analysis and evaluation of your RAG systems. Refer to the SDK examples for more detailed implementations.