As an example of RAG application we will use the sample app that is provided in the official documentation of DSPy library,
you can read more by following this link - RAG tutorial.
Firstly, lets access the dataset of wiki abstracts that will be used for example RAG optimization.
import dspyturbo = dspy.OpenAI(model='gpt-3.5-turbo')colbertv2_wiki17_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')dspy.settings.configure(lm=turbo, rm=colbertv2_wiki17_abstracts)from dspy.datasets import HotPotQA# Load the dataset.dataset = HotPotQA(train_seed=1, train_size=20, eval_seed=2023, dev_size=50, test_size=0)# Tell DSPy that the 'question' field is the input. Any other fields are labels and/or metadata.trainset =[x.with_inputs('question')for x in dataset.train]devset =[x.with_inputs('question')for x in dataset.dev]len(trainset),len(devset)
Next step - to define the RAG module itself.
You can explain the task and what the expected outputs mean in this context that an LLM can optimize these commands later.
classGenerateAnswer(dspy.Signature):"""Answer questions with short factoid answers.""" context = dspy.InputField(desc="may contain relevant facts") question = dspy.InputField() answer = dspy.OutputField(desc="often between 1 and 5 words")classRAG(dspy.Module):def__init__(self, num_passages=3):super().__init__() self.retrieve = dspy.Retrieve(k=num_passages) self.generate_answer = dspy.ChainOfThought(GenerateAnswer)defforward(self, question): context = self.retrieve(question).passages prediction = self.generate_answer(context=context, question=question)return dspy.Prediction(context=context, answer=prediction.answer)
Finally, you can connect to LangWatch. After running this code snippet - you will get a link that will give you access to
an api_key in the browser. Paste the API key into your code editor popup and press enter - now you are connected to LangWatch.
Last step is to actually run the prompt optitmizer. In this example BootstrapFewShot is used and it will
bootstrap our prompt with the best demos from our dataset.
from dspy.teleprompt import BootstrapFewShotfrom dspy import evaluatefrom dotenv import load_dotenvload_dotenv()# Validation logic: check that the predicted answer is correct.# Also check that the retrieved context does actually contain that answer.defvalidate_context_and_answer(example, pred, trace=None): answer_EM = evaluate.answer_exact_match(example, pred) answer_PM = evaluate.answer_passage_match(example, pred)return answer_EM and answer_PM# Set up a basic teleprompter, which will compile our RAG program.teleprompter = BootstrapFewShot(metric=validate_context_and_answer)langwatch.dspy.init(experiment="rag-dspy-tutorial", optimizer=teleprompter)# Compile!compiled_rag = teleprompter.compile(RAG(), trainset=trainset)
The result of optimization can be found on your LangWatch dashboard. On the graph you can see how many demos were boostrapped during the first optimization step.
Additionally, you can see each LLM call that has been done during the optimization with the corresponding costs and token counts.