Datasets

Understanding the Role of Datasets

Datasets are at the core of the Optimization Studio’s functionality. When working with non-deterministic systems like LLMs, running your tests across multiple examples is crucial for confidence in your results. While you might get lucky with a single successful test, running your LLM against hundreds of examples provides much more reliable validation of your solution. The good news is that you don’t need an enormous dataset to get started. As little as 20 examples can already provide meaningful results with the DSPy optimizers, thanks to their intelligent use of LLM capabilities.

Creating and Managing Datasets (0:50)

If you already use LangWatch for monitoring, you can import the production data generated by your LLMs as a dataset, otherwise, you can also create or import a new dataset on optimization studio directly.

Creating and Editing Datasets

Access the dataset editor by double-clicking on the dataset in the node or sidebar, this provides a spreadsheet-like interface where you can:

Add new records manually or modify existing entries
Add or remove columns
Make real-time changes to experiment on your workflow
Collaborate with team members and domain experts

Importing Existing Data (1:45)

If you already have data in CSV format, you can easily import it:

Use the upload CSV option
Configure column types and formats
Add additional columns as needed
Save and immediately use in your workflows

Dataset Configuration (2:23)

Manual Test Entry Settings

The “manual test entry” setting controls which data point is used during manual execution:

“Random” (default): Picks a different entry each time you run
“First Entry”: Always uses the same entry for consistent testing
This setting only affects manual testing, not full evaluations

Dataset Splitting (2:52)

One of the most important aspects of working with datasets is how they’re split for optimization and testing:

Default 80-20 Split

Optimization Set (80%): Used for training and improving your LLM pipeline
Test Set (20%): Reserved for validation to ensure your optimizations generalize well

You can adjust this split based on your needs:

Use fixed numbers instead of percentages
Modify the split ratio for different use cases
Balance between optimization data and test data

Shuffle Seed Configuration (3:51)

The shuffle seed is crucial for maintaining consistent, unbiased testing:

Prevents dataset ordering bias
Ensures consistent splitting across runs
Can be modified to test resilience to different data arrangements
The default 42 seed can be changed to any number for randomization

Evaluation Basics (4:56)

While detailed evaluation is covered in later tutorials, the basic workflow involves:

Clicking the Evaluate button
Documenting changes made to your pipeline
Selecting which dataset partition to evaluate against
Adding necessary LLM API keys

The evaluation panel provides:

Total entries processed
Average cost per entry
Total runtime
Overall experiment costs

This foundation in dataset management sets you up for evaluating the quality and running automated optimizations, which are covered in subsequent tutorials.

Get Started

Observability

Agent Simulations

Evaluation

Prompt Management

Platform

Examples & Cookbooks

Understanding the Role of Datasets

Creating and Managing Datasets (0:50)

Creating and Editing Datasets

Importing Existing Data (1:45)

Dataset Configuration (2:23)

Manual Test Entry Settings

Dataset Splitting (2:52)

Default 80-20 Split

Shuffle Seed Configuration (3:51)

Evaluation Basics (4:56)

Get Started

Observability

Agent Simulations

Evaluation

Prompt Management

Platform

Examples & Cookbooks

​Understanding the Role of Datasets

​Creating and Managing Datasets (0:50)

​Creating and Editing Datasets

​Importing Existing Data (1:45)

​Dataset Configuration (2:23)

​Manual Test Entry Settings

​Dataset Splitting (2:52)

​Default 80-20 Split

​Shuffle Seed Configuration (3:51)

​Evaluation Basics (4:56)

Understanding the Role of Datasets

Creating and Managing Datasets (0:50)

Creating and Editing Datasets

Importing Existing Data (1:45)

Dataset Configuration (2:23)

Manual Test Entry Settings

Dataset Splitting (2:52)

Default 80-20 Split

Shuffle Seed Configuration (3:51)

Evaluation Basics (4:56)