# Inspect Evals Test Utilities

This directory contains test utilities for validating datasets and other components of the Inspect Evals framework.

## Hugging Face Dataset Validation Utilities

The `tests/utils/huggingface.py` module provides utilities for validating Hugging Face datasets used in Inspect Evals. These utilities help ensure that datasets meet the expected structure and format requirements.

### `assert_huggingface_dataset_is_valid`

This function verifies that a Hugging Face dataset exists and is valid according to the Hugging Face API. It is used by the master test in `tests/test_datasets_hf.py`, which automatically discovers and validates all HF datasets used across evaluations. **Individual eval test files do not need to call this function** — the master test handles it.

### `assert_huggingface_dataset_structure`

This function validates that a Hugging Face dataset has the expected structure, including configurations, splits, and features.

#### Usage

```python
from tests.utils.huggingface import (
    DatasetInfosDict,
    assert_huggingface_dataset_structure,
    get_dataset_infos_dict,
)

# Replace with your dataset path
DATASET_PATH = "tau/commonsense_qa"

def test_dataset_structure(dataset_infos: DatasetInfosDict):
    """Test that the dataset has the expected structure."""
    # Define the expected schema for your dataset
    schema = {
        "configs": {
            "default": {  # Config name
                "splits": ["train", "validation"],  # Required splits for this config
                "features": {  # Required features for this config
                    "id": str,  # Simple type check
                    "question": str,
                    "question_concept": str,
                    "choices": {  # Nested structure check
                        "type": "dict",
                        "fields": {
                            "label": list[str],
                            "text": list[str],
                        },
                    },
                    "answerKey": str,
                }
            }
        }
    }

    # Validate the dataset against the schema
    assert_huggingface_dataset_structure(dataset_infos, schema)
```

#### Parameters

- `dataset_infos_dict` (DatasetInfosDict): Dataset info dictionary from `get_dataset_infos_dict`
- `schema` (dict): Schema dictionary defining expected configs, splits, and features

#### Schema Format

The schema dictionary should have the following structure:

```python
{
    "dataset_name": "dataset-name",  # Optional dataset name
    "configs": {  # Required configurations with their splits and features
        "config1": {  # Configuration name
            "splits": ["train", "validation"],  # Required splits for this config
            "features": {  # Required features for this config
                "feature_name": str,  # Simple type check using Python types
                "choices": {  # Nested structure check
                    "type": "dict",
                    "fields": {
                        "label": list[str],
                        "text": list[str],
                    }
                },
                "sequence_feature": {  # Sequence type check
                    "type": "sequence",
                    "element_type": str
                },
                "list_feature": list[str]  # List type check using type annotations
            }
        },
        "config2": {  # Another configuration
            "splits": ["train"],
            "features": {
                "text": str,
                "label": int
            }
        }
    }
}
```

#### Behavior

- If no schema is provided, the function will fail and display the available dataset info
- Validates that all required configurations exist in the dataset
- Validates that all required splits exist for each configuration
- Validates that all required features exist with the expected types and structure

### Best Practices

1. Use a fixture to load dataset info once per test module:

    ```python
    @pytest.fixture(scope="module")
    def dataset_infos() -> DatasetInfosDict:
        """Fixture to provide dataset info dictionary for all tests in the module."""
        return get_dataset_infos_dict(DATASET_PATH)
    ```

2. Mark tests that interact with the Hugging Face API with `@pytest.mark.huggingface`.

3. Optionally, start with simple validation tests before testing detailed structure:

    ```python
    def test_dataset_contains_required_splits(dataset_infos: DatasetInfosDict):
        """Test that the dataset contains all required splits."""
        config_name = "default"
        assert config_name in dataset_infos
        
        dataset_info = dataset_infos[config_name]
        required_splits = ["train", "validation"]
        
        for split in required_splits:
            assert split in dataset_info.splits
    ```
