# Inspect Evals Test Utilities

This directory contains test utilities for validating datasets and other components of the Inspect Evals framework.

## Hugging Face Dataset Validation Utilities

The `tests/utils/huggingface.py` module provides utilities for validating Hugging Face datasets used in Inspect Evals. These utilities help ensure that datasets meet the expected structure and format requirements.

### `assert_huggingface_dataset_is_valid`

This function verifies that a Hugging Face dataset exists and is valid according to the Hugging Face API.

#### Usage

```python
from tests.utils.huggingface import assert_huggingface_dataset_is_valid

# Replace with your dataset path
DATASET_PATH = "tau/commonsense_qa"

def test_dataset_is_valid():
    """Test that the dataset URL is valid."""
    assert_huggingface_dataset_is_valid(DATASET_PATH)
```

#### Parameters

- `hf_dataset_path` (str): Path to the Hugging Face dataset (e.g., "tau/commonsense_qa")

  Your dataset must exist at "<https://huggingface.co/datasets/{hf_dataset_path}>"

#### Behavior

- Makes a request to the Hugging Face API to check if the dataset is valid
- Raises an `AssertionError` if the dataset is not valid or if the API returns an error

### `assert_huggingface_dataset_structure`

This function validates that a Hugging Face dataset has the expected structure, including configurations, splits, and features.

#### Usage

```python
from tests.utils.huggingface import (
    DatasetInfosDict,
    assert_huggingface_dataset_structure,
    get_dataset_infos_dict,
)

# Replace with your dataset path
DATASET_PATH = "tau/commonsense_qa"

def test_dataset_structure(dataset_infos: DatasetInfosDict):
    """Test that the dataset has the expected structure."""
    # Define the expected schema for your dataset
    schema = {
        "configs": {
            "default": {  # Config name
                "splits": ["train", "validation"],  # Required splits for this config
                "features": {  # Required features for this config
                    "id": str,  # Simple type check
                    "question": str,
                    "question_concept": str,
                    "choices": {  # Nested structure check
                        "type": "dict",
                        "fields": {
                            "label": list[str],
                            "text": list[str],
                        },
                    },
                    "answerKey": str,
                }
            }
        }
    }

    # Validate the dataset against the schema
    assert_huggingface_dataset_structure(dataset_infos, schema)
```

#### Parameters

- `dataset_infos_dict` (DatasetInfosDict): Dataset info dictionary from `get_dataset_infos_dict`
- `schema` (dict): Schema dictionary defining expected configs, splits, and features

#### Schema Format

The schema dictionary should have the following structure:

```python
{
    "dataset_name": "dataset-name",  # Optional dataset name
    "configs": {  # Required configurations with their splits and features
        "config1": {  # Configuration name
            "splits": ["train", "validation"],  # Required splits for this config
            "features": {  # Required features for this config
                "feature_name": str,  # Simple type check using Python types
                "choices": {  # Nested structure check
                    "type": "dict",
                    "fields": {
                        "label": list[str],
                        "text": list[str],
                    }
                },
                "sequence_feature": {  # Sequence type check
                    "type": "sequence",
                    "element_type": str
                },
                "list_feature": list[str]  # List type check using type annotations
            }
        },
        "config2": {  # Another configuration
            "splits": ["train"],
            "features": {
                "text": str,
                "label": int
            }
        }
    }
}
```

#### Behavior

- If no schema is provided, the function will fail and display the available dataset info
- Validates that all required configurations exist in the dataset
- Validates that all required splits exist for each configuration
- Validates that all required features exist with the expected types and structure

### Best Practices

1. Use a fixture to load dataset info once per test module:

    ```python
    @pytest.fixture(scope="module")
    def dataset_infos() -> DatasetInfosDict:
        """Fixture to provide dataset info dictionary for all tests in the module."""
        return get_dataset_infos_dict(DATASET_PATH)
    ```

2. Mark tests that interact with the Hugging Face API:

    ```python
    @pytest.mark.huggingface
    def test_dataset_is_valid():
        # Test implementation
    ```

3. Optionally, start with simple validation tests before testing detailed structure:

    ```python
    def test_dataset_contains_required_splits(dataset_infos: DatasetInfosDict):
        """Test that the dataset contains all required splits."""
        config_name = "default"
        assert config_name in dataset_infos
        
        dataset_info = dataset_infos[config_name]
        required_splits = ["train", "validation"]
        
        for split in required_splits:
            assert split in dataset_info.splits
    ```