Utils

Selectors

Selectors are used to control the action selection behavior of a bandit. You can provide a selector to each bandit. If you don't provide one, a default selector will be used.

Currently, we provide the following selectors:

ArgMaxSelector: Select the action with the highest estimated reward.
EpsilonGreedySelector: Select the action with the highest estimated reward with probability 1 - epsilon and a random action with probability epsilon.
TopKSelector: Select the top k actions with the highest estimated reward.
EpsilonGreedyTopKSelector: Selects the top k arms with probability 1-epsilon or k random arms with probability epsilon.

If you want to implement your own selector, you can subclass the Selector class and implement the __call__ method making your class callable.

`AbstractSelector`

Bases: ABC

Defines the interface for all bandit action selectors.

Given a tensor of scores per action, the selector chooses an action (i.e. an arm) or a set of actions (i.e. a super arm in combinatorial bandits). The selector returns a one hot encoded tensor of the chosen actions.

`call(scores)` `abstractmethod`

Selects a single action, or a set of actions in the case of combinatorial bandits.

Parameters:

Name	Type	Description	Default
`scores`	`Tensor`	Scores for each action. Shape: (batch_size, n_arms). This may contain a probability distribution per sample or simply a score per arm (e.g. for UCB). In case of combinatorial bandits, these are the scores per arm from which the oracle selects a super arm (e.g. simply top-k).	required

Returns:

Type	Description
`Tensor`	One hot encoded actions that were chosen. Shape: (batch_size, n_arms).

Source code in src/calvera/utils/selectors.py

@abstractmethod
def __call__(self, scores: torch.Tensor) -> torch.Tensor:
    """Selects a single action, or a set of actions in the case of combinatorial bandits.

    Args:
        scores: Scores for each action. Shape: (batch_size, n_arms).
            This may contain a probability distribution per sample or simply a score per
            arm (e.g. for UCB). In case of combinatorial bandits, these are the scores
            per arm from which the oracle selects a super arm (e.g. simply top-k).

    Returns:
        One hot encoded actions that were chosen. Shape: (batch_size, n_arms).
    """
    pass

`ArgMaxSelector`

Bases: AbstractSelector

Selects the action with the highest score from a batch of scores.

`EpsilonGreedySelector(epsilon=0.1, seed=None)`

Bases: AbstractSelector

Implements an epsilon-greedy action selection strategy.

Parameters:

Name	Type	Description	Default
`epsilon`	`float`	Exploration probability. Must be between 0 and 1.	`0.1`
`seed`	`int \| None`	Random seed for the generator. Defaults to None (explicit seed used).	`None`

Source code in src/calvera/utils/selectors.py

def __init__(self, epsilon: float = 0.1, seed: int | None = None) -> None:
    """Initialize the epsilon-greedy selector.

    Args:
        epsilon: Exploration probability. Must be between 0 and 1.
        seed: Random seed for the generator. Defaults to None (explicit seed used).
    """
    assert 0 <= epsilon <= 1, "Epsilon must be between 0 and 1"
    self.epsilon = epsilon
    self.generator = torch.Generator()
    if seed is not None:
        self.generator.manual_seed(seed)

`RandomSelector(k=1, seed=None)`

Bases: AbstractSelector

Selects k random actions from the available actions.

Parameters:

Name	Type	Description	Default
`k`	`int`	Number of actions to select. Must be positive.	`1`
`seed`	`int \| None`	Random seed for the generator. Defaults to None.	`None`

Source code in src/calvera/utils/selectors.py

def __init__(self, k: int = 1, seed: int | None = None):
    """Initialize the random selector.

    Args:
        k: Number of actions to select. Must be positive.
        seed: Random seed for the generator. Defaults to None.
    """
    self.k = k
    self.generator = torch.Generator()
    if seed is not None:
        self.generator.manual_seed(seed)

`TopKSelector(k)`

Bases: AbstractSelector

Selects the top k actions with the highest scores.

Parameters:

Name	Type	Description	Default
`k`	`int`	Number of actions to select. Must be positive.	required

Source code in src/calvera/utils/selectors.py

def __init__(self, k: int):
    """Initialize the top-k selector.

    Args:
        k: Number of actions to select. Must be positive.
    """
    assert k > 0, "k must be positive"
    self.k = k

`EpsilonGreedyTopKSelector(k, epsilon=0.1, seed=None)`

Bases: AbstractSelector

Implements an epsilon-greedy top-k action selection strategy.

With probability 1-epsilon, selects the top k arms with highest scores. With probability epsilon, selects k random arms.

Parameters:

Name	Type	Description	Default
`k`	`int`	Number of actions to select. Must be positive.	required
`epsilon`	`float`	Exploration probability. Must be between 0 and 1.	`0.1`
`seed`	`int \| None`	Random seed for the generator. Defaults to None (explicit seed used).	`None`

Source code in src/calvera/utils/selectors.py

def __init__(self, k: int, epsilon: float = 0.1, seed: int | None = None) -> None:
    """Initialize the epsilon-greedy top-k selector.

    Args:
        k: Number of actions to select. Must be positive.
        epsilon: Exploration probability. Must be between 0 and 1.
        seed: Random seed for the generator. Defaults to None (explicit seed used).
    """
    assert k > 0, "k must be positive"
    assert 0 <= epsilon <= 1, "Epsilon must be between 0 and 1"
    self.k = k
    self.epsilon = epsilon
    self.generator = torch.Generator()
    if seed is not None:
        self.generator.manual_seed(seed)

Data Samplers

To simulate a bandit in a scenario with non-i.i.d. contexts, we need to modify the data sampler of our benchmark datasets. To be consistent we provide a DataSampler class that can be used to sample data from a dataset.

`AbstractDataSampler(data_source)`

Bases: Sampler[int], ABC

Base class for all custom samplers.

Implements the basic functionality required for sampling from a dataset. Subclasses need only implement the _get_iterator method to define their specific sampling strategy.

Parameters:

Name	Type	Description	Default
`data_source`	`Dataset[tuple[Tensor, Tensor]]`	Dataset to sample from	required

Source code in src/calvera/utils/data_sampler.py

def __init__(
    self,
    data_source: Dataset[tuple[torch.Tensor, torch.Tensor]],
) -> None:
    """Initializes the AbstractDataSampler.

    Args:
        data_source: Dataset to sample from
    """
    self.data_source = data_source

`len()`

Returns the number of elements in the data source.

Source code in src/calvera/utils/data_sampler.py

def __len__(self) -> int:
    """Returns the number of elements in the data source."""
    return len(self.data_source)  # type: ignore

`iter()`

Returns an iterator of the specified data_source indices in random order.

Source code in src/calvera/utils/data_sampler.py

def __iter__(self) -> Iterator[int]:
    """Returns an iterator of the specified `data_source` indices in random order."""
    return self._get_iterator()

`RandomDataSampler(data_source, generator=None)`

Bases: AbstractDataSampler

Samples elements randomly without replacement.

Parameters:

Name	Type	Description	Default
`data_source`	`Dataset[tuple[Tensor, Tensor]]`	Dataset to sample from	required
`generator`	`Generator \| None`	Optional PyTorch Generator for reproducible randomness	`None`

Source code in src/calvera/utils/data_sampler.py

def __init__(
    self,
    data_source: Dataset[tuple[torch.Tensor, torch.Tensor]],
    generator: torch.Generator | None = None,
) -> None:
    """Initializes the RandomDataSampler.

    Args:
        data_source: Dataset to sample from
        generator: Optional PyTorch Generator for reproducible randomness
    """
    super().__init__(data_source)
    self.generator = generator

`SortedDataSampler(data_source, key_fn, reverse=False)`

Bases: AbstractDataSampler

Samples elements in sorted order based on a key function.

Parameters:

Name	Type	Description	Default
`data_source`	`Dataset[tuple[Tensor, Tensor]]`	Dataset to sample from	required
`key_fn`	`Callable[[int], Any]`	Function that returns the sorting key for each dataset index	required
`reverse`	`bool`	Whether to sort in descending order (default: False)	`False`

Source code in src/calvera/utils/data_sampler.py

def __init__(
    self,
    data_source: Dataset[tuple[torch.Tensor, torch.Tensor]],
    key_fn: Callable[[int], Any],
    reverse: bool = False,
) -> None:
    """Initializes the SortedDataSampler.

    Args:
        data_source: Dataset to sample from
        key_fn: Function that returns the sorting key for each dataset index
        reverse: Whether to sort in descending order (default: False)
    """
    super().__init__(data_source)
    self.key_fn = key_fn
    self.reverse = reverse

`MultiClassContextualizer(n_arms)`

Applies disjoint model contextualization to the input feature vector.

Example

contextualizer = MultiClassContextualizer(n_arms=2)
feature_vector = torch.tensor([[1, 0]])
contextualizer(feature_vector)

tensor([[[1, 0, 0, 0],
     [0, 0, 1, 0]]])

Parameters:

Name	Type	Description	Default
`n_arms`	`int`	The number of arms in the bandit model.	required

Source code in src/calvera/utils/multiclass.py

def __init__(
    self,
    n_arms: int,
) -> None:
    """Initializes the MultiClassContextualizer.

    Args:
        n_arms: The number of arms in the bandit model.
    """
    super().__init__()
    self.n_arms = n_arms

`call(feature_vector)`

Performs the disjoint model contextualisation.

Parameters:

Name	Type	Description	Default
`feature_vector`	`Tensor`	Input feature vector of shape (batch_size, n_features)	required

Returns:

Type	Description
`Tensor`	contextualized actions of shape (batch_size, n_arms, n_features * n_arms)

Source code in src/calvera/utils/multiclass.py

def __call__(
    self,
    feature_vector: torch.Tensor,
) -> torch.Tensor:
    """Performs the disjoint model contextualisation.

    Args:
        feature_vector: Input feature vector of shape (batch_size, n_features)

    Returns:
        contextualized actions of shape (batch_size, n_arms, n_features * n_arms)
    """
    assert len(feature_vector.shape) == 2, "Feature vector must have shape (batch_size, n_features)"

    n_features = feature_vector.shape[1]
    contextualized_actions = torch.einsum(
        "ij,bk->bijk", torch.eye(self.n_arms, device=feature_vector.device), feature_vector
    )
    contextualized_actions = contextualized_actions.reshape(-1, self.n_arms, n_features * self.n_arms)

    return contextualized_actions

Utils

Selectors

AbstractSelector

__call__(scores) abstractmethod

ArgMaxSelector

EpsilonGreedySelector(epsilon=0.1, seed=None)

RandomSelector(k=1, seed=None)

TopKSelector(k)

EpsilonGreedyTopKSelector(k, epsilon=0.1, seed=None)

Data Samplers

AbstractDataSampler(data_source)

__len__()

__iter__()

RandomDataSampler(data_source, generator=None)

SortedDataSampler(data_source, key_fn, reverse=False)

MultiClassContextualizer(n_arms)

__call__(feature_vector)

`AbstractSelector`

`call(scores)` `abstractmethod`

`ArgMaxSelector`

`EpsilonGreedySelector(epsilon=0.1, seed=None)`

`RandomSelector(k=1, seed=None)`

`TopKSelector(k)`

`EpsilonGreedyTopKSelector(k, epsilon=0.1, seed=None)`

`AbstractDataSampler(data_source)`

`len()`

`iter()`

`RandomDataSampler(data_source, generator=None)`

`SortedDataSampler(data_source, key_fn, reverse=False)`

`MultiClassContextualizer(n_arms)`

`call(feature_vector)`