Skip to content

Utils

Selectors

Selectors are used to control the action selection behavior of a bandit. You can provide a selector to each bandit. If you don't provide one, a default selector will be used.

Currently, we provide the following selectors:

  • ArgMaxSelector: Select the action with the highest estimated reward.

  • EpsilonGreedySelector: Select the action with the highest estimated reward with probability 1 - epsilon and a random action with probability epsilon.

  • TopKSelector: Select the top k actions with the highest estimated reward.

  • EpsilonGreedyTopKSelector: Selects the top k arms with probability 1-epsilon or k random arms with probability epsilon.

If you want to implement your own selector, you can subclass the Selector class and implement the __call__ method making your class callable.


AbstractSelector

Bases: ABC

Defines the interface for all bandit action selectors.

Given a tensor of scores per action, the selector chooses an action (i.e. an arm) or a set of actions (i.e. a super arm in combinatorial bandits). The selector returns a one hot encoded tensor of the chosen actions.

__call__(scores) abstractmethod

Selects a single action, or a set of actions in the case of combinatorial bandits.

Parameters:

Name Type Description Default
scores Tensor

Scores for each action. Shape: (batch_size, n_arms). This may contain a probability distribution per sample or simply a score per arm (e.g. for UCB). In case of combinatorial bandits, these are the scores per arm from which the oracle selects a super arm (e.g. simply top-k).

required

Returns:

Type Description
Tensor

One hot encoded actions that were chosen. Shape: (batch_size, n_arms).

Source code in src/calvera/utils/selectors.py
@abstractmethod
def __call__(self, scores: torch.Tensor) -> torch.Tensor:
    """Selects a single action, or a set of actions in the case of combinatorial bandits.

    Args:
        scores: Scores for each action. Shape: (batch_size, n_arms).
            This may contain a probability distribution per sample or simply a score per
            arm (e.g. for UCB). In case of combinatorial bandits, these are the scores
            per arm from which the oracle selects a super arm (e.g. simply top-k).

    Returns:
        One hot encoded actions that were chosen. Shape: (batch_size, n_arms).
    """
    pass

ArgMaxSelector

Bases: AbstractSelector

Selects the action with the highest score from a batch of scores.

EpsilonGreedySelector(epsilon=0.1, seed=None)

Bases: AbstractSelector

Implements an epsilon-greedy action selection strategy.

Parameters:

Name Type Description Default
epsilon float

Exploration probability. Must be between 0 and 1.

0.1
seed int | None

Random seed for the generator. Defaults to None (explicit seed used).

None
Source code in src/calvera/utils/selectors.py
def __init__(self, epsilon: float = 0.1, seed: int | None = None) -> None:
    """Initialize the epsilon-greedy selector.

    Args:
        epsilon: Exploration probability. Must be between 0 and 1.
        seed: Random seed for the generator. Defaults to None (explicit seed used).
    """
    assert 0 <= epsilon <= 1, "Epsilon must be between 0 and 1"
    self.epsilon = epsilon
    self.generator = torch.Generator()
    if seed is not None:
        self.generator.manual_seed(seed)

RandomSelector(k=1, seed=None)

Bases: AbstractSelector

Selects k random actions from the available actions.

Parameters:

Name Type Description Default
k int

Number of actions to select. Must be positive.

1
seed int | None

Random seed for the generator. Defaults to None.

None
Source code in src/calvera/utils/selectors.py
def __init__(self, k: int = 1, seed: int | None = None):
    """Initialize the random selector.

    Args:
        k: Number of actions to select. Must be positive.
        seed: Random seed for the generator. Defaults to None.
    """
    self.k = k
    self.generator = torch.Generator()
    if seed is not None:
        self.generator.manual_seed(seed)

TopKSelector(k)

Bases: AbstractSelector

Selects the top k actions with the highest scores.

Parameters:

Name Type Description Default
k int

Number of actions to select. Must be positive.

required
Source code in src/calvera/utils/selectors.py
def __init__(self, k: int):
    """Initialize the top-k selector.

    Args:
        k: Number of actions to select. Must be positive.
    """
    assert k > 0, "k must be positive"
    self.k = k

EpsilonGreedyTopKSelector(k, epsilon=0.1, seed=None)

Bases: AbstractSelector

Implements an epsilon-greedy top-k action selection strategy.

With probability 1-epsilon, selects the top k arms with highest scores. With probability epsilon, selects k random arms.

Parameters:

Name Type Description Default
k int

Number of actions to select. Must be positive.

required
epsilon float

Exploration probability. Must be between 0 and 1.

0.1
seed int | None

Random seed for the generator. Defaults to None (explicit seed used).

None
Source code in src/calvera/utils/selectors.py
def __init__(self, k: int, epsilon: float = 0.1, seed: int | None = None) -> None:
    """Initialize the epsilon-greedy top-k selector.

    Args:
        k: Number of actions to select. Must be positive.
        epsilon: Exploration probability. Must be between 0 and 1.
        seed: Random seed for the generator. Defaults to None (explicit seed used).
    """
    assert k > 0, "k must be positive"
    assert 0 <= epsilon <= 1, "Epsilon must be between 0 and 1"
    self.k = k
    self.epsilon = epsilon
    self.generator = torch.Generator()
    if seed is not None:
        self.generator.manual_seed(seed)



Data Samplers

To simulate a bandit in a scenario with non-i.i.d. contexts, we need to modify the data sampler of our benchmark datasets. To be consistent we provide a DataSampler class that can be used to sample data from a dataset.

AbstractDataSampler(data_source)

Bases: Sampler[int], ABC

Base class for all custom samplers.

Implements the basic functionality required for sampling from a dataset. Subclasses need only implement the _get_iterator method to define their specific sampling strategy.

Parameters:

Name Type Description Default
data_source Dataset[tuple[Tensor, Tensor]]

Dataset to sample from

required
Source code in src/calvera/utils/data_sampler.py
def __init__(
    self,
    data_source: Dataset[tuple[torch.Tensor, torch.Tensor]],
) -> None:
    """Initializes the AbstractDataSampler.

    Args:
        data_source: Dataset to sample from
    """
    self.data_source = data_source

__len__()

Returns the number of elements in the data source.

Source code in src/calvera/utils/data_sampler.py
def __len__(self) -> int:
    """Returns the number of elements in the data source."""
    return len(self.data_source)  # type: ignore

__iter__()

Returns an iterator of the specified data_source indices in random order.

Source code in src/calvera/utils/data_sampler.py
def __iter__(self) -> Iterator[int]:
    """Returns an iterator of the specified `data_source` indices in random order."""
    return self._get_iterator()

RandomDataSampler(data_source, generator=None)

Bases: AbstractDataSampler

Samples elements randomly without replacement.

Parameters:

Name Type Description Default
data_source Dataset[tuple[Tensor, Tensor]]

Dataset to sample from

required
generator Generator | None

Optional PyTorch Generator for reproducible randomness

None
Source code in src/calvera/utils/data_sampler.py
def __init__(
    self,
    data_source: Dataset[tuple[torch.Tensor, torch.Tensor]],
    generator: torch.Generator | None = None,
) -> None:
    """Initializes the RandomDataSampler.

    Args:
        data_source: Dataset to sample from
        generator: Optional PyTorch Generator for reproducible randomness
    """
    super().__init__(data_source)
    self.generator = generator

SortedDataSampler(data_source, key_fn, reverse=False)

Bases: AbstractDataSampler

Samples elements in sorted order based on a key function.

Parameters:

Name Type Description Default
data_source Dataset[tuple[Tensor, Tensor]]

Dataset to sample from

required
key_fn Callable[[int], Any]

Function that returns the sorting key for each dataset index

required
reverse bool

Whether to sort in descending order (default: False)

False
Source code in src/calvera/utils/data_sampler.py
def __init__(
    self,
    data_source: Dataset[tuple[torch.Tensor, torch.Tensor]],
    key_fn: Callable[[int], Any],
    reverse: bool = False,
) -> None:
    """Initializes the SortedDataSampler.

    Args:
        data_source: Dataset to sample from
        key_fn: Function that returns the sorting key for each dataset index
        reverse: Whether to sort in descending order (default: False)
    """
    super().__init__(data_source)
    self.key_fn = key_fn
    self.reverse = reverse



MultiClassContextualizer(n_arms)

Applies disjoint model contextualization to the input feature vector.

Example
contextualizer = MultiClassContextualizer(n_arms=2)
feature_vector = torch.tensor([[1, 0]])
contextualizer(feature_vector)

tensor([[[1, 0, 0, 0],
     [0, 0, 1, 0]]])

Parameters:

Name Type Description Default
n_arms int

The number of arms in the bandit model.

required
Source code in src/calvera/utils/multiclass.py
def __init__(
    self,
    n_arms: int,
) -> None:
    """Initializes the MultiClassContextualizer.

    Args:
        n_arms: The number of arms in the bandit model.
    """
    super().__init__()
    self.n_arms = n_arms

__call__(feature_vector)

Performs the disjoint model contextualisation.

Parameters:

Name Type Description Default
feature_vector Tensor

Input feature vector of shape (batch_size, n_features)

required

Returns:

Type Description
Tensor

contextualized actions of shape (batch_size, n_arms, n_features * n_arms)

Source code in src/calvera/utils/multiclass.py
def __call__(
    self,
    feature_vector: torch.Tensor,
) -> torch.Tensor:
    """Performs the disjoint model contextualisation.

    Args:
        feature_vector: Input feature vector of shape (batch_size, n_features)

    Returns:
        contextualized actions of shape (batch_size, n_arms, n_features * n_arms)
    """
    assert len(feature_vector.shape) == 2, "Feature vector must have shape (batch_size, n_features)"

    n_features = feature_vector.shape[1]
    contextualized_actions = torch.einsum(
        "ij,bk->bijk", torch.eye(self.n_arms, device=feature_vector.device), feature_vector
    )
    contextualized_actions = contextualized_actions.reshape(-1, self.n_arms, n_features * self.n_arms)

    return contextualized_actions