Utils
Selectors
Selectors are used to control the action selection behavior of a bandit. You can provide a selector to each bandit. If you don't provide one, a default selector will be used.
Currently, we provide the following selectors:
-
ArgMaxSelector
: Select the action with the highest estimated reward. -
EpsilonGreedySelector
: Select the action with the highest estimated reward with probability1 - epsilon
and a random action with probabilityepsilon
. -
TopKSelector
: Select the topk
actions with the highest estimated reward. -
EpsilonGreedyTopKSelector
: Selects the topk
arms with probability1-epsilon
ork
random arms with probabilityepsilon
.
If you want to implement your own selector, you can subclass the Selector
class and implement the __call__
method making your class callable.
AbstractSelector
Bases: ABC
Defines the interface for all bandit action selectors.
Given a tensor of scores per action, the selector chooses an action (i.e. an arm) or a set of actions (i.e. a super arm in combinatorial bandits). The selector returns a one hot encoded tensor of the chosen actions.
__call__(scores)
abstractmethod
Selects a single action, or a set of actions in the case of combinatorial bandits.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scores
|
Tensor
|
Scores for each action. Shape: (batch_size, n_arms). This may contain a probability distribution per sample or simply a score per arm (e.g. for UCB). In case of combinatorial bandits, these are the scores per arm from which the oracle selects a super arm (e.g. simply top-k). |
required |
Returns:
Type | Description |
---|---|
Tensor
|
One hot encoded actions that were chosen. Shape: (batch_size, n_arms). |
Source code in src/calvera/utils/selectors.py
ArgMaxSelector
EpsilonGreedySelector(epsilon=0.1, seed=None)
Bases: AbstractSelector
Implements an epsilon-greedy action selection strategy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
epsilon
|
float
|
Exploration probability. Must be between 0 and 1. |
0.1
|
seed
|
int | None
|
Random seed for the generator. Defaults to None (explicit seed used). |
None
|
Source code in src/calvera/utils/selectors.py
RandomSelector(k=1, seed=None)
Bases: AbstractSelector
Selects k
random actions from the available actions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
k
|
int
|
Number of actions to select. Must be positive. |
1
|
seed
|
int | None
|
Random seed for the generator. Defaults to None. |
None
|
Source code in src/calvera/utils/selectors.py
TopKSelector(k)
Bases: AbstractSelector
Selects the top k
actions with the highest scores.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
k
|
int
|
Number of actions to select. Must be positive. |
required |
Source code in src/calvera/utils/selectors.py
EpsilonGreedyTopKSelector(k, epsilon=0.1, seed=None)
Bases: AbstractSelector
Implements an epsilon-greedy top-k action selection strategy.
With probability 1-epsilon
, selects the top k
arms with highest scores.
With probability epsilon
, selects k
random arms.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
k
|
int
|
Number of actions to select. Must be positive. |
required |
epsilon
|
float
|
Exploration probability. Must be between 0 and 1. |
0.1
|
seed
|
int | None
|
Random seed for the generator. Defaults to None (explicit seed used). |
None
|
Source code in src/calvera/utils/selectors.py
Data Samplers
To simulate a bandit in a scenario with non-i.i.d. contexts, we need to modify the data sampler of our benchmark datasets.
To be consistent we provide a DataSampler
class that can be used to sample data from a dataset.
AbstractDataSampler(data_source)
Bases: Sampler[int]
, ABC
Base class for all custom samplers.
Implements the basic functionality required for sampling from a dataset.
Subclasses need only implement the _get_iterator
method to define
their specific sampling strategy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_source
|
Dataset[tuple[Tensor, Tensor]]
|
Dataset to sample from |
required |
Source code in src/calvera/utils/data_sampler.py
__len__()
RandomDataSampler(data_source, generator=None)
Bases: AbstractDataSampler
Samples elements randomly without replacement.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_source
|
Dataset[tuple[Tensor, Tensor]]
|
Dataset to sample from |
required |
generator
|
Generator | None
|
Optional PyTorch Generator for reproducible randomness |
None
|
Source code in src/calvera/utils/data_sampler.py
SortedDataSampler(data_source, key_fn, reverse=False)
Bases: AbstractDataSampler
Samples elements in sorted order based on a key function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_source
|
Dataset[tuple[Tensor, Tensor]]
|
Dataset to sample from |
required |
key_fn
|
Callable[[int], Any]
|
Function that returns the sorting key for each dataset index |
required |
reverse
|
bool
|
Whether to sort in descending order (default: False) |
False
|
Source code in src/calvera/utils/data_sampler.py
MultiClassContextualizer(n_arms)
Applies disjoint model contextualization to the input feature vector.
Example
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_arms
|
int
|
The number of arms in the bandit model. |
required |
Source code in src/calvera/utils/multiclass.py
__call__(feature_vector)
Performs the disjoint model contextualisation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_vector
|
Tensor
|
Input feature vector of shape (batch_size, n_features) |
required |
Returns:
Type | Description |
---|---|
Tensor
|
contextualized actions of shape (batch_size, n_arms, n_features * n_arms) |