decorators module¶
Decorators for splitting.
cv_split function¶
cv_split(
*args,
parameterized_kwargs=None,
selection='max',
return_grid=False,
skip_errored=False,
template_context=None,
**split_kwargs
)
Decorator that combines split() and parameterized() for cross-validation.
Creates a new apply function that is going to be decorated with split() and thus applied at each single range using Splitter.apply(). Inside this apply function, there is a test whether the current range belongs to the first (training) set. If yes, parameterizes the underlying function and runs it on the entire grid of parameters. The returned results are then stored in a global list. These results are then read by the other (testing) sets in the same split. If selection is a template, it can evaluate the grid results (available as grid_results) and return the best parameter combination. This parameter combination is then executed by each set (including training).
Argument selection also accepts "min" for np.argmin and "max" for np.argmax.
Keyword arguments parameterized_kwargs will be passed to parameterized() and will have their templates substituted with a context that will also include the split-related context (including split_idx, set_idx, etc., see Splitter.apply()).
If return_grid is True or 'first', returns both the grid and the selection. If return_grid is 'all', executes the grid on each set and returns along with the selection. Otherwise, returns only the selection.
If NoResultsException is raised or skip_errored is True and any exception is raised, will skip the current iteration and remove it from the final index.
Usage
- Permutate a series and pick the first value. Make the seed parameterizable. Cross-validate based on the highest picked value:
>>> from vectorbtpro import *
>>> @vbt.cv_split(
... splitter="from_n_rolling",
... splitter_kwargs=dict(n=3, split=0.5),
... takeable_args=["sr"],
... merge_func="concat",
... )
... def f(sr, seed):
... np.random.seed(seed)
... return np.random.permutation(sr)[0]
>>> index = pd.date_range("2020-01-01", "2020-02-01")
>>> np.random.seed(0)
>>> sr = pd.Series(np.random.permutation(np.arange(len(index))), index=index)
>>> f(sr, vbt.Param([41, 42, 43]))
split set seed
0 set_0 41 22
set_1 41 28
1 set_0 43 8
set_1 43 31
2 set_0 43 19
set_1 43 0
dtype: int64
- Extend the example above to also return the grid results of each set:
>>> f(sr, vbt.Param([41, 42, 43]), _return_grid="all")
(split set seed
0 set_0 41 22
42 22
43 2
set_1 41 28
42 28
43 20
1 set_0 41 5
42 5
43 8
set_1 41 23
42 23
43 31
2 set_0 41 18
42 18
43 19
set_1 41 27
42 27
43 0
dtype: int64,
split set seed
0 set_0 41 22
set_1 41 28
1 set_0 43 8
set_1 43 31
2 set_0 43 19
set_1 43 0
dtype: int64)
split function¶
split(
*args,
splitter=None,
splitter_cls=vectorbtpro.generic.splitting.base.Splitter,
splitter_kwargs=None,
index=None,
index_from=None,
takeable_args=None,
template_context=None,
forward_kwargs_as=None,
return_splitter=False,
**apply_kwargs
)
Decorator that splits the inputs of a function.
Does the following:
- Resolves the splitter of the type Splitter using the argument
splitter. It can be either an already provided splitter instance, the name of splitter class method, or an arbitrary callable. If any of the latter, it will passindexand**splitter_kwargs. Index is getting resolved either using an already providedindex, by parsing the argument under a name/position provided inindex_from, or by parsing the first argument fromtakeable_args(in this order). - Wraps arguments in
takeable_argswith Takeable - Runs Splitter.apply() with arguments passed to the function as
argsandkwargs, but also**apply_kwargs(the ones passed to the decorator)
Usage
- Split a Series and return its sum:
>>> from vectorbtpro import *
>>> @vbt.split(
... splitter="from_n_rolling",
... splitter_kwargs=dict(n=2),
... takeable_args=["sr"]
... )
... def f(sr):
... return sr.sum()
>>> index = pd.date_range("2020-01-01", "2020-01-06")
>>> sr = pd.Series(np.arange(len(index)), index=index)
>>> f(sr)
split
0 3
1 12
dtype: int64
- Perform a split manually:
>>> @vbt.split(
... splitter="from_n_rolling",
... splitter_kwargs=dict(n=2),
... takeable_args=["index"]
... )
... def f(index, sr):
... return sr[index].sum()
>>> f(index, sr)
split
0 3
1 12
dtype: int64
- Construct splitter and mark arguments as "takeable" manually:
>>> splitter = vbt.Splitter.from_n_rolling(index, n=2)
>>> @vbt.split(splitter=splitter)
... def f(sr):
... return sr.sum()
>>> f(vbt.Takeable(sr))
split
0 3
1 12
dtype: int64
- Split multiple timeframes using a custom index:
>>> @vbt.split(
... splitter="from_n_rolling",
... splitter_kwargs=dict(n=2),
... index=index,
... takeable_args=["h12_sr", "d2_sr"]
... )
... def f(h12_sr, d2_sr):
... return h12_sr.sum() + d2_sr.sum()
>>> h12_index = pd.date_range("2020-01-01", "2020-01-06", freq="12H")
>>> d2_index = pd.date_range("2020-01-01", "2020-01-06", freq="2D")
>>> h12_sr = pd.Series(np.arange(len(h12_index)), index=h12_index)
>>> d2_sr = pd.Series(np.arange(len(d2_index)), index=d2_index)
>>> f(h12_sr, d2_sr)
split
0 15
1 42
dtype: int64