strategy-lab/74_VectorbtParamCV.ipynb at d718ed61bd41f290bcc40430414eb8867e2b579a

Files

David Brazda e3da60c647 daily update

2024-10-21 20:57:56 +02:00

6.8 MiB

Raw Blame History

No description has been provided for this image

This code cross-validates a parameterized trading strategy using historical data. It defines a cross-validation schema that splits data into training and testing sets based on specified time ranges. The code then applies a simple trading strategy, an EMA crossover with an ATR trailing stop, to each split. It evaluates the strategy's performance using the Sharpe ratio and performs parameter optimization to test various combinations. Finally, it analyzes the correlation between training and testing results to assess the strategy's robustness.

In [1]:

import numpy as np
from pandas.tseries.frequencies import to_offset
import vectorbtpro as vbt

Set the theme for VectorBT plots to dark

In [2]:

vbt.settings.set_theme("dark")

Define parameters for the data pull, including symbol, start and end dates, and timeframe

In [3]:

SYMBOL = "AAPL"
START = "2010"
END = "now"
TIMEFRAME = "day"

Pull historical data for the specified symbol and timeframe

In [4]:

data = vbt.YFData.pull(
    SYMBOL,
    start=START,
    end=END,
    timeframe=TIMEFRAME
)

Define parameters for the cross-validation schema, including training and testing periods

In [14]:

data.data["AAPL"]

Out[14]:

	Open	High	Low	Close	Volume	Dividends	Stock Splits
Date
2010-01-04 00:00:00-05:00	6.437013	6.469284	6.405345	6.454505	493729600	0.0	0.0
2010-01-05 00:00:00-05:00	6.472300	6.502158	6.431583	6.465664	601904800	0.0	0.0
2010-01-06 00:00:00-05:00	6.465664	6.491300	6.356183	6.362819	552160000	0.0	0.0
2010-01-07 00:00:00-05:00	6.386344	6.393884	6.304912	6.351056	477131200	0.0	0.0
2010-01-08 00:00:00-05:00	6.342613	6.393886	6.305216	6.393282	447610800	0.0	0.0
...	...	...	...	...	...	...	...
2024-10-02 00:00:00-04:00	225.889999	227.369995	223.020004	226.779999	32880600	0.0	0.0
2024-10-03 00:00:00-04:00	225.139999	226.809998	223.320007	225.669998	34044200	0.0	0.0
2024-10-04 00:00:00-04:00	227.899994	228.000000	224.130005	226.800003	37245100	0.0	0.0
2024-10-07 00:00:00-04:00	224.500000	225.690002	221.330002	221.690002	39505400	0.0	0.0
2024-10-08 00:00:00-04:00	224.300003	225.979996	223.250000	225.770004	31634500	0.0	0.0

3716 rows × 7 columns

In [5]:

TRAIN = 12
TEST = 12
EVERY = 3
OFFSET = "MS"

Create a splitter object that divides the date range into training and testing sets

In [6]:

splitter = vbt.Splitter.from_ranges(
    data.index, 
    every=f"{EVERY}{OFFSET}", 
    lookback_period=f"{TRAIN + TEST}{OFFSET}",
    split=(
        vbt.RepFunc(lambda index: index < index[0] + TRAIN * to_offset(OFFSET)),
        vbt.RepFunc(lambda index: index >= index[0] + TRAIN * to_offset(OFFSET)),
    ),
    set_labels=["train", "test"]
)

Display the splitter plots to visualize the training and testing sets

In [7]:

splitter.plots().show()

/Users/davidbrazda/Documents/Development/python/strategy-lab1/.venv/lib/python3.10/site-packages/jupyter_client/session.py:721: UserWarning:

Message serialization failed with:
Out of range float values are not JSON compliant
Supporting this message is deprecated in jupyter-client 7, please make sure your message is JSON-compliant

/Users/davidbrazda/Documents/Development/python/strategy-lab1/.venv/lib/python3.10/site-packages/jupyter_client/session.py:721: UserWarning:

Message serialization failed with:
Out of range float values are not JSON compliant
Supporting this message is deprecated in jupyter-client 7, please make sure your message is JSON-compliant

Define an objective function to execute a trading strategy with specific parameters

In [8]:

def objective(data, fast_period=11, slow_period=20, atr_period=14, atr_mult=3):
    """Execute EMA crossover with ATR trailing stop
    
    Parameters
    ----------
    data : vbt.Data
        Historical price data
    fast_period : int, optional
        Period for fast EMA, by default 10
    slow_period : int, optional
        Period for slow EMA, by default 20
    atr_period : int, optional
        Period for ATR, by default 14
    atr_mult : int, optional
        Multiplier for ATR trailing stop, by default 3
    
    Returns
    -------
    float
        Sharpe ratio of the strategy
    """
    
    # Calculate fast and slow EMAs and ATR for the given periods
    fast_ema = data.run("talib:ema", fast_period, short_name="fast_ema", unpack=True)
    slow_ema = data.run("talib:ema", slow_period, short_name="slow_ema", unpack=True)
    atr = data.run("talib:atr", atr_period, unpack=True)
    
    # Define a portfolio using EMA crossover signals and ATR trailing stop
    pf = vbt.PF.from_signals(
        data, 
        entries=fast_ema.vbt.crossed_above(slow_ema), 
        exits=fast_ema.vbt.crossed_below(slow_ema), 
        tsl_stop=atr * atr_mult, 
        save_returns=True,
        freq=TIMEFRAME
    )
    
    # Return the Sharpe ratio of the portfolio
    return pf.sharpe_ratio

Print the Sharpe ratio for the objective function with default parameters

In [9]:

print(objective(data))

1.133668496227128

Decorate the objective function to enable it to accept lists of parameters and execute across combinations

In [10]:

param_objective = vbt.parameterized(
    objective,
    merge_func="concat",
    mono_n_chunks="auto",
    execute_kwargs=dict(engine="pathos")
)

Further decorate the function to run across date ranges specified by the splitter

In [11]:

cv_objective = vbt.split(
    param_objective,
    splitter=splitter, 
    takeable_args=["data"], 
    merge_func="concat", 
    execute_kwargs=dict(show_progress=True)
)

Generate Sharpe ratio results for various parameter combinations using cross-validation

In [15]:

sharpe_ratio = cv_objective(
    data,
    vbt.Param(np.arange(10, 50, 10), condition="slow_period - fast_period >= 5"),
    vbt.Param(np.arange(10, 50,10)),
    vbt.Param(np.arange(10, 50, 10), condition="fast_period <= atr_period <= slow_period"),
    vbt.Param(np.arange(2, 5))
)

  4%|3         | 2/51 [00:02<00:49,  1.02s/it, split=2]

Print the resulting Sharpe ratio for the parameter combinations

In [16]:

sharpe_ratio

Out[16]:

split  set    fast_period  slow_period  atr_period  atr_mult
0      train  10           20           10          2           1.100707
                                                    3           1.100707
                                                    4           1.100707
                                        20          2           1.100707
                                                    3           1.100707
                                                                  ...   
50     test   30           40           30          3           1.604512
                                                    4           1.604512
                                        40          2           1.604512
                                                    3           1.604512
                                                    4           1.604512
Name: sharpe_ratio, Length: 4896, dtype: float64

Extract the Sharpe ratio for the training set

In [17]:

train_sharpe_ratio = sharpe_ratio.xs("train", level="set")
train_sharpe_ratio

Out[17]:

split  fast_period  slow_period  atr_period  atr_mult
0      10           20           10          2           1.100707
                                             3           1.100707
                                             4           1.100707
                                 20          2           1.100707
                                             3           1.100707
                                                           ...   
50     30           40           30          3           2.313270
                                             4           2.313270
                                 40          2           2.313270
                                             3           2.313270
                                             4           2.313270
Name: sharpe_ratio, Length: 2448, dtype: float64

Extract the Sharpe ratio for the testing set

In [20]:

test_sharpe_ratio = sharpe_ratio.xs("test", level="set")
test_sharpe_ratio

Out[20]:

split  fast_period  slow_period  atr_period  atr_mult
0      10           20           10          2          -0.080490
                                             3          -0.080490
                                             4          -0.080490
                                 20          2          -0.080490
                                             3          -0.080490
                                                           ...   
50     30           40           30          3           1.604512
                                             4           1.604512
                                 40          2           1.604512
                                             3           1.604512
                                             4           1.604512
Name: sharpe_ratio, Length: 2448, dtype: float64

Print the correlation between training and testing Sharpe ratios

In [21]:

train_sharpe_ratio.corr(test_sharpe_ratio)

Out[21]:

-0.17950267350172305

Calculate the difference in Sharpe ratios between testing and training sets

In [29]:

sharpe_ratio_diff = test_sharpe_ratio - train_sharpe_ratio
sharpe_ratio_diff

Out[29]:

split  fast_period  slow_period  atr_period  atr_mult
0      10           20           10          2          -1.181197
                                             3          -1.181197
                                             4          -1.181197
                                 20          2          -1.181197
                                             3          -1.181197
                                                           ...   
50     30           40           30          3          -0.708758
                                             4          -0.708758
                                 40          2          -0.708758
                                             3          -0.708758
                                             4          -0.708758
Name: sharpe_ratio, Length: 2448, dtype: float64

Compute the median difference in Sharpe ratios grouped by fast and slow EMA periods

In [34]:

sharpe_ratio_diff

Out[34]:

split  fast_period  slow_period  atr_period  atr_mult
0      10           20           10          2          -1.181197
                                             3          -1.181197
                                             4          -1.181197
                                 20          2          -1.181197
                                             3          -1.181197
                                                           ...   
50     30           40           30          3          -0.708758
                                             4          -0.708758
                                 40          2          -0.708758
                                             3          -0.708758
                                             4          -0.708758
Name: sharpe_ratio, Length: 2448, dtype: float64

In [36]:

sharpe_ratio_diff_median = sharpe_ratio_diff.groupby(
    ["fast_period", "slow_period"]
).median()
sharpe_ratio_diff_median

Out[36]:

fast_period  slow_period
10           20             0.069470
             30            -0.002789
             40            -0.200267
20           30            -0.052227
             40            -0.446390
30           40             0.114575
Name: sharpe_ratio, dtype: float64

Display a heatmap of the median differences

In [38]:

sharpe_ratio_diff_median.vbt.heatmap(
    trace_kwargs=dict(colorscale="RdBu")
).show()

/Users/davidbrazda/Documents/Development/python/strategy-lab1/.venv/lib/python3.10/site-packages/jupyter_client/session.py:721: UserWarning:

Message serialization failed with:
Out of range float values are not JSON compliant
Supporting this message is deprecated in jupyter-client 7, please make sure your message is JSON-compliant

PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advise. Use at your own risk.

6.8 MiB Raw Blame History Unescape Escape

6.8 MiB

Raw Blame History