Files
strategy-lab/to_explore/pyquantnews/74_VectorbtParamCV.ipynb
David Brazda e3da60c647 daily update
2024-10-21 20:57:56 +02:00

6.8 MiB
Raw Blame History

No description has been provided for this image

This code cross-validates a parameterized trading strategy using historical data. It defines a cross-validation schema that splits data into training and testing sets based on specified time ranges. The code then applies a simple trading strategy, an EMA crossover with an ATR trailing stop, to each split. It evaluates the strategy's performance using the Sharpe ratio and performs parameter optimization to test various combinations. Finally, it analyzes the correlation between training and testing results to assess the strategy's robustness.

In [1]:
import numpy as np
from pandas.tseries.frequencies import to_offset
import vectorbtpro as vbt

Set the theme for VectorBT plots to dark

In [2]:
vbt.settings.set_theme("dark")

Define parameters for the data pull, including symbol, start and end dates, and timeframe

In [3]:
SYMBOL = "AAPL"
START = "2010"
END = "now"
TIMEFRAME = "day"

Pull historical data for the specified symbol and timeframe

In [4]:
data = vbt.YFData.pull(
    SYMBOL,
    start=START,
    end=END,
    timeframe=TIMEFRAME
)

Define parameters for the cross-validation schema, including training and testing periods

In [14]:
data.data["AAPL"]
Out[14]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Open High Low Close Volume Dividends Stock Splits
Date
2010-01-04 00:00:00-05:00 6.437013 6.469284 6.405345 6.454505 493729600 0.0 0.0
2010-01-05 00:00:00-05:00 6.472300 6.502158 6.431583 6.465664 601904800 0.0 0.0
2010-01-06 00:00:00-05:00 6.465664 6.491300 6.356183 6.362819 552160000 0.0 0.0
2010-01-07 00:00:00-05:00 6.386344 6.393884 6.304912 6.351056 477131200 0.0 0.0
2010-01-08 00:00:00-05:00 6.342613 6.393886 6.305216 6.393282 447610800 0.0 0.0
... ... ... ... ... ... ... ...
2024-10-02 00:00:00-04:00 225.889999 227.369995 223.020004 226.779999 32880600 0.0 0.0
2024-10-03 00:00:00-04:00 225.139999 226.809998 223.320007 225.669998 34044200 0.0 0.0
2024-10-04 00:00:00-04:00 227.899994 228.000000 224.130005 226.800003 37245100 0.0 0.0
2024-10-07 00:00:00-04:00 224.500000 225.690002 221.330002 221.690002 39505400 0.0 0.0
2024-10-08 00:00:00-04:00 224.300003 225.979996 223.250000 225.770004 31634500 0.0 0.0

3716 rows × 7 columns

In [5]:
TRAIN = 12
TEST = 12
EVERY = 3
OFFSET = "MS"

Create a splitter object that divides the date range into training and testing sets

In [6]:
splitter = vbt.Splitter.from_ranges(
    data.index, 
    every=f"{EVERY}{OFFSET}", 
    lookback_period=f"{TRAIN + TEST}{OFFSET}",
    split=(
        vbt.RepFunc(lambda index: index < index[0] + TRAIN * to_offset(OFFSET)),
        vbt.RepFunc(lambda index: index >= index[0] + TRAIN * to_offset(OFFSET)),
    ),
    set_labels=["train", "test"]
)

Display the splitter plots to visualize the training and testing sets

In [7]:
splitter.plots().show()
/Users/davidbrazda/Documents/Development/python/strategy-lab1/.venv/lib/python3.10/site-packages/jupyter_client/session.py:721: UserWarning:

Message serialization failed with:
Out of range float values are not JSON compliant
Supporting this message is deprecated in jupyter-client 7, please make sure your message is JSON-compliant

/Users/davidbrazda/Documents/Development/python/strategy-lab1/.venv/lib/python3.10/site-packages/jupyter_client/session.py:721: UserWarning:

Message serialization failed with:
Out of range float values are not JSON compliant
Supporting this message is deprecated in jupyter-client 7, please make sure your message is JSON-compliant

Define an objective function to execute a trading strategy with specific parameters

In [8]:
def objective(data, fast_period=11, slow_period=20, atr_period=14, atr_mult=3):
    """Execute EMA crossover with ATR trailing stop
    
    Parameters
    ----------
    data : vbt.Data
        Historical price data
    fast_period : int, optional
        Period for fast EMA, by default 10
    slow_period : int, optional
        Period for slow EMA, by default 20
    atr_period : int, optional
        Period for ATR, by default 14
    atr_mult : int, optional
        Multiplier for ATR trailing stop, by default 3
    
    Returns
    -------
    float
        Sharpe ratio of the strategy
    """
    
    # Calculate fast and slow EMAs and ATR for the given periods
    fast_ema = data.run("talib:ema", fast_period, short_name="fast_ema", unpack=True)
    slow_ema = data.run("talib:ema", slow_period, short_name="slow_ema", unpack=True)
    atr = data.run("talib:atr", atr_period, unpack=True)
    
    # Define a portfolio using EMA crossover signals and ATR trailing stop
    pf = vbt.PF.from_signals(
        data, 
        entries=fast_ema.vbt.crossed_above(slow_ema), 
        exits=fast_ema.vbt.crossed_below(slow_ema), 
        tsl_stop=atr * atr_mult, 
        save_returns=True,
        freq=TIMEFRAME
    )
    
    # Return the Sharpe ratio of the portfolio
    return pf.sharpe_ratio

Print the Sharpe ratio for the objective function with default parameters

In [9]:
print(objective(data))
1.133668496227128

Decorate the objective function to enable it to accept lists of parameters and execute across combinations

In [10]:
param_objective = vbt.parameterized(
    objective,
    merge_func="concat",
    mono_n_chunks="auto",
    execute_kwargs=dict(engine="pathos")
)

Further decorate the function to run across date ranges specified by the splitter

In [11]:
cv_objective = vbt.split(
    param_objective,
    splitter=splitter, 
    takeable_args=["data"], 
    merge_func="concat", 
    execute_kwargs=dict(show_progress=True)
)

Generate Sharpe ratio results for various parameter combinations using cross-validation

In [15]:
sharpe_ratio = cv_objective(
    data,
    vbt.Param(np.arange(10, 50, 10), condition="slow_period - fast_period >= 5"),
    vbt.Param(np.arange(10, 50,10)),
    vbt.Param(np.arange(10, 50, 10), condition="fast_period <= atr_period <= slow_period"),
    vbt.Param(np.arange(2, 5))
)
  4%|3         | 2/51 [00:02<00:49,  1.02s/it, split=2]

Print the resulting Sharpe ratio for the parameter combinations

In [16]:
sharpe_ratio
Out[16]:
split  set    fast_period  slow_period  atr_period  atr_mult
0      train  10           20           10          2           1.100707
                                                    3           1.100707
                                                    4           1.100707
                                        20          2           1.100707
                                                    3           1.100707
                                                                  ...   
50     test   30           40           30          3           1.604512
                                                    4           1.604512
                                        40          2           1.604512
                                                    3           1.604512
                                                    4           1.604512
Name: sharpe_ratio, Length: 4896, dtype: float64

Extract the Sharpe ratio for the training set

In [17]:
train_sharpe_ratio = sharpe_ratio.xs("train", level="set")
train_sharpe_ratio
Out[17]:
split  fast_period  slow_period  atr_period  atr_mult
0      10           20           10          2           1.100707
                                             3           1.100707
                                             4           1.100707
                                 20          2           1.100707
                                             3           1.100707
                                                           ...   
50     30           40           30          3           2.313270
                                             4           2.313270
                                 40          2           2.313270
                                             3           2.313270
                                             4           2.313270
Name: sharpe_ratio, Length: 2448, dtype: float64

Extract the Sharpe ratio for the testing set

In [20]:
test_sharpe_ratio = sharpe_ratio.xs("test", level="set")
test_sharpe_ratio
Out[20]:
split  fast_period  slow_period  atr_period  atr_mult
0      10           20           10          2          -0.080490
                                             3          -0.080490
                                             4          -0.080490
                                 20          2          -0.080490
                                             3          -0.080490
                                                           ...   
50     30           40           30          3           1.604512
                                             4           1.604512
                                 40          2           1.604512
                                             3           1.604512
                                             4           1.604512
Name: sharpe_ratio, Length: 2448, dtype: float64

Print the correlation between training and testing Sharpe ratios

In [21]:
train_sharpe_ratio.corr(test_sharpe_ratio)
Out[21]:
-0.17950267350172305

Calculate the difference in Sharpe ratios between testing and training sets

In [29]:
sharpe_ratio_diff = test_sharpe_ratio - train_sharpe_ratio
sharpe_ratio_diff
Out[29]:
split  fast_period  slow_period  atr_period  atr_mult
0      10           20           10          2          -1.181197
                                             3          -1.181197
                                             4          -1.181197
                                 20          2          -1.181197
                                             3          -1.181197
                                                           ...   
50     30           40           30          3          -0.708758
                                             4          -0.708758
                                 40          2          -0.708758
                                             3          -0.708758
                                             4          -0.708758
Name: sharpe_ratio, Length: 2448, dtype: float64

Compute the median difference in Sharpe ratios grouped by fast and slow EMA periods

In [34]:
sharpe_ratio_diff
Out[34]:
split  fast_period  slow_period  atr_period  atr_mult
0      10           20           10          2          -1.181197
                                             3          -1.181197
                                             4          -1.181197
                                 20          2          -1.181197
                                             3          -1.181197
                                                           ...   
50     30           40           30          3          -0.708758
                                             4          -0.708758
                                 40          2          -0.708758
                                             3          -0.708758
                                             4          -0.708758
Name: sharpe_ratio, Length: 2448, dtype: float64
In [36]:
sharpe_ratio_diff_median = sharpe_ratio_diff.groupby(
    ["fast_period", "slow_period"]
).median()
sharpe_ratio_diff_median
Out[36]:
fast_period  slow_period
10           20             0.069470
             30            -0.002789
             40            -0.200267
20           30            -0.052227
             40            -0.446390
30           40             0.114575
Name: sharpe_ratio, dtype: float64

Display a heatmap of the median differences

In [38]:
sharpe_ratio_diff_median.vbt.heatmap(
    trace_kwargs=dict(colorscale="RdBu")
).show()
/Users/davidbrazda/Documents/Development/python/strategy-lab1/.venv/lib/python3.10/site-packages/jupyter_client/session.py:721: UserWarning:

Message serialization failed with:
Out of range float values are not JSON compliant
Supporting this message is deprecated in jupyter-client 7, please make sure your message is JSON-compliant

PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advise. Use at your own risk.