Files
strategy-lab/to_explore/pyquantnews/42_VectorbtWalkforwardAnalysis.ipynb
David Brazda e3da60c647 daily update
2024-10-21 20:57:56 +02:00

16 KiB

No description has been provided for this image

This code performs a walk-forward analysis using moving averages (MA) to optimize trading strategies. It splits historical stock prices into training and testing periods, runs simulations to find optimal MA parameters, and then tests the strategy on out-of-sample data. The code also evaluates the strategy's performance using the Sharpe ratio and compares it to a simple buy-and-hold strategy. Additionally, statistical tests are conducted to determine if the optimized strategy significantly outperforms the buy-and-hold approach. The results are visualized for further analysis.

In [ ]:
import numpy as np
import vectorbt as vbt
from datetime import datetime, timedelta
import scipy.stats as stats

Create a date range index for the Series

In [ ]:
index = [datetime(2020, 1, 1) + timedelta(days=i) for i in range(10)]

Create a pandas Series with the date index and sequential integer values

In [ ]:
sr = pd.Series(np.arange(len(index)), index=index)

Perform a rolling split on the Series and plot the in-sample and out-sample periods

In [ ]:
sr.vbt.rolling_split(
    window_len=5, 
    set_lens=(1,), 
    left_to_right=False, 
    plot=True, 
    trace_names=['in_sample', 'out_sample']
)

Create a range of window sizes for moving averages to iterate through

In [ ]:
windows = np.arange(10, 50)

Download historical closing prices for Apple (AAPL) stock

In [ ]:
price = vbt.YFData.download('AAPL').get('Close')

Perform a rolling split on the price data for walk-forward analysis

In [ ]:
(in_price, in_indexes), (out_price, out_indexes) = price.vbt.rolling_split(
    n=30, 
    window_len=365 * 2,
    set_lens=(180,),
    left_to_right=False,
)
In [ ]:
def simulate_holding(price, **kwargs):
    """Returns Sharpe ratio for holding strategy
    
    Parameters
    ----------
    price : pd.Series
        Historical price data
    kwargs : dict
        Additional arguments for the Portfolio function
    
    Returns
    -------
    float
        Sharpe ratio of the holding strategy
    """
    
    # Run a backtest for holding the asset and return the Sharpe ratio
    pf = vbt.Portfolio.from_holding(price, **kwargs)
    return pf.sharpe_ratio()
In [ ]:
def simulate_all_params(price, windows, **kwargs):
    """Returns Sharpe ratio for all parameter combinations
    
    Parameters
    ----------
    price : pd.Series
        Historical price data
    windows : iterable
        Range of window sizes for moving averages
    kwargs : dict
        Additional arguments for the Portfolio function
    
    Returns
    -------
    pd.Series
        Sharpe ratios for all parameter combinations
    """
    
    # Run combinations of moving averages for all window sizes
    fast_ma, slow_ma = vbt.MA.run_combs(
        price, windows, r=2, short_names=["fast", "slow"]
    )
    
    # Generate entry signals when fast MA crosses above slow MA
    entries = fast_ma.ma_crossed_above(slow_ma)
    
    # Generate exit signals when fast MA crosses below slow MA
    exits = fast_ma.ma_crossed_below(slow_ma)
    
    # Run the backtest and return the Sharpe ratio
    pf = vbt.Portfolio.from_signals(price, entries, exits, **kwargs)
    return pf.sharpe_ratio()
In [ ]:
def get_best_index(performance, higher_better=True):
    """Returns the best performing index
    
    Parameters
    ----------
    performance : pd.Series
        Performance metrics for each split
    higher_better : bool, optional
        Whether higher values are better, by default True
    
    Returns
    -------
    pd.Index
        Index of the best performing parameters
    """
    
    if higher_better:
        return performance[performance.groupby('split_idx').idxmax()].index
    return performance[performance.groupby('split_idx').idxmin()].index
In [ ]:
def get_best_params(best_index, level_name):
    """Returns the best parameters
    
    Parameters
    ----------
    best_index : pd.Index
        Index of the best performing parameters
    level_name : str
        Name of the level to extract values from
    
    Returns
    -------
    np.ndarray
        Best parameter values
    """
    
    return best_index.get_level_values(level_name).to_numpy()
In [ ]:
def simulate_best_params(price, best_fast_windows, best_slow_windows, **kwargs):
    """Returns Sharpe ratio for best parameters
    
    Parameters
    ----------
    price : pd.Series
        Historical price data
    best_fast_windows : np.ndarray
        Best fast moving average windows
    best_slow_windows : np.ndarray
        Best slow moving average windows
    kwargs : dict
        Additional arguments for the Portfolio function
    
    Returns
    -------
    pd.Series
        Sharpe ratios for the best parameters
    """
    
    # Run the moving average indicators with the best parameters
    fast_ma = vbt.MA.run(price, window=best_fast_windows, per_column=True)
    slow_ma = vbt.MA.run(price, window=best_slow_windows, per_column=True)
    
    # Generate entry signals when fast MA crosses above slow MA
    entries = fast_ma.ma_crossed_above(slow_ma)
    
    # Generate exit signals when fast MA crosses below slow MA
    exits = fast_ma.ma_crossed_below(slow_ma)
    
    # Run the backtest and return the Sharpe ratio
    pf = vbt.Portfolio.from_signals(price, entries, exits, **kwargs)
    return pf.sharpe_ratio()

Get the Sharpe ratio of the strategy across all MA windows for in-sample data

In [ ]:
in_sharpe = simulate_all_params(
    in_price, 
    windows, 
    direction="both", 
    freq="d"
)

Find the best performing parameter index for in-sample data

In [ ]:
in_best_index = get_best_index(in_sharpe)

Extract the best fast and slow moving average window values

In [ ]:
in_best_fast_windows = get_best_params(
    in_best_index,
    'fast_window'
)
In [ ]:
in_best_slow_windows = get_best_params(
    in_best_index,
    'slow_window'
)

Pair the best fast and slow moving average windows

In [ ]:
in_best_window_pairs = np.array(
    list(
        zip(
            in_best_fast_windows, 
            in_best_slow_windows
        )
    )
)

Use best parameters from in-sample ranges and simulate them for out-sample ranges

In [ ]:
out_test_sharpe = simulate_best_params(
    out_price, 
    in_best_fast_windows, 
    in_best_slow_windows, 
    direction="both", 
    freq="d"
)

Extract the best in-sample Sharpe ratios

In [ ]:
in_sample_best = in_sharpe[in_best_index].values

Extract the out-sample Sharpe ratios

In [ ]:
out_sample_test = out_test_sharpe.values

Perform a t-test to compare in-sample and out-sample performance

In [ ]:
t, p = stats.ttest_ind(
    a=out_sample_test,
    b=in_sample_best,
    alternative="greater"
)

Check if the p-value is greater than 0.05 to determine statistical significance

In [ ]:
p > 0.05

Print the t-statistic and p-value

In [ ]:
t, p

Plot the out-sample performance

In [ ]:
out_sample.plot()

Create a DataFrame to store cross-validation results

In [ ]:
cv_results_df = pd.DataFrame({
    'in_sample_median': in_sharpe.groupby('split_idx').median().values,
    'out_sample_median': out_sharpe.groupby('split_idx').median().values,
})

Plot the cross-validation results

In [ ]:
color_schema = vbt.settings['plotting']['color_schema']
In [ ]:
cv_results_df.vbt.plot(
    trace_kwargs=[
        dict(line_color=color_schema['blue']),
        dict(line_color=color_schema['blue'], line_dash='dash'),
        dict(line_color=color_schema['orange']),
        dict(line_color=color_schema['orange'], line_dash='dash'),
    ]
)

PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advise. Use at your own risk.