482 KiB
import vectorbtpro as vbt import numpy as np import pandas as pd from itertools import product
vbt.settings.set_theme('dark')
Task / Learning¶
In this exercise, we want to create and understand a 3-dim-array to make a Volume Plot in VBT.
Cf. https://stackoverflow.com/a/63748235 Cf. https://vectorbt.pro/api/generic/plotting/
x = np.zeros((2,3,4)) Simply Means:
2 Sets, 3 Rows per Set, 4 Columns Example:
Input
x = np.zeros((2,3,4)) Output
Set # 1 ---- [[[ 0., 0., 0., 0.], ---- Row 1 [ 0., 0., 0., 0.], ---- Row 2 [ 0., 0., 0., 0.]], ---- Row 3
Set # 2 ---- [[ 0., 0., 0., 0.], ---- Row 1 [ 0., 0., 0., 0.], ---- Row 2 [ 0., 0., 0., 0.]]] ---- Row 3
Execution¶
Prepare sample backtest¶
Import Some real data
start = '2021-01-01 UTC' # crypto is in UTC end = '2021-12-31 UTC' timeframe = '1h' cols = ['Open', 'High', 'Low', 'Close', 'Volume'] ohlcv = vbt.BinanceData.fetch('BTCUSDT', start=start, end=end, timeframe=timeframe, limit=100000).get(cols)
0it [00:00, ?it/s]
Basic Function to test our parameters
def test_entry(close=ohlcv['Close'], exit_shift=1, fast_window=9, slow_window=50, wait=0): fast_ma = vbt.MA.run(close=ohlcv['Close'], window=fast_window) slow_ma = vbt.MA.run(close=ohlcv['Close'], window=slow_window) entries = fast_ma.ma_crossed_above(slow_ma, wait=wait) exits = entries.shift(exit_shift).astype(bool) pf = vbt.Portfolio.from_signals( close=close, entries=entries, exits=exits, size=100, size_type='value', init_cash='auto') return pf.stats([ 'total_return', 'win_rate', 'profit_factor', 'max_dd', 'total_trades' ])
Define the ranges of our three variables (shift, slow ma, fast ma)
exit_shift = range(5,20) # in our 3d plot, this will equal the x axis (15 values) slow_ma = range(40,60) # in our 3d plot, this will equal the y axis (20 values) fast_ma = range(7,27) # in our 3d plot, this will equal the z axis (20 values)
.. BTW: see above, cell 5 reg. the exit_shift variable.
- All it does, is "shifting" the entry signals x timestamps forward to generate the signal for exiting the trade.
Such technique is especially helpful to inspect the reliability and robustness of your entry logic without confusing it with exit indicators (want to know whether your profit comes from the entry or the exit?). That is to say, it gives you an idea, if and how frequently a move in the anticipated direction occurs after the entry signal has happened.
Lets run our backtest now!
th_combs = list(product(exit_shift, slow_ma, fast_ma)) comb_stats = [ test_entry(exit_shift=exit_shift, slow_window=slow_ma, fast_window=fast_ma) for exit_shift, slow_ma, fast_ma in th_combs ]
Create a Dataframe from our results and label it accordingly
comb_stats_df = pd.DataFrame(comb_stats) comb_stats_df.index = pd.MultiIndex.from_tuples( th_combs, names=['exit_shift', 'slow_ma', 'fast_ma'])
comb_stats_df
| Total Return [%] | Win Rate [%] | Profit Factor | Max Drawdown [%] | Total Trades | |||
|---|---|---|---|---|---|---|---|
| exit_shift | slow_ma | fast_ma | |||||
| 5 | 40 | 7 | 28.620706 | 52.727273 | 1.289403 | 12.039517 | 165 |
| 8 | 12.865810 | 51.592357 | 1.128907 | 14.661135 | 157 | ||
| 9 | 8.272885 | 51.333333 | 1.082172 | 15.812769 | 150 | ||
| 10 | 13.276195 | 48.591549 | 1.124026 | 17.905736 | 142 | ||
| 11 | 9.227458 | 48.905109 | 1.099671 | 22.467860 | 138 | ||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 19 | 59 | 22 | 17.943975 | 59.740260 | 1.195631 | 35.704668 | 77 |
| 23 | -2.045321 | 54.545455 | 0.977928 | 41.559362 | 77 | ||
| 24 | 16.698445 | 54.545455 | 1.177064 | 32.215622 | 77 | ||
| 25 | 26.545968 | 57.142857 | 1.280304 | 31.915274 | 77 | ||
| 26 | 41.837836 | 57.894737 | 1.456534 | 29.457537 | 76 |
6000 rows × 5 columns
Prepare DF for conversion into 3D-array¶
Since we can only display one metric, lets get rid of all columns except for 'Total Return [%]'
clean_df = comb_stats_df.drop(['Win Rate [%]', 'Profit Factor', 'Max Drawdown [%]', 'Total Trades'], axis=1)
... the same could be achieved via clean_df = comb_stats_df['Total Return [%]']
to get an idea of how to interact with the new DF, try for instance:
clean_df.index.names clean_df.index.get_level_values(2)
Int64Index([ 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
...
17, 18, 19, 20, 21, 22, 23, 24, 25, 26],
dtype='int64', name='fast_ma', length=6000)
.. and extract the pure array data
array_data = clean_df['Total Return [%]']
Thats it - we're ready for conversion.
3D Conversion¶
We can now create a 3d-array by using .to_numpy.
However, we need to further use .reshape(). Reshape() requires us to define, how the array_data should be allocated to x, y, and z. In our setup,
- x equals the shift
- y equals the slow ma
- z equals the fast ma
Lets do it:
three_d_array = clean_df['Total Return [%]'].to_numpy().reshape(15, 20, 20)
See above, cell 57 to get a clue, where this numbers (15, 20,20) come from..
Volume Plotting¶
To assign the values to the respective axes, we need to define input lists.
exit_shift_list = list(exit_shift) slow_ma_list = list(slow_ma) fast_ma_list = list(fast_ma)
.. and we are ready to go:
volume = vbt.Volume( data=three_d_array, # our 3dim-array, which we created above x_labels=exit_shift_list, # the lists that we created from our parameter ranges to label the values y_labels=slow_ma_list, z_labels=fast_ma_list, width=800, height=800, trace_kwargs=dict(colorscale="icefire", cmid=0), ## Cf. https://plotly.com/python/builtin-colorscales/, ## cmid=0 dives the color scale between posite and negative results scene=dict( xaxis=dict( title='Shift', showaxeslabels=True, ), yaxis=dict( title='Slow MA', showaxeslabels=True, ), zaxis=dict( title='Fast MA', #autorange=True, showaxeslabels=True, ), ), ) volume.fig.show()
Example for further analysis¶
The lower region does not look too bad for our simply strategy. But we might want to consider other metrics as well. Lets sort the dataframe by max drawdown.
drawdown_df = comb_stats_df.sort_values(by=['Max Drawdown [%]'], ascending=True)
Maybe we want to use leverage and are thus interested in strategies that do not exceed a 15% DD?
rslt_df = drawdown_df.loc[drawdown_df['Max Drawdown [%]'] < 15.0]
.. and chose the best performing out of such? According to profit factor?
rslt_df_pf = rslt_df.sort_values(by=['Profit Factor'], ascending=False)
pd.set_option('display.max_rows', 20) ## using display.max_rows you can extend the dataframe display! rslt_df_pf
| Total Return [%] | Win Rate [%] | Profit Factor | Max Drawdown [%] | Total Trades | |||
|---|---|---|---|---|---|---|---|
| exit_shift | slow_ma | fast_ma | |||||
| 11 | 53 | 11 | 63.973228 | 55.000000 | 1.973216 | 9.656734 | 101 |
| 8 | 59 | 9 | 55.996476 | 60.396040 | 1.956220 | 10.207960 | 101 |
| 53 | 11 | 51.522183 | 59.000000 | 1.949297 | 10.397894 | 101 | |
| 9 | 54 | 11 | 55.580395 | 53.061224 | 1.943820 | 9.979441 | 99 |
| 8 | 58 | 10 | 54.401509 | 59.405941 | 1.938970 | 7.701408 | 101 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 5 | 51 | 13 | -3.241266 | 44.444444 | 0.944172 | 13.009315 | 100 |
| 50 | 23 | -3.665415 | 46.938776 | 0.926288 | 14.002172 | 98 | |
| 13 | -5.637548 | 47.000000 | 0.903764 | 13.811262 | 101 | ||
| 51 | 22 | -6.530703 | 48.979592 | 0.868143 | 11.870502 | 98 | |
| 6 | 51 | 22 | -6.985558 | 46.938776 | 0.866175 | 14.173559 | 98 |
909 rows × 5 columns
Or better in terms of total return?
rslt_df_pf_2 = rslt_df_pf.sort_values(by=['Total Return [%]'], ascending=False)
rslt_df_pf_2
| Total Return [%] | Win Rate [%] | Profit Factor | Max Drawdown [%] | Total Trades | |||
|---|---|---|---|---|---|---|---|
| exit_shift | slow_ma | fast_ma | |||||
| 12 | 51 | 7 | 75.931181 | 57.627119 | 1.817858 | 12.950735 | 119 |
| 11 | 57 | 8 | 69.868964 | 56.310680 | 1.917144 | 9.424025 | 103 |
| 51 | 7 | 69.634861 | 53.781513 | 1.769494 | 11.517904 | 120 | |
| 57 | 7 | 67.116657 | 56.190476 | 1.860671 | 9.742177 | 105 | |
| 59 | 7 | 66.624758 | 55.339806 | 1.861782 | 7.539600 | 103 | |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 5 | 51 | 13 | -3.241266 | 44.444444 | 0.944172 | 13.009315 | 100 |
| 50 | 23 | -3.665415 | 46.938776 | 0.926288 | 14.002172 | 98 | |
| 13 | -5.637548 | 47.000000 | 0.903764 | 13.811262 | 101 | ||
| 51 | 22 | -6.530703 | 48.979592 | 0.868143 | 11.870502 | 98 | |
| 6 | 51 | 22 | -6.985558 | 46.938776 | 0.866175 | 14.173559 | 98 |
909 rows × 5 columns
The first result looks promosing, both in terms of total return and max DD. This is
- exit_shift: 10
- slow_ma = 59
- fast_ma = 7
Example Cross-verification¶
But is this reliable? Or are we just overfitting? Lets find out!
Grab some data for another time frame and add another symbol
start = '2016-01-01 UTC' # crypto is in UTC end = '2020-12-31 UTC' timeframe = '1h' cols = ['Open', 'High', 'Low', 'Close', 'Volume'] data = vbt.BinanceData.fetch(['BTCUSDT', 'ETHUSDT'], start=start, end=end, timeframe=timeframe, limit=1000000)
0%| | 0/2 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/data/base.py:538: UserWarning: Symbols have mismatching index. Setting missing data points to NaN.
BTC_Close = data.data['BTCUSDT']['Close'] ETH_Close = data.data['ETHUSDT']['Close'] close = data.get('Close') # this applies for both symbols!
Lets run our best combination!
fast_ma = vbt.MA.run(close=close, window=7) # value from our best test slow_ma = vbt.MA.run(close=close, window=59) # value from our best test entries = fast_ma.ma_crossed_above(slow_ma, wait=0) exits = entries.shift(10).astype(bool) # value shift from our best test. pf = vbt.Portfolio.from_signals( close=close, entries=entries, exits=exits, #slippage=0.00055, sl_stop=0.0075, #fees=0.0006, size=1000, size_type='value', init_cash='auto' )
pf.stats()
/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'sharpe_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'calmar_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'omega_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'sortino_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/base/wrapping.py:1285: UserWarning: Couldn't parse the frequency of index. Pass it as `freq` or define it globally under `settings.wrapping`. /tmp/ipykernel_116506/3705677322.py:1: UserWarning: Object has multiple columns. Aggregated some metrics using <function mean at 0x7f32e0069c10>. Pass column to select a single column/group.
Start 2017-08-17 04:00:00+00:00
End 2020-12-30 23:00:00+00:00
Period 29494
Start Value 1010.722202
Min Value 997.435799
...
Avg Losing Trade [%] -0.643434
Avg Winning Trade Duration 9.952336
Avg Losing Trade Duration 5.340928
Profit Factor 1.978824
Expectancy 3.704043
Name: agg_stats, Length: 25, dtype: object
btc_stats = pf.stats(column=0) eth_stats = pf.stats(column=1)
/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'sharpe_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'calmar_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'omega_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'sortino_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'sharpe_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'calmar_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'omega_ratio' requires frequency to be set /home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning: Metric 'sortino_ratio' requires frequency to be set
btc_stats
Start 2017-08-17 04:00:00+00:00
End 2020-12-30 23:00:00+00:00
Period 29494
Start Value 1013.944405
Min Value 999.281598
...
Avg Losing Trade [%] -0.63482
Avg Winning Trade Duration 10.025478
Avg Losing Trade Duration 5.718894
Profit Factor 1.787723
Expectancy 2.893693
Name: 0, Length: 25, dtype: object
Doesn't look too bad! We dont beat the benchmark - however, we are only invested around 9.5 percent of the time and, more strikingly, we yielded a positive return on this data sample as well!
eth_stats
Start 2017-08-17 04:00:00+00:00
End 2020-12-30 23:00:00+00:00
Period 29494
Start Value 1007.5
Min Value 995.59
...
Avg Losing Trade [%] -0.652049
Avg Winning Trade Duration 9.879195
Avg Losing Trade Duration 4.962963
Profit Factor 2.169926
Expectancy 4.514393
Name: 1, Length: 25, dtype: object
.. and look at this! It works for ETH as well. In this case, we even outperformed the benchmark return - while being invested only 8.6 percent of the time.
.. and remember: this is just a simple crossover strategy with fixed exit after 10 bars.
Nice feature: Lets look at a heatmap to see how the returns are distributed and to spot correlations.
pf.returns_acc.resample("M").ts_heatmap()
FigureWidget({
'data': [{'colorscale': [[0.0, '#0d0887'], [0.1111111111111111, '#46039f'],
…