Files
strategy-lab/to_explore/3d_array_and_volume_plot_v1.1-8BA47.ipynb
David Brazda e3da60c647 daily update
2024-10-21 20:57:56 +02:00

482 KiB
Raw Blame History

In [1]:
import vectorbtpro as vbt
import numpy as np
import pandas as pd
from itertools import product
In [2]:
vbt.settings.set_theme('dark')

Task / Learning

In this exercise, we want to create and understand a 3-dim-array to make a Volume Plot in VBT.

Cf. https://stackoverflow.com/a/63748235 Cf. https://vectorbt.pro/api/generic/plotting/

x = np.zeros((2,3,4)) Simply Means:

2 Sets, 3 Rows per Set, 4 Columns Example:

Input

x = np.zeros((2,3,4)) Output

Set # 1 ---- [[[ 0., 0., 0., 0.], ---- Row 1 [ 0., 0., 0., 0.], ---- Row 2 [ 0., 0., 0., 0.]], ---- Row 3

Set # 2 ---- [[ 0., 0., 0., 0.], ---- Row 1 [ 0., 0., 0., 0.], ---- Row 2 [ 0., 0., 0., 0.]]] ---- Row 3

Execution

Prepare sample backtest

Import Some real data

In [3]:
start = '2021-01-01 UTC'  # crypto is in UTC
end = '2021-12-31 UTC'
timeframe = '1h'
cols = ['Open', 'High', 'Low', 'Close', 'Volume']

ohlcv = vbt.BinanceData.fetch('BTCUSDT', start=start, end=end, timeframe=timeframe, limit=100000).get(cols)
0it [00:00, ?it/s]

Basic Function to test our parameters

In [4]:
def test_entry(close=ohlcv['Close'], exit_shift=1, fast_window=9, slow_window=50, wait=0):
    fast_ma = vbt.MA.run(close=ohlcv['Close'], window=fast_window)
    slow_ma = vbt.MA.run(close=ohlcv['Close'], window=slow_window)
    entries = fast_ma.ma_crossed_above(slow_ma, wait=wait)
    exits = entries.shift(exit_shift).astype(bool)
    pf = vbt.Portfolio.from_signals(
        close=close, 
        entries=entries, 
        exits=exits,
        size=100,
        size_type='value',
        init_cash='auto')
    return pf.stats([
        'total_return', 
        'win_rate', 
        'profit_factor',
        'max_dd',
        'total_trades'
        ])

Define the ranges of our three variables (shift, slow ma, fast ma)

In [5]:
exit_shift = range(5,20)            # in our 3d plot, this will equal the x axis (15 values)
slow_ma = range(40,60)              # in our 3d plot, this will equal the y axis (20 values)
fast_ma = range(7,27)               # in our 3d plot, this will equal the z axis (20 values)

.. BTW: see above, cell 5 reg. the exit_shift variable.

  • All it does, is "shifting" the entry signals x timestamps forward to generate the signal for exiting the trade.

Such technique is especially helpful to inspect the reliability and robustness of your entry logic without confusing it with exit indicators (want to know whether your profit comes from the entry or the exit?). That is to say, it gives you an idea, if and how frequently a move in the anticipated direction occurs after the entry signal has happened.

Lets run our backtest now!

In [6]:
th_combs = list(product(exit_shift, slow_ma, fast_ma))

comb_stats = [
    test_entry(exit_shift=exit_shift, slow_window=slow_ma, fast_window=fast_ma)
    for exit_shift, slow_ma, fast_ma in th_combs
    ]  

Create a Dataframe from our results and label it accordingly

In [7]:
comb_stats_df = pd.DataFrame(comb_stats)

comb_stats_df.index = pd.MultiIndex.from_tuples(
    th_combs, 
    names=['exit_shift', 'slow_ma', 'fast_ma'])
In [8]:
comb_stats_df
Out[8]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Total Return [%] Win Rate [%] Profit Factor Max Drawdown [%] Total Trades
exit_shift slow_ma fast_ma
5 40 7 28.620706 52.727273 1.289403 12.039517 165
8 12.865810 51.592357 1.128907 14.661135 157
9 8.272885 51.333333 1.082172 15.812769 150
10 13.276195 48.591549 1.124026 17.905736 142
11 9.227458 48.905109 1.099671 22.467860 138
... ... ... ... ... ... ... ...
19 59 22 17.943975 59.740260 1.195631 35.704668 77
23 -2.045321 54.545455 0.977928 41.559362 77
24 16.698445 54.545455 1.177064 32.215622 77
25 26.545968 57.142857 1.280304 31.915274 77
26 41.837836 57.894737 1.456534 29.457537 76

6000 rows × 5 columns

Prepare DF for conversion into 3D-array

Since we can only display one metric, lets get rid of all columns except for 'Total Return [%]'

In [9]:
clean_df = comb_stats_df.drop(['Win Rate [%]', 'Profit Factor', 'Max Drawdown [%]', 'Total Trades'], axis=1)

... the same could be achieved via clean_df = comb_stats_df['Total Return [%]']

to get an idea of how to interact with the new DF, try for instance:

In [10]:
clean_df.index.names
clean_df.index.get_level_values(2)
Out[10]:
Int64Index([ 7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
            ...
            17, 18, 19, 20, 21, 22, 23, 24, 25, 26],
           dtype='int64', name='fast_ma', length=6000)

.. and extract the pure array data

In [11]:
array_data = clean_df['Total Return [%]']

Thats it - we're ready for conversion.

3D Conversion

We can now create a 3d-array by using .to_numpy.

However, we need to further use .reshape(). Reshape() requires us to define, how the array_data should be allocated to x, y, and z. In our setup,

  • x equals the shift
  • y equals the slow ma
  • z equals the fast ma

Lets do it:

In [12]:
three_d_array = clean_df['Total Return [%]'].to_numpy().reshape(15, 20, 20)

See above, cell 57 to get a clue, where this numbers (15, 20,20) come from..

Volume Plotting

To assign the values to the respective axes, we need to define input lists.

In [13]:
exit_shift_list = list(exit_shift)
slow_ma_list = list(slow_ma)
fast_ma_list = list(fast_ma)

.. and we are ready to go:

In [40]:
volume = vbt.Volume(
    data=three_d_array, # our 3dim-array, which we created above
    x_labels=exit_shift_list, # the lists that we created from our parameter ranges to label the values
    y_labels=slow_ma_list,
    z_labels=fast_ma_list,
    width=800, 
    height=800,
    trace_kwargs=dict(colorscale="icefire", cmid=0), 
    ## Cf. https://plotly.com/python/builtin-colorscales/, 
    ## cmid=0 dives the color scale between posite and negative results
    scene=dict(
      xaxis=dict(
        title='Shift',
        showaxeslabels=True,
      ),
      yaxis=dict(
        title='Slow MA',
        showaxeslabels=True,
      ),
      zaxis=dict(
        title='Fast MA',
        #autorange=True,
        showaxeslabels=True,
      ),
    ),
)
volume.fig.show()

Example for further analysis

The lower region does not look too bad for our simply strategy. But we might want to consider other metrics as well. Lets sort the dataframe by max drawdown.

In [15]:
drawdown_df = comb_stats_df.sort_values(by=['Max Drawdown [%]'], ascending=True)

Maybe we want to use leverage and are thus interested in strategies that do not exceed a 15% DD?

In [16]:
rslt_df = drawdown_df.loc[drawdown_df['Max Drawdown [%]'] < 15.0] 

.. and chose the best performing out of such? According to profit factor?

In [17]:
rslt_df_pf = rslt_df.sort_values(by=['Profit Factor'], ascending=False)
In [18]:
pd.set_option('display.max_rows', 20) ## using display.max_rows you can extend the dataframe display!
rslt_df_pf
Out[18]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Total Return [%] Win Rate [%] Profit Factor Max Drawdown [%] Total Trades
exit_shift slow_ma fast_ma
11 53 11 63.973228 55.000000 1.973216 9.656734 101
8 59 9 55.996476 60.396040 1.956220 10.207960 101
53 11 51.522183 59.000000 1.949297 10.397894 101
9 54 11 55.580395 53.061224 1.943820 9.979441 99
8 58 10 54.401509 59.405941 1.938970 7.701408 101
... ... ... ... ... ... ... ...
5 51 13 -3.241266 44.444444 0.944172 13.009315 100
50 23 -3.665415 46.938776 0.926288 14.002172 98
13 -5.637548 47.000000 0.903764 13.811262 101
51 22 -6.530703 48.979592 0.868143 11.870502 98
6 51 22 -6.985558 46.938776 0.866175 14.173559 98

909 rows × 5 columns

Or better in terms of total return?

In [19]:
rslt_df_pf_2 = rslt_df_pf.sort_values(by=['Total Return [%]'], ascending=False)
In [20]:
rslt_df_pf_2
Out[20]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Total Return [%] Win Rate [%] Profit Factor Max Drawdown [%] Total Trades
exit_shift slow_ma fast_ma
12 51 7 75.931181 57.627119 1.817858 12.950735 119
11 57 8 69.868964 56.310680 1.917144 9.424025 103
51 7 69.634861 53.781513 1.769494 11.517904 120
57 7 67.116657 56.190476 1.860671 9.742177 105
59 7 66.624758 55.339806 1.861782 7.539600 103
... ... ... ... ... ... ... ...
5 51 13 -3.241266 44.444444 0.944172 13.009315 100
50 23 -3.665415 46.938776 0.926288 14.002172 98
13 -5.637548 47.000000 0.903764 13.811262 101
51 22 -6.530703 48.979592 0.868143 11.870502 98
6 51 22 -6.985558 46.938776 0.866175 14.173559 98

909 rows × 5 columns

The first result looks promosing, both in terms of total return and max DD. This is

  • exit_shift: 10
  • slow_ma = 59
  • fast_ma = 7

Example Cross-verification

But is this reliable? Or are we just overfitting? Lets find out!

Grab some data for another time frame and add another symbol

In [21]:
start = '2016-01-01 UTC'  # crypto is in UTC
end = '2020-12-31 UTC'
timeframe = '1h'
cols = ['Open', 'High', 'Low', 'Close', 'Volume']

data = vbt.BinanceData.fetch(['BTCUSDT', 'ETHUSDT'], start=start, end=end, timeframe=timeframe, limit=1000000)
  0%|          | 0/2 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/data/base.py:538: UserWarning:

Symbols have mismatching index. Setting missing data points to NaN.

In [22]:
BTC_Close = data.data['BTCUSDT']['Close']
ETH_Close = data.data['ETHUSDT']['Close']
close = data.get('Close') # this applies for both symbols!

Lets run our best combination!

In [23]:
fast_ma = vbt.MA.run(close=close, window=7) # value from our best test
slow_ma = vbt.MA.run(close=close, window=59) # value from our best test

entries = fast_ma.ma_crossed_above(slow_ma, wait=0)

exits = entries.shift(10).astype(bool) # value shift from our best test.

pf = vbt.Portfolio.from_signals(
    close=close, 
    entries=entries, 
    exits=exits,
    #slippage=0.00055,
    sl_stop=0.0075,
    #fees=0.0006,
    size=1000,
    size_type='value',
    init_cash='auto'
)
In [24]:
pf.stats()
/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sharpe_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'calmar_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'omega_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sortino_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/base/wrapping.py:1285: UserWarning:

Couldn't parse the frequency of index. Pass it as `freq` or define it globally under `settings.wrapping`.

/tmp/ipykernel_116506/3705677322.py:1: UserWarning:

Object has multiple columns. Aggregated some metrics using <function mean at 0x7f32e0069c10>. Pass column to select a single column/group.

Out[24]:
Start                         2017-08-17 04:00:00+00:00
End                           2020-12-30 23:00:00+00:00
Period                                            29494
Start Value                                 1010.722202
Min Value                                    997.435799
                                        ...            
Avg Losing Trade [%]                          -0.643434
Avg Winning Trade Duration                     9.952336
Avg Losing Trade Duration                      5.340928
Profit Factor                                  1.978824
Expectancy                                     3.704043
Name: agg_stats, Length: 25, dtype: object
In [25]:
btc_stats = pf.stats(column=0)
eth_stats = pf.stats(column=1)
/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sharpe_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'calmar_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'omega_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sortino_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sharpe_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'calmar_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'omega_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sortino_ratio' requires frequency to be set

In [26]:
btc_stats
Out[26]:
Start                         2017-08-17 04:00:00+00:00
End                           2020-12-30 23:00:00+00:00
Period                                            29494
Start Value                                 1013.944405
Min Value                                    999.281598
                                        ...            
Avg Losing Trade [%]                           -0.63482
Avg Winning Trade Duration                    10.025478
Avg Losing Trade Duration                      5.718894
Profit Factor                                  1.787723
Expectancy                                     2.893693
Name: 0, Length: 25, dtype: object

Doesn't look too bad! We dont beat the benchmark - however, we are only invested around 9.5 percent of the time and, more strikingly, we yielded a positive return on this data sample as well!

In [27]:
eth_stats
Out[27]:
Start                         2017-08-17 04:00:00+00:00
End                           2020-12-30 23:00:00+00:00
Period                                            29494
Start Value                                      1007.5
Min Value                                        995.59
                                        ...            
Avg Losing Trade [%]                          -0.652049
Avg Winning Trade Duration                     9.879195
Avg Losing Trade Duration                      4.962963
Profit Factor                                  2.169926
Expectancy                                     4.514393
Name: 1, Length: 25, dtype: object

.. and look at this! It works for ETH as well. In this case, we even outperformed the benchmark return - while being invested only 8.6 percent of the time.

.. and remember: this is just a simple crossover strategy with fixed exit after 10 bars.

Nice feature: Lets look at a heatmap to see how the returns are distributed and to spot correlations.

In [42]:
pf.returns_acc.resample("M").ts_heatmap()
FigureWidget({
    'data': [{'colorscale': [[0.0, '#0d0887'], [0.1111111111111111, '#46039f'],
               …