strategy-lab/3d_array_and_volume_plot_v1.1-8BA47.ipynb at d718ed61bd41f290bcc40430414eb8867e2b579a

Files

David Brazda e3da60c647 daily update

2024-10-21 20:57:56 +02:00

482 KiB

Raw Blame History

In [1]:

import vectorbtpro as vbt
import numpy as np
import pandas as pd
from itertools import product

In [2]:

vbt.settings.set_theme('dark')

Task / Learning¶

In this exercise, we want to create and understand a 3-dim-array to make a Volume Plot in VBT.

Cf. https://stackoverflow.com/a/63748235 Cf. https://vectorbt.pro/api/generic/plotting/

x = np.zeros((2,3,4)) Simply Means:

2 Sets, 3 Rows per Set, 4 Columns Example:

Input

x = np.zeros((2,3,4)) Output

Set # 1 ---- [[[ 0., 0., 0., 0.], ---- Row 1 [ 0., 0., 0., 0.], ---- Row 2 [ 0., 0., 0., 0.]], ---- Row 3

Set # 2 ---- [[ 0., 0., 0., 0.], ---- Row 1 [ 0., 0., 0., 0.], ---- Row 2 [ 0., 0., 0., 0.]]] ---- Row 3

Execution¶

Prepare sample backtest¶

Import Some real data

In [3]:

start = '2021-01-01 UTC'  # crypto is in UTC
end = '2021-12-31 UTC'
timeframe = '1h'
cols = ['Open', 'High', 'Low', 'Close', 'Volume']

ohlcv = vbt.BinanceData.fetch('BTCUSDT', start=start, end=end, timeframe=timeframe, limit=100000).get(cols)

0it [00:00, ?it/s]

Basic Function to test our parameters

In [4]:

def test_entry(close=ohlcv['Close'], exit_shift=1, fast_window=9, slow_window=50, wait=0):
    fast_ma = vbt.MA.run(close=ohlcv['Close'], window=fast_window)
    slow_ma = vbt.MA.run(close=ohlcv['Close'], window=slow_window)
    entries = fast_ma.ma_crossed_above(slow_ma, wait=wait)
    exits = entries.shift(exit_shift).astype(bool)
    pf = vbt.Portfolio.from_signals(
        close=close, 
        entries=entries, 
        exits=exits,
        size=100,
        size_type='value',
        init_cash='auto')
    return pf.stats([
        'total_return', 
        'win_rate', 
        'profit_factor',
        'max_dd',
        'total_trades'
        ])

Define the ranges of our three variables (shift, slow ma, fast ma)

In [5]:

exit_shift = range(5,20)            # in our 3d plot, this will equal the x axis (15 values)
slow_ma = range(40,60)              # in our 3d plot, this will equal the y axis (20 values)
fast_ma = range(7,27)               # in our 3d plot, this will equal the z axis (20 values)

.. BTW: see above, cell 5 reg. the exit_shift variable.

All it does, is "shifting" the entry signals x timestamps forward to generate the signal for exiting the trade.

Such technique is especially helpful to inspect the reliability and robustness of your entry logic without confusing it with exit indicators (want to know whether your profit comes from the entry or the exit?). That is to say, it gives you an idea, if and how frequently a move in the anticipated direction occurs after the entry signal has happened.

Lets run our backtest now!

In [6]:

th_combs = list(product(exit_shift, slow_ma, fast_ma))

comb_stats = [
    test_entry(exit_shift=exit_shift, slow_window=slow_ma, fast_window=fast_ma)
    for exit_shift, slow_ma, fast_ma in th_combs
    ]

Create a Dataframe from our results and label it accordingly

In [7]:

comb_stats_df = pd.DataFrame(comb_stats)

comb_stats_df.index = pd.MultiIndex.from_tuples(
    th_combs, 
    names=['exit_shift', 'slow_ma', 'fast_ma'])

In [8]:

comb_stats_df

Out[8]:

			Total Return [%]	Win Rate [%]	Profit Factor	Max Drawdown [%]	Total Trades
exit_shift	slow_ma	fast_ma
5	40	7	28.620706	52.727273	1.289403	12.039517	165
		8	12.865810	51.592357	1.128907	14.661135	157
		9	8.272885	51.333333	1.082172	15.812769	150
		10	13.276195	48.591549	1.124026	17.905736	142
		11	9.227458	48.905109	1.099671	22.467860	138
...	...	...	...	...	...	...	...
19	59	22	17.943975	59.740260	1.195631	35.704668	77
		23	-2.045321	54.545455	0.977928	41.559362	77
		24	16.698445	54.545455	1.177064	32.215622	77
		25	26.545968	57.142857	1.280304	31.915274	77
		26	41.837836	57.894737	1.456534	29.457537	76

6000 rows × 5 columns

Prepare DF for conversion into 3D-array¶

Since we can only display one metric, lets get rid of all columns except for 'Total Return [%]'

In [9]:

clean_df = comb_stats_df.drop(['Win Rate [%]', 'Profit Factor', 'Max Drawdown [%]', 'Total Trades'], axis=1)

... the same could be achieved via clean_df = comb_stats_df['Total Return [%]']

to get an idea of how to interact with the new DF, try for instance:

In [10]:

clean_df.index.names
clean_df.index.get_level_values(2)

Out[10]:

Int64Index([ 7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
            ...
            17, 18, 19, 20, 21, 22, 23, 24, 25, 26],
           dtype='int64', name='fast_ma', length=6000)

.. and extract the pure array data

In [11]:

array_data = clean_df['Total Return [%]']

Thats it - we're ready for conversion.

3D Conversion¶

We can now create a 3d-array by using .to_numpy.

However, we need to further use .reshape(). Reshape() requires us to define, how the array_data should be allocated to x, y, and z. In our setup,

x equals the shift
y equals the slow ma
z equals the fast ma

Lets do it:

In [12]:

three_d_array = clean_df['Total Return [%]'].to_numpy().reshape(15, 20, 20)

See above, cell 57 to get a clue, where this numbers (15, 20,20) come from..

Volume Plotting¶

To assign the values to the respective axes, we need to define input lists.

In [13]:

exit_shift_list = list(exit_shift)
slow_ma_list = list(slow_ma)
fast_ma_list = list(fast_ma)

.. and we are ready to go:

In [40]:

volume = vbt.Volume(
    data=three_d_array, # our 3dim-array, which we created above
    x_labels=exit_shift_list, # the lists that we created from our parameter ranges to label the values
    y_labels=slow_ma_list,
    z_labels=fast_ma_list,
    width=800, 
    height=800,
    trace_kwargs=dict(colorscale="icefire", cmid=0), 
    ## Cf. https://plotly.com/python/builtin-colorscales/, 
    ## cmid=0 dives the color scale between posite and negative results
    scene=dict(
      xaxis=dict(
        title='Shift',
        showaxeslabels=True,
      ),
      yaxis=dict(
        title='Slow MA',
        showaxeslabels=True,
      ),
      zaxis=dict(
        title='Fast MA',
        #autorange=True,
        showaxeslabels=True,
      ),
    ),
)
volume.fig.show()

Example for further analysis¶

The lower region does not look too bad for our simply strategy. But we might want to consider other metrics as well. Lets sort the dataframe by max drawdown.

In [15]:

drawdown_df = comb_stats_df.sort_values(by=['Max Drawdown [%]'], ascending=True)

Maybe we want to use leverage and are thus interested in strategies that do not exceed a 15% DD?

In [16]:

rslt_df = drawdown_df.loc[drawdown_df['Max Drawdown [%]'] < 15.0]

.. and chose the best performing out of such? According to profit factor?

In [17]:

rslt_df_pf = rslt_df.sort_values(by=['Profit Factor'], ascending=False)

In [18]:

pd.set_option('display.max_rows', 20) ## using display.max_rows you can extend the dataframe display!
rslt_df_pf

Out[18]:

			Total Return [%]	Win Rate [%]	Profit Factor	Max Drawdown [%]	Total Trades
exit_shift	slow_ma	fast_ma
11	53	11	63.973228	55.000000	1.973216	9.656734	101
8	59	9	55.996476	60.396040	1.956220	10.207960	101
8	53	11	51.522183	59.000000	1.949297	10.397894	101
9	54	11	55.580395	53.061224	1.943820	9.979441	99
8	58	10	54.401509	59.405941	1.938970	7.701408	101
...	...	...	...	...	...	...	...
5	51	13	-3.241266	44.444444	0.944172	13.009315	100
	50	23	-3.665415	46.938776	0.926288	14.002172	98
	50	13	-5.637548	47.000000	0.903764	13.811262	101
	51	22	-6.530703	48.979592	0.868143	11.870502	98
6	51	22	-6.985558	46.938776	0.866175	14.173559	98

909 rows × 5 columns

Or better in terms of total return?

In [19]:

rslt_df_pf_2 = rslt_df_pf.sort_values(by=['Total Return [%]'], ascending=False)

In [20]:

rslt_df_pf_2

Out[20]:

			Total Return [%]	Win Rate [%]	Profit Factor	Max Drawdown [%]	Total Trades
exit_shift	slow_ma	fast_ma
12	51	7	75.931181	57.627119	1.817858	12.950735	119
11	57	8	69.868964	56.310680	1.917144	9.424025	103
	51	7	69.634861	53.781513	1.769494	11.517904	120
	57	7	67.116657	56.190476	1.860671	9.742177	105
	59	7	66.624758	55.339806	1.861782	7.539600	103
...	...	...	...	...	...	...	...
5	51	13	-3.241266	44.444444	0.944172	13.009315	100
	50	23	-3.665415	46.938776	0.926288	14.002172	98
	50	13	-5.637548	47.000000	0.903764	13.811262	101
	51	22	-6.530703	48.979592	0.868143	11.870502	98
6	51	22	-6.985558	46.938776	0.866175	14.173559	98

909 rows × 5 columns

The first result looks promosing, both in terms of total return and max DD. This is

exit_shift: 10
slow_ma = 59
fast_ma = 7

Example Cross-verification¶

But is this reliable? Or are we just overfitting? Lets find out!

Grab some data for another time frame and add another symbol

In [21]:

start = '2016-01-01 UTC'  # crypto is in UTC
end = '2020-12-31 UTC'
timeframe = '1h'
cols = ['Open', 'High', 'Low', 'Close', 'Volume']

data = vbt.BinanceData.fetch(['BTCUSDT', 'ETHUSDT'], start=start, end=end, timeframe=timeframe, limit=1000000)

  0%|          | 0/2 [00:00<?, ?it/s]

0it [00:00, ?it/s]

0it [00:00, ?it/s]

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/data/base.py:538: UserWarning:

Symbols have mismatching index. Setting missing data points to NaN.

In [22]:

BTC_Close = data.data['BTCUSDT']['Close']
ETH_Close = data.data['ETHUSDT']['Close']
close = data.get('Close') # this applies for both symbols!

Lets run our best combination!

In [23]:

fast_ma = vbt.MA.run(close=close, window=7) # value from our best test
slow_ma = vbt.MA.run(close=close, window=59) # value from our best test

entries = fast_ma.ma_crossed_above(slow_ma, wait=0)

exits = entries.shift(10).astype(bool) # value shift from our best test.

pf = vbt.Portfolio.from_signals(
    close=close, 
    entries=entries, 
    exits=exits,
    #slippage=0.00055,
    sl_stop=0.0075,
    #fees=0.0006,
    size=1000,
    size_type='value',
    init_cash='auto'
)

In [24]:

pf.stats()

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sharpe_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'calmar_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'omega_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sortino_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/base/wrapping.py:1285: UserWarning:

Couldn't parse the frequency of index. Pass it as `freq` or define it globally under `settings.wrapping`.

/tmp/ipykernel_116506/3705677322.py:1: UserWarning:

Object has multiple columns. Aggregated some metrics using <function mean at 0x7f32e0069c10>. Pass column to select a single column/group.

Out[24]:

Start                         2017-08-17 04:00:00+00:00
End                           2020-12-30 23:00:00+00:00
Period                                            29494
Start Value                                 1010.722202
Min Value                                    997.435799
                                        ...            
Avg Losing Trade [%]                          -0.643434
Avg Winning Trade Duration                     9.952336
Avg Losing Trade Duration                      5.340928
Profit Factor                                  1.978824
Expectancy                                     3.704043
Name: agg_stats, Length: 25, dtype: object

In [25]:

btc_stats = pf.stats(column=0)
eth_stats = pf.stats(column=1)

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sharpe_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'calmar_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'omega_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sortino_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sharpe_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'calmar_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'omega_ratio' requires frequency to be set

/home/pdr0906pwr0407/anaconda3/lib/python3.9/site-packages/vectorbtpro/generic/stats_builder.py:461: UserWarning:

Metric 'sortino_ratio' requires frequency to be set

In [26]:

btc_stats

Out[26]:

Start                         2017-08-17 04:00:00+00:00
End                           2020-12-30 23:00:00+00:00
Period                                            29494
Start Value                                 1013.944405
Min Value                                    999.281598
                                        ...            
Avg Losing Trade [%]                           -0.63482
Avg Winning Trade Duration                    10.025478
Avg Losing Trade Duration                      5.718894
Profit Factor                                  1.787723
Expectancy                                     2.893693
Name: 0, Length: 25, dtype: object

Doesn't look too bad! We dont beat the benchmark - however, we are only invested around 9.5 percent of the time and, more strikingly, we yielded a positive return on this data sample as well!

In [27]:

eth_stats

Out[27]:

Start                         2017-08-17 04:00:00+00:00
End                           2020-12-30 23:00:00+00:00
Period                                            29494
Start Value                                      1007.5
Min Value                                        995.59
                                        ...            
Avg Losing Trade [%]                          -0.652049
Avg Winning Trade Duration                     9.879195
Avg Losing Trade Duration                      4.962963
Profit Factor                                  2.169926
Expectancy                                     4.514393
Name: 1, Length: 25, dtype: object

.. and look at this! It works for ETH as well. In this case, we even outperformed the benchmark return - while being invested only 8.6 percent of the time.

.. and remember: this is just a simple crossover strategy with fixed exit after 10 bars.

Nice feature: Lets look at a heatmap to see how the returns are distributed and to spot correlations.

In [42]:

pf.returns_acc.resample("M").ts_heatmap()

FigureWidget({
    'data': [{'colorscale': [[0.0, '#0d0887'], [0.1111111111111111, '#46039f'],
               …

482 KiB Raw Blame History Unescape Escape