strategy-lab/Discretionary_Signal_Backtesting.ipynb at master

Files

David Brazda e3da60c647 daily update

2024-10-21 20:57:56 +02:00

708 KiB

Raw Permalink Blame History

Description¶

In this project we backtest historical signals from a Telegram Channel using VectorBT Pro.
Author(s): Oleg Polakow (VBT Pro Simulations), Dilip Rajkumar (Signal Extraction)

In [56]:

import numpy as np
import pandas as pd
from numba import njit
import vectorbtpro as vbt

vbt.settings.set_theme("dark")
vbt.settings['plotting']['layout']['width'] = 1280

Read Market Data and Telegram Signals (Extracted) Data¶

In [57]:

# Fetch data
def date_parser(timestamps):
    # First column are integer timestamps, parse them into DatetimeIndex
    return pd.to_datetime(timestamps, utc=True, unit="ms")

## Read OHLCV for XAUUSD downloaded from Dukascopy
data = vbt.CSVData.fetch("/Users/dilip.rajkumar/Documents/TeleGram_Signal_Extraction/data/xauusd_202109_202303.csv", date_parser=date_parser)
data.get()

Out[57]:

	open	high	low	close	volume
timestamp
2021-09-02 00:00:00+00:00	1813.2250	1813.732	1813.222	1813.5345	0.0417
2021-09-02 00:01:00+00:00	1813.5245	1813.742	1813.621	1813.4850	0.0374
2021-09-02 00:02:00+00:00	1813.5250	1813.802	1813.531	1813.4700	0.0313
2021-09-02 00:03:00+00:00	1813.5000	1813.742	1813.542	1813.3750	0.0301
2021-09-02 00:04:00+00:00	1813.3700	1813.772	1813.472	1813.5550	0.0449
...	...	...	...	...	...
2023-03-13 23:55:00+00:00	1911.9150	1912.305	1912.025	1912.1550	0.0079
2023-03-13 23:56:00+00:00	1912.1650	1912.315	1911.995	1911.8750	0.0048
2023-03-13 23:57:00+00:00	1911.8900	1912.385	1912.035	1912.1035	0.0145
2023-03-13 23:58:00+00:00	1912.1135	1912.715	1912.282	1912.5100	0.0194
2023-03-13 23:59:00+00:00	1912.5185	1912.695	1912.342	1912.1735	0.0272

119906 rows × 5 columns

In [58]:

# Fetch Telegram signals extracted from "Green Pips" ( https://t.me/forexbookspdf )
signal_data = vbt.CSVData.fetch("/Users/dilip.rajkumar/Documents/TeleGram_Signal_Extraction/data/TG_Extracted_Signals.csv", index_col=1)
print("Telegram Signal DF Shape:",signal_data.wrapper.shape)
signal_data.get()

Telegram Signal DF Shape: (543, 10)

Out[58]:

	id	message	Symbol	OrderType	EntryPrice	SL	TP1	TP2	TP3	TP4
date
2021-09-14 05:57:17+00:00	840	Gbpchf sell now @1.27590 - 1.27890\n\nSL: 1.28...	GBPCHF	SELL	1.2789	1.28	1.27	NaN	NaN	NaN
2021-09-14 05:57:30+00:00	841	Gbpaud sell now @1.88550 - 1.88750\n\nSL: 1.89...	GBPAUD	SELL	1.8875	1.89	1.88	NaN	NaN	NaN
2021-09-14 06:28:04+00:00	843	Eurjpy buy now @ 129.980\n\nSL: 129.580\nTP1: ...	EURJPY	BUY	129.9800	129.58	130.13	130.28	130.48	NaN
2021-09-14 08:39:48+00:00	844	Gbpjpy Sell now @ 152.650\n\nTP1: 152.500 (15...	GBPJPY	SELL	152.6500	153.05	152.50	152.35	152.15	NaN
2021-09-14 12:43:51+00:00	846	XAUUSD sell now@ 1792.4\n\nTP1: 1790.9 (15 pip...	XAUUSD	SELL	1792.4000	1796.40	1790.90	1789.40	1787.40	NaN
...	...	...	...	...	...	...	...	...	...	...
2023-03-13 10:17:39+00:00	4408	XAUUSD SELL 1885.00\nSL 1895\nTP 1882\nTP 18...	XAUUSD	SELL	1885.0000	1895.00	1882.00	1877.00	1865.00	1800.0
2023-03-13 13:08:21+00:00	4411	XAUUSD SELL STOP 1896.00\nSL 1906\nTP 1893\n...	XAUUSD	SELL STOP	1896.0000	1906.00	1893.00	1888.00	1870.00	1840.0
2023-03-13 14:36:09+00:00	4413	XAUUSD SELL 1907.00\nSL 1918\nTP 1903\nTP 1...	XAUUSD	SELL	1907.0000	1918.00	1903.00	1900.00	1890.00	1860.0
2023-03-15 06:27:23+00:00	4425	XAUUSD SELL STOP 1899.00\nSL 1910\nTP 1896\...	XAUUSD	SELL STOP	1899.0000	1910.00	1896.00	1890.00	1880.00	1850.0
2023-03-16 13:03:03+00:00	4432	XAUUSD SELL 1929.00\nSL 1940\nTP 1926\nTP 1...	XAUUSD	SELL	1929.0000	1940.00	1926.00	1920.00	1915.00	1870.0

543 rows × 10 columns

In [59]:

df = signal_data.get()
df.OrderType = df.OrderType.apply(lambda x: x.rstrip()) ## Remove empty white spaces at the end of the string using rstrip()
df.Symbol = df.Symbol.apply(lambda x: x.rstrip())
df[df['EntryPrice'] == 0] ## Checking for signals with NULL Entry Prices

Out[59]:

	id	message	Symbol	OrderType	EntryPrice	SL	TP1	TP2	TP3	TP4
date

In [60]:

df[df['EntryPrice'].isna()]

Out[60]:

	id	message	Symbol	OrderType	EntryPrice	SL	TP1	TP2	TP3	TP4
date

In [61]:

## Different Order Types present in the TeleGram Signal
print(df['OrderType'][(df['Symbol']=='XAUUSD')].unique())

['SELL' 'BUY' 'BUY STOP' 'SELL STOP']

In [62]:

## Check for BUY STOP Orders in XAUUSD Symbol
df[(df['Symbol']=='XAUUSD') & (df['OrderType'] == 'BUY STOP')]

Out[62]:

	id	message	Symbol	OrderType	EntryPrice	SL	TP1	TP2	TP3	TP4
date
2022-07-15 13:01:54+00:00	2837	XAUUSD BUY STOP 1710.00\nSL 1697\nTP 1713\nTP ...	XAUUSD	BUY STOP	1710.0	1697.0	1713.0	1719.0	1770.0	NaN
2022-08-11 07:59:19+00:00	3002	XAUUSD BUY STOP 1788.00 \nSL 1777\nTP 1791\nT...	XAUUSD	BUY STOP	1788.0	1777.0	1791.0	1797.0	1850.0	NaN
2022-08-31 12:24:05+00:00	3110	XAUUSD BUY STOP 1712.00 \nSL 1700\nTP 1715\nTP...	XAUUSD	BUY STOP	1712.0	1700.0	1715.0	1721.0	1780.0	NaN

Creation of various `NamedTuple`s¶

In [63]:

# Numba doesn't understand strings, thus create an enumerated type for stop types
from collections import namedtuple

# Create a type first
OrderTypeT = namedtuple("OrderTypeT", ["BUY", "SELL", "BUYSTOP", "SELLSTOP"])

# Then create a tuple of type: OrderTypeT
OrderType = OrderTypeT(*range(len(OrderTypeT._fields)))

print(OrderType)

OrderTypeT(BUY=0, SELL=1, BUYSTOP=2, SELLSTOP=3)

In [64]:

## You could have also created the named tuple with `typing.NamedTuple`, 
## but some versions of Numba don't like it, so to to access the 
## `field_names` and `typename` like this
print("NamedTuple Fields:", OrderTypeT._fields)
print("NamedTuple Name  :",   OrderTypeT.__name__)

NamedTuple Fields: ('BUY', 'SELL', 'BUYSTOP', 'SELLSTOP')
NamedTuple Name  : OrderTypeT

Mapping OrderTypes column of string values to integers in the OrderType namedtuple

In [65]:

def transform_signal_data(df: pd.DataFrame, symbol : str = 'XAUUSD'):
    '''Transform OrderType Column to numerical type for numba'''
    # Select only one symbol, the one we fetched the data for
    print("DF All Columns:",df.columns.tolist())
    df = df[df["Symbol"] == symbol]
    
    # Select columns of interest
    df = df.iloc[:, [0, 3, 4, 5, 6, 7, 8 , 9]]
    print("DF Sel Int. Columns:",df.columns.tolist())
    # Map order types using OrderType
    df["OrderType"] = df["OrderType"].map(lambda x: OrderType._fields.index(x.replace(" ", "")))
    
    # Some entry prices are zero
    df = df[df["EntryPrice"] > 0]
    
    return df

signal_data = signal_data.transform(transform_signal_data)

print("Final Signal DF Shape:",signal_data.wrapper.shape)

DF All Columns: ['id', 'message', 'Symbol', 'OrderType', 'EntryPrice', 'SL', 'TP1', 'TP2', 'TP3', 'TP4']
DF Sel Int. Columns: ['id', 'OrderType', 'EntryPrice', 'SL', 'TP1', 'TP2', 'TP3', 'TP4']
Final Signal DF Shape: (232, 8)

We have about 232 signals for XAUUSD

In [66]:

## OrderType column now remapped
signal_data.get()

Out[66]:

	id	OrderType	EntryPrice	SL	TP1	TP2	TP3	TP4
date
2021-09-14 12:43:51+00:00	846	1	1792.4	1796.4	1790.9	1789.4	1787.4	NaN
2021-09-15 04:24:47+00:00	854	0	1800.0	1797.5	1805.0	1810.0	NaN	NaN
2021-09-15 08:08:45+00:00	866	1	1802.5	1806.5	1801.0	1799.5	1797.5	NaN
2021-09-16 10:39:52+00:00	884	0	1780.0	NaN	1781.3	1783.3	NaN	NaN
2021-09-16 12:48:36+00:00	887	0	1762.3	1758.3	1763.8	1765.3	1767.3	NaN
...	...	...	...	...	...	...	...	...
2023-03-13 10:17:39+00:00	4408	1	1885.0	1895.0	1882.0	1877.0	1865.0	1800.0
2023-03-13 13:08:21+00:00	4411	3	1896.0	1906.0	1893.0	1888.0	1870.0	1840.0
2023-03-13 14:36:09+00:00	4413	1	1907.0	1918.0	1903.0	1900.0	1890.0	1860.0
2023-03-15 06:27:23+00:00	4425	3	1899.0	1910.0	1896.0	1890.0	1880.0	1850.0
2023-03-16 13:03:03+00:00	4432	1	1929.0	1940.0	1926.0	1920.0	1915.0	1870.0

232 rows × 8 columns

In [67]:

# Create named tuples which will act as containers for various arrays

# SignalInfo will contain signal information in a vbt-friendly format
# Rows in each array correspond to signals
SignalInfo = namedtuple(typename = "SignalInfo", field_names =  [
    "timestamp",  # 1d array with timestamps in nanosecond format (int64)
    "order_type",  # 1d array with order types in integer format (int64, see order_type_map)
    "entry_price",  # 1d array with entry price (float64)
    "sl",  # 2d array where columns are SL levels (float64)
    "tp",  # 2d array where columns are TP levels (float64)
])

# TempInfo will contain temporary information that will be written during backtesting
# You can imagine being buffer that we write and then access at a later time
# Rows in each array correspond to signals
TempInfo = namedtuple(typename = "TempInfo", field_names = [
    "ts_bar",           # 1d array with row indices of the timestamp where the signal was hit (int64)
    "entry_price_bar",  # 1d array with row indices where entry price was hit (int64)
    "sl_bar",  # 2d array with row indices where each SL level was hit, same shape as SignalInfo.sl (int64)
    "tp_bar",  # 2d array with row indices where each TP level was hit, same shape as SignalInfo.tp (int64)
])

In the TempInfo namedtuple above, the purpose of ts_bar is to mark the bar (i.e absolute index of the row) where we saw the signal for the first time, we could have also created a boolean array with True for "signal seen" but for more consistency we use this namedTuple format for storing more rich information. Basically it just a one-dimensional array where each element corresponds to a column (signal). Before the signal is hit, the value is -1. After that, it becomes the row index that never changes. Having this array is important because we want to skip our logic if there is no signal yet.

`order_func_nb` - A Numba Compiled Order Function¶

The order_func_nb will deal with the execution of various entry and exit orders for our signals along with the position management. For the order_func_nb below, here's what we will do:

Represent each signal as a separate column with its own starting capital
The order function is executed at each bar and column (signal in our case). If the current bar contains a signal, execute the signal logic
Order functions can issue only one order at bar, thus if multiple stops were hit, we will aggregate them
We will go all in and then gradually reduce the position based on the number of stops
Sometimes we may see two signals occuring right next to each other with different stop losses. To maintain the consistency in the number of rows (i.e levels). All these stops are stored in a single array, i.e. all stops across all signals must have the same number of rows (ladder steps/levels), thus we just pad the columns where there are less stops with NaN. You could have also run the simulation on each signal separately (would be slower a bit), in such a case you wouldn't need padding.
Finally, we will run this order function order_func_nb using Portfolio.from_order_func()

In [68]:

@njit
def has_data_nb(c):
    """
    Numba function to check whether OHLC is not NaN. 
    If any column O,H,L,C is NaN returns False else returns True
    """
    if np.isnan(vbt.pf_nb.select_nb(c, c.open)):
        return False
    if np.isnan(vbt.pf_nb.select_nb(c, c.high)):
        return False
    if np.isnan(vbt.pf_nb.select_nb(c, c.low)):
        return False
    if np.isnan(vbt.pf_nb.select_nb(c, c.close)):
        return False
    return True

## Wrapper function to call the vbt function
@njit
def check_price_hit_nb(c, price, hit_below, can_use_ohlc):
    # Numba function to check whether a price level was hit during this bar
    # Use hit_below = True to check against low and hit_below = False to check against high
    # If can_use_ohlc is False, will check only against the close price
    
    order_price, hit_on_open, hit = vbt.pf_nb.check_price_hit_nb(
        open = vbt.pf_nb.select_nb(c, c.open),  # OHLC are flexible arrays, always use select_nb!
        high = vbt.pf_nb.select_nb(c, c.high),
        low  = vbt.pf_nb.select_nb(c, c.low),
        close= vbt.pf_nb.select_nb(c, c.close),
        price=price,
        hit_below=hit_below,
        can_use_ohlc=can_use_ohlc
    )
    # Order price here isn't necessarily the price that has been hit
    # For example, if the price was hit before open, order price is set to the open price
    return order_price, hit

@njit(boundscheck=True)
def order_func_nb(c, signal_info, temp_info):  # first argument is context object, other are our namedTuple containers
    if not has_data_nb(c):
        # If this bar contains no data, skip it
        return vbt.pf_nb.order_nothing_nb()
    
    # Each column corresponds to a signal
    signal = c.col
    
    # Each row corresponds to a bar
    bar = c.i
    
    # Define various flags for pure convenience
    buy_market = signal_info.order_type[signal] == OrderType.BUY
    sell_market = signal_info.order_type[signal] == OrderType.SELL
    buy_stop = signal_info.order_type[signal] == OrderType.BUYSTOP
    sell_stop = signal_info.order_type[signal] == OrderType.SELLSTOP
    buy = buy_market or buy_stop
    # We have only `buy = buy_market or buy_stop` because Selling means not buying, 
    # so we only need to check whether it's a buy operation, and if not, 
    # it's automatically a sell operation (i.e. `sell = not buy`)
    
    # First, we need to check whether the current bar contains a signal
    can_use_ohlc = True
    if temp_info.ts_bar[signal] == -1:
        # Check whether the signal has been discovered
        # -1 means hasn't been discovered yet    
        if c.index[bar] == signal_info.timestamp[signal]:
            # If so, store the current row index in a temporary array
            # such that later we know that we already discovered a signal
            temp_info.ts_bar[signal] = bar

            # The signal has the granularity of seconds, thus it belongs somewhere in the bar
            # We need to notify the functions below that they cannot use fully OHLC information, only close
            # This is to avoid using prices that technically happened before the signal
            can_use_ohlc = False
        
    # Here comes the entry order
    if temp_info.ts_bar[signal] != -1:        
        # Then, check whether the entry order hasn't been executed
        if temp_info.entry_price_bar[signal] == -1:            
            # If so, execute the entry order
            if buy_market:
                # Buy market order (using closing price)
                # Store the current row index in a temporary array such that future bars know
                # that the order has already been executed
                temp_info.entry_price_bar[signal] = bar
                order_price = signal_info.entry_price[signal]
                return vbt.pf_nb.order_nb(np.inf, np.inf)  # size, price            
            if sell_market:
                # Sell market order (using closing price)
                temp_info.entry_price_bar[signal] = bar
                order_price = signal_info.entry_price[signal]
                return vbt.pf_nb.order_nb(-np.inf, np.inf)
            
            if buy_stop: # Buy stop order
                # A buy stop order is entered at a stop price above the current market price                
                # Since it's a pending order, we first need to check whether the entry price has been hit
                order_price, hit = check_price_hit_nb(c,
                                                      price=signal_info.entry_price[signal],
                                                      hit_below=False,
                                                      can_use_ohlc=can_use_ohlc)
                if hit: # If so, execute the order
                    temp_info.entry_price_bar[signal] = bar
                    return vbt.pf_nb.order_nb(np.inf, order_price)
                
            if sell_stop: # Sell stop order
                # A sell stop order is entered at a stop price below the current market price
                order_price, hit = check_price_hit_nb(c,
                                                      price=signal_info.entry_price[signal],
                                                      hit_below=True,
                                                      can_use_ohlc=can_use_ohlc)
                if hit:
                    temp_info.entry_price_bar[signal] = bar
                    return vbt.pf_nb.order_nb(-np.inf, order_price)
               
        # Here comes the stop order, i.e EXIT Order
        # Check whether the entry order has been executed
        if temp_info.entry_price_bar[signal] != -1:
            # We also need to check whether we're still in a position
            # in case stops have already closed out the position
            if c.last_position[signal] != 0:
                
                # If so, start with checking for potential SL orders
                # (remember that SL pessimistically comes before TP)
                # First, we need to know the number of potential and already executed SL levels
                # since we want to gradually reduce the position proportionally to the number of levels
                # For example, one signal may define [12.35, 12.29] and another [17.53, nan]
                n_sl_levels = 0
                n_sl_hits = 0
                sl_levels = signal_info.sl[signal]  # select 1d array from 2d array
                sl_bar = temp_info.sl_bar[signal]  # same here
                for k in range(len(sl_levels)):
                    if not np.isnan(sl_levels[k]):
                        n_sl_levels += 1
                    if sl_bar[k] != -1:
                        n_sl_hits += 1
                
                # We can execute only one order at the current bar
                # Thus, if the price crossed multiple SL levels, we need to pack them into one order
                # Since SL levels are guaranteed to be sorted, we will check the most distant levels first
                # because if a distant stop has been hit, the closer stops are automatically hit too
                for k in range(n_sl_levels - 1, n_sl_hits - 1, -1):
                    if not np.isnan(sl_levels[k]) and sl_bar[k] == -1:
                        # Check against low for buy orders and against high for sell orders
                        order_price, hit = check_price_hit_nb(c,
                                                              price=sl_levels[k],
                                                              hit_below=buy,
                                                              can_use_ohlc=can_use_ohlc)
                        if hit:
                            sl_bar[k] = bar
                            # The further away the stop is, the more of the position needs to be closed
                            # We will specify a target percentage
                            # For example, for two stops it would be 0.5 (SL1) and 0.0 (SL2)
                            # while for three stops it would be 0.66 (SL1), 0.33 (SL2), and 0.0 (SL3)
                            # This works only if we went all in before (size=np.inf)!
                            size = 1 - (k + 1) / n_sl_levels
                            size_type = vbt.pf_enums.SizeType.TargetPercent
                            if buy:
                                return vbt.pf_nb.order_nb(size, order_price, size_type)
                            else:
                                # Size must be negative for short positions
                                return vbt.pf_nb.order_nb(-size, order_price, size_type)
                        
                # Same for potential TP orders
                n_tp_levels = 0
                n_tp_hits = 0
                tp_levels = signal_info.tp[signal]
                tp_bar = temp_info.tp_bar[signal]
                for k in range(len(tp_levels)):
                    if not np.isnan(tp_levels[k]):
                        n_tp_levels += 1
                    if tp_bar[k] != -1:
                        n_tp_hits += 1
                
                for k in range(n_tp_levels - 1, n_tp_hits - 1, -1):
                    if not np.isnan(tp_levels[k]) and tp_bar[k] == -1:
                        # Check against high for buy orders and against low for sell orders
                        order_price, hit = check_price_hit_nb(c,
                                                              price=tp_levels[k],
                                                              hit_below=not buy,
                                                              can_use_ohlc=can_use_ohlc)
                        if hit:
                            tp_bar[k] = bar
                            size = 1 - (k + 1) / n_tp_levels
                            size_type = vbt.pf_enums.SizeType.TargetPercent
                            if buy:
                                return vbt.pf_nb.order_nb(size, order_price, size_type)
                            else:
                                return vbt.pf_nb.order_nb(-size, order_price, size_type)
                    
    # If neither of orders has been executed, order nothing
    return vbt.pf_nb.order_nothing_nb()

Partial Position Closure Illustration at multiple TP Levels¶

Case Study - 3 TP Levels

To make the requested code segment more clear, let's run it on some sample data. First, we will derive the number of TP levels that are defined in this signal and the number of levels that have been already hit:

In [69]:

tp_levels = np.array([10, 12, 14])  # stop prices
tp_bar = np.array([-1, -1, -1])  # row indices where each stop price was hit

n_tp_levels = 0
n_tp_hits = 0
for k in range(len(tp_levels)):
    if not np.isnan(tp_levels[k]):
        n_tp_levels += 1
    if tp_bar[k] != -1:
        n_tp_hits += 1

print("Nr of TP Levels:",n_tp_levels)
print("Nr. of TP Hits:",n_tp_hits)

Nr of TP Levels: 3
Nr. of TP Hits: 0

We see that initially none of the TP stop prices have been hit.

Next we want to create a loop that iterates over the TP stop prices in a reversed order. Order is reversed because if the last stop price has been hit, all the stop prices defined before it are hit automatically too. We want to iterate only over those prices that haven't been hit yet.

In [70]:

for k in range(n_tp_levels - 1, n_tp_hits - 1, -1):
    if not np.isnan(tp_levels[k]) and tp_bar[k] == -1:
        print("TP Level:",tp_levels[k])

TP Level: 14
TP Level: 12
TP Level: 10

Let's assume that the first stop price has been hit:

In [71]:

k = 0
size = 1 - (k + 1) / n_tp_levels
print("Nr. of TP Levels:", n_tp_levels)
print("Size:", size)

Nr. of TP Levels: 3
Size: 0.6666666666666667

The size represents a target percentage size, that is, we want the column value after the order to be 66.6% of the value that is right now. We basically remove 1/3 from the value by hitting the first stop price.

But what happens if our second stop price is hit instead?

In [72]:

k = 1
size = 1 - (k + 1) / n_tp_levels
print("Size:",size)

Size: 0.33333333333333337

We remove 2/3 from the value with this stop price. Finally, the last stop price is hit and the position is now closed.
We removed an equal chunk of value with each stop price using the above equation.

In [73]:

k = 2
size = 1 - (k + 1) / n_tp_levels
print("Size:", size)

Size: 0.0

Case Study - Two TP Levels¶

(NaN in a TP_Level)

Now let's say we only have a ladder with 2 TP levels but there are three rows because some other column is using three levels (in such a case the array is padded with nan):

In [74]:

tp_levels = np.array([10, 12, np.nan])  # stop prices
tp_bar = np.array([-1, -1, -1])  # row indices where each stop price was hit

n_tp_levels = 0
n_tp_hits = 0
for k in range(len(tp_levels)):
    if not np.isnan(tp_levels[k]):
        n_tp_levels += 1
    if tp_bar[k] != -1:
        n_tp_hits += 1

print("Nr of TP Levels:",n_tp_levels)
print("Nr. of TP Hits:",n_tp_hits)

Nr of TP Levels: 2
Nr. of TP Hits: 0

This code has correctly determined the number of levels in the ladder of this signal.

Let's say the first stop price is hit:

In [75]:

k = 0
size = 1 - (k + 1) / n_tp_levels
print(size)

0.5

In [76]:

k = 1
size = 1 - (k + 1) / n_tp_levels
print(size)

0.0

Since we have only two levels in the ladder, we now remove 50% from the value instead of 33.3%.

Matching Timestamps - minute to minute¶

In [77]:

# Prepare timestamp for signal information
timestamp = signal_data.index.values.astype(np.int64)  # nanoseconds
print(timestamp[:5])

[1631623431000000000 1631679887000000000 1631693325000000000
 1631788792000000000 1631796516000000000]

In [78]:

# Since the signals are of the second granularity while the data is of the minute granularity,
# we need to round the timestamp of the signal to the nearest minute
# Timestamps represent the opening time, thus the 59th second in "19:28:59" still belongs to the minute "19:28:00"

timestamp = timestamp - timestamp % vbt.dt_nb.m_ns
print(timestamp[:5])

[1631623380000000000 1631679840000000000 1631693280000000000
 1631788740000000000 1631796480000000000]

Each value above is a date represented in nanosecond format since Unix epoch (01-01-1970). Since they are just regular integers, we can do operations such as "modulo" as we did above, which just translates to "remove the remainder from dividing the timestamp by a minute" to effectively remove any seconds, milliseconds, microseconds, and nanoseconds from the minute we're currently in. Here, `vbt.dt_nb.m_ns` is the total number of nanoseconds in one minute.

Actual Simulation 🏃🏻‍♂️🎬¶

In [79]:

signal_data.get().head()

Out[79]:

	id	OrderType	EntryPrice	SL	TP1	TP2	TP3	TP4
date
2021-09-14 12:43:51+00:00	846	1	1792.4	1796.4	1790.9	1789.4	1787.4	NaN
2021-09-15 04:24:47+00:00	854	0	1800.0	1797.5	1805.0	1810.0	NaN	NaN
2021-09-15 08:08:45+00:00	866	1	1802.5	1806.5	1801.0	1799.5	1797.5	NaN
2021-09-16 10:39:52+00:00	884	0	1780.0	NaN	1781.3	1783.3	NaN	NaN
2021-09-16 12:48:36+00:00	887	0	1762.3	1758.3	1763.8	1765.3	1767.3	NaN

In [80]:

order_type = signal_data.get("OrderType").values
entry_price = signal_data.get("EntryPrice").values
sl = signal_data.get("SL").values
tp1 = signal_data.get("TP1").values
tp2 = signal_data.get("TP2").values
tp3 = signal_data.get("TP3").values
tp4 = signal_data.get("TP4").values

n_signals = len(timestamp)
print('Total nr. of Signals:',n_signals)
# Create a named tuple for signal information

## Feed the above created arrays into the namedtuple
signal_info = SignalInfo(
    timestamp=timestamp,
    order_type=order_type,
    entry_price=entry_price,
    sl=np.column_stack((sl,)),
    tp=np.column_stack((tp1, tp2, tp3, tp4))
)

n_sl_levels = signal_info.sl.shape[1]
print("Nr. of SL Levels:",n_sl_levels)

n_tp_levels = signal_info.tp.shape[1]
print("Nr. of TP Levels:",n_tp_levels)

Total nr. of Signals: 232
Nr. of SL Levels: 1
Nr. of TP Levels: 4

In [81]:

# Important: re-run this cell every time you're running the simulation!
# Create a named tuple for temporary information
# All arrays below hold row indices, thus the default value is -1

def build_temp_info(signal_info):
    return TempInfo(
        ts_bar=np.full(len(signal_info.timestamp), -1),
        entry_price_bar=np.full(len(signal_info.timestamp), -1),
        sl_bar=np.full(signal_info.sl.shape, -1),
        tp_bar=np.full(signal_info.tp.shape, -1)
    )

temp_info = build_temp_info(signal_info)

Why re-run build_temp_info?
Temporary information gets overridden during the simulation (it acts as a memory where signal functions from the future access information written by the signal functions from the past), and you don't want to use dirty arrays in the next simulation, so we have to re-run this build_temp_info function everytime for each simulation

In [82]:

# By default, vectorBT initializes an empty order array of the same shape as data
# But since our data is highly granular, it would take a lot of RAM
# Let's limit the number of records to one entry order and the maximum number of SL and TP orders
# It will be applied per column

## The 1 below is for Entry Order
max_orders = 1 + n_sl_levels + n_tp_levels

# It's the maximum number of orders per column (i.e per signal)
print("Nr. of SL Levels:", n_sl_levels)
print("Nr. of TP Levels:", n_tp_levels)
print("Maximum Orders:",max_orders)

Nr. of SL Levels: 1
Nr. of TP Levels: 4
Maximum Orders: 6

In [83]:

# Perform the actual simulation
# Since we don't broadcast data against any other array, vectorbt doesn't know anything about
# our signal arrays and will simulate only the one column in our data
# Thus, we need to tell it to expand the number of columns by the number of signals using tiling
# But don't worry: thanks to flexible indexing vectorbt won't actually tile the data - good for RAM!
# (it would tile the data if it had multiple columns though!)

pf = vbt.Portfolio.from_order_func(
    data,
    order_func_nb=order_func_nb,
    order_args=(signal_info, temp_info),
    broadcast_kwargs=dict(tile=n_signals),  # tiling here
    max_orders=max_orders,
    freq="minute"  # we have an irregular one-minute frequency
)
# (may take a minute...)

tiling in broadcast_kwargs argument in below vbt.Portfolio.from_order_func function
Since we don't broadcast data against any other array, vectorbt doesn't know anything about our signal arrays and will simulate only the one column that is in our data. For example, if data has shape (500, 1), then the simulation will only run on one column. Thus, we need to tell it to expand the number of columns to the number of signals, which is as simple as providing the tile argument to the broadcaster. Under the hood, it will replace our example shape of (500, 1) with (500, 232).
Also note that you can pass an index instead of a number. For example, you can pass signal_info.timestamp or telegram message IDs as pd.Index so they become the column names in the new portfolio.

In [84]:

# Let's print out the order records in a human-readable format
pf.orders.records_readable

Out[84]:

	Order Id	Column	Index	Size	Price	Fees	Side
0	0	0	2021-09-14 12:43:00+00:00	0.055830	1791.160	0.0	Sell
1	1	0	2021-09-14 12:44:00+00:00	0.018610	1790.900	0.0	Buy
2	2	0	2021-09-14 12:50:00+00:00	0.018591	1789.400	0.0	Buy
3	3	0	2021-09-14 12:55:00+00:00	0.018629	1796.400	0.0	Buy
4	0	5	2021-09-17 11:00:00+00:00	0.056762	1761.727	0.0	Buy
...	...	...	...	...	...	...	...
150	3	226	2023-03-13 00:00:00+00:00	0.027003	1877.865	0.0	Buy
151	0	227	2023-03-13 10:17:00+00:00	0.053066	1884.455	0.0	Sell
152	1	227	2023-03-13 10:47:00+00:00	0.013174	1882.000	0.0	Buy
153	2	227	2023-03-13 12:29:00+00:00	0.039892	1895.000	0.0	Buy
154	0	229	2023-03-13 14:36:00+00:00	0.052468	1905.935	0.0	Sell

155 rows × 7 columns

Calculation of size using order_func_nb
We treat our columns as independent backtests and assign to each backtest $100 of capital. By default, without specifying the size, vbt will use the entire cash (that is, size of infinity). Since each signal is executed at a different point of price history, absolute size is different across all signals, but they are using the same amount of cash which makes them perfectly comparable during the simulation phase.

Creating the StopType Column¶

A position can be stopped out in one of two scenarios, either it hits a Stop Loss or it hits a TakeProfit. We collectively, frame these two scenarios as a StopOrder Type

In [85]:

# We can notice above that there's no information whether an order is an SL or TP order
# What we can do is to create our own order records with custom fields, copy the old ones over,
# and tell the portfolio to use them instead of the default ones

# First, we need to create an enumerated field for stop types
# SL levels will come first, TP levels second, in an incremental fashion
StopTypeT = namedtuple("StopTypeT", [
    *[f"SL{i + 1}" for i in range(n_sl_levels)],
    *[f"TP{i + 1}" for i in range(n_tp_levels)]
])
StopType = StopTypeT(*range(len(StopTypeT._fields)))

print(StopType)

StopTypeT(SL1=0, TP1=1, TP2=2, TP3=3, TP4=4)

In [86]:

# To extend order records, we just need to append new fields and construct a new data type
custom_order_dt = np.dtype(vbt.pf_enums.order_fields + [("order_type", np.int_), ("stop_type", np.int_)])

def fix_order_records(order_records, signal_info, temp_info):
    # This is a function that will "fix" our default records and return the fixed ones
    # Create a new empty record array with the new data type
    # Empty here means that the array isn't initialized yet and contains junk data
    # Thus, make sure to override each single element
    custom_order_records = np.empty(order_records.shape, dtype=custom_order_dt)
    
    # Copy over the information from our default records
    for field, _ in vbt.pf_enums.order_fields:
        custom_order_records[field] = order_records[field]
        
    # Iterate over the new records and fill the stop type
    for i in range(len(custom_order_records)):
        record = custom_order_records[i]
        signal = record["col"]  # each column corresponds to a signal
        
        # Fill the order type
        record["order_type"] = signal_info.order_type[signal]
        
        # Concatenate SL and TP row indices of this signal into a new list
        # We must do it the same way as we did in StopTypeT
        bar = [
            *temp_info.sl_bar[signal],
            *temp_info.tp_bar[signal]
        ]
        
        # Check whether the row index of this order is in this list
        # (which means that this order is a stop order)
        if record["idx"] in bar:
            # If so, get the matching position in this list and use it as order type
            # It will correspond to a field in StopType
            record["stop_type"] = bar.index(record["idx"])
        else:
            record["stop_type"] = -1
    return custom_order_records
            
custom_order_records = fix_order_records(pf.order_records, signal_info, temp_info)
custom_order_records[:10]

Out[86]:

array([(0,  0, 4663, 0.05582974, 1791.16  , 0., 1, 1, -1),
       (1,  0, 4664, 0.01860971, 1790.9   , 0., 0, 1,  1),
       (2,  0, 4670, 0.01859054, 1789.4   , 0., 0, 1,  2),
       (3,  0, 4675, 0.01862949, 1796.4   , 0., 0, 1,  0),
       (0,  5, 5940, 0.05676248, 1761.727 , 0., 0, 0, -1),
       (1,  5, 6054, 0.05676248, 1755.    , 0., 1, 0,  0),
       (0, 21, 8746, 0.05683987, 1759.3285, 0., 1, 1, -1),
       (1, 21, 9180, 0.05683987, 1791.78  , 0., 0, 1,  0),
       (0, 27, 9663, 0.05568476, 1795.8235, 0., 1, 1, -1),
       (1, 27, 9782, 0.05568476, 1798.8   , 0., 0, 1,  0)],
      dtype=[('id', '<i8'), ('col', '<i8'), ('idx', '<i8'), ('size', '<f8'), ('price', '<f8'), ('fees', '<f8'), ('side', '<i8'), ('order_type', '<i8'), ('stop_type', '<i8')])

In [87]:

# Having raw order records is not enough as vbt.Orders doesn't know what to do with the new field
# (remember that vbt.Orders is used to analyze the records)
# Let's create our custom class that subclasses vbt.Orders
# and override the field config to also include the information on the new field

from vectorbtpro.records.decorators import attach_fields, override_field_config

@attach_fields(dict(
    order_type=dict(attach_filters=True),
    stop_type=dict(attach_filters=True)
))
@override_field_config(dict(
    dtype=custom_order_dt,  # specify the new data type
    settings=dict(
        order_type=dict(
            title="Order Type",  # specify a human-readable title for the field
            mapping=OrderType,  # specify the mapper for the field
        ),
        stop_type=dict(
            title="Stop Type",  # specify a human-readable title for the field
            mapping=StopType,  # specify the mapper for the field
        ),
    )
))
class CustomOrders(vbt.Orders):
    pass

An Orders class basically represents the raw order data in a more analysis-friendly fashion. For example, order records have a field "size", which can be analyzed by querying `pf.orders.size`. Since we've got new fields, we want to attach them in the same way. This is easily done by using two decorators: "attach_fields" creates properties around new fields, such as for example `pf.orders.side_buy` is an automatically-created property to filter only the records with the side "Buy". The second decorator "override_field_config" allows us to describe the fields and make them human-readable, for example, "title" is the name of the field whenever the user prints out `pf.orders.records_readable`.

In [88]:

# Finally, let's replace the order records and the class in the portfolio
pf = pf.replace(order_records=custom_order_records, orders_cls=CustomOrders)

In [89]:

# We can now effortlessly analyze the stop type
pf.orders.records_readable

Out[89]:

	Order Id	Column	Index	Size	Price	Fees	Side	Order Type	Stop Type
0	0	0	2021-09-14 12:43:00+00:00	0.055830	1791.160	0.0	Sell	SELL	None
1	1	0	2021-09-14 12:44:00+00:00	0.018610	1790.900	0.0	Buy	SELL	TP1
2	2	0	2021-09-14 12:50:00+00:00	0.018591	1789.400	0.0	Buy	SELL	TP2
3	3	0	2021-09-14 12:55:00+00:00	0.018629	1796.400	0.0	Buy	SELL	SL1
4	0	5	2021-09-17 11:00:00+00:00	0.056762	1761.727	0.0	Buy	BUY	None
...	...	...	...	...	...	...	...	...	...
150	3	226	2023-03-13 00:00:00+00:00	0.027003	1877.865	0.0	Buy	SELLSTOP	SL1
151	0	227	2023-03-13 10:17:00+00:00	0.053066	1884.455	0.0	Sell	SELL	None
152	1	227	2023-03-13 10:47:00+00:00	0.013174	1882.000	0.0	Buy	SELL	TP1
153	2	227	2023-03-13 12:29:00+00:00	0.039892	1895.000	0.0	Buy	SELL	SL1
154	0	229	2023-03-13 14:36:00+00:00	0.052468	1905.935	0.0	Sell	SELL	None

155 rows × 9 columns

In [90]:

## Selecting only BUY Side orders
pf.orders.side_buy.records_readable

Out[90]:

	Order Id	Column	Index	Size	Price	Fees	Side	Order Type	Stop Type
0	1	0	2021-09-14 12:44:00+00:00	0.018610	1790.900	0.0	Buy	SELL	TP1
1	2	0	2021-09-14 12:50:00+00:00	0.018591	1789.400	0.0	Buy	SELL	TP2
2	3	0	2021-09-14 12:55:00+00:00	0.018629	1796.400	0.0	Buy	SELL	SL1
3	0	5	2021-09-17 11:00:00+00:00	0.056762	1761.727	0.0	Buy	BUY	None
4	1	21	2021-10-14 00:00:00+00:00	0.056840	1791.780	0.0	Buy	SELL	SL1
...	...	...	...	...	...	...	...	...	...
89	1	226	2023-03-10 16:05:00+00:00	0.013292	1862.000	0.0	Buy	SELLSTOP	TP1
90	2	226	2023-03-10 16:19:00+00:00	0.013324	1857.000	0.0	Buy	SELLSTOP	TP2
91	3	226	2023-03-13 00:00:00+00:00	0.027003	1877.865	0.0	Buy	SELLSTOP	SL1
92	1	227	2023-03-13 10:47:00+00:00	0.013174	1882.000	0.0	Buy	SELL	TP1
93	2	227	2023-03-13 12:29:00+00:00	0.039892	1895.000	0.0	Buy	SELL	SL1

94 rows × 9 columns

In [91]:

# And here are the signals that correspond to these records for verification
signal_data.get()

Out[91]:

	id	OrderType	EntryPrice	SL	TP1	TP2	TP3	TP4
date
2021-09-14 12:43:51+00:00	846	1	1792.4	1796.4	1790.9	1789.4	1787.4	NaN
2021-09-15 04:24:47+00:00	854	0	1800.0	1797.5	1805.0	1810.0	NaN	NaN
2021-09-15 08:08:45+00:00	866	1	1802.5	1806.5	1801.0	1799.5	1797.5	NaN
2021-09-16 10:39:52+00:00	884	0	1780.0	NaN	1781.3	1783.3	NaN	NaN
2021-09-16 12:48:36+00:00	887	0	1762.3	1758.3	1763.8	1765.3	1767.3	NaN
...	...	...	...	...	...	...	...	...
2023-03-13 10:17:39+00:00	4408	1	1885.0	1895.0	1882.0	1877.0	1865.0	1800.0
2023-03-13 13:08:21+00:00	4411	3	1896.0	1906.0	1893.0	1888.0	1870.0	1840.0
2023-03-13 14:36:09+00:00	4413	1	1907.0	1918.0	1903.0	1900.0	1890.0	1860.0
2023-03-15 06:27:23+00:00	4425	3	1899.0	1910.0	1896.0	1890.0	1880.0	1850.0
2023-03-16 13:03:03+00:00	4432	1	1929.0	1940.0	1926.0	1920.0	1915.0	1870.0

232 rows × 8 columns

In [92]:

pf.orders.count()

Out[92]:

0      4
1      0
2      0
3      0
4      0
      ..
227    3
228    0
229    1
230    0
231    0
Name: count, Length: 232, dtype: int64

Filtering Telegram Signals which got skipped¶

If we run np.flatnonzero(pf.orders.count() == 0) we can get the rows of the signals that were skipped. Since we have the ID column of the Telegram messages, we use these indices to select signal data, like this signal_data.get().iloc[np.flatnonzero(pf.orders.count() == 0)].

In [93]:

signal_data.get().iloc[np.flatnonzero(pf.orders.count() == 0)]

Out[93]:

	id	OrderType	EntryPrice	SL	TP1	TP2	TP3	TP4
date
2021-09-15 04:24:47+00:00	854	0	1800.0	1797.5	1805.0	1810.0	NaN	NaN
2021-09-15 08:08:45+00:00	866	1	1802.5	1806.5	1801.0	1799.5	1797.5	NaN
2021-09-16 10:39:52+00:00	884	0	1780.0	NaN	1781.3	1783.3	NaN	NaN
2021-09-16 12:48:36+00:00	887	0	1762.3	1758.3	1763.8	1765.3	1767.3	NaN
2021-09-21 13:42:40+00:00	942	1	1775.0	1776.0	1769.0	1766.0	NaN	NaN
...	...	...	...	...	...	...	...	...
2023-03-08 15:15:01+00:00	4381	1	1820.0	1830.0	1817.0	1812.0	1805.0	1770.0
2023-03-08 15:38:05+00:00	4382	1	1823.0	1830.0	1820.0	1815.0	1805.0	1770.0
2023-03-13 13:08:21+00:00	4411	3	1896.0	1906.0	1893.0	1888.0	1870.0	1840.0
2023-03-15 06:27:23+00:00	4425	3	1899.0	1910.0	1896.0	1890.0	1880.0	1850.0
2023-03-16 13:03:03+00:00	4432	1	1929.0	1940.0	1926.0	1920.0	1915.0	1870.0

187 rows × 8 columns

In [94]:

print("Nr. Orders which got skipped:", (pf.orders.count() == 0).sum())

Nr. Orders which got skipped: 187

In [95]:

# We can see that some signals were skipped, let's remove them from the portfolio
pf = pf.loc[:, pf.orders.count() >= 1]
print(len(pf.wrapper.columns))

Analysis and Visualization of our Signals Backtesting¶

In [96]:

# There are various ways to analyze the data
# For example, we can count how many times each stop type was triggered
# Since we want to combine all trades in each statistic, we need to provide grouping

print(pf.orders.stop_type.stats(group_by=True))

Start                 2021-09-02 00:00:00+00:00
End                   2023-03-13 23:59:00+00:00
Period                         83 days 06:26:00
Count                                       155
Value Counts: None                           45
Value Counts: SL1                            34
Value Counts: TP1                            34
Value Counts: TP2                            30
Value Counts: TP3                             8
Value Counts: TP4                             4
Name: group, dtype: object

There were 225 signals (columns) and the above stats show the distribution of stop types, where None means that the order was not any type of a stop order and Stop Loss was hit 34 times, while TP3 and TP4 was hit very sparringly at 8 and 4 times respectively.

In [97]:

# We can also get the position stats for P&L information
pf.positions.stats(group_by=True)

Out[97]:

Start                         2021-09-02 00:00:00+00:00
End                           2023-03-13 23:59:00+00:00
Period                                 83 days 06:26:00
First Trade Start             2021-09-14 12:43:00+00:00
Last Trade End                2023-03-13 23:59:00+00:00
Coverage                               32 days 18:10:00
Overlap Coverage                        5 days 09:17:00
Total Records                                        45
Total Long Trades                                    11
Total Short Trades                                   34
Total Closed Trades                                  44
Total Open Trades                                     1
Open Trade PnL                                 -0.32732
Win Rate [%]                                  27.272727
Max Win Streak                                        1
Max Loss Streak                                       1
Best Trade [%]                                 2.876974
Worst Trade [%]                               -1.844539
Avg Winning Trade [%]                           0.94318
Avg Losing Trade [%]                           -0.36128
Avg Winning Trade Duration              1 days 09:37:30
Avg Losing Trade Duration        0 days 15:46:16.875000
Profit Factor                                  0.978998
Expectancy                                    -0.005518
SQN                                           -0.045627
Edge Ratio                                     1.497918
Name: group, dtype: object

In [98]:

pf.trades.records_readable

Out[98]:

	Exit Trade Id	Column	Size	Entry Order Id	Entry Index	Avg Entry Price	Entry Fees	Exit Order Id	Exit Index	Avg Exit Price	Exit Fees	PnL	Return	Direction	Status	Position Id
0	0	0	0.018610	0	2021-09-14 12:43:00+00:00	1791.1600	0.0	1	2021-09-14 12:44:00+00:00	1790.9000	0.0	0.004839	0.000145	Short	Closed	0
1	1	0	0.018591	0	2021-09-14 12:43:00+00:00	1791.1600	0.0	2	2021-09-14 12:50:00+00:00	1789.4000	0.0	0.032719	0.000983	Short	Closed	0
2	2	0	0.018629	0	2021-09-14 12:43:00+00:00	1791.1600	0.0	3	2021-09-14 12:55:00+00:00	1796.4000	0.0	-0.097619	-0.002925	Short	Closed	0
3	0	5	0.056762	0	2021-09-17 11:00:00+00:00	1761.7270	0.0	1	2021-09-17 12:54:00+00:00	1755.0000	0.0	-0.381841	-0.003818	Long	Closed	0
4	0	21	0.056840	0	2021-10-05 15:46:00+00:00	1759.3285	0.0	1	2021-10-14 00:00:00+00:00	1791.7800	0.0	-1.844539	-0.018445	Short	Closed	0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
106	1	226	0.013324	0	2023-03-10 15:51:00+00:00	1865.0000	0.0	2	2023-03-10 16:19:00+00:00	1857.0000	0.0	0.106591	0.004290	Short	Closed	0
107	2	226	0.027003	0	2023-03-10 15:51:00+00:00	1865.0000	0.0	3	2023-03-13 00:00:00+00:00	1877.8650	0.0	-0.347394	-0.006898	Short	Closed	0
108	0	227	0.013174	0	2023-03-13 10:17:00+00:00	1884.4550	0.0	1	2023-03-13 10:47:00+00:00	1882.0000	0.0	0.032342	0.001303	Short	Closed	0
109	1	227	0.039892	0	2023-03-13 10:17:00+00:00	1884.4550	0.0	2	2023-03-13 12:29:00+00:00	1895.0000	0.0	-0.420660	-0.005596	Short	Closed	0
110	0	229	0.052468	0	2023-03-13 14:36:00+00:00	1905.9350	0.0	-1	2023-03-13 23:59:00+00:00	1912.1735	0.0	-0.327320	-0.003273	Short	Open	0

111 rows × 16 columns

In [99]:

pf.trades.records_readable['Position Id'].unique()

Out[99]:

array([0])

Trades vs Orders:

pf.Orders is our customized vectorBT representation of the various orders that result from the simulation of our signals data.

Trades in vectorbtpro world are a bit different from what you normally call trades. There are two types of trades: entry trades and exit trades. For example, a position may have several entry orders that increase the position and several exit orders that decrease the position. The first are called entry trades, the second exit trades. pf.trades is the same as pf.exit_trades, for entry trades you can query pf.entry_trades

In [100]:

pf.entry_trades.records_readable

Out[100]:

	Column	Size	Entry Index	Avg Entry Price	Exit Order Id	Exit Index	Avg Exit Price	PnL	Return	Direction	Status
0	0	0.055830	2021-09-14 12:43:00+00:00	1791.1600	3	2021-09-14 12:55:00+00:00	1792.235783	-0.060061	-0.000601	Short	Closed
1	5	0.056762	2021-09-17 11:00:00+00:00	1761.7270	1	2021-09-17 12:54:00+00:00	1755.000000	-0.381841	-0.003818	Long	Closed
2	21	0.056840	2021-10-05 15:46:00+00:00	1759.3285	1	2021-10-14 00:00:00+00:00	1791.780000	-1.844539	-0.018445	Short	Closed
3	27	0.055685	2021-10-14 08:03:00+00:00	1795.8235	1	2021-10-14 10:02:00+00:00	1798.800000	-0.165746	-0.001657	Short	Closed
4	32	0.054156	2022-01-20 15:04:00+00:00	1846.5100	4	2022-01-26 19:03:00+00:00	1836.717318	0.530335	0.005303	Short	Closed
5	35	0.053757	2022-02-14 14:37:00+00:00	1860.2400	1	2022-02-14 15:16:00+00:00	1868.000000	-0.417150	-0.004172	Short	Closed
6	46	0.050660	2022-03-10 01:06:00+00:00	1973.9450	3	2022-03-10 11:09:00+00:00	1991.658877	0.897385	0.008974	Long	Closed
7	47	0.050472	2022-03-10 07:05:00+00:00	1981.2930	3	2022-03-10 11:09:00+00:00	1994.663824	0.674853	0.006749	Long	Closed
8	48	0.050042	2022-03-10 14:57:00+00:00	1998.3380	2	2022-03-14 00:00:00+00:00	1984.930000	-0.670958	-0.006710	Long	Closed
9	50	0.050699	2022-03-14 01:18:00+00:00	1972.4250	2	2022-03-14 08:57:00+00:00	1970.666667	-0.089146	-0.000891	Long	Closed
10	51	0.051462	2022-03-24 01:45:00+00:00	1943.1675	2	2022-03-24 11:40:00+00:00	1947.700070	-0.233257	-0.002333	Short	Closed
11	52	0.051373	2022-03-24 11:32:00+00:00	1946.5545	2	2022-03-24 11:40:00+00:00	1947.275000	0.037014	0.000370	Long	Closed
12	53	0.051217	2022-03-24 11:46:00+00:00	1952.4725	3	2022-03-30 00:00:00+00:00	1946.514917	-0.305130	-0.003051	Long	Closed
13	56	0.052012	2022-03-30 01:25:00+00:00	1922.6325	2	2022-04-21 00:00:00+00:00	1942.427500	1.029578	0.010296	Long	Closed
14	63	0.051380	2022-04-21 12:41:00+00:00	1946.2980	3	2022-05-09 00:00:00+00:00	1911.743445	1.775399	0.017754	Short	Closed
15	70	0.054021	2022-05-11 09:06:00+00:00	1851.1345	3	2022-05-24 13:12:00+00:00	1851.202702	-0.003684	-0.000037	Short	Closed
16	77	0.053794	2022-05-24 08:25:00+00:00	1858.9405	3	2022-07-19 00:00:00+00:00	1805.459265	2.876974	0.028770	Short	Closed
17	111	0.056982	2022-08-19 07:01:00+00:00	1754.9380	3	2022-09-13 13:02:00+00:00	1733.215120	1.237815	0.012378	Short	Closed
18	116	0.058411	2022-08-31 12:25:00+00:00	1712.0000	3	2022-09-13 13:02:00+00:00	1712.006528	0.000381	0.000004	Long	Closed
19	123	0.058781	2022-09-13 13:25:00+00:00	1701.2235	3	2022-09-16 00:00:00+00:00	1691.510743	-0.570928	-0.005709	Long	Closed
20	125	0.060044	2022-09-19 12:37:00+00:00	1665.4585	3	2022-09-21 18:35:00+00:00	1666.711569	-0.075239	-0.000752	Short	Closed
21	126	0.059773	2022-09-21 13:44:00+00:00	1673.0000	3	2022-09-21 18:44:00+00:00	1673.018246	-0.001091	-0.000011	Short	Closed
22	129	0.060758	2022-09-28 13:33:00+00:00	1645.8850	2	2022-09-30 06:54:00+00:00	1653.966033	-0.490984	-0.004910	Short	Closed
23	130	0.060234	2022-09-28 17:11:00+00:00	1660.1975	3	2022-09-30 14:37:00+00:00	1662.044470	-0.111250	-0.001113	Short	Closed
24	136	0.060110	2022-10-11 10:36:00+00:00	1663.6215	3	2022-10-13 12:45:00+00:00	1662.669895	-0.057201	-0.000572	Long	Closed
25	139	0.060350	2022-10-13 12:39:00+00:00	1657.0000	3	2022-10-13 15:39:00+00:00	1657.026043	-0.001572	-0.000016	Short	Closed
26	140	0.060096	2022-10-17 12:33:00+00:00	1664.0150	3	2022-11-06 23:00:00+00:00	1664.041533	-0.001594	-0.000016	Short	Closed
27	155	0.056384	2022-11-16 03:34:00+00:00	1773.5600	2	2022-11-16 10:21:00+00:00	1780.358997	-0.383353	-0.003834	Short	Closed
28	156	0.056114	2022-11-16 10:39:00+00:00	1782.0800	3	2022-12-05 00:00:00+00:00	1784.053532	-0.110743	-0.001107	Short	Closed
29	159	0.057057	2022-11-25 08:20:00+00:00	1752.6295	2	2022-12-05 00:00:00+00:00	1782.766007	-1.719502	-0.017195	Short	Closed
30	178	0.053879	2023-01-04 07:47:00+00:00	1856.0150	3	2023-01-06 19:41:00+00:00	1857.699209	-0.090743	-0.000907	Short	Closed
31	179	0.054367	2023-01-06 02:24:00+00:00	1839.3350	3	2023-01-06 15:05:00+00:00	1843.785817	-0.241980	-0.002420	Short	Closed
32	180	0.054268	2023-01-06 13:40:00+00:00	1842.7065	1	2023-01-06 13:47:00+00:00	1850.000000	-0.395804	-0.003958	Short	Closed
33	181	0.054105	2023-01-06 14:04:00+00:00	1848.2500	3	2023-01-06 15:08:00+00:00	1850.355558	-0.113922	-0.001139	Short	Closed
34	182	0.053310	2023-01-09 05:13:00+00:00	1875.8300	3	2023-01-15 23:00:00+00:00	1888.282490	-0.663839	-0.006638	Short	Closed
35	187	0.052329	2023-01-17 05:30:00+00:00	1911.0000	3	2023-01-25 00:00:00+00:00	1916.413678	-0.283290	-0.002833	Short	Closed
36	194	0.051787	2023-01-25 15:34:00+00:00	1931.0000	1	2023-01-25 17:33:00+00:00	1940.000000	-0.466080	-0.004661	Short	Closed
37	195	0.051603	2023-01-25 17:58:00+00:00	1937.8700	3	2023-02-03 13:41:00+00:00	1916.134703	1.121608	0.011216	Short	Closed
38	202	0.051975	2023-01-31 15:07:00+00:00	1924.0000	4	2023-02-03 15:10:00+00:00	1903.895890	1.044912	0.010449	Short	Closed
39	221	0.053868	2023-03-06 07:32:00+00:00	1856.3735	4	2023-03-10 20:57:00+00:00	1854.667343	0.091908	0.000919	Short	Closed
40	224	0.055002	2023-03-09 08:48:00+00:00	1818.1015	1	2023-03-09 14:20:00+00:00	1828.000000	-0.544442	-0.005444	Short	Closed
41	225	0.054756	2023-03-09 14:26:00+00:00	1826.2950	1	2023-03-09 15:07:00+00:00	1835.000000	-0.476648	-0.004766	Short	Closed
42	226	0.053619	2023-03-10 15:51:00+00:00	1865.0000	3	2023-03-13 00:00:00+00:00	1868.747273	-0.200926	-0.002009	Short	Closed
43	227	0.053066	2023-03-13 10:17:00+00:00	1884.4550	2	2023-03-13 12:29:00+00:00	1891.772688	-0.388319	-0.003883	Short	Closed
44	229	0.052468	2023-03-13 14:36:00+00:00	1905.9350	-1	2023-03-13 23:59:00+00:00	1912.173500	-0.327320	-0.003273	Short	Open

In [101]:

pf.exit_trades.records_readable

Out[101]:

	Exit Trade Id	Column	Size	Entry Order Id	Entry Index	Avg Entry Price	Entry Fees	Exit Order Id	Exit Index	Avg Exit Price	Exit Fees	PnL	Return	Direction	Status	Position Id
0	0	0	0.018610	0	2021-09-14 12:43:00+00:00	1791.1600	0.0	1	2021-09-14 12:44:00+00:00	1790.9000	0.0	0.004839	0.000145	Short	Closed	0
1	1	0	0.018591	0	2021-09-14 12:43:00+00:00	1791.1600	0.0	2	2021-09-14 12:50:00+00:00	1789.4000	0.0	0.032719	0.000983	Short	Closed	0
2	2	0	0.018629	0	2021-09-14 12:43:00+00:00	1791.1600	0.0	3	2021-09-14 12:55:00+00:00	1796.4000	0.0	-0.097619	-0.002925	Short	Closed	0
3	0	5	0.056762	0	2021-09-17 11:00:00+00:00	1761.7270	0.0	1	2021-09-17 12:54:00+00:00	1755.0000	0.0	-0.381841	-0.003818	Long	Closed	0
4	0	21	0.056840	0	2021-10-05 15:46:00+00:00	1759.3285	0.0	1	2021-10-14 00:00:00+00:00	1791.7800	0.0	-1.844539	-0.018445	Short	Closed	0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
106	1	226	0.013324	0	2023-03-10 15:51:00+00:00	1865.0000	0.0	2	2023-03-10 16:19:00+00:00	1857.0000	0.0	0.106591	0.004290	Short	Closed	0
107	2	226	0.027003	0	2023-03-10 15:51:00+00:00	1865.0000	0.0	3	2023-03-13 00:00:00+00:00	1877.8650	0.0	-0.347394	-0.006898	Short	Closed	0
108	0	227	0.013174	0	2023-03-13 10:17:00+00:00	1884.4550	0.0	1	2023-03-13 10:47:00+00:00	1882.0000	0.0	0.032342	0.001303	Short	Closed	0
109	1	227	0.039892	0	2023-03-13 10:17:00+00:00	1884.4550	0.0	2	2023-03-13 12:29:00+00:00	1895.0000	0.0	-0.420660	-0.005596	Short	Closed	0
110	0	229	0.052468	0	2023-03-13 14:36:00+00:00	1905.9350	0.0	-1	2023-03-13 23:59:00+00:00	1912.1735	0.0	-0.327320	-0.003273	Short	Open	0

111 rows × 16 columns

Visualizing Individual Trades¶

Here, we're randomly selecting one column (i.e., one signal), and plotting its exit trades . Since the only orders that reduce our position are stop orders, each green/red colored box represents a stop order of this signal. In our case, a green box is a TP order, a red box is an SL order, while the blue box is the market order that initialized the position.

In [102]:

# Let's plot a random trade
# The only issue: we have too much data for that (thanks to Plotly)
# Thus, crop it before plotting to remove irrelevant data

signal = np.random.choice(len(pf.wrapper.columns)) ## 6 is hitting all levels till TP3 #6
print("Signal (or) Column Nr:", signal) 
pf.trades.iloc[:, signal].crop().plot().show()

Signal (or) Column Nr: 6

This particular trade having signal or col. Nr as 6, has all 3 Take Profits. Yay! 😃

Order Entry Check within the Candle¶

The use case is clear: what if the telegram group gives us signals at a price that is too optimistic to be executed at the current time? They could provide a much higher price for a SELL order to make the trade appear more profitable than it really is. Instead of manually checking whether the order price is within the expected range (OHLC), there's a property pf.orders.price_status that does the check for us and returns the status for each order. For example, it returns "BelowLow" if the requested order price is lower than the low price of the bar where the order happened. For example, with pf.orders.bar_high we get the high price of the bar where an order happened, then we call to_readable to make it a Pandas Series and put to our DataFrame. Same for low. By putting high and low price columns alongside "Price", we can analyze whether the order price is within its candle.

In [103]:

# Let's verify that the entry price stays within each candle
pd.concat((
    pf.orders.records_readable[["Column", "Order Type", "Stop Type", "Price"]],
    pf.orders.bar_high.to_readable(title="High", only_values=True),
    pf.orders.bar_low.to_readable(title="Low", only_values=True),
    pf.orders.price_status.to_readable(title="Price Status", only_values=True),
), axis=1)

Out[103]:

	Column	Order Type	Stop Type	Price	High	Low	Price Status
0	0	SELL	None	1791.160	1791.866	1790.982	OK
1	0	SELL	TP1	1790.900	1791.436	1790.262	OK
2	0	SELL	TP2	1789.400	1790.396	1789.156	OK
3	0	SELL	SL1	1796.400	1797.156	1794.742	OK
4	5	BUY	None	1761.727	1761.993	1761.736	BelowLow
...	...	...	...	...	...	...	...
150	226	SELLSTOP	SL1	1877.865	1878.035	1875.445	OK
151	227	SELL	None	1884.455	1884.715	1883.635	OK
152	227	SELL	TP1	1882.000	1882.585	1881.685	OK
153	227	SELL	SL1	1895.000	1898.335	1894.185	OK
154	229	SELL	None	1905.935	1906.795	1905.795	OK

155 rows × 7 columns

In [104]:

pf.orders.price_status.stats(group_by=True)

Out[104]:

Start                     2021-09-02 00:00:00+00:00
End                       2023-03-13 23:59:00+00:00
Period                             83 days 06:26:00
Count                                           155
Value Counts: OK                                122
Value Counts: BelowLow                           33
Name: group, dtype: object

In [105]:

pf.orders.bar_high.to_readable()

Out[105]:

	Id	Column	Index	Value
0	0	0	2021-09-14 12:43:00+00:00	1791.866
1	1	0	2021-09-14 12:44:00+00:00	1791.436
2	2	0	2021-09-14 12:50:00+00:00	1790.396
3	3	0	2021-09-14 12:55:00+00:00	1797.156
4	0	5	2021-09-17 11:00:00+00:00	1761.993
...	...	...	...	...
150	3	226	2023-03-13 00:00:00+00:00	1878.035
151	0	227	2023-03-13 10:17:00+00:00	1884.715
152	1	227	2023-03-13 10:47:00+00:00	1882.585
153	2	227	2023-03-13 12:29:00+00:00	1898.335
154	0	229	2023-03-13 14:36:00+00:00	1906.795

155 rows × 4 columns

Merging Order Records for portfolio metrics¶

In [106]:

# Now, what if we're interested in portfolio metrics, such as the Sharpe ratio?
# The problem is that most metrics are producing multiple (intermediate) time series of the full shape, which is 
# disastrous for RAM since our data will have to be tiled  by the number of columns. 
# But here's a trick: merge order records of all columns into one, as if we did the simulation on just one column!

def merge_order_records(order_records):
    merged_order_records = order_records.copy()
    
    # New records should have only one column
    merged_order_records["col"][:] = 0
    
    # Sort the records by the timestamp
    merged_order_records = merged_order_records[np.argsort(merged_order_records["idx"])]
    
    # Reset the order ids
    merged_order_records["id"][:] = np.arange(len(merged_order_records))
    return merged_order_records

merged_order_records = merge_order_records(custom_order_records)
pd.DataFrame(merged_order_records)

Out[106]:

	id	col	idx	size	price	fees	side	order_type	stop_type
0	0	0	4663	0.055830	1791.160	0.0	1	1	-1
1	1	0	4664	0.018610	1790.900	0.0	0	1	1
2	2	0	4670	0.018591	1789.400	0.0	0	1	2
3	3	0	4675	0.018629	1796.400	0.0	0	1	0
4	4	0	5940	0.056762	1761.727	0.0	0	0	-1
...	...	...	...	...	...	...	...	...	...
150	150	0	118526	0.027003	1877.865	0.0	0	3	0
151	151	0	119143	0.053066	1884.455	0.0	1	1	-1
152	152	0	119173	0.013174	1882.000	0.0	0	1	1
153	153	0	119275	0.039892	1895.000	0.0	0	1	0
154	154	0	119402	0.052468	1905.935	0.0	1	1	-1

155 rows × 9 columns

In [107]:

# We also need to change the wrapper because it holds the information on our columns
merged_wrapper = pf.wrapper.replace(columns=[0], ndim=1)

In [108]:

# Is there any other array that requires merging?
# Let's introspect the portfolio instance and search for arrays of the full shape
print(pf)
# There are none, thus replace only the records and the wrapper

Portfolio(
    wrapper=ArrayWrapper(
        index=<pandas.core.indexes.datetimes.DatetimeIndex object at 0x1735a6a70 with shape (119906,)>,
        columns=<pandas.core.indexes.numeric.Int64Index object at 0x28cb121d0 with shape (45,)>,
        ndim=2,
        freq='minute',
        column_only_select=None,
        range_only_select=None,
        group_select=None,
        grouped_ndim=None,
        grouper=Grouper(
            index=<pandas.core.indexes.numeric.Int64Index object at 0x28cb121d0 with shape (45,)>,
            group_by=None,
            def_lvl_name='group',
            allow_enable=True,
            allow_disable=True,
            allow_modify=True
        )
    ),
    order_records=<numpy.ndarray object at 0x28d7fba50 with shape (155,)>,
    open=<numpy.ndarray object at 0x17365d5f0 with shape (119906, 1)>,
    high=<numpy.ndarray object at 0x17365e010 with shape (119906, 1)>,
    low=<numpy.ndarray object at 0x17365e1f0 with shape (119906, 1)>,
    close=<numpy.ndarray object at 0x17365da10 with shape (119906, 1)>,
    log_records=<numpy.ndarray object at 0x173686130 with shape (0,)>,
    cash_sharing=False,
    init_cash=<numpy.ndarray object at 0x28c9ee1f0 with shape (45,)>,
    init_position=<numpy.ndarray object at 0x28d20d8f0 with shape (45,)>,
    init_price=<numpy.ndarray object at 0x28d20d830 with shape (45,)>,
    cash_deposits=<numpy.ndarray object at 0x173662670 with shape (1, 1)>,
    cash_earnings=<numpy.ndarray object at 0x17365c870 with shape (1, 1)>,
    call_seq=None,
    in_outputs=None,
    use_in_outputs=None,
    bm_close=None,
    fillna_close=None,
    trades_type=None,
    orders_cls=<class '__main__.CustomOrders'>,
    logs_cls=None,
    trades_cls=None,
    entry_trades_cls=None,
    exit_trades_cls=None,
    positions_cls=None,
    drawdowns_cls=None
)

In [109]:

merged_pf = pf.replace(
    order_records=merged_order_records, 
    wrapper=merged_wrapper,
    init_cash="auto"
)

# Also, the previous individual portfolios were each using the starting capital of $100m Which was used 100%, 
# but since we merge columns together, we now may require less starting capital
# Thus, we will determine it automatically
## init_Cash = "auto" means automatic initial capital is choosen, which is calculated from the max amount of cash we spent during all simulations.
##  This is better than taking 100$ and multiplying by the number of signals since then we would have inflated the returns.

In [110]:

# We can now get any portfolio statistic
print(merged_pf.stats())

Start                         2021-09-02 00:00:00+00:00
End                           2023-03-13 23:59:00+00:00
Period                                 83 days 06:26:00
Start Value                                   168.90342
Min Value                                    166.380452
Max Value                                    174.233989
End Value                                    168.333302
Total Return [%]                              -0.337541
Benchmark Return [%]                           5.439047
Total Time Exposure [%]                       39.339149
Max Gross Exposure [%]                            100.0
Max Drawdown [%]                               3.894613
Max Drawdown Duration                  38 days 07:03:00
Total Orders                                        155
Total Fees Paid                                     0.0
Total Trades                                        112
Win Rate [%]                                  61.261261
Best Trade [%]                                  8.00419
Worst Trade [%]                               -2.386677
Avg Winning Trade [%]                          0.642969
Avg Losing Trade [%]                          -0.712394
Avg Winning Trade Duration    0 days 14:31:48.529411764
Avg Losing Trade Duration     1 days 03:40:20.930232558
Profit Factor                                  0.983703
Expectancy                                    -0.002187
Sharpe Ratio                                  -0.161445
Calmar Ratio                                  -0.056725
Omega Ratio                                     0.99786
Sortino Ratio                                 -0.227766
dtype: object

In [111]:

# You may wonder why the win rate and other trade metrics are different here
# There are two reasons: 
# 1) Portfolio stats uses exit trades (previously we used positions), that is, each stop order is a trade
# 2) After merging, there's no more information which order belongs to which trade, thus positions are built in a sequential order

# But to verify that both portfolio match, we can compare to the total profit to the previous trade P&L
print("Total Profit:",merged_pf.total_profit)
print("PnL Sum:",pf.trades.pnl.sum(group_by=True))

Total Profit: -0.5701185962467719
PnL Sum: -0.5701185962467079

In [112]:

# We can now plot the entire portfolio
merged_pf.resample("daily").plot().show()

Custom Order Simulator¶

Putting it all together from above

Need:
The main issue with using from_order_func is that we need to go over the entire data as many times as there are signals because the order function is run on each element. A far more time-efficient approach would be processing trades in a sequential order. This is easily possible because our trades are perfectly sorted - we don't need to process a signal if the previous signal hasn't been processed yet.

Also, because the scope of this notebook assumes that signals are independent, we can simulate them independently and stop each signal's simulation once its position has been closed out This is only possible by writing an own simulator (which isn't as scary as it sounds!)

Let's build the simulator
Technically, it's just a regular Numba function that does whatever we want.
What's special about it is that it calls the vectorbt's low-level API to place orders and updates the simulation state such as cash balances and positions

In [113]:

# To avoid duplicating our signal logic, we will re-use order_func_nb by passing our own limited context
# It will consist only of the fields that are required by our order_func_nb

OrderContext = namedtuple("OrderContext", [
    "i",
    "col",
    "index",
    "open",  
    "high",
    "low",
    "close",
    "last_position"
])

In [114]:

## Nr. of TP Levels
signal_info.tp.shape[1]

Out[114]:

In [115]:

signal_data.get().shape[0]

Out[115]:

In [116]:

# We'll first determine the bars where the signals happen, and then run a smaller simulation on the first signal.
# Once the signal's position has been closed out, we'll terminate the simulation and continue with the next signal, 
# until all signals are processed.

@njit(boundscheck=True)
def signal_simulator_nb(index,
                        open,
                        high,
                        low,
                        close,
                        signal_info,
                        temp_info
                        ):
    # Determine the number of signals, levels, and potential orders
    n_signals = len(signal_info.timestamp)
    n_sl_levels = signal_info.sl.shape[1]
    n_tp_levels = signal_info.tp.shape[1]
    max_orders = 1 + n_sl_levels + n_tp_levels
    
    # TEMPORARY ARRAYS
    # This array will hold the bar where each signal happens
    signal_bars = np.full(n_signals, -1, dtype=np.int_)
    
    # This array will hold order records
    # Initially, order records are uninitialized (junk data) but we will fill them gradually
    # Notice how we use our own data type custom_order_dt - we can fill order type and stop type fields right during the simulation
    order_records = np.empty((max_orders, n_signals), dtype=custom_order_dt)
    
    # To be able to distinguish between uninitialized and initialized (filled) orders,
    # we'll create another array holding the number of filled orders for each signal
    # For example, if order_records has a maximum of 6 rows and only one record is filled,
    # order_counts will be 1 for this signal, so vectorbt can remove 5 unfilled orders later
    order_counts = np.full(n_signals, 0, dtype=np.int_)
    
    # order_func_nb requires last_position, which holds the position of each signal
    last_position = np.full(n_signals, 0.0, dtype=np.float_)
    
    # First, we need to determine the bars where the signals happen
    # Even though we know their timestamps, we need to translate them into absolute indices
    signal = 0
    bar = 0
    while signal < n_signals and bar < len(index):
        if index[bar] == signal_info.timestamp[signal]:
            # If there's a match, save the bar and continue with the next signal on the next bar
            signal_bars[signal] = bar
            signal += 1
            bar += 1
        elif index[bar] > signal_info.timestamp[signal]:
            # If we're past the signal, continue with the next signal on the same bar
            signal += 1
        else:
            # If we haven't hit the signal yet, continue on the next bar
            bar += 1

    # Once we know the bars, we can iterate over signals in a loop and simulate them independently
    for signal in range(n_signals):
        
        # If there was no match in the previous level, skip the simulation
        from_bar = signal_bars[signal]
        if from_bar == -1:
            continue
            
        # This is our initial execution state, which holds the most important cash balance
        exec_state = vbt.pf_enums.ExecState(
            cash=100.0,         # We'll start with a starting capital of $100
            position=0.0,
            debt=0.0,
            locked_cash=0.0,
            free_cash=100.0,
            val_price=np.nan,
            value=np.nan
        )
            
        # Here comes the actual simulation that starts from the signal's bar and ends either once we processed all bars
        #  or once the position has been closed out (see below)
        for bar in range(from_bar, len(index)):
            
            # Create a named tuple holding the current context (this is "c" in order_func_nb)
            c = OrderContext(  
                i=bar,
                col=signal,
                index=index,
                open=open,
                high=high,
                low=low,
                close=close,
                last_position=last_position,
            )
            
            # If the first bar has no data, skip the simulation
            if bar == from_bar and not has_data_nb(c):
                break

            # Price area holds the OHLC of the current bar
            price_area = vbt.pf_enums.PriceArea(
                vbt.flex_select_nb(open, bar, signal), 
                vbt.flex_select_nb(high, bar, signal), 
                vbt.flex_select_nb(low, bar, signal), 
                vbt.flex_select_nb(close, bar, signal)
            )
            
            # Why do we need to redefine the execution state?
            # Because we need to manually update the valuation price and the value of the column
            # to be able to use complex size types such as target percentages
            # As in order_func_nb, we will use the opening price as the valuation price
            # Why doesn't vectorbt do it on its own? Because it doesn't know anything about other columns. 
            # For example, imagine having a grouped simulation with 100 columns sharing the same cash: 
            # Using the formula below wouldn't consider the positions of other 99 columns.
            exec_state = vbt.pf_enums.ExecState(
                cash=exec_state.cash,
                position=exec_state.position,
                debt=exec_state.debt,
                locked_cash=exec_state.locked_cash,
                free_cash=exec_state.free_cash,
                val_price=price_area.open,
                value=exec_state.cash + price_area.open * exec_state.position
            )
            
            # Let's run the order function, which returns an order
            # Remember when we used order_nothing_nb()? It also returns an order but with filled with nans
            order = order_func_nb(c, signal_info, temp_info)
            
            # Here's the main function in this entire simulation, which 
            # 1) executes the order,
            # 2) updates the execution state, and 
            # 3) updates the order_records and order_counts
            order_result, exec_state = vbt.pf_nb.process_order_nb(
                signal, ## For Grouping
                signal, ## For column
                bar,
                exec_state=exec_state,
                order=order,
                price_area=price_area,
                order_records=order_records,
                order_counts=order_counts
            )
            
            # Where there's no grouping, then group = column. Columns in our case are signals.        
            # If the order was successful (i.e., it's now in order_records),
            # we need to manually set the order type and stop type
            if order_result.status == vbt.pf_enums.OrderStatus.Filled:
                
                # Use this line to get the last order of any signal
                filled_order = order_records[order_counts[signal] - 1, signal]
                
                # Fill the order type
                filled_order["order_type"] = signal_info.order_type[signal]
                
                # Fill the stop type by going through the SL and TP levels and checking whether 
                # the order bar matches the level bar
                order_is_stop = False
                for k in range(n_sl_levels):
                    if filled_order["idx"] == temp_info.sl_bar[signal, k]:
                        filled_order["stop_type"] = k
                        order_is_stop = True
                        break
                for k in range(n_tp_levels):
                    if filled_order["idx"] == temp_info.tp_bar[signal, k]:
                        filled_order["stop_type"] = n_sl_levels + k  # TP indices come after SL indices
                        order_is_stop = True
                        break
                
                # If order bar hasn't been matched, it's not a stop order
                if not order_is_stop:
                    filled_order["stop_type"] = -1
                    
            # If we're not in position after an entry anymore, terminate the simulation
            if temp_info.entry_price_bar[signal] != -1:
                if exec_state.position == 0:
                    break
                    
            # Don't forget to update the position array
            last_position[signal] = exec_state.position
        
    # Remove uninitialized order records and flatten 2d array into a 1d array
    return vbt.nb.repartition_nb(order_records, order_counts)

In [117]:

# Numba requires arrays in a NumPy format, and to avoid preparing them each time,
# let's create a function that only takes the data and signal information, and does everything else for us

def signal_simulator(data, signal_info):
    temp_info = build_temp_info(signal_info)
    
    custom_order_records = signal_simulator_nb(
        index = data.index.vbt.to_ns(),  # convert to nanoseconds
        open = vbt.to_2d_array(data.open),  # flexible indexing requires inputs to be 2d
        high = vbt.to_2d_array(data.high),
        low = vbt.to_2d_array(data.low),
        close = vbt.to_2d_array(data.close),
        signal_info = signal_info,
        temp_info = temp_info
    )
    
    # We have order records, what's left is wrapping them with a Portfolio
    # Required are three things: 
    # 1) array wrapper with index and columns,
    # 2) order records, and 
    # 3) prices
    # We also need to specify the starting capital that we used during the simulation
    return vbt.Portfolio(
        wrapper=vbt.ArrayWrapper(
            index=data.index, 
            columns=pd.Index(signal_data.get("id").values, name="id"),  # one column per signal, ids as column names
            freq="minute"),
        order_records=custom_order_records,
        open=data.open,
        high=data.high,
        low=data.low,
        close=data.close,
        init_cash=100.0, ## starting capital
        orders_cls=CustomOrders
    )

In [118]:

# That's it!
pf = signal_simulator(data, signal_info)
print('PnL:',pf.trades.pnl.sum(group_by=True))

PnL: -0.5701185962467079

In [119]:

pf.wrapper.columns

Out[119]:

Int64Index([ 846,  854,  866,  884,  887,  898,  942, 1037, 1044, 1063,
            ...
            4381, 4382, 4387, 4390, 4400, 4408, 4411, 4413, 4425, 4432],
           dtype='int64', name='id', length=232)

In the above vbt.ArrayWrapper in our signal_simulator function we pass pd.Index(signal_data.get("id").values, name="id")to the columns argument to retrieve additional information about the signal using the column id in the signal data. The messages id will now be displayed as column names and you can analyze them (for example, you would be able to do pf[id].stats() to analyze an equity curve of a single signal). Just don't analyze them all at once since they will use lots of RAM.

In [120]:

id = 846
pf[id].stats()

Out[120]:

Start                         2021-09-02 00:00:00+00:00
End                           2023-03-13 23:59:00+00:00
Period                                 83 days 06:26:00
Start Value                                       100.0
Min Value                                     99.939939
Max Value                                    100.077239
End Value                                     99.939939
Total Return [%]                              -0.060061
Benchmark Return [%]                           5.439047
Total Time Exposure [%]                        0.010008
Max Gross Exposure [%]                            100.0
Max Drawdown [%]                               0.137193
Max Drawdown Duration                  80 days 00:35:00
Total Orders                                          4
Total Fees Paid                                     0.0
Total Trades                                          3
Win Rate [%]                                  66.666667
Best Trade [%]                                  0.09826
Worst Trade [%]                               -0.292548
Avg Winning Trade [%]                          0.056388
Avg Losing Trade [%]                          -0.292548
Avg Winning Trade Duration              0 days 00:04:00
Avg Losing Trade Duration               0 days 00:12:00
Profit Factor                                  0.384741
Expectancy                                     -0.02002
Sharpe Ratio                                  -1.542342
Calmar Ratio                                  -0.286392
Omega Ratio                                    0.610956
Sortino Ratio                                 -1.929536
Name: 846, dtype: object

Summary of `signal_simulator` function¶

Basically we take from_order_func, remove redundant parts from it and introduce some optimizations. In from_order_func , each signal (column) required going through the entire data from the start to the end, but in our custom simulator we can skip these parts, which makes it much, much faster.
We also create the temporary arrays and write our own custom records in-place instead of fixing them after the simulation, which is very convenient.
The only part that isn't integrated is merging the records, but having the records partitioned by signal provides more depth and is overall better for analysis, the user can still merge them at any time after the simulation with functions developed previously.

708 KiB Raw Permalink Blame History Unescape Escape