Files
strategy-lab/to_explore/Discretionary_Signal_Backtesting.ipynb
David Brazda e3da60c647 daily update
2024-10-21 20:57:56 +02:00

708 KiB
Raw Permalink Blame History

Description

In this project we backtest historical signals from a Telegram Channel using VectorBT Pro.
Author(s): Oleg Polakow (VBT Pro Simulations), Dilip Rajkumar (Signal Extraction)

In [56]:
import numpy as np
import pandas as pd
from numba import njit
import vectorbtpro as vbt

vbt.settings.set_theme("dark")
vbt.settings['plotting']['layout']['width'] = 1280

Read Market Data and Telegram Signals (Extracted) Data

In [57]:
# Fetch data
def date_parser(timestamps):
    # First column are integer timestamps, parse them into DatetimeIndex
    return pd.to_datetime(timestamps, utc=True, unit="ms")

## Read OHLCV for XAUUSD downloaded from Dukascopy
data = vbt.CSVData.fetch("/Users/dilip.rajkumar/Documents/TeleGram_Signal_Extraction/data/xauusd_202109_202303.csv", date_parser=date_parser)
data.get()
Out[57]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
open high low close volume
timestamp
2021-09-02 00:00:00+00:00 1813.2250 1813.732 1813.222 1813.5345 0.0417
2021-09-02 00:01:00+00:00 1813.5245 1813.742 1813.621 1813.4850 0.0374
2021-09-02 00:02:00+00:00 1813.5250 1813.802 1813.531 1813.4700 0.0313
2021-09-02 00:03:00+00:00 1813.5000 1813.742 1813.542 1813.3750 0.0301
2021-09-02 00:04:00+00:00 1813.3700 1813.772 1813.472 1813.5550 0.0449
... ... ... ... ... ...
2023-03-13 23:55:00+00:00 1911.9150 1912.305 1912.025 1912.1550 0.0079
2023-03-13 23:56:00+00:00 1912.1650 1912.315 1911.995 1911.8750 0.0048
2023-03-13 23:57:00+00:00 1911.8900 1912.385 1912.035 1912.1035 0.0145
2023-03-13 23:58:00+00:00 1912.1135 1912.715 1912.282 1912.5100 0.0194
2023-03-13 23:59:00+00:00 1912.5185 1912.695 1912.342 1912.1735 0.0272

119906 rows × 5 columns

In [58]:
# Fetch Telegram signals extracted from "Green Pips" ( https://t.me/forexbookspdf )
signal_data = vbt.CSVData.fetch("/Users/dilip.rajkumar/Documents/TeleGram_Signal_Extraction/data/TG_Extracted_Signals.csv", index_col=1)
print("Telegram Signal DF Shape:",signal_data.wrapper.shape)
signal_data.get()
Telegram Signal DF Shape: (543, 10)
Out[58]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id message Symbol OrderType EntryPrice SL TP1 TP2 TP3 TP4
date
2021-09-14 05:57:17+00:00 840 Gbpchf sell now @1.27590 - 1.27890\n\nSL: 1.28... GBPCHF SELL 1.2789 1.28 1.27 NaN NaN NaN
2021-09-14 05:57:30+00:00 841 Gbpaud sell now @1.88550 - 1.88750\n\nSL: 1.89... GBPAUD SELL 1.8875 1.89 1.88 NaN NaN NaN
2021-09-14 06:28:04+00:00 843 Eurjpy buy now @ 129.980\n\nSL: 129.580\nTP1: ... EURJPY BUY 129.9800 129.58 130.13 130.28 130.48 NaN
2021-09-14 08:39:48+00:00 844 Gbpjpy Sell now @ 152.650\n\nTP1: 152.500 (15... GBPJPY SELL 152.6500 153.05 152.50 152.35 152.15 NaN
2021-09-14 12:43:51+00:00 846 XAUUSD sell now@ 1792.4\n\nTP1: 1790.9 (15 pip... XAUUSD SELL 1792.4000 1796.40 1790.90 1789.40 1787.40 NaN
... ... ... ... ... ... ... ... ... ... ...
2023-03-13 10:17:39+00:00 4408 XAUUSD SELL 1885.00\nSL 1895\nTP 1882\nTP 18... XAUUSD SELL 1885.0000 1895.00 1882.00 1877.00 1865.00 1800.0
2023-03-13 13:08:21+00:00 4411 XAUUSD SELL STOP 1896.00\nSL 1906\nTP 1893\n... XAUUSD SELL STOP 1896.0000 1906.00 1893.00 1888.00 1870.00 1840.0
2023-03-13 14:36:09+00:00 4413 XAUUSD SELL 1907.00\nSL 1918\nTP 1903\nTP 1... XAUUSD SELL 1907.0000 1918.00 1903.00 1900.00 1890.00 1860.0
2023-03-15 06:27:23+00:00 4425 XAUUSD SELL STOP 1899.00\nSL 1910\nTP 1896\... XAUUSD SELL STOP 1899.0000 1910.00 1896.00 1890.00 1880.00 1850.0
2023-03-16 13:03:03+00:00 4432 XAUUSD SELL 1929.00\nSL 1940\nTP 1926\nTP 1... XAUUSD SELL 1929.0000 1940.00 1926.00 1920.00 1915.00 1870.0

543 rows × 10 columns

In [59]:
df = signal_data.get()
df.OrderType = df.OrderType.apply(lambda x: x.rstrip()) ## Remove empty white spaces at the end of the string using rstrip()
df.Symbol = df.Symbol.apply(lambda x: x.rstrip())
df[df['EntryPrice'] == 0] ## Checking for signals with NULL Entry Prices
Out[59]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id message Symbol OrderType EntryPrice SL TP1 TP2 TP3 TP4
date
In [60]:
df[df['EntryPrice'].isna()]
Out[60]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id message Symbol OrderType EntryPrice SL TP1 TP2 TP3 TP4
date
In [61]:
## Different Order Types present in the TeleGram Signal
print(df['OrderType'][(df['Symbol']=='XAUUSD')].unique())
['SELL' 'BUY' 'BUY STOP' 'SELL STOP']
In [62]:
## Check for BUY STOP Orders in XAUUSD Symbol
df[(df['Symbol']=='XAUUSD') & (df['OrderType'] == 'BUY STOP')]
Out[62]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id message Symbol OrderType EntryPrice SL TP1 TP2 TP3 TP4
date
2022-07-15 13:01:54+00:00 2837 XAUUSD BUY STOP 1710.00\nSL 1697\nTP 1713\nTP ... XAUUSD BUY STOP 1710.0 1697.0 1713.0 1719.0 1770.0 NaN
2022-08-11 07:59:19+00:00 3002 XAUUSD BUY STOP 1788.00 \nSL 1777\nTP 1791\nT... XAUUSD BUY STOP 1788.0 1777.0 1791.0 1797.0 1850.0 NaN
2022-08-31 12:24:05+00:00 3110 XAUUSD BUY STOP 1712.00 \nSL 1700\nTP 1715\nTP... XAUUSD BUY STOP 1712.0 1700.0 1715.0 1721.0 1780.0 NaN

Creation of various NamedTuples

In [63]:
# Numba doesn't understand strings, thus create an enumerated type for stop types
from collections import namedtuple

# Create a type first
OrderTypeT = namedtuple("OrderTypeT", ["BUY", "SELL", "BUYSTOP", "SELLSTOP"])

# Then create a tuple of type: OrderTypeT
OrderType = OrderTypeT(*range(len(OrderTypeT._fields)))

print(OrderType)
OrderTypeT(BUY=0, SELL=1, BUYSTOP=2, SELLSTOP=3)
In [64]:
## You could have also created the named tuple with `typing.NamedTuple`, 
## but some versions of Numba don't like it, so to to access the 
## `field_names` and `typename` like this
print("NamedTuple Fields:", OrderTypeT._fields)
print("NamedTuple Name  :",   OrderTypeT.__name__)
NamedTuple Fields: ('BUY', 'SELL', 'BUYSTOP', 'SELLSTOP')
NamedTuple Name  : OrderTypeT

Mapping OrderTypes column of string values to integers in the OrderType namedtuple

In [65]:
def transform_signal_data(df: pd.DataFrame, symbol : str = 'XAUUSD'):
    '''Transform OrderType Column to numerical type for numba'''
    # Select only one symbol, the one we fetched the data for
    print("DF All Columns:",df.columns.tolist())
    df = df[df["Symbol"] == symbol]
    
    # Select columns of interest
    df = df.iloc[:, [0, 3, 4, 5, 6, 7, 8 , 9]]
    print("DF Sel Int. Columns:",df.columns.tolist())
    # Map order types using OrderType
    df["OrderType"] = df["OrderType"].map(lambda x: OrderType._fields.index(x.replace(" ", "")))
    
    # Some entry prices are zero
    df = df[df["EntryPrice"] > 0]
    
    return df

signal_data = signal_data.transform(transform_signal_data)

print("Final Signal DF Shape:",signal_data.wrapper.shape)
DF All Columns: ['id', 'message', 'Symbol', 'OrderType', 'EntryPrice', 'SL', 'TP1', 'TP2', 'TP3', 'TP4']
DF Sel Int. Columns: ['id', 'OrderType', 'EntryPrice', 'SL', 'TP1', 'TP2', 'TP3', 'TP4']
Final Signal DF Shape: (232, 8)

We have about 232 signals for XAUUSD

In [66]:
## OrderType column now remapped
signal_data.get()
Out[66]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id OrderType EntryPrice SL TP1 TP2 TP3 TP4
date
2021-09-14 12:43:51+00:00 846 1 1792.4 1796.4 1790.9 1789.4 1787.4 NaN
2021-09-15 04:24:47+00:00 854 0 1800.0 1797.5 1805.0 1810.0 NaN NaN
2021-09-15 08:08:45+00:00 866 1 1802.5 1806.5 1801.0 1799.5 1797.5 NaN
2021-09-16 10:39:52+00:00 884 0 1780.0 NaN 1781.3 1783.3 NaN NaN
2021-09-16 12:48:36+00:00 887 0 1762.3 1758.3 1763.8 1765.3 1767.3 NaN
... ... ... ... ... ... ... ... ...
2023-03-13 10:17:39+00:00 4408 1 1885.0 1895.0 1882.0 1877.0 1865.0 1800.0
2023-03-13 13:08:21+00:00 4411 3 1896.0 1906.0 1893.0 1888.0 1870.0 1840.0
2023-03-13 14:36:09+00:00 4413 1 1907.0 1918.0 1903.0 1900.0 1890.0 1860.0
2023-03-15 06:27:23+00:00 4425 3 1899.0 1910.0 1896.0 1890.0 1880.0 1850.0
2023-03-16 13:03:03+00:00 4432 1 1929.0 1940.0 1926.0 1920.0 1915.0 1870.0

232 rows × 8 columns

In [67]:
# Create named tuples which will act as containers for various arrays

# SignalInfo will contain signal information in a vbt-friendly format
# Rows in each array correspond to signals
SignalInfo = namedtuple(typename = "SignalInfo", field_names =  [
    "timestamp",  # 1d array with timestamps in nanosecond format (int64)
    "order_type",  # 1d array with order types in integer format (int64, see order_type_map)
    "entry_price",  # 1d array with entry price (float64)
    "sl",  # 2d array where columns are SL levels (float64)
    "tp",  # 2d array where columns are TP levels (float64)
])

# TempInfo will contain temporary information that will be written during backtesting
# You can imagine being buffer that we write and then access at a later time
# Rows in each array correspond to signals
TempInfo = namedtuple(typename = "TempInfo", field_names = [
    "ts_bar",           # 1d array with row indices of the timestamp where the signal was hit (int64)
    "entry_price_bar",  # 1d array with row indices where entry price was hit (int64)
    "sl_bar",  # 2d array with row indices where each SL level was hit, same shape as SignalInfo.sl (int64)
    "tp_bar",  # 2d array with row indices where each TP level was hit, same shape as SignalInfo.tp (int64)
])

In the TempInfo namedtuple above, the purpose of ts_bar is to mark the bar (i.e absolute index of the row) where we saw the signal for the first time, we could have also created a boolean array with True for "signal seen" but for more consistency we use this namedTuple format for storing more rich information. Basically it just a one-dimensional array where each element corresponds to a column (signal). Before the signal is hit, the value is -1. After that, it becomes the row index that never changes. Having this array is important because we want to skip our logic if there is no signal yet.

order_func_nb - A Numba Compiled Order Function

The order_func_nb will deal with the execution of various entry and exit orders for our signals along with the position management. For the order_func_nb below, here's what we will do:

  • Represent each signal as a separate column with its own starting capital
  • The order function is executed at each bar and column (signal in our case). If the current bar contains a signal, execute the signal logic
  • Order functions can issue only one order at bar, thus if multiple stops were hit, we will aggregate them
  • We will go all in and then gradually reduce the position based on the number of stops
  • Sometimes we may see two signals occuring right next to each other with different stop losses. To maintain the consistency in the number of rows (i.e levels). All these stops are stored in a single array, i.e. all stops across all signals must have the same number of rows (ladder steps/levels), thus we just pad the columns where there are less stops with NaN. You could have also run the simulation on each signal separately (would be slower a bit), in such a case you wouldn't need padding.
  • Finally, we will run this order function order_func_nb using Portfolio.from_order_func()
In [68]:
@njit
def has_data_nb(c):
    """
    Numba function to check whether OHLC is not NaN. 
    If any column O,H,L,C is NaN returns False else returns True
    """
    if np.isnan(vbt.pf_nb.select_nb(c, c.open)):
        return False
    if np.isnan(vbt.pf_nb.select_nb(c, c.high)):
        return False
    if np.isnan(vbt.pf_nb.select_nb(c, c.low)):
        return False
    if np.isnan(vbt.pf_nb.select_nb(c, c.close)):
        return False
    return True

## Wrapper function to call the vbt function
@njit
def check_price_hit_nb(c, price, hit_below, can_use_ohlc):
    # Numba function to check whether a price level was hit during this bar
    # Use hit_below = True to check against low and hit_below = False to check against high
    # If can_use_ohlc is False, will check only against the close price
    
    order_price, hit_on_open, hit = vbt.pf_nb.check_price_hit_nb(
        open = vbt.pf_nb.select_nb(c, c.open),  # OHLC are flexible arrays, always use select_nb!
        high = vbt.pf_nb.select_nb(c, c.high),
        low  = vbt.pf_nb.select_nb(c, c.low),
        close= vbt.pf_nb.select_nb(c, c.close),
        price=price,
        hit_below=hit_below,
        can_use_ohlc=can_use_ohlc
    )
    # Order price here isn't necessarily the price that has been hit
    # For example, if the price was hit before open, order price is set to the open price
    return order_price, hit

@njit(boundscheck=True)
def order_func_nb(c, signal_info, temp_info):  # first argument is context object, other are our namedTuple containers
    if not has_data_nb(c):
        # If this bar contains no data, skip it
        return vbt.pf_nb.order_nothing_nb()
    
    # Each column corresponds to a signal
    signal = c.col
    
    # Each row corresponds to a bar
    bar = c.i
    
    # Define various flags for pure convenience
    buy_market = signal_info.order_type[signal] == OrderType.BUY
    sell_market = signal_info.order_type[signal] == OrderType.SELL
    buy_stop = signal_info.order_type[signal] == OrderType.BUYSTOP
    sell_stop = signal_info.order_type[signal] == OrderType.SELLSTOP
    buy = buy_market or buy_stop
    # We have only `buy = buy_market or buy_stop` because Selling means not buying, 
    # so we only need to check whether it's a buy operation, and if not, 
    # it's automatically a sell operation (i.e. `sell = not buy`)
    
    # First, we need to check whether the current bar contains a signal
    can_use_ohlc = True
    if temp_info.ts_bar[signal] == -1:
        # Check whether the signal has been discovered
        # -1 means hasn't been discovered yet    
        if c.index[bar] == signal_info.timestamp[signal]:
            # If so, store the current row index in a temporary array
            # such that later we know that we already discovered a signal
            temp_info.ts_bar[signal] = bar

            # The signal has the granularity of seconds, thus it belongs somewhere in the bar
            # We need to notify the functions below that they cannot use fully OHLC information, only close
            # This is to avoid using prices that technically happened before the signal
            can_use_ohlc = False
        
    # Here comes the entry order
    if temp_info.ts_bar[signal] != -1:        
        # Then, check whether the entry order hasn't been executed
        if temp_info.entry_price_bar[signal] == -1:            
            # If so, execute the entry order
            if buy_market:
                # Buy market order (using closing price)
                # Store the current row index in a temporary array such that future bars know
                # that the order has already been executed
                temp_info.entry_price_bar[signal] = bar
                order_price = signal_info.entry_price[signal]
                return vbt.pf_nb.order_nb(np.inf, np.inf)  # size, price            
            if sell_market:
                # Sell market order (using closing price)
                temp_info.entry_price_bar[signal] = bar
                order_price = signal_info.entry_price[signal]
                return vbt.pf_nb.order_nb(-np.inf, np.inf)
            
            if buy_stop: # Buy stop order
                # A buy stop order is entered at a stop price above the current market price                
                # Since it's a pending order, we first need to check whether the entry price has been hit
                order_price, hit = check_price_hit_nb(c,
                                                      price=signal_info.entry_price[signal],
                                                      hit_below=False,
                                                      can_use_ohlc=can_use_ohlc)
                if hit: # If so, execute the order
                    temp_info.entry_price_bar[signal] = bar
                    return vbt.pf_nb.order_nb(np.inf, order_price)
                
            if sell_stop: # Sell stop order
                # A sell stop order is entered at a stop price below the current market price
                order_price, hit = check_price_hit_nb(c,
                                                      price=signal_info.entry_price[signal],
                                                      hit_below=True,
                                                      can_use_ohlc=can_use_ohlc)
                if hit:
                    temp_info.entry_price_bar[signal] = bar
                    return vbt.pf_nb.order_nb(-np.inf, order_price)
               
        # Here comes the stop order, i.e EXIT Order
        # Check whether the entry order has been executed
        if temp_info.entry_price_bar[signal] != -1:
            # We also need to check whether we're still in a position
            # in case stops have already closed out the position
            if c.last_position[signal] != 0:
                
                # If so, start with checking for potential SL orders
                # (remember that SL pessimistically comes before TP)
                # First, we need to know the number of potential and already executed SL levels
                # since we want to gradually reduce the position proportionally to the number of levels
                # For example, one signal may define [12.35, 12.29] and another [17.53, nan]
                n_sl_levels = 0
                n_sl_hits = 0
                sl_levels = signal_info.sl[signal]  # select 1d array from 2d array
                sl_bar = temp_info.sl_bar[signal]  # same here
                for k in range(len(sl_levels)):
                    if not np.isnan(sl_levels[k]):
                        n_sl_levels += 1
                    if sl_bar[k] != -1:
                        n_sl_hits += 1
                
                # We can execute only one order at the current bar
                # Thus, if the price crossed multiple SL levels, we need to pack them into one order
                # Since SL levels are guaranteed to be sorted, we will check the most distant levels first
                # because if a distant stop has been hit, the closer stops are automatically hit too
                for k in range(n_sl_levels - 1, n_sl_hits - 1, -1):
                    if not np.isnan(sl_levels[k]) and sl_bar[k] == -1:
                        # Check against low for buy orders and against high for sell orders
                        order_price, hit = check_price_hit_nb(c,
                                                              price=sl_levels[k],
                                                              hit_below=buy,
                                                              can_use_ohlc=can_use_ohlc)
                        if hit:
                            sl_bar[k] = bar
                            # The further away the stop is, the more of the position needs to be closed
                            # We will specify a target percentage
                            # For example, for two stops it would be 0.5 (SL1) and 0.0 (SL2)
                            # while for three stops it would be 0.66 (SL1), 0.33 (SL2), and 0.0 (SL3)
                            # This works only if we went all in before (size=np.inf)!
                            size = 1 - (k + 1) / n_sl_levels
                            size_type = vbt.pf_enums.SizeType.TargetPercent
                            if buy:
                                return vbt.pf_nb.order_nb(size, order_price, size_type)
                            else:
                                # Size must be negative for short positions
                                return vbt.pf_nb.order_nb(-size, order_price, size_type)
                        
                # Same for potential TP orders
                n_tp_levels = 0
                n_tp_hits = 0
                tp_levels = signal_info.tp[signal]
                tp_bar = temp_info.tp_bar[signal]
                for k in range(len(tp_levels)):
                    if not np.isnan(tp_levels[k]):
                        n_tp_levels += 1
                    if tp_bar[k] != -1:
                        n_tp_hits += 1
                
                for k in range(n_tp_levels - 1, n_tp_hits - 1, -1):
                    if not np.isnan(tp_levels[k]) and tp_bar[k] == -1:
                        # Check against high for buy orders and against low for sell orders
                        order_price, hit = check_price_hit_nb(c,
                                                              price=tp_levels[k],
                                                              hit_below=not buy,
                                                              can_use_ohlc=can_use_ohlc)
                        if hit:
                            tp_bar[k] = bar
                            size = 1 - (k + 1) / n_tp_levels
                            size_type = vbt.pf_enums.SizeType.TargetPercent
                            if buy:
                                return vbt.pf_nb.order_nb(size, order_price, size_type)
                            else:
                                return vbt.pf_nb.order_nb(-size, order_price, size_type)
                    
    # If neither of orders has been executed, order nothing
    return vbt.pf_nb.order_nothing_nb()

Partial Position Closure Illustration at multiple TP Levels

Case Study - 3 TP Levels

To make the requested code segment more clear, let's run it on some sample data. First, we will derive the number of TP levels that are defined in this signal and the number of levels that have been already hit:
In [69]:
tp_levels = np.array([10, 12, 14])  # stop prices
tp_bar = np.array([-1, -1, -1])  # row indices where each stop price was hit

n_tp_levels = 0
n_tp_hits = 0
for k in range(len(tp_levels)):
    if not np.isnan(tp_levels[k]):
        n_tp_levels += 1
    if tp_bar[k] != -1:
        n_tp_hits += 1

print("Nr of TP Levels:",n_tp_levels)
print("Nr. of TP Hits:",n_tp_hits)
Nr of TP Levels: 3
Nr. of TP Hits: 0
We see that initially none of the TP stop prices have been hit.

Next we want to create a loop that iterates over the TP stop prices in a reversed order. Order is reversed because if the last stop price has been hit, all the stop prices defined before it are hit automatically too. We want to iterate only over those prices that haven't been hit yet.

In [70]:
for k in range(n_tp_levels - 1, n_tp_hits - 1, -1):
    if not np.isnan(tp_levels[k]) and tp_bar[k] == -1:
        print("TP Level:",tp_levels[k])
TP Level: 14
TP Level: 12
TP Level: 10
Let's assume that the first stop price has been hit:
In [71]:
k = 0
size = 1 - (k + 1) / n_tp_levels
print("Nr. of TP Levels:", n_tp_levels)
print("Size:", size)
Nr. of TP Levels: 3
Size: 0.6666666666666667
The size represents a target percentage size, that is, we want the column value after the order to be 66.6% of the value that is right now. We basically remove 1/3 from the value by hitting the first stop price.

But what happens if our second stop price is hit instead?

In [72]:
k = 1
size = 1 - (k + 1) / n_tp_levels
print("Size:",size)
Size: 0.33333333333333337
We remove 2/3 from the value with this stop price. Finally, the last stop price is hit and the position is now closed.
We removed an equal chunk of value with each stop price using the above equation.
In [73]:
k = 2
size = 1 - (k + 1) / n_tp_levels
print("Size:", size)
Size: 0.0

Case Study - Two TP Levels

(NaN in a TP_Level)

Now let's say we only have a ladder with 2 TP levels but there are three rows because some other column is using three levels (in such a case the array is padded with nan):
In [74]:
tp_levels = np.array([10, 12, np.nan])  # stop prices
tp_bar = np.array([-1, -1, -1])  # row indices where each stop price was hit

n_tp_levels = 0
n_tp_hits = 0
for k in range(len(tp_levels)):
    if not np.isnan(tp_levels[k]):
        n_tp_levels += 1
    if tp_bar[k] != -1:
        n_tp_hits += 1

print("Nr of TP Levels:",n_tp_levels)
print("Nr. of TP Hits:",n_tp_hits)
Nr of TP Levels: 2
Nr. of TP Hits: 0
This code has correctly determined the number of levels in the ladder of this signal.

Let's say the first stop price is hit:

In [75]:
k = 0
size = 1 - (k + 1) / n_tp_levels
print(size)
0.5
In [76]:
k = 1
size = 1 - (k + 1) / n_tp_levels
print(size)
0.0
Since we have only two levels in the ladder, we now remove 50% from the value instead of 33.3%.

Matching Timestamps - minute to minute

In [77]:
# Prepare timestamp for signal information
timestamp = signal_data.index.values.astype(np.int64)  # nanoseconds
print(timestamp[:5])
[1631623431000000000 1631679887000000000 1631693325000000000
 1631788792000000000 1631796516000000000]
In [78]:
# Since the signals are of the second granularity while the data is of the minute granularity,
# we need to round the timestamp of the signal to the nearest minute
# Timestamps represent the opening time, thus the 59th second in "19:28:59" still belongs to the minute "19:28:00"

timestamp = timestamp - timestamp % vbt.dt_nb.m_ns
print(timestamp[:5])
[1631623380000000000 1631679840000000000 1631693280000000000
 1631788740000000000 1631796480000000000]
Each value above is a date represented in nanosecond format since Unix epoch (01-01-1970). Since they are just regular integers, we can do operations such as "modulo" as we did above, which just translates to "remove the remainder from dividing the timestamp by a minute" to effectively remove any seconds, milliseconds, microseconds, and nanoseconds from the minute we're currently in. Here, `vbt.dt_nb.m_ns` is the total number of nanoseconds in one minute.

Actual Simulation 🏃🏻‍♂️🎬

In [79]:
signal_data.get().head()
Out[79]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id OrderType EntryPrice SL TP1 TP2 TP3 TP4
date
2021-09-14 12:43:51+00:00 846 1 1792.4 1796.4 1790.9 1789.4 1787.4 NaN
2021-09-15 04:24:47+00:00 854 0 1800.0 1797.5 1805.0 1810.0 NaN NaN
2021-09-15 08:08:45+00:00 866 1 1802.5 1806.5 1801.0 1799.5 1797.5 NaN
2021-09-16 10:39:52+00:00 884 0 1780.0 NaN 1781.3 1783.3 NaN NaN
2021-09-16 12:48:36+00:00 887 0 1762.3 1758.3 1763.8 1765.3 1767.3 NaN
In [80]:
order_type = signal_data.get("OrderType").values
entry_price = signal_data.get("EntryPrice").values
sl = signal_data.get("SL").values
tp1 = signal_data.get("TP1").values
tp2 = signal_data.get("TP2").values
tp3 = signal_data.get("TP3").values
tp4 = signal_data.get("TP4").values

n_signals = len(timestamp)
print('Total nr. of Signals:',n_signals)
# Create a named tuple for signal information

## Feed the above created arrays into the namedtuple
signal_info = SignalInfo(
    timestamp=timestamp,
    order_type=order_type,
    entry_price=entry_price,
    sl=np.column_stack((sl,)),
    tp=np.column_stack((tp1, tp2, tp3, tp4))
)

n_sl_levels = signal_info.sl.shape[1]
print("Nr. of SL Levels:",n_sl_levels)

n_tp_levels = signal_info.tp.shape[1]
print("Nr. of TP Levels:",n_tp_levels)
Total nr. of Signals: 232
Nr. of SL Levels: 1
Nr. of TP Levels: 4
In [81]:
# Important: re-run this cell every time you're running the simulation!
# Create a named tuple for temporary information
# All arrays below hold row indices, thus the default value is -1

def build_temp_info(signal_info):
    return TempInfo(
        ts_bar=np.full(len(signal_info.timestamp), -1),
        entry_price_bar=np.full(len(signal_info.timestamp), -1),
        sl_bar=np.full(signal_info.sl.shape, -1),
        tp_bar=np.full(signal_info.tp.shape, -1)
    )

temp_info = build_temp_info(signal_info)

Why re-run build_temp_info?
Temporary information gets overridden during the simulation (it acts as a memory where signal functions from the future access information written by the signal functions from the past), and you don't want to use dirty arrays in the next simulation, so we have to re-run this build_temp_info function everytime for each simulation

In [82]:
# By default, vectorBT initializes an empty order array of the same shape as data
# But since our data is highly granular, it would take a lot of RAM
# Let's limit the number of records to one entry order and the maximum number of SL and TP orders
# It will be applied per column

## The 1 below is for Entry Order
max_orders = 1 + n_sl_levels + n_tp_levels

# It's the maximum number of orders per column (i.e per signal)
print("Nr. of SL Levels:", n_sl_levels)
print("Nr. of TP Levels:", n_tp_levels)
print("Maximum Orders:",max_orders)
Nr. of SL Levels: 1
Nr. of TP Levels: 4
Maximum Orders: 6
In [83]:
# Perform the actual simulation
# Since we don't broadcast data against any other array, vectorbt doesn't know anything about
# our signal arrays and will simulate only the one column in our data
# Thus, we need to tell it to expand the number of columns by the number of signals using tiling
# But don't worry: thanks to flexible indexing vectorbt won't actually tile the data - good for RAM!
# (it would tile the data if it had multiple columns though!)

pf = vbt.Portfolio.from_order_func(
    data,
    order_func_nb=order_func_nb,
    order_args=(signal_info, temp_info),
    broadcast_kwargs=dict(tile=n_signals),  # tiling here
    max_orders=max_orders,
    freq="minute"  # we have an irregular one-minute frequency
)
# (may take a minute...)
  • tiling in broadcast_kwargs argument in below vbt.Portfolio.from_order_func function
    Since we don't broadcast data against any other array, vectorbt doesn't know anything about our signal arrays and will simulate only the one column that is in our data. For example, if data has shape (500, 1), then the simulation will only run on one column. Thus, we need to tell it to expand the number of columns to the number of signals, which is as simple as providing the tile argument to the broadcaster. Under the hood, it will replace our example shape of (500, 1) with (500, 232).
    Also note that you can pass an index instead of a number. For example, you can pass signal_info.timestamp or telegram message IDs as pd.Index so they become the column names in the new portfolio.
In [84]:
# Let's print out the order records in a human-readable format
pf.orders.records_readable
Out[84]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Order Id Column Index Size Price Fees Side
0 0 0 2021-09-14 12:43:00+00:00 0.055830 1791.160 0.0 Sell
1 1 0 2021-09-14 12:44:00+00:00 0.018610 1790.900 0.0 Buy
2 2 0 2021-09-14 12:50:00+00:00 0.018591 1789.400 0.0 Buy
3 3 0 2021-09-14 12:55:00+00:00 0.018629 1796.400 0.0 Buy
4 0 5 2021-09-17 11:00:00+00:00 0.056762 1761.727 0.0 Buy
... ... ... ... ... ... ... ...
150 3 226 2023-03-13 00:00:00+00:00 0.027003 1877.865 0.0 Buy
151 0 227 2023-03-13 10:17:00+00:00 0.053066 1884.455 0.0 Sell
152 1 227 2023-03-13 10:47:00+00:00 0.013174 1882.000 0.0 Buy
153 2 227 2023-03-13 12:29:00+00:00 0.039892 1895.000 0.0 Buy
154 0 229 2023-03-13 14:36:00+00:00 0.052468 1905.935 0.0 Sell

155 rows × 7 columns

  • Calculation of size using order_func_nb
    We treat our columns as independent backtests and assign to each backtest $100 of capital. By default, without specifying the size, vbt will use the entire cash (that is, size of infinity). Since each signal is executed at a different point of price history, absolute size is different across all signals, but they are using the same amount of cash which makes them perfectly comparable during the simulation phase.

Creating the StopType Column

A position can be stopped out in one of two scenarios, either it hits a Stop Loss or it hits a TakeProfit. We collectively, frame these two scenarios as a StopOrder Type

In [85]:
# We can notice above that there's no information whether an order is an SL or TP order
# What we can do is to create our own order records with custom fields, copy the old ones over,
# and tell the portfolio to use them instead of the default ones

# First, we need to create an enumerated field for stop types
# SL levels will come first, TP levels second, in an incremental fashion
StopTypeT = namedtuple("StopTypeT", [
    *[f"SL{i + 1}" for i in range(n_sl_levels)],
    *[f"TP{i + 1}" for i in range(n_tp_levels)]
])
StopType = StopTypeT(*range(len(StopTypeT._fields)))

print(StopType)
StopTypeT(SL1=0, TP1=1, TP2=2, TP3=3, TP4=4)
In [86]:
# To extend order records, we just need to append new fields and construct a new data type
custom_order_dt = np.dtype(vbt.pf_enums.order_fields + [("order_type", np.int_), ("stop_type", np.int_)])

def fix_order_records(order_records, signal_info, temp_info):
    # This is a function that will "fix" our default records and return the fixed ones
    # Create a new empty record array with the new data type
    # Empty here means that the array isn't initialized yet and contains junk data
    # Thus, make sure to override each single element
    custom_order_records = np.empty(order_records.shape, dtype=custom_order_dt)
    
    # Copy over the information from our default records
    for field, _ in vbt.pf_enums.order_fields:
        custom_order_records[field] = order_records[field]
        
    # Iterate over the new records and fill the stop type
    for i in range(len(custom_order_records)):
        record = custom_order_records[i]
        signal = record["col"]  # each column corresponds to a signal
        
        # Fill the order type
        record["order_type"] = signal_info.order_type[signal]
        
        # Concatenate SL and TP row indices of this signal into a new list
        # We must do it the same way as we did in StopTypeT
        bar = [
            *temp_info.sl_bar[signal],
            *temp_info.tp_bar[signal]
        ]
        
        # Check whether the row index of this order is in this list
        # (which means that this order is a stop order)
        if record["idx"] in bar:
            # If so, get the matching position in this list and use it as order type
            # It will correspond to a field in StopType
            record["stop_type"] = bar.index(record["idx"])
        else:
            record["stop_type"] = -1
    return custom_order_records
            
custom_order_records = fix_order_records(pf.order_records, signal_info, temp_info)
custom_order_records[:10]
Out[86]:
array([(0,  0, 4663, 0.05582974, 1791.16  , 0., 1, 1, -1),
       (1,  0, 4664, 0.01860971, 1790.9   , 0., 0, 1,  1),
       (2,  0, 4670, 0.01859054, 1789.4   , 0., 0, 1,  2),
       (3,  0, 4675, 0.01862949, 1796.4   , 0., 0, 1,  0),
       (0,  5, 5940, 0.05676248, 1761.727 , 0., 0, 0, -1),
       (1,  5, 6054, 0.05676248, 1755.    , 0., 1, 0,  0),
       (0, 21, 8746, 0.05683987, 1759.3285, 0., 1, 1, -1),
       (1, 21, 9180, 0.05683987, 1791.78  , 0., 0, 1,  0),
       (0, 27, 9663, 0.05568476, 1795.8235, 0., 1, 1, -1),
       (1, 27, 9782, 0.05568476, 1798.8   , 0., 0, 1,  0)],
      dtype=[('id', '<i8'), ('col', '<i8'), ('idx', '<i8'), ('size', '<f8'), ('price', '<f8'), ('fees', '<f8'), ('side', '<i8'), ('order_type', '<i8'), ('stop_type', '<i8')])
In [87]:
# Having raw order records is not enough as vbt.Orders doesn't know what to do with the new field
# (remember that vbt.Orders is used to analyze the records)
# Let's create our custom class that subclasses vbt.Orders
# and override the field config to also include the information on the new field

from vectorbtpro.records.decorators import attach_fields, override_field_config

@attach_fields(dict(
    order_type=dict(attach_filters=True),
    stop_type=dict(attach_filters=True)
))
@override_field_config(dict(
    dtype=custom_order_dt,  # specify the new data type
    settings=dict(
        order_type=dict(
            title="Order Type",  # specify a human-readable title for the field
            mapping=OrderType,  # specify the mapper for the field
        ),
        stop_type=dict(
            title="Stop Type",  # specify a human-readable title for the field
            mapping=StopType,  # specify the mapper for the field
        ),
    )
))
class CustomOrders(vbt.Orders):
    pass
An Orders class basically represents the raw order data in a more analysis-friendly fashion. For example, order records have a field "size", which can be analyzed by querying `pf.orders.size`. Since we've got new fields, we want to attach them in the same way. This is easily done by using two decorators: "attach_fields" creates properties around new fields, such as for example `pf.orders.side_buy` is an automatically-created property to filter only the records with the side "Buy". The second decorator "override_field_config" allows us to describe the fields and make them human-readable, for example, "title" is the name of the field whenever the user prints out `pf.orders.records_readable`.
In [88]:
# Finally, let's replace the order records and the class in the portfolio
pf = pf.replace(order_records=custom_order_records, orders_cls=CustomOrders)
In [89]:
# We can now effortlessly analyze the stop type
pf.orders.records_readable
Out[89]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Order Id Column Index Size Price Fees Side Order Type Stop Type
0 0 0 2021-09-14 12:43:00+00:00 0.055830 1791.160 0.0 Sell SELL None
1 1 0 2021-09-14 12:44:00+00:00 0.018610 1790.900 0.0 Buy SELL TP1
2 2 0 2021-09-14 12:50:00+00:00 0.018591 1789.400 0.0 Buy SELL TP2
3 3 0 2021-09-14 12:55:00+00:00 0.018629 1796.400 0.0 Buy SELL SL1
4 0 5 2021-09-17 11:00:00+00:00 0.056762 1761.727 0.0 Buy BUY None
... ... ... ... ... ... ... ... ... ...
150 3 226 2023-03-13 00:00:00+00:00 0.027003 1877.865 0.0 Buy SELLSTOP SL1
151 0 227 2023-03-13 10:17:00+00:00 0.053066 1884.455 0.0 Sell SELL None
152 1 227 2023-03-13 10:47:00+00:00 0.013174 1882.000 0.0 Buy SELL TP1
153 2 227 2023-03-13 12:29:00+00:00 0.039892 1895.000 0.0 Buy SELL SL1
154 0 229 2023-03-13 14:36:00+00:00 0.052468 1905.935 0.0 Sell SELL None

155 rows × 9 columns

In [90]:
## Selecting only BUY Side orders
pf.orders.side_buy.records_readable
Out[90]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Order Id Column Index Size Price Fees Side Order Type Stop Type
0 1 0 2021-09-14 12:44:00+00:00 0.018610 1790.900 0.0 Buy SELL TP1
1 2 0 2021-09-14 12:50:00+00:00 0.018591 1789.400 0.0 Buy SELL TP2
2 3 0 2021-09-14 12:55:00+00:00 0.018629 1796.400 0.0 Buy SELL SL1
3 0 5 2021-09-17 11:00:00+00:00 0.056762 1761.727 0.0 Buy BUY None
4 1 21 2021-10-14 00:00:00+00:00 0.056840 1791.780 0.0 Buy SELL SL1
... ... ... ... ... ... ... ... ... ...
89 1 226 2023-03-10 16:05:00+00:00 0.013292 1862.000 0.0 Buy SELLSTOP TP1
90 2 226 2023-03-10 16:19:00+00:00 0.013324 1857.000 0.0 Buy SELLSTOP TP2
91 3 226 2023-03-13 00:00:00+00:00 0.027003 1877.865 0.0 Buy SELLSTOP SL1
92 1 227 2023-03-13 10:47:00+00:00 0.013174 1882.000 0.0 Buy SELL TP1
93 2 227 2023-03-13 12:29:00+00:00 0.039892 1895.000 0.0 Buy SELL SL1

94 rows × 9 columns

In [91]:
# And here are the signals that correspond to these records for verification
signal_data.get()
Out[91]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id OrderType EntryPrice SL TP1 TP2 TP3 TP4
date
2021-09-14 12:43:51+00:00 846 1 1792.4 1796.4 1790.9 1789.4 1787.4 NaN
2021-09-15 04:24:47+00:00 854 0 1800.0 1797.5 1805.0 1810.0 NaN NaN
2021-09-15 08:08:45+00:00 866 1 1802.5 1806.5 1801.0 1799.5 1797.5 NaN
2021-09-16 10:39:52+00:00 884 0 1780.0 NaN 1781.3 1783.3 NaN NaN
2021-09-16 12:48:36+00:00 887 0 1762.3 1758.3 1763.8 1765.3 1767.3 NaN
... ... ... ... ... ... ... ... ...
2023-03-13 10:17:39+00:00 4408 1 1885.0 1895.0 1882.0 1877.0 1865.0 1800.0
2023-03-13 13:08:21+00:00 4411 3 1896.0 1906.0 1893.0 1888.0 1870.0 1840.0
2023-03-13 14:36:09+00:00 4413 1 1907.0 1918.0 1903.0 1900.0 1890.0 1860.0
2023-03-15 06:27:23+00:00 4425 3 1899.0 1910.0 1896.0 1890.0 1880.0 1850.0
2023-03-16 13:03:03+00:00 4432 1 1929.0 1940.0 1926.0 1920.0 1915.0 1870.0

232 rows × 8 columns

In [92]:
pf.orders.count()
Out[92]:
0      4
1      0
2      0
3      0
4      0
      ..
227    3
228    0
229    1
230    0
231    0
Name: count, Length: 232, dtype: int64

Filtering Telegram Signals which got skipped

If we run np.flatnonzero(pf.orders.count() == 0) we can get the rows of the signals that were skipped. Since we have the ID column of the Telegram messages, we use these indices to select signal data, like this signal_data.get().iloc[np.flatnonzero(pf.orders.count() == 0)].

In [93]:
signal_data.get().iloc[np.flatnonzero(pf.orders.count() == 0)]
Out[93]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id OrderType EntryPrice SL TP1 TP2 TP3 TP4
date
2021-09-15 04:24:47+00:00 854 0 1800.0 1797.5 1805.0 1810.0 NaN NaN
2021-09-15 08:08:45+00:00 866 1 1802.5 1806.5 1801.0 1799.5 1797.5 NaN
2021-09-16 10:39:52+00:00 884 0 1780.0 NaN 1781.3 1783.3 NaN NaN
2021-09-16 12:48:36+00:00 887 0 1762.3 1758.3 1763.8 1765.3 1767.3 NaN
2021-09-21 13:42:40+00:00 942 1 1775.0 1776.0 1769.0 1766.0 NaN NaN
... ... ... ... ... ... ... ... ...
2023-03-08 15:15:01+00:00 4381 1 1820.0 1830.0 1817.0 1812.0 1805.0 1770.0
2023-03-08 15:38:05+00:00 4382 1 1823.0 1830.0 1820.0 1815.0 1805.0 1770.0
2023-03-13 13:08:21+00:00 4411 3 1896.0 1906.0 1893.0 1888.0 1870.0 1840.0
2023-03-15 06:27:23+00:00 4425 3 1899.0 1910.0 1896.0 1890.0 1880.0 1850.0
2023-03-16 13:03:03+00:00 4432 1 1929.0 1940.0 1926.0 1920.0 1915.0 1870.0

187 rows × 8 columns

In [94]:
print("Nr. Orders which got skipped:", (pf.orders.count() == 0).sum())
Nr. Orders which got skipped: 187
In [95]:
# We can see that some signals were skipped, let's remove them from the portfolio
pf = pf.loc[:, pf.orders.count() >= 1]
print(len(pf.wrapper.columns))
45

Analysis and Visualization of our Signals Backtesting

In [96]:
# There are various ways to analyze the data
# For example, we can count how many times each stop type was triggered
# Since we want to combine all trades in each statistic, we need to provide grouping

print(pf.orders.stop_type.stats(group_by=True))
Start                 2021-09-02 00:00:00+00:00
End                   2023-03-13 23:59:00+00:00
Period                         83 days 06:26:00
Count                                       155
Value Counts: None                           45
Value Counts: SL1                            34
Value Counts: TP1                            34
Value Counts: TP2                            30
Value Counts: TP3                             8
Value Counts: TP4                             4
Name: group, dtype: object

There were 225 signals (columns) and the above stats show the distribution of stop types, where None means that the order was not any type of a stop order and Stop Loss was hit 34 times, while TP3 and TP4 was hit very sparringly at 8 and 4 times respectively.

In [97]:
# We can also get the position stats for P&L information
pf.positions.stats(group_by=True)
Out[97]:
Start                         2021-09-02 00:00:00+00:00
End                           2023-03-13 23:59:00+00:00
Period                                 83 days 06:26:00
First Trade Start             2021-09-14 12:43:00+00:00
Last Trade End                2023-03-13 23:59:00+00:00
Coverage                               32 days 18:10:00
Overlap Coverage                        5 days 09:17:00
Total Records                                        45
Total Long Trades                                    11
Total Short Trades                                   34
Total Closed Trades                                  44
Total Open Trades                                     1
Open Trade PnL                                 -0.32732
Win Rate [%]                                  27.272727
Max Win Streak                                        1
Max Loss Streak                                       1
Best Trade [%]                                 2.876974
Worst Trade [%]                               -1.844539
Avg Winning Trade [%]                           0.94318
Avg Losing Trade [%]                           -0.36128
Avg Winning Trade Duration              1 days 09:37:30
Avg Losing Trade Duration        0 days 15:46:16.875000
Profit Factor                                  0.978998
Expectancy                                    -0.005518
SQN                                           -0.045627
Edge Ratio                                     1.497918
Name: group, dtype: object
In [98]:
pf.trades.records_readable
Out[98]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Exit Trade Id Column Size Entry Order Id Entry Index Avg Entry Price Entry Fees Exit Order Id Exit Index Avg Exit Price Exit Fees PnL Return Direction Status Position Id
0 0 0 0.018610 0 2021-09-14 12:43:00+00:00 1791.1600 0.0 1 2021-09-14 12:44:00+00:00 1790.9000 0.0 0.004839 0.000145 Short Closed 0
1 1 0 0.018591 0 2021-09-14 12:43:00+00:00 1791.1600 0.0 2 2021-09-14 12:50:00+00:00 1789.4000 0.0 0.032719 0.000983 Short Closed 0
2 2 0 0.018629 0 2021-09-14 12:43:00+00:00 1791.1600 0.0 3 2021-09-14 12:55:00+00:00 1796.4000 0.0 -0.097619 -0.002925 Short Closed 0
3 0 5 0.056762 0 2021-09-17 11:00:00+00:00 1761.7270 0.0 1 2021-09-17 12:54:00+00:00 1755.0000 0.0 -0.381841 -0.003818 Long Closed 0
4 0 21 0.056840 0 2021-10-05 15:46:00+00:00 1759.3285 0.0 1 2021-10-14 00:00:00+00:00 1791.7800 0.0 -1.844539 -0.018445 Short Closed 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
106 1 226 0.013324 0 2023-03-10 15:51:00+00:00 1865.0000 0.0 2 2023-03-10 16:19:00+00:00 1857.0000 0.0 0.106591 0.004290 Short Closed 0
107 2 226 0.027003 0 2023-03-10 15:51:00+00:00 1865.0000 0.0 3 2023-03-13 00:00:00+00:00 1877.8650 0.0 -0.347394 -0.006898 Short Closed 0
108 0 227 0.013174 0 2023-03-13 10:17:00+00:00 1884.4550 0.0 1 2023-03-13 10:47:00+00:00 1882.0000 0.0 0.032342 0.001303 Short Closed 0
109 1 227 0.039892 0 2023-03-13 10:17:00+00:00 1884.4550 0.0 2 2023-03-13 12:29:00+00:00 1895.0000 0.0 -0.420660 -0.005596 Short Closed 0
110 0 229 0.052468 0 2023-03-13 14:36:00+00:00 1905.9350 0.0 -1 2023-03-13 23:59:00+00:00 1912.1735 0.0 -0.327320 -0.003273 Short Open 0

111 rows × 16 columns

In [99]:
pf.trades.records_readable['Position Id'].unique()
Out[99]:
array([0])

Trades vs Orders:

pf.Orders is our customized vectorBT representation of the various orders that result from the simulation of our signals data.

Trades in vectorbtpro world are a bit different from what you normally call trades. There are two types of trades: entry trades and exit trades. For example, a position may have several entry orders that increase the position and several exit orders that decrease the position. The first are called entry trades, the second exit trades. pf.trades is the same as pf.exit_trades, for entry trades you can query pf.entry_trades

In [100]:
pf.entry_trades.records_readable
Out[100]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Entry Trade Id Column Size Entry Order Id Entry Index Avg Entry Price Entry Fees Exit Order Id Exit Index Avg Exit Price Exit Fees PnL Return Direction Status Position Id
0 0 0 0.055830 0 2021-09-14 12:43:00+00:00 1791.1600 0.0 3 2021-09-14 12:55:00+00:00 1792.235783 0.0 -0.060061 -0.000601 Short Closed 0
1 0 5 0.056762 0 2021-09-17 11:00:00+00:00 1761.7270 0.0 1 2021-09-17 12:54:00+00:00 1755.000000 0.0 -0.381841 -0.003818 Long Closed 0
2 0 21 0.056840 0 2021-10-05 15:46:00+00:00 1759.3285 0.0 1 2021-10-14 00:00:00+00:00 1791.780000 0.0 -1.844539 -0.018445 Short Closed 0
3 0 27 0.055685 0 2021-10-14 08:03:00+00:00 1795.8235 0.0 1 2021-10-14 10:02:00+00:00 1798.800000 0.0 -0.165746 -0.001657 Short Closed 0
4 0 32 0.054156 0 2022-01-20 15:04:00+00:00 1846.5100 0.0 4 2022-01-26 19:03:00+00:00 1836.717318 0.0 0.530335 0.005303 Short Closed 0
5 0 35 0.053757 0 2022-02-14 14:37:00+00:00 1860.2400 0.0 1 2022-02-14 15:16:00+00:00 1868.000000 0.0 -0.417150 -0.004172 Short Closed 0
6 0 46 0.050660 0 2022-03-10 01:06:00+00:00 1973.9450 0.0 3 2022-03-10 11:09:00+00:00 1991.658877 0.0 0.897385 0.008974 Long Closed 0
7 0 47 0.050472 0 2022-03-10 07:05:00+00:00 1981.2930 0.0 3 2022-03-10 11:09:00+00:00 1994.663824 0.0 0.674853 0.006749 Long Closed 0
8 0 48 0.050042 0 2022-03-10 14:57:00+00:00 1998.3380 0.0 2 2022-03-14 00:00:00+00:00 1984.930000 0.0 -0.670958 -0.006710 Long Closed 0
9 0 50 0.050699 0 2022-03-14 01:18:00+00:00 1972.4250 0.0 2 2022-03-14 08:57:00+00:00 1970.666667 0.0 -0.089146 -0.000891 Long Closed 0
10 0 51 0.051462 0 2022-03-24 01:45:00+00:00 1943.1675 0.0 2 2022-03-24 11:40:00+00:00 1947.700070 0.0 -0.233257 -0.002333 Short Closed 0
11 0 52 0.051373 0 2022-03-24 11:32:00+00:00 1946.5545 0.0 2 2022-03-24 11:40:00+00:00 1947.275000 0.0 0.037014 0.000370 Long Closed 0
12 0 53 0.051217 0 2022-03-24 11:46:00+00:00 1952.4725 0.0 3 2022-03-30 00:00:00+00:00 1946.514917 0.0 -0.305130 -0.003051 Long Closed 0
13 0 56 0.052012 0 2022-03-30 01:25:00+00:00 1922.6325 0.0 2 2022-04-21 00:00:00+00:00 1942.427500 0.0 1.029578 0.010296 Long Closed 0
14 0 63 0.051380 0 2022-04-21 12:41:00+00:00 1946.2980 0.0 3 2022-05-09 00:00:00+00:00 1911.743445 0.0 1.775399 0.017754 Short Closed 0
15 0 70 0.054021 0 2022-05-11 09:06:00+00:00 1851.1345 0.0 3 2022-05-24 13:12:00+00:00 1851.202702 0.0 -0.003684 -0.000037 Short Closed 0
16 0 77 0.053794 0 2022-05-24 08:25:00+00:00 1858.9405 0.0 3 2022-07-19 00:00:00+00:00 1805.459265 0.0 2.876974 0.028770 Short Closed 0
17 0 111 0.056982 0 2022-08-19 07:01:00+00:00 1754.9380 0.0 3 2022-09-13 13:02:00+00:00 1733.215120 0.0 1.237815 0.012378 Short Closed 0
18 0 116 0.058411 0 2022-08-31 12:25:00+00:00 1712.0000 0.0 3 2022-09-13 13:02:00+00:00 1712.006528 0.0 0.000381 0.000004 Long Closed 0
19 0 123 0.058781 0 2022-09-13 13:25:00+00:00 1701.2235 0.0 3 2022-09-16 00:00:00+00:00 1691.510743 0.0 -0.570928 -0.005709 Long Closed 0
20 0 125 0.060044 0 2022-09-19 12:37:00+00:00 1665.4585 0.0 3 2022-09-21 18:35:00+00:00 1666.711569 0.0 -0.075239 -0.000752 Short Closed 0
21 0 126 0.059773 0 2022-09-21 13:44:00+00:00 1673.0000 0.0 3 2022-09-21 18:44:00+00:00 1673.018246 0.0 -0.001091 -0.000011 Short Closed 0
22 0 129 0.060758 0 2022-09-28 13:33:00+00:00 1645.8850 0.0 2 2022-09-30 06:54:00+00:00 1653.966033 0.0 -0.490984 -0.004910 Short Closed 0
23 0 130 0.060234 0 2022-09-28 17:11:00+00:00 1660.1975 0.0 3 2022-09-30 14:37:00+00:00 1662.044470 0.0 -0.111250 -0.001113 Short Closed 0
24 0 136 0.060110 0 2022-10-11 10:36:00+00:00 1663.6215 0.0 3 2022-10-13 12:45:00+00:00 1662.669895 0.0 -0.057201 -0.000572 Long Closed 0
25 0 139 0.060350 0 2022-10-13 12:39:00+00:00 1657.0000 0.0 3 2022-10-13 15:39:00+00:00 1657.026043 0.0 -0.001572 -0.000016 Short Closed 0
26 0 140 0.060096 0 2022-10-17 12:33:00+00:00 1664.0150 0.0 3 2022-11-06 23:00:00+00:00 1664.041533 0.0 -0.001594 -0.000016 Short Closed 0
27 0 155 0.056384 0 2022-11-16 03:34:00+00:00 1773.5600 0.0 2 2022-11-16 10:21:00+00:00 1780.358997 0.0 -0.383353 -0.003834 Short Closed 0
28 0 156 0.056114 0 2022-11-16 10:39:00+00:00 1782.0800 0.0 3 2022-12-05 00:00:00+00:00 1784.053532 0.0 -0.110743 -0.001107 Short Closed 0
29 0 159 0.057057 0 2022-11-25 08:20:00+00:00 1752.6295 0.0 2 2022-12-05 00:00:00+00:00 1782.766007 0.0 -1.719502 -0.017195 Short Closed 0
30 0 178 0.053879 0 2023-01-04 07:47:00+00:00 1856.0150 0.0 3 2023-01-06 19:41:00+00:00 1857.699209 0.0 -0.090743 -0.000907 Short Closed 0
31 0 179 0.054367 0 2023-01-06 02:24:00+00:00 1839.3350 0.0 3 2023-01-06 15:05:00+00:00 1843.785817 0.0 -0.241980 -0.002420 Short Closed 0
32 0 180 0.054268 0 2023-01-06 13:40:00+00:00 1842.7065 0.0 1 2023-01-06 13:47:00+00:00 1850.000000 0.0 -0.395804 -0.003958 Short Closed 0
33 0 181 0.054105 0 2023-01-06 14:04:00+00:00 1848.2500 0.0 3 2023-01-06 15:08:00+00:00 1850.355558 0.0 -0.113922 -0.001139 Short Closed 0
34 0 182 0.053310 0 2023-01-09 05:13:00+00:00 1875.8300 0.0 3 2023-01-15 23:00:00+00:00 1888.282490 0.0 -0.663839 -0.006638 Short Closed 0
35 0 187 0.052329 0 2023-01-17 05:30:00+00:00 1911.0000 0.0 3 2023-01-25 00:00:00+00:00 1916.413678 0.0 -0.283290 -0.002833 Short Closed 0
36 0 194 0.051787 0 2023-01-25 15:34:00+00:00 1931.0000 0.0 1 2023-01-25 17:33:00+00:00 1940.000000 0.0 -0.466080 -0.004661 Short Closed 0
37 0 195 0.051603 0 2023-01-25 17:58:00+00:00 1937.8700 0.0 3 2023-02-03 13:41:00+00:00 1916.134703 0.0 1.121608 0.011216 Short Closed 0
38 0 202 0.051975 0 2023-01-31 15:07:00+00:00 1924.0000 0.0 4 2023-02-03 15:10:00+00:00 1903.895890 0.0 1.044912 0.010449 Short Closed 0
39 0 221 0.053868 0 2023-03-06 07:32:00+00:00 1856.3735 0.0 4 2023-03-10 20:57:00+00:00 1854.667343 0.0 0.091908 0.000919 Short Closed 0
40 0 224 0.055002 0 2023-03-09 08:48:00+00:00 1818.1015 0.0 1 2023-03-09 14:20:00+00:00 1828.000000 0.0 -0.544442 -0.005444 Short Closed 0
41 0 225 0.054756 0 2023-03-09 14:26:00+00:00 1826.2950 0.0 1 2023-03-09 15:07:00+00:00 1835.000000 0.0 -0.476648 -0.004766 Short Closed 0
42 0 226 0.053619 0 2023-03-10 15:51:00+00:00 1865.0000 0.0 3 2023-03-13 00:00:00+00:00 1868.747273 0.0 -0.200926 -0.002009 Short Closed 0
43 0 227 0.053066 0 2023-03-13 10:17:00+00:00 1884.4550 0.0 2 2023-03-13 12:29:00+00:00 1891.772688 0.0 -0.388319 -0.003883 Short Closed 0
44 0 229 0.052468 0 2023-03-13 14:36:00+00:00 1905.9350 0.0 -1 2023-03-13 23:59:00+00:00 1912.173500 0.0 -0.327320 -0.003273 Short Open 0
In [101]:
pf.exit_trades.records_readable
Out[101]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Exit Trade Id Column Size Entry Order Id Entry Index Avg Entry Price Entry Fees Exit Order Id Exit Index Avg Exit Price Exit Fees PnL Return Direction Status Position Id
0 0 0 0.018610 0 2021-09-14 12:43:00+00:00 1791.1600 0.0 1 2021-09-14 12:44:00+00:00 1790.9000 0.0 0.004839 0.000145 Short Closed 0
1 1 0 0.018591 0 2021-09-14 12:43:00+00:00 1791.1600 0.0 2 2021-09-14 12:50:00+00:00 1789.4000 0.0 0.032719 0.000983 Short Closed 0
2 2 0 0.018629 0 2021-09-14 12:43:00+00:00 1791.1600 0.0 3 2021-09-14 12:55:00+00:00 1796.4000 0.0 -0.097619 -0.002925 Short Closed 0
3 0 5 0.056762 0 2021-09-17 11:00:00+00:00 1761.7270 0.0 1 2021-09-17 12:54:00+00:00 1755.0000 0.0 -0.381841 -0.003818 Long Closed 0
4 0 21 0.056840 0 2021-10-05 15:46:00+00:00 1759.3285 0.0 1 2021-10-14 00:00:00+00:00 1791.7800 0.0 -1.844539 -0.018445 Short Closed 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
106 1 226 0.013324 0 2023-03-10 15:51:00+00:00 1865.0000 0.0 2 2023-03-10 16:19:00+00:00 1857.0000 0.0 0.106591 0.004290 Short Closed 0
107 2 226 0.027003 0 2023-03-10 15:51:00+00:00 1865.0000 0.0 3 2023-03-13 00:00:00+00:00 1877.8650 0.0 -0.347394 -0.006898 Short Closed 0
108 0 227 0.013174 0 2023-03-13 10:17:00+00:00 1884.4550 0.0 1 2023-03-13 10:47:00+00:00 1882.0000 0.0 0.032342 0.001303 Short Closed 0
109 1 227 0.039892 0 2023-03-13 10:17:00+00:00 1884.4550 0.0 2 2023-03-13 12:29:00+00:00 1895.0000 0.0 -0.420660 -0.005596 Short Closed 0
110 0 229 0.052468 0 2023-03-13 14:36:00+00:00 1905.9350 0.0 -1 2023-03-13 23:59:00+00:00 1912.1735 0.0 -0.327320 -0.003273 Short Open 0

111 rows × 16 columns

Visualizing Individual Trades

Here, we're randomly selecting one column (i.e., one signal), and plotting its exit trades . Since the only orders that reduce our position are stop orders, each green/red colored box represents a stop order of this signal. In our case, a green box is a TP order, a red box is an SL order, while the blue box is the market order that initialized the position.

In [102]:
# Let's plot a random trade
# The only issue: we have too much data for that (thanks to Plotly)
# Thus, crop it before plotting to remove irrelevant data

signal = np.random.choice(len(pf.wrapper.columns)) ## 6 is hitting all levels till TP3 #6
print("Signal (or) Column Nr:", signal) 
pf.trades.iloc[:, signal].crop().plot().show()
Signal (or) Column Nr: 6

This particular trade having signal or col. Nr as 6, has all 3 Take Profits. Yay! 😃

Order Entry Check within the Candle

The use case is clear: what if the telegram group gives us signals at a price that is too optimistic to be executed at the current time? They could provide a much higher price for a SELL order to make the trade appear more profitable than it really is. Instead of manually checking whether the order price is within the expected range (OHLC), there's a property pf.orders.price_status that does the check for us and returns the status for each order. For example, it returns "BelowLow" if the requested order price is lower than the low price of the bar where the order happened. For example, with pf.orders.bar_high we get the high price of the bar where an order happened, then we call to_readable to make it a Pandas Series and put to our DataFrame. Same for low. By putting high and low price columns alongside "Price", we can analyze whether the order price is within its candle.

In [103]:
# Let's verify that the entry price stays within each candle
pd.concat((
    pf.orders.records_readable[["Column", "Order Type", "Stop Type", "Price"]],
    pf.orders.bar_high.to_readable(title="High", only_values=True),
    pf.orders.bar_low.to_readable(title="Low", only_values=True),
    pf.orders.price_status.to_readable(title="Price Status", only_values=True),
), axis=1)
Out[103]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Column Order Type Stop Type Price High Low Price Status
0 0 SELL None 1791.160 1791.866 1790.982 OK
1 0 SELL TP1 1790.900 1791.436 1790.262 OK
2 0 SELL TP2 1789.400 1790.396 1789.156 OK
3 0 SELL SL1 1796.400 1797.156 1794.742 OK
4 5 BUY None 1761.727 1761.993 1761.736 BelowLow
... ... ... ... ... ... ... ...
150 226 SELLSTOP SL1 1877.865 1878.035 1875.445 OK
151 227 SELL None 1884.455 1884.715 1883.635 OK
152 227 SELL TP1 1882.000 1882.585 1881.685 OK
153 227 SELL SL1 1895.000 1898.335 1894.185 OK
154 229 SELL None 1905.935 1906.795 1905.795 OK

155 rows × 7 columns

In [104]:
pf.orders.price_status.stats(group_by=True)
Out[104]:
Start                     2021-09-02 00:00:00+00:00
End                       2023-03-13 23:59:00+00:00
Period                             83 days 06:26:00
Count                                           155
Value Counts: OK                                122
Value Counts: BelowLow                           33
Name: group, dtype: object
In [105]:
pf.orders.bar_high.to_readable()
Out[105]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
Id Column Index Value
0 0 0 2021-09-14 12:43:00+00:00 1791.866
1 1 0 2021-09-14 12:44:00+00:00 1791.436
2 2 0 2021-09-14 12:50:00+00:00 1790.396
3 3 0 2021-09-14 12:55:00+00:00 1797.156
4 0 5 2021-09-17 11:00:00+00:00 1761.993
... ... ... ... ...
150 3 226 2023-03-13 00:00:00+00:00 1878.035
151 0 227 2023-03-13 10:17:00+00:00 1884.715
152 1 227 2023-03-13 10:47:00+00:00 1882.585
153 2 227 2023-03-13 12:29:00+00:00 1898.335
154 0 229 2023-03-13 14:36:00+00:00 1906.795

155 rows × 4 columns

Merging Order Records for portfolio metrics

In [106]:
# Now, what if we're interested in portfolio metrics, such as the Sharpe ratio?
# The problem is that most metrics are producing multiple (intermediate) time series of the full shape, which is 
# disastrous for RAM since our data will have to be tiled  by the number of columns. 
# But here's a trick: merge order records of all columns into one, as if we did the simulation on just one column!

def merge_order_records(order_records):
    merged_order_records = order_records.copy()
    
    # New records should have only one column
    merged_order_records["col"][:] = 0
    
    # Sort the records by the timestamp
    merged_order_records = merged_order_records[np.argsort(merged_order_records["idx"])]
    
    # Reset the order ids
    merged_order_records["id"][:] = np.arange(len(merged_order_records))
    return merged_order_records

merged_order_records = merge_order_records(custom_order_records)
pd.DataFrame(merged_order_records)
Out[106]:
<style scoped=""> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
id col idx size price fees side order_type stop_type
0 0 0 4663 0.055830 1791.160 0.0 1 1 -1
1 1 0 4664 0.018610 1790.900 0.0 0 1 1
2 2 0 4670 0.018591 1789.400 0.0 0 1 2
3 3 0 4675 0.018629 1796.400 0.0 0 1 0
4 4 0 5940 0.056762 1761.727 0.0 0 0 -1
... ... ... ... ... ... ... ... ... ...
150 150 0 118526 0.027003 1877.865 0.0 0 3 0
151 151 0 119143 0.053066 1884.455 0.0 1 1 -1
152 152 0 119173 0.013174 1882.000 0.0 0 1 1
153 153 0 119275 0.039892 1895.000 0.0 0 1 0
154 154 0 119402 0.052468 1905.935 0.0 1 1 -1

155 rows × 9 columns

In [107]:
# We also need to change the wrapper because it holds the information on our columns
merged_wrapper = pf.wrapper.replace(columns=[0], ndim=1)
In [108]:
# Is there any other array that requires merging?
# Let's introspect the portfolio instance and search for arrays of the full shape
print(pf)
# There are none, thus replace only the records and the wrapper
Portfolio(
    wrapper=ArrayWrapper(
        index=<pandas.core.indexes.datetimes.DatetimeIndex object at 0x1735a6a70 with shape (119906,)>,
        columns=<pandas.core.indexes.numeric.Int64Index object at 0x28cb121d0 with shape (45,)>,
        ndim=2,
        freq='minute',
        column_only_select=None,
        range_only_select=None,
        group_select=None,
        grouped_ndim=None,
        grouper=Grouper(
            index=<pandas.core.indexes.numeric.Int64Index object at 0x28cb121d0 with shape (45,)>,
            group_by=None,
            def_lvl_name='group',
            allow_enable=True,
            allow_disable=True,
            allow_modify=True
        )
    ),
    order_records=<numpy.ndarray object at 0x28d7fba50 with shape (155,)>,
    open=<numpy.ndarray object at 0x17365d5f0 with shape (119906, 1)>,
    high=<numpy.ndarray object at 0x17365e010 with shape (119906, 1)>,
    low=<numpy.ndarray object at 0x17365e1f0 with shape (119906, 1)>,
    close=<numpy.ndarray object at 0x17365da10 with shape (119906, 1)>,
    log_records=<numpy.ndarray object at 0x173686130 with shape (0,)>,
    cash_sharing=False,
    init_cash=<numpy.ndarray object at 0x28c9ee1f0 with shape (45,)>,
    init_position=<numpy.ndarray object at 0x28d20d8f0 with shape (45,)>,
    init_price=<numpy.ndarray object at 0x28d20d830 with shape (45,)>,
    cash_deposits=<numpy.ndarray object at 0x173662670 with shape (1, 1)>,
    cash_earnings=<numpy.ndarray object at 0x17365c870 with shape (1, 1)>,
    call_seq=None,
    in_outputs=None,
    use_in_outputs=None,
    bm_close=None,
    fillna_close=None,
    trades_type=None,
    orders_cls=<class '__main__.CustomOrders'>,
    logs_cls=None,
    trades_cls=None,
    entry_trades_cls=None,
    exit_trades_cls=None,
    positions_cls=None,
    drawdowns_cls=None
)
In [109]:
merged_pf = pf.replace(
    order_records=merged_order_records, 
    wrapper=merged_wrapper,
    init_cash="auto"
)

# Also, the previous individual portfolios were each using the starting capital of $100m Which was used 100%, 
# but since we merge columns together, we now may require less starting capital
# Thus, we will determine it automatically
## init_Cash = "auto" means automatic initial capital is choosen, which is calculated from the max amount of cash we spent during all simulations.
##  This is better than taking 100$ and multiplying by the number of signals since then we would have inflated the returns.
In [110]:
# We can now get any portfolio statistic
print(merged_pf.stats())
Start                         2021-09-02 00:00:00+00:00
End                           2023-03-13 23:59:00+00:00
Period                                 83 days 06:26:00
Start Value                                   168.90342
Min Value                                    166.380452
Max Value                                    174.233989
End Value                                    168.333302
Total Return [%]                              -0.337541
Benchmark Return [%]                           5.439047
Total Time Exposure [%]                       39.339149
Max Gross Exposure [%]                            100.0
Max Drawdown [%]                               3.894613
Max Drawdown Duration                  38 days 07:03:00
Total Orders                                        155
Total Fees Paid                                     0.0
Total Trades                                        112
Win Rate [%]                                  61.261261
Best Trade [%]                                  8.00419
Worst Trade [%]                               -2.386677
Avg Winning Trade [%]                          0.642969
Avg Losing Trade [%]                          -0.712394
Avg Winning Trade Duration    0 days 14:31:48.529411764
Avg Losing Trade Duration     1 days 03:40:20.930232558
Profit Factor                                  0.983703
Expectancy                                    -0.002187
Sharpe Ratio                                  -0.161445
Calmar Ratio                                  -0.056725
Omega Ratio                                     0.99786
Sortino Ratio                                 -0.227766
dtype: object
In [111]:
# You may wonder why the win rate and other trade metrics are different here
# There are two reasons: 
# 1) Portfolio stats uses exit trades (previously we used positions), that is, each stop order is a trade
# 2) After merging, there's no more information which order belongs to which trade, thus positions are built in a sequential order

# But to verify that both portfolio match, we can compare to the total profit to the previous trade P&L
print("Total Profit:",merged_pf.total_profit)
print("PnL Sum:",pf.trades.pnl.sum(group_by=True))
Total Profit: -0.5701185962467719
PnL Sum: -0.5701185962467079
In [112]:
# We can now plot the entire portfolio
merged_pf.resample("daily").plot().show()

Custom Order Simulator

Putting it all together from above

Need:
The main issue with using from_order_func is that we need to go over the entire data as many times as there are signals because the order function is run on each element. A far more time-efficient approach would be processing trades in a sequential order. This is easily possible because our trades are perfectly sorted - we don't need to process a signal if the previous signal hasn't been processed yet.

Also, because the scope of this notebook assumes that signals are independent, we can simulate them independently and stop each signal's simulation once its position has been closed out This is only possible by writing an own simulator (which isn't as scary as it sounds!)

Let's build the simulator
Technically, it's just a regular Numba function that does whatever we want.
What's special about it is that it calls the vectorbt's low-level API to place orders and updates the simulation state such as cash balances and positions

In [113]:
# To avoid duplicating our signal logic, we will re-use order_func_nb by passing our own limited context
# It will consist only of the fields that are required by our order_func_nb

OrderContext = namedtuple("OrderContext", [
    "i",
    "col",
    "index",
    "open",  
    "high",
    "low",
    "close",
    "last_position"
])
In [114]:
## Nr. of TP Levels
signal_info.tp.shape[1]
Out[114]:
4
In [115]:
signal_data.get().shape[0]
Out[115]:
232
In [116]:
# We'll first determine the bars where the signals happen, and then run a smaller simulation on the first signal.
# Once the signal's position has been closed out, we'll terminate the simulation and continue with the next signal, 
# until all signals are processed.

@njit(boundscheck=True)
def signal_simulator_nb(index,
                        open,
                        high,
                        low,
                        close,
                        signal_info,
                        temp_info
                        ):
    # Determine the number of signals, levels, and potential orders
    n_signals = len(signal_info.timestamp)
    n_sl_levels = signal_info.sl.shape[1]
    n_tp_levels = signal_info.tp.shape[1]
    max_orders = 1 + n_sl_levels + n_tp_levels
    
    # TEMPORARY ARRAYS
    # This array will hold the bar where each signal happens
    signal_bars = np.full(n_signals, -1, dtype=np.int_)
    
    # This array will hold order records
    # Initially, order records are uninitialized (junk data) but we will fill them gradually
    # Notice how we use our own data type custom_order_dt - we can fill order type and stop type fields right during the simulation
    order_records = np.empty((max_orders, n_signals), dtype=custom_order_dt)
    
    # To be able to distinguish between uninitialized and initialized (filled) orders,
    # we'll create another array holding the number of filled orders for each signal
    # For example, if order_records has a maximum of 6 rows and only one record is filled,
    # order_counts will be 1 for this signal, so vectorbt can remove 5 unfilled orders later
    order_counts = np.full(n_signals, 0, dtype=np.int_)
    
    # order_func_nb requires last_position, which holds the position of each signal
    last_position = np.full(n_signals, 0.0, dtype=np.float_)
    
    # First, we need to determine the bars where the signals happen
    # Even though we know their timestamps, we need to translate them into absolute indices
    signal = 0
    bar = 0
    while signal < n_signals and bar < len(index):
        if index[bar] == signal_info.timestamp[signal]:
            # If there's a match, save the bar and continue with the next signal on the next bar
            signal_bars[signal] = bar
            signal += 1
            bar += 1
        elif index[bar] > signal_info.timestamp[signal]:
            # If we're past the signal, continue with the next signal on the same bar
            signal += 1
        else:
            # If we haven't hit the signal yet, continue on the next bar
            bar += 1

    # Once we know the bars, we can iterate over signals in a loop and simulate them independently
    for signal in range(n_signals):
        
        # If there was no match in the previous level, skip the simulation
        from_bar = signal_bars[signal]
        if from_bar == -1:
            continue
            
        # This is our initial execution state, which holds the most important cash balance
        exec_state = vbt.pf_enums.ExecState(
            cash=100.0,         # We'll start with a starting capital of $100
            position=0.0,
            debt=0.0,
            locked_cash=0.0,
            free_cash=100.0,
            val_price=np.nan,
            value=np.nan
        )
            
        # Here comes the actual simulation that starts from the signal's bar and ends either once we processed all bars
        #  or once the position has been closed out (see below)
        for bar in range(from_bar, len(index)):
            
            # Create a named tuple holding the current context (this is "c" in order_func_nb)
            c = OrderContext(  
                i=bar,
                col=signal,
                index=index,
                open=open,
                high=high,
                low=low,
                close=close,
                last_position=last_position,
            )
            
            # If the first bar has no data, skip the simulation
            if bar == from_bar and not has_data_nb(c):
                break

            # Price area holds the OHLC of the current bar
            price_area = vbt.pf_enums.PriceArea(
                vbt.flex_select_nb(open, bar, signal), 
                vbt.flex_select_nb(high, bar, signal), 
                vbt.flex_select_nb(low, bar, signal), 
                vbt.flex_select_nb(close, bar, signal)
            )
            
            # Why do we need to redefine the execution state?
            # Because we need to manually update the valuation price and the value of the column
            # to be able to use complex size types such as target percentages
            # As in order_func_nb, we will use the opening price as the valuation price
            # Why doesn't vectorbt do it on its own? Because it doesn't know anything about other columns. 
            # For example, imagine having a grouped simulation with 100 columns sharing the same cash: 
            # Using the formula below wouldn't consider the positions of other 99 columns.
            exec_state = vbt.pf_enums.ExecState(
                cash=exec_state.cash,
                position=exec_state.position,
                debt=exec_state.debt,
                locked_cash=exec_state.locked_cash,
                free_cash=exec_state.free_cash,
                val_price=price_area.open,
                value=exec_state.cash + price_area.open * exec_state.position
            )
            
            # Let's run the order function, which returns an order
            # Remember when we used order_nothing_nb()? It also returns an order but with filled with nans
            order = order_func_nb(c, signal_info, temp_info)
            
            # Here's the main function in this entire simulation, which 
            # 1) executes the order,
            # 2) updates the execution state, and 
            # 3) updates the order_records and order_counts
            order_result, exec_state = vbt.pf_nb.process_order_nb(
                signal, ## For Grouping
                signal, ## For column
                bar,
                exec_state=exec_state,
                order=order,
                price_area=price_area,
                order_records=order_records,
                order_counts=order_counts
            )
            
            # Where there's no grouping, then group = column. Columns in our case are signals.        
            # If the order was successful (i.e., it's now in order_records),
            # we need to manually set the order type and stop type
            if order_result.status == vbt.pf_enums.OrderStatus.Filled:
                
                # Use this line to get the last order of any signal
                filled_order = order_records[order_counts[signal] - 1, signal]
                
                # Fill the order type
                filled_order["order_type"] = signal_info.order_type[signal]
                
                # Fill the stop type by going through the SL and TP levels and checking whether 
                # the order bar matches the level bar
                order_is_stop = False
                for k in range(n_sl_levels):
                    if filled_order["idx"] == temp_info.sl_bar[signal, k]:
                        filled_order["stop_type"] = k
                        order_is_stop = True
                        break
                for k in range(n_tp_levels):
                    if filled_order["idx"] == temp_info.tp_bar[signal, k]:
                        filled_order["stop_type"] = n_sl_levels + k  # TP indices come after SL indices
                        order_is_stop = True
                        break
                
                # If order bar hasn't been matched, it's not a stop order
                if not order_is_stop:
                    filled_order["stop_type"] = -1
                    
            # If we're not in position after an entry anymore, terminate the simulation
            if temp_info.entry_price_bar[signal] != -1:
                if exec_state.position == 0:
                    break
                    
            # Don't forget to update the position array
            last_position[signal] = exec_state.position
        
    # Remove uninitialized order records and flatten 2d array into a 1d array
    return vbt.nb.repartition_nb(order_records, order_counts)
In [117]:
# Numba requires arrays in a NumPy format, and to avoid preparing them each time,
# let's create a function that only takes the data and signal information, and does everything else for us

def signal_simulator(data, signal_info):
    temp_info = build_temp_info(signal_info)
    
    custom_order_records = signal_simulator_nb(
        index = data.index.vbt.to_ns(),  # convert to nanoseconds
        open = vbt.to_2d_array(data.open),  # flexible indexing requires inputs to be 2d
        high = vbt.to_2d_array(data.high),
        low = vbt.to_2d_array(data.low),
        close = vbt.to_2d_array(data.close),
        signal_info = signal_info,
        temp_info = temp_info
    )
    
    # We have order records, what's left is wrapping them with a Portfolio
    # Required are three things: 
    # 1) array wrapper with index and columns,
    # 2) order records, and 
    # 3) prices
    # We also need to specify the starting capital that we used during the simulation
    return vbt.Portfolio(
        wrapper=vbt.ArrayWrapper(
            index=data.index, 
            columns=pd.Index(signal_data.get("id").values, name="id"),  # one column per signal, ids as column names
            freq="minute"),
        order_records=custom_order_records,
        open=data.open,
        high=data.high,
        low=data.low,
        close=data.close,
        init_cash=100.0, ## starting capital
        orders_cls=CustomOrders
    )
In [118]:
# That's it!
pf = signal_simulator(data, signal_info)
print('PnL:',pf.trades.pnl.sum(group_by=True))
PnL: -0.5701185962467079
In [119]:
pf.wrapper.columns
Out[119]:
Int64Index([ 846,  854,  866,  884,  887,  898,  942, 1037, 1044, 1063,
            ...
            4381, 4382, 4387, 4390, 4400, 4408, 4411, 4413, 4425, 4432],
           dtype='int64', name='id', length=232)

In the above vbt.ArrayWrapper in our signal_simulator function we pass pd.Index(signal_data.get("id").values, name="id")to the columns argument to retrieve additional information about the signal using the column id in the signal data. The messages id will now be displayed as column names and you can analyze them (for example, you would be able to do pf[id].stats() to analyze an equity curve of a single signal). Just don't analyze them all at once since they will use lots of RAM.

In [120]:
id = 846
pf[id].stats()
Out[120]:
Start                         2021-09-02 00:00:00+00:00
End                           2023-03-13 23:59:00+00:00
Period                                 83 days 06:26:00
Start Value                                       100.0
Min Value                                     99.939939
Max Value                                    100.077239
End Value                                     99.939939
Total Return [%]                              -0.060061
Benchmark Return [%]                           5.439047
Total Time Exposure [%]                        0.010008
Max Gross Exposure [%]                            100.0
Max Drawdown [%]                               0.137193
Max Drawdown Duration                  80 days 00:35:00
Total Orders                                          4
Total Fees Paid                                     0.0
Total Trades                                          3
Win Rate [%]                                  66.666667
Best Trade [%]                                  0.09826
Worst Trade [%]                               -0.292548
Avg Winning Trade [%]                          0.056388
Avg Losing Trade [%]                          -0.292548
Avg Winning Trade Duration              0 days 00:04:00
Avg Losing Trade Duration               0 days 00:12:00
Profit Factor                                  0.384741
Expectancy                                     -0.02002
Sharpe Ratio                                  -1.542342
Calmar Ratio                                  -0.286392
Omega Ratio                                    0.610956
Sortino Ratio                                 -1.929536
Name: 846, dtype: object

Summary of signal_simulator function

  • Basically we take from_order_func, remove redundant parts from it and introduce some optimizations. In from_order_func , each signal (column) required going through the entire data from the start to the end, but in our custom simulator we can skip these parts, which makes it much, much faster.
  • We also create the temporary arrays and write our own custom records in-place instead of fixing them after the simulation, which is very convenient.
  • The only part that isn't integrated is merging the records, but having the records partitioned by signal provides more depth and is overall better for analysis, the user can still merge them at any time after the simulation with functions developed previously.