28 KiB
Load data¶
Make sure you have .env file in ttools or any parent dir with your Alpaca keys.
ACCOUNT1_LIVE_API_KEY=api_key ACCOUNT1_LIVE_SECRET_KEY=secret_key
Cache directories¶
Daily trade files - DATADIR/tradecache Agg data cache - DATADIR/aggcache
DATADIR - user_data_dir from appdirs library - see config.py
import pandas as pd import numpy as np from ttools.utils import AggType from datetime import datetime from ttools.aggregator_vectorized import generate_time_bars_nb, aggregate_trades from ttools.loaders import load_data, prepare_trade_cache from ttools.utils import zoneNY import vectorbtpro as vbt from lightweight_charts import PlotDFAccessor, PlotSRAccessor vbt.settings.set_theme("dark") vbt.settings['plotting']['layout']['width'] = 1280 vbt.settings.plotting.auto_rangebreaks = True # Set the option to display with pagination pd.set_option('display.notebook_repr_html', True) pd.set_option('display.max_rows', 10) # Number of rows per page
TTOOLS: Loaded env variables from file /Users/davidbrazda/Documents/Development/python/.env
Fetching aggregated data¶
Available aggregation types:
- time based bars - AggType.OHLCV
- volume based bars - AggType.OHLCV_VOL, resolution = volume threshold
- dollar based bars - AggType.OHLCV_DOL, resolution = dollar threshold
- renko bars - AggType.OHLCV_RENKO resolution = bricksize
#This is how to call LOAD function symbol = ["SPY"] #datetime in zoneNY day_start = datetime(2024, 2, 15, 9, 30, 0) day_stop = datetime(2024, 3, 18, 16, 0, 0) day_start = zoneNY.localize(day_start) day_stop = zoneNY.localize(day_stop) #requested AGG resolution = 1 #12s bars agg_type = AggType.OHLCV #other types AggType.OHLCV_VOL, AggType.OHLCV_DOL, AggType.OHLCV_RENKO exclude_conditions = ['C','O','4','B','7','V','P','W','U','Z','F','9','M','6'] #None to defaults minsize = 100 #min trade size to include main_session_only = True force_remote = False data = load_data(symbol = symbol, agg_type = agg_type, resolution = resolution, start_date = day_start, end_date = day_stop, #exclude_conditions = None, minsize = minsize, main_session_only = main_session_only, force_remote = force_remote, return_vbt = True, #returns vbt object verbose = False ) data.ohlcv.data[symbol[0]] #data.ohlcv.data[symbol[0]].lw.plot()
| open | high | low | close | volume | |
|---|---|---|---|---|---|
| time | |||||
| 2024-02-15 09:30:00-05:00 | 499.29 | 499.41 | 499.2900 | 499.3200 | 161900.0 |
| 2024-02-15 09:30:01-05:00 | 499.32 | 499.41 | 499.3000 | 499.4000 | 10900.0 |
| 2024-02-15 09:30:02-05:00 | 499.36 | 499.40 | 499.3550 | 499.3800 | 7040.0 |
| 2024-02-15 09:30:03-05:00 | 499.39 | 499.42 | 499.3800 | 499.4000 | 8717.0 |
| 2024-02-15 09:30:04-05:00 | 499.40 | 499.40 | 499.3500 | 499.3500 | 3265.0 |
| ... | ... | ... | ... | ... | ... |
| 2024-03-18 15:59:55-04:00 | 512.94 | 512.94 | 512.8600 | 512.8900 | 7345.0 |
| 2024-03-18 15:59:56-04:00 | 512.90 | 512.90 | 512.8700 | 512.8800 | 2551.0 |
| 2024-03-18 15:59:57-04:00 | 512.89 | 512.91 | 512.8500 | 512.8701 | 18063.0 |
| 2024-03-18 15:59:58-04:00 | 512.87 | 512.90 | 512.8496 | 512.9000 | 7734.0 |
| 2024-03-18 15:59:59-04:00 | 512.92 | 512.92 | 512.8200 | 512.8700 | 37159.0 |
417345 rows × 5 columns
data.ohlcv.data[symbol[0]]
| open | high | low | close | volume | |
|---|---|---|---|---|---|
| time | |||||
| 2024-10-14 09:45:00-04:00 | 41.9650 | 41.970 | 41.950 | 41.9500 | 17895.0 |
| 2024-10-14 09:45:12-04:00 | 41.9589 | 41.965 | 41.950 | 41.9650 | 6281.0 |
| 2024-10-14 09:45:24-04:00 | 41.9650 | 42.005 | 41.965 | 41.9975 | 3522.0 |
| 2024-10-14 09:45:36-04:00 | 41.9900 | 42.005 | 41.990 | 42.0000 | 5960.0 |
| 2024-10-14 09:45:48-04:00 | 42.0050 | 42.040 | 42.005 | 42.0300 | 9113.0 |
| ... | ... | ... | ... | ... | ... |
| 2024-10-16 15:00:00-04:00 | 42.9150 | 42.915 | 42.910 | 42.9100 | 12872.0 |
| 2024-10-16 15:00:12-04:00 | 42.9150 | 42.920 | 42.910 | 42.9200 | 7574.0 |
| 2024-10-16 15:00:24-04:00 | 42.9200 | 42.920 | 42.910 | 42.9200 | 1769.0 |
| 2024-10-16 15:00:36-04:00 | 42.9200 | 42.920 | 42.905 | 42.9050 | 26599.0 |
| 2024-10-16 15:00:48-04:00 | 42.9050 | 42.905 | 42.880 | 42.8800 | 9216.0 |
5480 rows × 5 columns
Prepare daily trade cache¶
This is how to prepare trade cache for given symbol and period (if daily trades are not cached they are remotely fetched.)
symbols = ["BAC", "AAPL"] #datetime in zoneNY day_start = datetime(2024, 10, 1, 9, 45, 0) day_stop = datetime(2024, 10, 27, 15, 1, 0) day_start = zoneNY.localize(day_start) day_stop = zoneNY.localize(day_stop) force_remote = False prepare_trade_cache(symbols, day_start, day_stop, force_remote, verbose = True)
Prepare daily trade cache - cli script¶
Python script prepares trade cache for specified symbols and date range.
Usually 1 day takes about 35s. It is stored in /tradescache/ directory as daily file keyed by symbol.
To run this script in the background with specific arguments:
# Running without forcing remote fetch python3 prepare_cache.py --symbols BAC AAPL --day_start 2024-10-14 --day_stop 2024-10-18 & # Running with force_remote set to True python3 prepare_cache.py --symbols BAC AAPL --day_start 2024-10-14 --day_stop 2024-10-18 --force_remote &
Aggregated data are stored per symbol, date range and conditions. If requested dates are matched with existing stored data with same conditions but wider data spans they are loaded from this file.
This is the matching part:
from ttools.utils import list_matching_files, print_matching_files_info, zoneNY from datetime import datetime from ttools.config import AGG_CACHE # Find all files covering January 15, 2024 9:30 to 16:00 files = list_matching_files( symbol='SPY', resolution="1", agg_type='AggType.OHLCV', start_date=datetime(2024, 1, 15, 9, 30), end_date=datetime(2024, 1, 15, 16, 0) ) #print_matching_files_info(files) # Example with all parameters specified specific_files = list_matching_files( symbol="SPY", agg_type="AggType.OHLCV", resolution="12", start_date=zoneNY.localize(datetime(2024, 1, 15, 9, 30)), end_date=zoneNY.localize(datetime(2024, 1, 15, 16, 0)), excludes_str="4679BCFMOPUVWZ", minsize=100, main_session_only=True ) print_matching_files_info(specific_files)
File: SPY-AggType.OHLCV-12-2024-01-15T09-30-00-2024-10-20T16-00-00-4679BCFMOPUVWZ-100-True.parquet Coverage: 2024-01-15 09:30:00 to 2024-10-20 16:00:00 Symbol: SPY Agg Type: AggType.OHLCV Resolution: 12 Excludes: 4679BCFMOPUVWZ Minsize: 100 Main Session Only: True --------------------------------------------------------------------------------
And date subset loaded from parquet. Usually this is all done yb load_data in loader.
start = zoneNY.localize(datetime(2024, 1, 15, 9, 30)) end = zoneNY.localize(datetime(2024, 10, 20, 16, 00)) ohlcv_df = pd.read_parquet( AGG_CACHE / "SPY-AggType.OHLCV-1-2024-01-15T09-30-00-2024-10-20T16-00-00-4679BCFMOPUVWZ-100-True.parquet", engine='pyarrow', filters=[('time', '>=', start), ('time', '<=', end)] ) ohlcv_df
| open | high | low | close | volume | trades | updated | vwap | buyvolume | sellvolume | |
|---|---|---|---|---|---|---|---|---|---|---|
| time | ||||||||||
| 2024-01-16 09:30:00-05:00 | 475.250 | 475.3600 | 475.20 | 475.285 | 255386.0 | 93.0 | 2024-01-16 09:30:01.002183-05:00 | 475.251725 | 3692.0 | 242756.0 |
| 2024-01-16 09:30:01-05:00 | 475.335 | 475.3350 | 475.23 | 475.260 | 15161.0 | 100.0 | 2024-01-16 09:30:02.007313-05:00 | 475.283390 | 4386.0 | 4944.0 |
| 2024-01-16 09:30:02-05:00 | 475.250 | 475.3000 | 475.24 | 475.300 | 6993.0 | 39.0 | 2024-01-16 09:30:03.008912-05:00 | 475.262507 | 1900.0 | 2256.0 |
| 2024-01-16 09:30:03-05:00 | 475.290 | 475.3200 | 475.24 | 475.270 | 8497.0 | 47.0 | 2024-01-16 09:30:04.201093-05:00 | 475.275280 | 1300.0 | 3200.0 |
| 2024-01-16 09:30:04-05:00 | 475.250 | 475.2700 | 475.22 | 475.270 | 5367.0 | 37.0 | 2024-01-16 09:30:05.004980-05:00 | 475.234353 | 1613.0 | 1247.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-10-18 15:59:55-04:00 | 584.520 | 584.5800 | 584.51 | 584.580 | 10357.0 | 47.0 | 2024-10-18 15:59:56.008928-04:00 | 584.543870 | 1600.0 | 1100.0 |
| 2024-10-18 15:59:56-04:00 | 584.570 | 584.6091 | 584.55 | 584.550 | 6527.0 | 32.0 | 2024-10-18 15:59:57.007658-04:00 | 584.566643 | 1525.0 | 1002.0 |
| 2024-10-18 15:59:57-04:00 | 584.560 | 584.6100 | 584.56 | 584.600 | 5068.0 | 23.0 | 2024-10-18 15:59:58.000435-04:00 | 584.596249 | 1960.0 | 900.0 |
| 2024-10-18 15:59:58-04:00 | 584.590 | 584.6200 | 584.56 | 584.560 | 8786.0 | 23.0 | 2024-10-18 15:59:59.041984-04:00 | 584.592217 | 2859.0 | 3921.0 |
| 2024-10-18 15:59:59-04:00 | 584.560 | 584.6100 | 584.56 | 584.570 | 12583.0 | 69.0 | 2024-10-18 15:59:59.982132-04:00 | 584.583131 | 5303.0 | 1980.0 |
3384529 rows × 10 columns