Files
strategy-lab/to_explore/pyquantnews/75_ArcticDBOptions.ipynb
David Brazda e3da60c647 daily update
2024-10-21 20:57:56 +02:00

409 lines
11 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "e2c3f24b",
"metadata": {},
"source": [
"<div style=\"background-color:#000;\"><img src=\"pqn.png\"></img></div>"
]
},
{
"cell_type": "markdown",
"id": "35d6d902",
"metadata": {},
"source": [
"This code processes and analyzes equity options data, extracting volatility curves for specific options chains. It reads options data from CSV files, stores them in an Arctic database, and then queries the database to retrieve options chains for analysis. The code includes functions to read the options data, filter it based on specific criteria, and query it for expiration dates. Finally, it plots the implied volatility curves for the options, visualizing how volatility changes with different strikes and expiration dates."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c11513ab",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import glob\n",
"import matplotlib.pyplot as plt\n",
"import datetime as dt"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "92e29c6f",
"metadata": {},
"outputs": [],
"source": [
"import arcticdb as adb\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"id": "ae881285",
"metadata": {},
"source": [
"Initialize an Arctic database connection and get the \"options\" library"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8669ee9a",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"arctic = adb.Arctic(\"lmdb://equity_options\")\n",
"lib = arctic.get_library(\"options\", create_if_missing=True)"
]
},
{
"cell_type": "markdown",
"id": "23e2dfbe",
"metadata": {},
"source": [
"Define a function to read options chains from CSV files"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c99d8bf0",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"def read_chains(fl):\n",
" \"\"\"Reads options chains from a CSV file\n",
" \n",
" Parameters\n",
" ----------\n",
" fl : str\n",
" Path to the CSV file\n",
" \n",
" Returns\n",
" -------\n",
" df : pd.DataFrame\n",
" Dataframe containing the options chains\n",
" \"\"\"\n",
" \n",
" # Read the CSV file, set the \"date\" column as the index, and convert to datetime\n",
" df = (\n",
" pd\n",
" .read_csv(fl)\n",
" .set_index(\"date\")\n",
" )\n",
" df.index = pd.to_datetime(df.index)\n",
" return df"
]
},
{
"cell_type": "markdown",
"id": "691e962c",
"metadata": {},
"source": [
"Read all CSV files from the \"rut-eod\" directory and store their data in the Arctic database"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "12d9662e",
"metadata": {},
"outputs": [],
"source": [
"files = glob.glob(os.path.join(\"rut-eod\", \"*.csv\"))\n",
"for fl in files:\n",
" chains = read_chains(fl)\n",
" chains.option_expiration = pd.to_datetime(chains.option_expiration)\n",
" underlyings = chains.symbol.unique()\n",
" for underlying in underlyings:\n",
" df = chains[chains.symbol == underlying]\n",
" adb_sym = f\"options/{underlying}\"\n",
" adb_fcn = lib.update if lib.has_symbol(adb_sym) else lib.write\n",
" adb_fcn(adb_sym, df)"
]
},
{
"cell_type": "markdown",
"id": "7921ad3e",
"metadata": {},
"source": [
"Read data for the \"RUT\" underlying from the Arctic database"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d66b0041",
"metadata": {},
"outputs": [],
"source": [
"rut = lib.read(\"options/RUT\").data"
]
},
{
"cell_type": "markdown",
"id": "e836326c",
"metadata": {},
"source": [
"Print the minimum and maximum dates from the \"RUT\" data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9868b99f",
"metadata": {},
"outputs": [],
"source": [
"rut.index.min(), rut.index.max()"
]
},
{
"cell_type": "markdown",
"id": "c6bbb827",
"metadata": {},
"source": [
"Print information about the \"RUT\" dataframe"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "83ec5802",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"rut.info()"
]
},
{
"cell_type": "markdown",
"id": "1f7d0264",
"metadata": {},
"source": [
"Define a function to read the volatility curve for a specific options chain"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18f49a7c",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"def read_vol_curve(as_of_date, underlying, expiry, delta_low, delta_high):\n",
" \"\"\"Reads the volatility curve for specific options\n",
" \n",
" Parameters\n",
" ----------\n",
" as_of_date : pd.Timestamp\n",
" The date for which to read the data\n",
" underlying : str\n",
" The underlying asset symbol\n",
" expiry : pd.Timestamp\n",
" The expiration date of the options\n",
" delta_low : float\n",
" The lower bound of the delta range\n",
" delta_high : float\n",
" The upper bound of the delta range\n",
" \n",
" Returns\n",
" -------\n",
" df : pd.DataFrame\n",
" Dataframe containing the volatility curve\n",
" \"\"\"\n",
" \n",
" # Build a query to filter options data based on expiration date and delta range\n",
" q = adb.QueryBuilder()\n",
" filter = (\n",
" (q[\"option_expiration\"] == expiry) & \n",
" (\n",
" \t(\n",
" \t\t(q[\"delta\"] >= delta_low) & (q[\"delta\"] <= delta_high)\n",
" \t) | (\n",
" \t\t(q[\"delta\"] >= -delta_high) & (q[\"delta\"] <= -delta_low)\n",
" \t)\n",
" )\n",
" )\n",
" q = (\n",
" q[filter]\n",
" .groupby(\"strike\")\n",
" .agg({\"iv\": \"mean\"})\n",
" )\n",
" \n",
" # Read the filtered data from the Arctic database\n",
" return lib.read(\n",
" f\"options/{underlying}\", \n",
" date_range=(as_of_date, as_of_date),\n",
" query_builder=q\n",
" ).data"
]
},
{
"cell_type": "markdown",
"id": "86dbe228",
"metadata": {},
"source": [
"Define a function to query expiration dates for options"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9d68eb55",
"metadata": {
"lines_to_next_cell": 1
},
"outputs": [],
"source": [
"def query_expirations(as_of_date, underlying, dte=30):\n",
" \"\"\"Queries expiration dates for options\n",
" \n",
" Parameters\n",
" ----------\n",
" as_of_date : pd.Timestamp\n",
" The date for which to query the data\n",
" underlying : str\n",
" The underlying asset symbol\n",
" dte : int, optional\n",
" Days to expiration threshold, by default 30\n",
" \n",
" Returns\n",
" -------\n",
" expirations : pd.Index\n",
" Index containing the expiration dates\n",
" \"\"\"\n",
" \n",
" # Build a query to filter options data based on expiration date threshold\n",
" q = adb.QueryBuilder()\n",
" filter = (q.option_expiration > as_of_date + dt.timedelta(days=dte))\n",
" q = q[filter].groupby(\"option_expiration\").agg({\"volume\": \"sum\"})\n",
" \n",
" # Read the filtered data from the Arctic database and sort by expiration date\n",
" return (\n",
" lib\n",
" .read(\n",
" f\"options/{underlying}\", \n",
" date_range=(as_of_date, as_of_date), \n",
" query_builder=q\n",
" )\n",
" .data\n",
" .sort_index()\n",
" .index\n",
" )"
]
},
{
"cell_type": "markdown",
"id": "a344b771",
"metadata": {},
"source": [
"Define input parameters for querying and plotting the volatility curves"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d8c667fe",
"metadata": {},
"outputs": [],
"source": [
"as_of_date = pd.Timestamp(\"2013-06-03\")\n",
"expiry = pd.Timestamp(\"2013-06-22\")\n",
"underlying = \"RUT\"\n",
"dte = 30\n",
"delta_low = 0.05\n",
"delta_high = 0.50"
]
},
{
"cell_type": "markdown",
"id": "421cce89",
"metadata": {},
"source": [
"Query expiration dates for the given underlying asset and date"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f138d4b9",
"metadata": {},
"outputs": [],
"source": [
"expiries = query_expirations(as_of_date, underlying, dte)"
]
},
{
"cell_type": "markdown",
"id": "37f2012c",
"metadata": {},
"source": [
"Plot the implied volatility curves for the retrieved expiration dates"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8152b931",
"metadata": {},
"outputs": [],
"source": [
"_, ax = plt.subplots(1, 1)\n",
"cmap = plt.get_cmap(\"rainbow\", len(expiries))\n",
"format_kw = {\"linewidth\": 0.5, \"alpha\": 0.85}\n",
"for i, expiry in enumerate(expiries):\n",
" curve = read_vol_curve(\n",
" as_of_date, \n",
" underlying, \n",
" expiry, \n",
" delta_low, \n",
" delta_high\n",
" )\n",
" (\n",
" curve\n",
" .sort_index()\n",
" .plot(\n",
" ax=ax, \n",
" y=\"iv\", \n",
" label=expiry.strftime(\"%Y-%m-%d\"),\n",
" grid=True,\n",
" color=cmap(i),\n",
" **format_kw\n",
" )\n",
" )\n",
"ax.set_ylabel(\"implied volatility\")\n",
"ax.legend(loc=\"upper right\", framealpha=0.7)"
]
},
{
"cell_type": "markdown",
"id": "f120b062",
"metadata": {},
"source": [
"<a href=\"https://pyquantnews.com/\">PyQuant News</a> is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to <a href=\"https://gettingstartedwithpythonforquantfinance.com/\">get started with Python for quant finance</a>. For educational purposes. Not investment advise. Use at your own risk."
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "-all",
"main_language": "python",
"notebook_metadata_filter": "-all"
}
},
"nbformat": 4,
"nbformat_minor": 5
}