Files
strategy-lab/to_explore/pyquantnews/45_AlphaLens.ipynb
David Brazda e3da60c647 daily update
2024-10-21 20:57:56 +02:00

730 lines
16 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "af91626b",
"metadata": {},
"source": [
"<div style=\"background-color:#000;\"><img src=\"pqn.png\"></img></div>"
]
},
{
"cell_type": "markdown",
"id": "42f5827b",
"metadata": {},
"source": [
"This notebook implements a momentum-based trading strategy using Zipline and performs performance analysis with Alphalens. It defines custom factors to measure stock momentum and uses these factors to build a trading pipeline. The strategy is then backtested over a specified period, and results are analyzed using information coefficients and other metrics. This is useful for quantitative trading, backtesting, and performance evaluation of trading strategies."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "45bbe3ff",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from scipy import stats\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f537421d",
"metadata": {},
"outputs": [],
"source": [
"from zipline import run_algorithm\n",
"from zipline.pipeline import Pipeline, CustomFactor\n",
"from zipline.pipeline.data import USEquityPricing\n",
"from zipline.api import (\n",
" attach_pipeline,\n",
" calendars,\n",
" pipeline_output,\n",
" date_rules,\n",
" time_rules,\n",
" set_commission,\n",
" set_slippage,\n",
" record,\n",
" order_target_percent,\n",
" get_open_orders,\n",
" schedule_function\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8182c403",
"metadata": {},
"outputs": [],
"source": [
"from alphalens.utils import (\n",
" get_clean_factor_and_forward_returns,\n",
" get_forward_returns_columns\n",
")\n",
"from alphalens.plotting import plot_ic_ts\n",
"from alphalens.performance import factor_information_coefficient, mean_information_coefficient"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "16f879d5",
"metadata": {
"lines_to_next_cell": 2
},
"outputs": [],
"source": [
"import warnings\n",
"warnings.filterwarnings(\"ignore\")"
]
},
{
"cell_type": "markdown",
"id": "9aa346e9",
"metadata": {},
"source": [
"Define the number of long and short positions to hold in the portfolio"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c5a639c8",
"metadata": {},
"outputs": [],
"source": [
"N_LONGS = N_SHORTS = 10"
]
},
{
"cell_type": "markdown",
"id": "cd497509",
"metadata": {},
"source": [
"Define a custom factor class to compute momentum"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9d77ed6",
"metadata": {},
"outputs": [],
"source": [
"class Momentum(CustomFactor):\n",
" \"\"\"Computes momentum factor for assets.\"\"\"\n",
" \n",
" inputs = [USEquityPricing.close]\n",
"\n",
" def compute(self, today, assets, out, close):\n",
" \"\"\"Computes momentum as the ratio of latest to earliest closing price over the window.\"\"\"\n",
" out[:] = close[-1] / close[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "17b1ae3f",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "977f61af",
"metadata": {},
"source": [
"Define a pipeline to fetch and filter stocks based on momentum"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ea9970cd",
"metadata": {},
"outputs": [],
"source": [
"def make_pipeline():\n",
" \"\"\"Creates a pipeline for fetching and filtering stocks based on momentum.\"\"\"\n",
" \n",
" twenty_day_momentum = Momentum(window_length=20)\n",
" thirty_day_momentum = Momentum(window_length=30)\n",
"\n",
" positive_momentum = (\n",
" (twenty_day_momentum > 1) & \n",
" (thirty_day_momentum > 1)\n",
" )\n",
"\n",
" return Pipeline(\n",
" columns={\n",
" 'longs': thirty_day_momentum.top(N_LONGS),\n",
" 'shorts': thirty_day_momentum.top(N_SHORTS),\n",
" 'ranking': thirty_day_momentum.rank(ascending=False)\n",
" },\n",
" screen=positive_momentum\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1c8c73f8",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "1494ac97",
"metadata": {},
"source": [
"Define a function that runs before the trading day starts"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ac225d0c",
"metadata": {},
"outputs": [],
"source": [
"def before_trading_start(context, data):\n",
" \"\"\"Fetches factor data before trading starts.\"\"\"\n",
" \n",
" context.factor_data = pipeline_output(\"factor_pipeline\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c7b1854b",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "f1bd1bb6",
"metadata": {},
"source": [
"Define the initialize function to set up the algorithm"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cf08b200",
"metadata": {},
"outputs": [],
"source": [
"def initialize(context):\n",
" \"\"\"Initializes the trading algorithm and schedules rebalancing.\"\"\"\n",
" \n",
" attach_pipeline(make_pipeline(), \"factor_pipeline\")\n",
" schedule_function(\n",
" rebalance,\n",
" date_rules.week_start(),\n",
" time_rules.market_open(),\n",
" calendar=calendars.US_EQUITIES,\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cda91bd5",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "a1eebf33",
"metadata": {},
"source": [
"Define the rebalancing function to execute trades based on factor data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dd4c8964",
"metadata": {},
"outputs": [],
"source": [
"def rebalance(context, data):\n",
" \"\"\"Rebalances the portfolio based on factor data.\"\"\"\n",
" \n",
" factor_data = context.factor_data\n",
" record(factor_data=factor_data.ranking)\n",
"\n",
" assets = factor_data.index\n",
" record(prices=data.current(assets, 'price'))\n",
"\n",
" longs = assets[factor_data.longs]\n",
" shorts = assets[factor_data.shorts]\n",
" divest = set(context.portfolio.positions.keys()) - set(longs.union(shorts))\n",
"\n",
" exec_trades(data, assets=divest, target_percent=0)\n",
" exec_trades(data, assets=longs, target_percent=1 / N_LONGS)\n",
" exec_trades(data, assets=shorts, target_percent=-1 / N_SHORTS)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "25e1653e",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "ae353e46",
"metadata": {},
"source": [
"Define a function to execute trades for given assets and target percentages"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2d55bfdf",
"metadata": {},
"outputs": [],
"source": [
"def exec_trades(data, assets, target_percent):\n",
" \"\"\"Executes trades for given assets to achieve target portfolio weights.\"\"\"\n",
" \n",
" for asset in assets:\n",
" if data.can_trade(asset) and not get_open_orders(asset):\n",
" order_target_percent(asset, target_percent)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6b8b6b5a",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "17866c5a",
"metadata": {},
"source": [
"Define the backtesting period"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cffe0933",
"metadata": {},
"outputs": [],
"source": [
"start = pd.Timestamp('2015')\n",
"end = pd.Timestamp('2018')"
]
},
{
"cell_type": "markdown",
"id": "9cc5a44e",
"metadata": {},
"source": [
"Run the backtest using the specified start and end dates"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9bdcefdf",
"metadata": {},
"outputs": [],
"source": [
"perf = run_algorithm(\n",
" start=start,\n",
" end=end,\n",
" initialize=initialize,\n",
" before_trading_start=before_trading_start,\n",
" capital_base=100_000,\n",
" bundle=\"quandl\",\n",
")"
]
},
{
"cell_type": "markdown",
"id": "1b744283",
"metadata": {},
"source": [
"Get the final portfolio value"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fe5fc4f4",
"metadata": {
"lines_to_next_cell": 2
},
"outputs": [],
"source": [
"perf.portfolio_value"
]
},
{
"cell_type": "markdown",
"id": "76841808",
"metadata": {},
"source": [
"Construct a DataFrame with stock prices and corresponding dates"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d155f3a5",
"metadata": {},
"outputs": [],
"source": [
"prices = pd.concat(\n",
" [df.to_frame(d) for d, df in perf.prices.dropna().items()], \n",
" axis=1\n",
").T"
]
},
{
"cell_type": "markdown",
"id": "1b843ad7",
"metadata": {},
"source": [
"Convert column names to stock symbols"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d2c08533",
"metadata": {},
"outputs": [],
"source": [
"prices.columns = [col.symbol for col in prices.columns]"
]
},
{
"cell_type": "markdown",
"id": "80f6ea3a",
"metadata": {},
"source": [
"Normalize the index to midnight, preserving timezone information"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0e039f8",
"metadata": {
"lines_to_next_cell": 2
},
"outputs": [],
"source": [
"prices.index = prices.index.normalize()"
]
},
{
"cell_type": "markdown",
"id": "024c97d3",
"metadata": {},
"source": [
"Construct a DataFrame with factor ranks and corresponding dates"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eca244f6",
"metadata": {},
"outputs": [],
"source": [
"factor_data = pd.concat(\n",
" [df.to_frame(d) for d, df in perf.factor_data.dropna().items()],\n",
" axis=1\n",
").T"
]
},
{
"cell_type": "markdown",
"id": "37bcda09",
"metadata": {},
"source": [
"Convert column names to stock symbols"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0e14e9b5",
"metadata": {},
"outputs": [],
"source": [
"factor_data.columns = [col.symbol for col in factor_data.columns]"
]
},
{
"cell_type": "markdown",
"id": "e0827b1d",
"metadata": {},
"source": [
"Normalize the index to midnight, preserving timezone information"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "50c1200d",
"metadata": {},
"outputs": [],
"source": [
"factor_data.index = factor_data.index.normalize()"
]
},
{
"cell_type": "markdown",
"id": "6ecbce84",
"metadata": {},
"source": [
"Create a MultiIndex with date and asset"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "79888180",
"metadata": {},
"outputs": [],
"source": [
"factor_data = factor_data.stack()"
]
},
{
"cell_type": "markdown",
"id": "675d5dc9",
"metadata": {},
"source": [
"Rename the MultiIndex levels"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2050795f",
"metadata": {
"lines_to_next_cell": 2
},
"outputs": [],
"source": [
"factor_data.index.names = ['date', 'asset']"
]
},
{
"cell_type": "markdown",
"id": "d6a5240f",
"metadata": {},
"source": [
"Compile forward returns, factor ranks, and factor quantiles"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fe1db4bd",
"metadata": {},
"outputs": [],
"source": [
"alphalens_data = get_clean_factor_and_forward_returns(\n",
" factor=factor_data, prices=prices, periods=(5, 10, 21, 63), quantiles=5\n",
")"
]
},
{
"cell_type": "markdown",
"id": "cda0df84",
"metadata": {},
"source": [
"Display the compiled Alphalens data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "70e9a889",
"metadata": {
"lines_to_next_cell": 2
},
"outputs": [],
"source": [
"alphalens_data"
]
},
{
"cell_type": "markdown",
"id": "38144b65",
"metadata": {},
"source": [
"Generate the information coefficient for each holding period on each date"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3af97fa7",
"metadata": {
"lines_to_next_cell": 2
},
"outputs": [],
"source": [
"ic = factor_information_coefficient(alphalens_data)"
]
},
{
"cell_type": "markdown",
"id": "b256162d",
"metadata": {},
"source": [
"Calculate the mean information coefficient over all periods"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "254c0617",
"metadata": {},
"outputs": [],
"source": [
"ii = mean_information_coefficient(alphalens_data)"
]
},
{
"cell_type": "markdown",
"id": "3007dad5",
"metadata": {},
"source": [
"Plot the mean information coefficient"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3ba48590",
"metadata": {},
"outputs": [],
"source": [
"ii.plot()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "56d5a42a",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "f12e37c8",
"metadata": {},
"source": [
"Plot the information coefficient for the 5-day holding period"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "56d4b894",
"metadata": {},
"outputs": [],
"source": [
"plot_ic_ts(ic[[\"5D\"]])\n",
"plt.tight_layout()"
]
},
{
"cell_type": "markdown",
"id": "b24451da",
"metadata": {},
"source": [
"Calculate mean information coefficient per holding period per year"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dda93f62",
"metadata": {},
"outputs": [],
"source": [
"ic_by_year = ic.resample('A').mean()\n",
"ic_by_year.index = ic_by_year.index.year"
]
},
{
"cell_type": "markdown",
"id": "a2167b6a",
"metadata": {},
"source": [
"Plot the mean information coefficient by year"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1ea9e9bb",
"metadata": {},
"outputs": [],
"source": [
"ic_by_year.plot.bar(figsize=(14, 6))\n",
"plt.tight_layout()"
]
},
{
"cell_type": "markdown",
"id": "2003869d",
"metadata": {},
"source": [
"<a href=\"https://pyquantnews.com/\">PyQuant News</a> is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to <a href=\"https://gettingstartedwithpythonforquantfinance.com/\">get started with Python for quant finance</a>. For educational purposes. Not investment advise. Use at your own risk."
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "-all",
"main_language": "python",
"notebook_metadata_filter": "-all"
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}