12 KiB
This code performs financial analysis by downloading historical stock price data and calculating returns and correlations. It visualizes price movements and correlations, then calculates residuals between sector returns and market returns using OLS regression. Finally, it plots cumulative returns of residuals and evaluates the impact of forecast correlations on portfolio breadth. This is useful for risk management and portfolio optimization.
import numpy as np import matplotlib.pyplot as plt import pandas as pd import seaborn as sns import statsmodels.api as sm import yfinance as yf import warnings
warnings.filterwarnings("ignore")
Define a list of stock tickers to download data for
tickers = ["WFC", "JPM", "USB", "XOM", "VLO", "SLB"]
Download historical stock price data from Yahoo Finance for the specified tickers
data = yf.download(tickers, start="2015-01-01", end="2023-12-31")["Close"]
Calculate daily percentage returns and drop any missing values
returns = data.pct_change().dropna()
Create a figure with two subplots to plot stock prices and correlation heatmap
fig, (ax1, ax2) = plt.subplots(ncols=2) fig.tight_layout()
Calculate the correlation matrix of the returns
corr = returns.corr()
Plot the historical stock prices on the first subplot
left = data.plot(ax=ax1)
Plot the correlation heatmap on the second subplot
right = sns.heatmap( corr, ax=ax2, vmin=-1, vmax=1, xticklabels=tickers, yticklabels=tickers )
Calculate and print the average pairwise correlation among the stocks
average_corr = np.mean( corr.values[np.triu_indices_from(corr.values, k=1)] ) print(f"Average pairwise correlation: {average_corr}")
Define stock tickers for market indices and two sectors
market_symbols = ["XLF", "SPY", "XLE"] sector_1_stocks = ["WFC", "JPM", "USB"] sector_2_stocks = ["XOM", "VLO", "SLB"]
Combine all tickers into one list
tickers = market_symbols + sector_1_stocks + sector_2_stocks
Download historical price data for the combined list of tickers
price = yf.download(tickers, start="2015-01-01", end="2023-12-31").Close
Calculate daily percentage returns and drop any missing values
returns = price.pct_change().dropna()
Separate market returns and sector returns from the combined returns data
market_returns = returns["SPY"] sector_1_returns = returns["XLF"] sector_2_returns = returns["XLE"]
Initialize DataFrames to store residuals after regression against market returns
stock_returns = returns.drop(market_symbols, axis=1) residuals_market = stock_returns.copy() * 0.0 residuals = stock_returns.copy() * 0.0
def ols_residual(y, x): """Calculate OLS residuals between two series Parameters ---------- y : pd.Series Dependent variable series x : pd.Series Independent variable series Returns ------- residuals : pd.Series Residuals from OLS regression """ results = sm.OLS(y, x).fit() return results.resid
Calculate residuals of sector returns after removing market returns influence
sector_1_excess = ols_residual(sector_1_returns, market_returns) sector_2_excess = ols_residual(sector_2_returns, market_returns)
Calculate residuals for each stock in sector 1 after removing market and sector influence
for stock in sector_1_stocks: residuals_market[stock] = ols_residual(returns[stock], market_returns) residuals[stock] = ols_residual(residuals_market[stock], sector_1_excess)
Calculate residuals for each stock in sector 2 after removing market and sector influence
for stock in sector_2_stocks: residuals_market[stock] = ols_residual(returns[stock], market_returns) residuals[stock] = ols_residual(residuals_market[stock], sector_2_excess)
Plot cumulative returns of residuals and the correlation heatmap of residuals
fig, (ax1, ax2) = plt.subplots(ncols=2) fig.tight_layout() corr = residuals.corr()
Plot cumulative returns of residuals
left = (1 + residuals).cumprod().plot(ax=ax1)
Plot correlation heatmap of residuals
right = sns.heatmap( corr, ax=ax2, fmt="d", vmin=-1, vmax=1, xticklabels=residuals.columns, yticklabels=residuals.columns, )
Calculate and print the average pairwise correlation among the residuals
average_corr = np.mean(corr.values[np.triu_indices_from(corr.values, k=1)]) print(f"Average pairwise correlation: {average_corr}")
def buckle_BR_const(N, rho): """Calculate effective breadth based on correlation Parameters ---------- N : int Number of assets rho : np.ndarray Array of correlation values Returns ------- effective_breadth : np.ndarray Effective number of independent bets """ return N / (1 + rho * (N - 1))
Generate a range of correlation values and plot the effective breadth
corr = np.linspace(start=0, stop=1.0, num=500) plt.plot(corr, buckle_BR_const(6, corr)) plt.ylabel('Effective Breadth (Number of Bets)') plt.xlabel('Forecast Correlation')
PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advise. Use at your own risk.
