Portfolio Optimization’s Basics with Python

Portfolio Optimization is very populer in quantative research-analysis. When you have a budget and multiple investment tools you might consider to read analyst’s opinion or might have your own opinion (which you build based on news, tweets, networks, etc.). Hedge Funds mostly use advanced mathematical models to optimize and allocate investments and they don’t prefer to share theirs methods but since it is open source age you can easily find similar basic or complicated (if you can read papers) models use them to invest your money.

Here I will do some basics coding with python to collect and clean data from different sources and apply a simple optimization and allocation model on collected data.

I will use Pandas, json, dateutil and datetime libraries to manipulate and clean data, urllib, requests and evds to collect data, plotly for visualization and finally PyPortfolioOpt for modeling. The code will be in block text format bu you can easly copy-paste-try yourself just install required libraries with pip install. There will be also sources and a jupyter notebook.

The main module here is PyPortfolioOpt which build on cvxpy which is a Python-embedded modeling language for convex optimization problems. This module makes it easy to do optimization with most populer methods, you can also check Empriyal which is easy to use and has a great visualization but it is not easy to use custom data here.

Step 1: Importing modules, collect data and prepare data

import pandas as pd
import urllib
import json
import evds
from evds import evdsAPI
from dateutil.relativedelta import *
from datetime import date, timedelta
import dateutil
import requests
evds = evdsAPI(‘api ticket’)

I will use 3 kind of investment tool for our optimization. For reproducibility, i will share the code for collecting data.
The data that will be used here is including last 1 year’s prices.

usdtry currency data:

periods=365 #last 365 days
TODAY = date.today()
end=dateutil.parser.parse(str(date.today())).date().strftime(“%d-%m-%Y”)
start=TODAY+relativedelta(days=-periods)
start=dateutil.parser.parse(str(start)).date().strftime(“%d-%m-%Y”)
df_currency=evds.get_data([‘TP.DK.USD.S.YTL’] , startdate=start, enddate=end)
df_currency.columns=[‘date’, ‘usdtry’]
df_currency[‘date’]=pd.to_datetime(df_currency[‘date’], format=”%d-%m-%Y”).astype(str)
df_currency[‘usdtry’]=pd.to_numeric(df_currency[‘usdtry’])
df_currency=df_currency.sort_values(by=’date’)

Bitcoin cryptocurrency data:

coindeskURL=‘https://api.coindesk.com/v1/bpi/historical/close.json?'
endd = date.today()
startt = endd — timedelta(days=periods)
url = f’{coindeskURL}start={startt:%Y-%m-%d}&end={endd:%Y-%m-%d}’
result = requests.get(url)
df_bitcoin=pd.DataFrame(data, index=).T.reset_index()
df_bitcoin.columns=[‘date’, ‘bitcoin’]
df_bitcoin[‘date’]=pd.to_datetime(df_bitcoin.date).dt.strftime(‘%Y-%m-%d’)
df_bitcoin.sort_values(by=’date’)

Garanti bank stock data:

For this data i used this helpful script 🙏. You can try another stock from Turkish stock market.

starttt = str(startt).replace(“-”,””)
enddd = str(endd).replace(“-”,””)
name_stock=’GARAN’
with urllib.request.urlopen(temp_val) as url:
df_stock=pd.DataFrame(data_stock[‘dataSet’])[[‘date’, ‘close’]]
df_stock[‘date’]=pd.to_datetime(df_stock.date, unit=’ms’).dt.strftime(‘%Y-%m-%d’)
df_stock.columns=[‘date’, name_stock.lower()]

I will merge all data, since the currency and stock data only include bussiness days i will use ffill method for missing data, ffill method fill null values with previous non-null value.

df=df_bitcoin.merge(df_stock, on=’date’, how=’left’).merge(df_currency, on=’date’, how=’left’)
df[‘date’]=pd.to_datetime(df[‘date’])
df=df.set_index(‘date’).ffill(axis = 0)

Let’s see data trend(right y axis represent bitcoin and left y axis represent garan and usdtry):

import plotly.graph_objects as go
from plotly.subplots import make_subplots
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{“secondary_y”: True}]])
go.Scatter(x=df.index, y=df.garan, name=”garanti”),
secondary_y=False,
)
go.Scatter(x=df.index, y=df.usdtry, name=”usd-try”),
secondary_y=False,
)
go.Scatter(x=df.index, y=df.bitcoin, name=”bitcoin”),
secondary_y=True,
)
fig.update_layout(
title_text=”Data Plot”
)
# Set x-axis title
fig.update_xaxes(title_text=”date”)
fig.show()

Step 2: Optimization

I will use Efficient Frontier method. Efficient Frontier is basically calculates highest expected return for an optimal risk level or lowest risk level for optimal expected return.
The expected returns as mu and sample covariance as S as follow.

from pypfopt import EfficientFrontier, risk_models,expected_returns
# Calculate expected returns and sample covariance
mu = expected_returns.mean_historical_return(df)
S = risk_models.sample_cov(df)
# Optimize for maximal Sharpe ratio
ef = EfficientFrontier(mu, S)
raw_weights = ef.max_sharpe()
cleaned_weights = ef.clean_weights()
#ef.save_weights_to_file(“weights.csv”) # to save file
print(pd.Series(raw_weights))
ef.portfolio_performance(verbose=True)

output:

bitcoin    0.339164
garan 0.182989
usdtry 0.477847
dtype: float64
Expected annual return: 67.9%
Annual volatility: 23.7%
Sharpe Ratio: 2.79
(0.6790690047824147, 0.23663700707863036, 2.7851476525961028)

67% expected return with 23% volatility. First terms represent weights, Sharpe Ratio is return of the investment compared to its risk and Sharpe Ratios above 1.00 are generally considered “good” according to investopedia. The optimization will be on this metric. Still not sure what is the latest list 🤨

PyPortfolioOpt also provides a method which allows you to calculate what and how many to buy with budget that you want. I consider to have one million 💵 .

from pypfopt.discrete_allocation import DiscreteAllocation, get_latest_priceslatest_prices = get_latest_prices(df)da = DiscreteAllocation(cleaned_weights, latest_prices, total_portfolio_value=1000000)
allocation, leftover = da.greedy_portfolio()
print(“Discrete allocation:”, allocation)
print(“Leftover: \${:.2f}”.format(leftover))

output:

Discrete allocation: {'usdtry': 57071, 'bitcoin': 6, 'garan': 19141}
Leftover: \$45620.96

I also try to calculate returns wrt above allocations with Monte Carlo simulation but that can be another story, yet you can see it on jupyter notebook.

Thanks for reading, since this my first attempt for such massive area please if you see something wrong here let me know. -ytd.

Data Analyst/Scientist, Engineering of Physics @ tapu.com .

More from Umar Igan

Data Analyst/Scientist, Engineering of Physics @ tapu.com .