Bokeh dashboards using Quandl datasets¶
Though I am not from financial industry, I can understand the need to track how precious commodities and stocks are performing over years. So I started looking into commodity/stock datasets available for creating useful charts/dashboard. One of my colleague told me about Quandl which has datasets realted to finance and economy sectors. So I set out to create a bokeh application using quandl data and this article outlines the steps performed to achieve it. In order to demonstrate how the app is built, I am going to choose a couple of stocks and commodities. You can extend it other quandl datasets.
About Quandl¶
You need to sign up for Quandl account which allows you to download free data and samples of premium data. Once you have signed up, you will be provided with API key and you can find the API key under account settings. To know more about API limits for a free account click here
You can use Quandl API in R, Python and through Excel plugin. I went for Python as Quandl has a package that you can install using pip. I won't go into the details as it is well documented by quandl's team.
Data Processing¶
main.py¶
Data can be accessed by calling the quandl api using the following code:
quandl.ApiConfig.api_key = "GIVE YOUR API KEY HERE"
Once you have set the api key you can quandl datatables using .get method
data1 = quandl.get("LBMA/SILVER", collapse="monthly")
data2 = quandl.get("LBMA/GOLD", collapse="monthly")
Importing tab_commodity() function from commodity_select.py
from scripts.commodity_select import tab_commodity
We will be doing a bit a data processing to make sure data is consistent across commodities. Some data processing steps performed are
- Convert the output from quandl api get call into pandas dataframe
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
- Fill NA values with 0
df1 = df1.fillna(0)
df2 = df2.fillna(0)
- Resetting the index on the dataframe
df1 = df1.reset_index()
df2 = df2.reset_index()
- Converting the Date column to datetime object
df1['Date'] = pd.to_datetime(df1['Date'])
df1['Date'] = pd.DatetimeIndex(df1['Date'])
df2['Date'] = pd.to_datetime(df2['Date'])
df2['Date'] = pd.DatetimeIndex(df2['Date'])
- Selecting a subset of dataframe
df1 = df1[['Date','USD']]
df2 = df2[['Date','USD (PM)']]
- Renaming the column in df2 and assigning new column to each dataframe
df2.rename(columns={'USD (PM)': 'USD'}, inplace=True)
df1['Commodity']='SILVER'
df2['Commodity']='GOLD'
- Combining the dataframes
frames = [df1,df2]
df_commodity = pd.concat(frames)
Also lets create a list of commodities that will be passed to the function in commodity_select.py
tab2 = tab_commodity(commodities,df_commodity)
Tab1 - Precious Commodities¶
commodity_select.py¶
Here is the commodity_select.py which is used to created Commodity Trend tab.
First step is to import the libraries required for passing data to the plot and dropdown select
import pandas as pd
from bokeh.layouts import column
from bokeh.models import ColumnDataSource, Select, Panel, HoverTool
from bokeh.plotting import figure
Now lets define the function tab_commodity that will be called from main.py
def tab_commodity(commodities, df_commodity):
p = figure(plot_width=960, plot_height=540, title="Commodity Trend",x_axis_type="datetime",sizing_mode="scale_width")
p.background_fill_color="#f5f5f5"
p.background_fill_alpha = 0.5
p.grid.grid_line_color="white"
p.xaxis.axis_label = 'Date'
p.yaxis.axis_label = 'Value in USD'
p.axis.axis_line_color = None
hover = HoverTool()
select_commodity = Select(title='Commodity:',value=commodities[0],options=commodities)
x_val = df_commodity.loc[df_commodity['Commodity'] == select_commodity.value].Date.tolist()
y_val = df_commodity.loc[df_commodity['Commodity'] == select_commodity.value].USD.tolist()
source = ColumnDataSource(data=(dict(x=x_val,y=y_val)))
p.line('x','y',source=source,color='#ebbd5b',alpha=0.5,line_width=2)
hover.tooltips = [("Date", "@x{%F}"),("Value", "$@{y}")]
hover.formatters={
'x' : 'datetime'
}
hover.mode = 'vline'
p.tools.append(hover)
def callback(attr,old,new):
new_commodity = select_commodity.value
x_new = df_commodity.loc[df_commodity['Commodity'] == new_commodity].Date.tolist()
y_new = df_commodity.loc[df_commodity['Commodity'] == new_commodity].USD.tolist()
source_new = ColumnDataSource(data=(dict(x=x_new,y=y_new)))
source.data = source_new.data
select_commodity.on_change('value',callback)
layout = column(select_commodity,p)
tab2 = Panel(child = layout, title = 'Commodity Performance')
return tab2
Tab2 - Stocks¶
main.py¶
Quandl has data on stocks which can be used to create dashboards that are useful for financial institutions. To get the data from quandl, we are going to use get_table method. Below is the code block for doing it
from scripts.Stocks_Select import stock_tab
tickerlist = ['AAPL','IBM','MSFT','GOOG']
df = quandl.get_table('WIKI/PRICES', qopts = { 'columns': ['ticker','date', 'close'] },
ticker = tickerlist, date = { 'gte': '2016-01-01', 'lte': '2016-12-31' })
df = df[['ticker','date','close']]
df['date'] = pd.DatetimeIndex(df['date'])
Calling the stock_tab function from Stocks_Select.py
tab1 = stock_tab(tickerlist, df)
Stocks_Select.py¶
import pandas as pd
from bokeh.layouts import column
from bokeh.models import ColumnDataSource, Select, Panel, HoverTool
from bokeh.plotting import figure
def stock_tab(tickerlist, df):
p = figure(plot_width=960, plot_height=540, x_axis_type="datetime",sizing_mode="scale_width")
p.background_fill_color="#f5f5f5"
p.background_fill_alpha = 0.5
p.grid.grid_line_color="white"
p.xaxis.axis_label = 'Date'
p.yaxis.axis_label = 'Value in USD'
p.axis.axis_line_color = None
hover = HoverTool()
hover.tooltips = [("Date", "@x{%F}"),("Value", "$@{y}")]
hover.formatters={'x':'datetime'}
hover.mode = 'vline'
p.tools.append(hover)
def callback(attr,old,new):
ticker_new = ticker.value
x = df.loc[df['ticker'] == ticker_new].date.tolist()
y = df.loc[df['ticker'] == ticker_new].close.tolist()
source_new = ColumnDataSource(data=dict(x=x,y=y))
source.data = source_new.data
ticker.value = ticker_new
ticker = Select(title="Ticker:",value=tickerlist[0],options=tickerlist)
x = df.loc[df['ticker'] == ticker.value].date.tolist()
y = df.loc[df['ticker'] == ticker.value].close.tolist()
ticker.on_change('value',callback)
source = ColumnDataSource(data=dict(x=x, y=y))
p.line('x','y', line_width=2, color='navy', alpha=0.5,source=source)
layout = column(ticker, p)
tab = Panel(child = layout, title = 'Stock Performance')
return tab
from IPython.display import HTML
HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/jj_GTBJcNr0" frameborder="0" allow="accelerometer; encrypted-media; gyroscope" allowfullscreen></iframe>')