[OpenDATA] Price Volume usage example

Some examples of utilizing the free data provided by Finter.

finter offers various datasets for free. In this article, we will demonstrate how to fetch these data and implement a simple strategy example.

First, if you don't have finter installed, please install it.

! pip install finter
! pip install mplfinance
! pip install statsmodels

How to call Price Volume dataset

Currently, stock market data from the Korean Public Data Portal can be fetched via the Finter API. According to the data policy of the Public Data Portal, you can retrieve trading prices, the number of listed shares, and market capitalization, excluding the opening price.

from finter.data import ContentFactory

cf = ContentFactory("kr_stock", 20000101, 20200201)
df = cf.get_df("price_close")

# Now available!
# price_close, price_high, price_low
# volume_sum, listed_shares, mkt_cap

The columns of the fetched data are in the form of a 6-digit code, which represents the numerical codes of stocks traded within the KRX (Korea Exchange). The index represents the trading days. The dataframe structure is as follows:

Simple Data Visualization

When handling data, visualization skills are essential. In this example, we will draw a simple candlestick chart.

You can fetch data with a simple code snippet. By using the item_list attribute of ContentFactory and a Python dictionary, you can collect and fetch the available datasets.

# call data
from finter.data import ContentFactory

data = dict()
cf = ContentFactory("kr_stock", 20240301, 20240520)
for i in cf.item_list:
    data[i] = cf.get_df(i)

To draw the chart, OHLCV (Open, High, Low, Close, Volume) data is needed. Here is a simple example of preprocessing the data and reflecting it in the chart using mplfinance.

import pandas as pd 
import mplfinance as mpf

def get_stock(c):
    items = ["price_close", "price_high", "price_low", "volume_sum"]
    stock = pd.concat(
        [
            data[a][c] for a in items
        ], axis=1
    ).dropna(how="all", axis=0)
    stock.columns = ["Close", "High", "Low", "Volume"]
    stock["Open"] = stock["Close"].shift(1)
    return stock.dropna()


mpf.plot(
    get_stock("005930"),
    type="candle", 
    title = "Samsung Electronics",  
    style="yahoo", 
    volume=True, 
    figratio=(12, 6),
    returnfig=True,
)


plt.tight_layout()
plt.show()

Ta-da! With this simple code snippet, you can draw a candlestick chart for your desired stock.

Example of Stock Price Prediction

Below is an example code that trains and predicts stock prices using the ARIMA model.

import pandas as pd

from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_predict

def show_trained(inp):
    model = ARIMA(inp, order=(2, 1, 2))
    model_trained = model.fit()
    plot_predict(model_trained)
    pd.DataFrame(model_trained.resid).plot(secondary_y=True)

samsung = get_stock("005930").Close.values
show_trained(samsung)

Looking at the results, it appears that our model predicts the stock prices well. However, it is generally not recommended to use prices directly for quantitative modeling.

This can be more clearly verified when testing with returns instead of prices. Unlike the stock prices which had high predictive accuracy, when the input data is changed to returns, the model does not predict as well. This indicates that using price data directly can lead to overfitting to the price data.

samsung = get_stock("005930").Close.pct_change().values
show_trained(samsung)

Besides the ARIMA model, feel free to add any models and variables you desire to create your preferred model!

Starting with Price Volume, more data will be made available in the future. Please refer to the page below for the list of available datasets.

https://pypi.org/project/finter/

Last updated

Was this helpful?