Brad Lucas

A blog mostly about programming
June 5, 2017

Yahoo Finance Quote Download Clojure

Overview

Today's post is a final version of the Yahoo Finance Quote Download application written using the Clojure programming language. If you are interested there are three previous versions of this application using Bash, Python and Java.

For the impatient you can find the source for this post in the following git repo.

https://github.com/bradlucas/get-yahoo-quotes-clojure

Investigation

The first post in this series details the new way the Yahoo Finance pages work with their cookie and a crumb value embedded in the page data. Here I'll just state the steps and how they can be easily accomplished in Clojure.

First, just like in Java we need to consider using a library to do the http requests against the Yahoo site. A popular choice is clj-http which is a Clojure wrapper and interface over the Java HttpClient library. Considering we just used the HttpClient library I decided to use clj-http here.

So the steps we need to accomplish are as follows:

  • Read the finance page for a specific symbol
  • Grep the page for crumbStore and save the associated crumb value
  • Keep the cookie which comes with the interaction with the page
  • Build a url with the crumb and data parameters to request a data download
  • Call the download url and pass along the cookie we received earlier.

clj-http

To use clj-http in our project first add the following to the project.clj file.

[clj-http "3.6.1"]

Then add this `require'

 (:require [clj-http.client :as client])

Steps

Read the finance page for a specific symbol

The get function of clj-http simply accepts a url and returns a map with the results. This map has a number of keys of which the body key contains the data from page. To make the page's data easier to parse I've split the data into lines on the "}" character.

(defn get-page-data [symbol]
  (let [url (format "https://finance.yahoo.com/quote/%s/?p=%s" symbol symbol)
        page (client/get url)]
    (str/split (:body page) #"}")))

Grep the page for crumbStore and save the associated crumb value

With the list of lines of data we can simply filter for the first one with "CrumbStore" in it. This returns a value that look like ,"CrumbStore": {"crumb":"FWP\u002F5EFll3U". The third section is the crumb value. To get it you can split on the ":" character and then remove the quotes. In practice this value sometimes has a unicode escaped value of \u002F. To convert that I've called out to the same StringEscapeUtils class I used in the java application.

Read the following code and review the get-crumb function and notice how it uses the previous functions to build the result. This aspect of building up functions is one of the great things about Clojure.

(defn split-crumbstore [v]
  ;; # ,"CrumbStore": {"crumb":"FWP\u002F5EFll3U"
  ;; get the last field delineated by :
  ;; strip the quotes
  ;; fixup the unicode-escaped values
  (-> 
   (str/split v #":")
   last
   (str/replace "\"" "") 
   (StringEscapeUtils/unescapeJava)))

(defn find-crumbstore [lines]
  (first (filter #(str/includes? % "CrumbStore") lines)))

(defn get-crumb [symbol]
  (let [crumb (split-crumbstore (find-crumbstore (get-page-data symbol)))]
    ;; (println crumb)
    crumb))

Keep the cookie which comes with the interaction with the page

The repo for clj-http has plenty of documentation. The cookie section is one to point out. Here we see how our two url calls need to be wrapped up in a binding to a cookiestore so we can ensure that the download request contains the cookie dropped by our first url request.

    (binding [clj-http.core/*cookie-store* (clj-http.cookies/cookie-store)]
       ;; Do stuff here...
       )

Build a url with the crumb and data parameters to request a data download

Downloading the data file is similar to the original get page data routine except in this case we'll want to read the data out to a file. Notice in the following how the return from client/get is passed to :body to get the data which is then copied to a file.

(defn download-data [symbol crumb]
  ;; we should use a cookiestore so our requests pass the received cookie from the first page along with subsequent requests
  ;; @see https://github.com/dakrone/clj-http#cookies
  (binding [clj-http.core/*cookie-store* (clj-http.cookies/cookie-store)]

    ;; build the download url
    (let [filename (format "%s.csv" symbol)
          start_date 0
          end_date (System/currentTimeMillis)
          url (format "https://query1.finance.yahoo.com/v7/finance/download/%s?period1=%s&period2=%s&interval=1d&events=history&crumb=%s" symbol, start_date, end_date, (get-crumb symbol))]
      (println "--------------------------------------------------")
      (println (format "Downloading %s to %s" symbol filename))

      ;; @see https://stackoverflow.com/a/32745253
      (-> 
       (client/get url {:as :byte-array})
       (:body)
       (io/input-stream)
       (io/copy (io/file filename)))
      (println "--------------------------------------------------"))))

The Final Application

My final version of the above is up on GitHub in the following repo:

https://github.com/bradlucas/get-yahoo-quotes-clojure


Tags: clojure yahoo quotes trading