Yahoo Finance Quote Download Java

June 4, 2017

Overview

Today's post includes and overview of a Java implementation created to support the new Yahoo Finance page changes. Previous posts include implementations in Bash and Python. Please review them if curious or if you feel the following doesn't have enough of an explanation of the issue created when Yahoo changed their finance page back in May.

For the impatient you can find the program in the following git repo.

https://github.com/bradlucas/get-yahoo-quotes-java

Investigating

The issue detailed previously is that to download quote data on the new Yahoo pages you need to do two things. First, you navigate to the symbol's data page. There your browser or client will be given a cookie that is needed for the later download. Also, the page will have a crumb value associated with CrumbStore. This value needs to be parsed out of the page's data.

We'll do this in Java and use the HttpClient library to do our client connection requests. One immediate bonus is by sharing the HttpClient instance we won't have to manually grab the B cookie and use it because the client will take care of sharing on the subsequent download.

So, for our first page request we only need to focus on retrieving he crumb value.

Data in the page

We want to call the initial http://finance.yahoo.com/quote page to get the crumb. Working from top down the first function is getCrumb. This function will find the crumb by looking in the lines of the page returned from getPage.

getPage

Simple call the appropriate page on Yahoo and return the page's content as a string.

    public String getPage(String symbol) {
        String rtn = null;
        String url = String.format("https://finance.yahoo.com/quote/%s/?p=%s", symbol, symbol);
        HttpGet request = new HttpGet(url);
        System.out.println(url);

        request.addHeader("User-Agent", "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10 (maverick) Firefox/3.6.13");
        try {
            HttpResponse response = client.execute(request, context);
            System.out.println("Response Code : " + response.getStatusLine().getStatusCode());

            BufferedReader rd = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
            StringBuffer result = new StringBuffer();
            String line = "";
            while ((line = rd.readLine()) != null) {
                result.append(line);
            }
            rtn = result.toString();
            HttpClientUtils.closeQuietly(response);
        } catch (Exception ex) {
            System.out.println("Exception");
            System.out.println(ex);
        }
        System.out.println("returning from getPage");
        return rtn;
    }

splitPageData

The page data is returned as a single string. This makes it difficult to search so we'll split the page on } characters and return a list of strings.

    public List<String> splitPageData(String page) {
        // Return the page as a list of string using } to split the page
        return Arrays.asList(page.split("}"));
    }

findCrumb

With our list of strings we'll look for CrumbStore, extract the crumb value and then unescape the data.

    public String findCrumb(List<String> lines) {
        String crumb = "";
        String rtn = "";
        for (String l : lines) {
            if (l.indexOf("CrumbStore") > -1) {
                rtn = l;
                break;
            }
        }
        // ,"CrumbStore":{"crumb":"OKSUqghoLs8"        
        if (rtn != null && !rtn.isEmpty()) {
            String[] vals = rtn.split(":");                 // get third item
            crumb = vals[2].replace("\"", "");              // strip quotes
            crumb = StringEscapeUtils.unescapeJava(crumb);  // unescape escaped values (particularly, \u002f
            }
        return crumb;
    }

getCrumb

Calling the above sequentially.

    public String getCrumb(String symbol) {
        return findCrumb(splitPageData(getPage(symbol)));
    }

Downloading Data

With our crumb value retrieved we can now download data. This requires a call to http://query1.finance.yahoo.com with start and end data values in epoch time. Nicely, we can get our end date for today with System.currentTimeMillis().

    public void downloadData(String symbol, long startDate, long endDate, String crumb) {
        String filename = String.format("%s.csv", symbol);
        String url = String.format("https://query1.finance.yahoo.com/v7/finance/download/%s?period1=%s&period2=%s&interval=1d&events=history&crumb=%s", symbol, startDate, endDate, crumb);
        HttpGet request = new HttpGet(url);
        System.out.println(url);

        request.addHeader("User-Agent", "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10 (maverick) Firefox/3.6.13");
        try {
            HttpResponse response = client.execute(request, context);
            System.out.println("Response Code : " + response.getStatusLine().getStatusCode());
            HttpEntity entity = response.getEntity();

            String reasonPhrase = response.getStatusLine().getReasonPhrase();
            int statusCode = response.getStatusLine().getStatusCode();
            
            System.out.println(String.format("statusCode: %d", statusCode));
            System.out.println(String.format("reasonPhrase: %s", reasonPhrase));

            if (entity != null) {
                BufferedInputStream bis = new BufferedInputStream(entity.getContent());
                BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(new File(filename)));
                int inByte;
                while((inByte = bis.read()) != -1) 
                    bos.write(inByte);
                bis.close();
                bos.close();
            }
            HttpClientUtils.closeQuietly(response);

        } catch (Exception ex) {
            System.out.println("Exception");
            System.out.println(ex);
        }
    }

Lastly, here is our setup and call to downloadData.

GetYahooQuotes c = new GetYahooQuotes();
c.downloadData(symbol, 0, System.currentTimeMillis(), c.getCrumb("goog");

The Final Application

My final version of the above is up on GitHub in the following repo:

https://github.com/bradlucas/get-yahoo-quotes-java


Tags: yahoo java quotes trading