Brad Lucas

Programming, Clojure and other interests

Unzip File If Necessary

May 17, 2018

Ran into a situation where an external vendor started sending csv files gzipped. The trouble was they started zipping only larger files. Since the system that ingested the files would break on a gz file I needed a routine to unzip the gz files and leave the non-zipped files alone.

    def unzip_if_necessary(self, filename):
        print "unzip_if_necessary: " + filename
        f = open(filename)

        # Read magic number (the first 2 bytes) and rewind.                                                                                                                                                                               
        magic_number = f.read(2)
        f.seek(0)
        f.close()

        if magic_number == '\x1f\x8b':
            print "gzip file"
            data = ''
            # with  gzip.GzipFile(fileobj=f) as f:                                                                                                                                                                                        
            with gzip.open(filename, 'rb') as f:
                data = f.read()
            print "Writing to " + filename
            with open(filename, "w") as f:
                f.write(data)
        else:
            print "csv file"

Gist

Continue reading →

Download All Files From Url

January 12, 2018

Use wget to download a web page and all associated files.

$ wget -r -np -k http://domain.com/url

If you don't have wget and you are using a Mac use brew.

$ brew install wget
Continue reading →

Advent Of Code

December 5, 2017

After taking a month off from writing blog posts I've just found something interesting to point out. The other day I found this "Advent Calendar" idea for programmers. Each day you solve a puzzle or two to unlock the day. With twenty five days to complete it looks like an interesting activity for the month of December.

If you are curious a bunch of Clojurians are listing their solutions repos here:

Continue reading →

Gpg Signing Failed Inappropriate Ioctl For Device

November 5, 2017

Received:

gpg: signing failed: Inappropriate ioctl for device

To fix:

$ GPG_TTY=$(tty)
$ export GPG_TTY
Continue reading →

Clojure Ads.txt Crawler Reporting Site

October 19, 2017

Continuing with the Ads.txt crawler has lead to the idea to store the crawler results in a database and have them available from a web site. This post introduces the first pass as such a site with the source code available in the following repository:

Ads.txt Review

As a quick review the Ads.txt standard is one where publishers can host a simple text file with the names of authorized ad networks that have permission to sell the publisher's inventory. There is a reference Python crawler for such files and I've built a crawler in Clojure as an alternative. See this link for a series of posts about the Ads.txt specification and the development of the crawler. The crawler project is here.

Continue reading →