SQLite is a small self contained SQL database engine that reads and writes to a normal file. It is a good first pass for adding database storage to your application. It's easy to use and with a single file to deal with offers no real issues. Being a SQL database though means you won't be far off if you decide to up your database requirements to Postgres or some other database system.
My recent project of building a Clojure Ads.txt file crawler is a good target for SQLite support for a couple reasons. One is that the project is simple and works just fine writing to result files but having a database would be good in the long term especially as the project grows. Second is that the original Python Ads.txt file crawler defaulted to writing to a SQLite database. I figure the Clojure project should at the very least be able to do the same.
The results of adding SQLite are in the repository if you are curious.
Since we'll be storing the output from the crawler we'll need to create a table that will support this data. We can start with the sql over in the https://github.com/InteractiveAdvertisingBureau/adstxtcrawler project and tweak it a bit.
BEGIN TRANSACTION;
DROP TABLE IF EXISTS adstxt;
CREATE TABLE adstxt(
SITE_DOMAIN TEXT NOT NULL,
EXCHANGE_DOMAIN TEXT NOT NULL,
SELLER_ACCOUNT_ID TEXT NOT NULL,
ACCOUNT_TYPE TEXT NOT NULL,
TAG_ID TEXT NOT NULL,
ENTRY_COMMENT TEXT NOT NULL,
UPDATED DATE DEFAULT (datetime('now','localtime')),
PRIMARY KEY (SITE_DOMAIN,EXCHANGE_DOMAIN,SELLER_ACCOUNT_ID,ACCOUNT_TYPE,TAG_ID)
);
END TRANSACTION;
To use this to create a database:
$ sqlite3 database.db < ./sql/create.sql
Note that there is a bit of a mismatch between the names I have used in the code. This will change in a subsequent version as I move away from the https://github.com/InteractiveAdvertisingBureau/adstxtcrawler version as a model.
There are two libraries you'll need to add to the project. The first is the jdbc library and the second is the SQLite driver library. The links for each of these are as follows.
Note though that the sqlite-jdbc library doesn't give you the hint on how to use it with Leiningen. This when first encountered is confusing. The secret to remember is to go look in Maven. In this case there is a button called "maven cntral 3.20.0" which you can click. It goes to http://search.maven.org/#artifactdetails%7Corg.xerial%7Csqlite-jdbc%7C3.20.0%7C and you can click on the word Leiningen under Dependency Information. In other cases you'll need to search the MVNRepository yourself. Here is a link https://mvnrepository.com/. Remember the project name and the latest version when searching.
Add the following to the dependencies map of the project.clj to add the libraries
[org.clojure/java.jdbc "0.7.2"]
[org.xerial/sqlite-jdbc "3.20.0"]
See the src/adstxtcrawler/crawl.clj file for the crawl-domain
function. If you pass it a dbname
you end up in the src/adstxtcrawler/database.clj file's save routine otherwise you'll use the previous print
routine.
The src/adstxtcrawler/database.clj file contains the routines for writing to the database. The key points to show include the connection specification and the actual inssert.
To accept a parameter to build a connection spec there is a routine called db-spec
.
(defn db-spec [dbname]
{:classname "org.sqlite.JDBC"
:subprotocol "sqlite"
:subname dbname
})
The actual insert is done in save. It first unpacks the data by destructuring the data map. Then the jdbc/insert!
does the insert. Note that duplicates will throw and exception which is ignored. In practice Ads.txt files can contain duplicates which for out purposes we can ignore.
(defn save [domain data dbname]
(let [db-spec (db-spec dbname)]
(doseq [m data]
(let [{:keys [exchange-domain account-id account-type tag-id comment]} m]
;; some ads.txt files have duplicates
;; for these we can ignore the exception causes by trying to insert a duplicate
(try
(jdbc/insert! db-spec :adstxt {:site_domain domain
:exchange_domain exchange-domain
:seller_account_id account-id
:account_type account-type
:tag_id tag-id
:entry_comment comment})
(catch org.sqlite.SQLiteException e
;; Uncomment if you want to see the duplicates
;; (.println *err* (format "Duplicate: %s,%s,%s,%s,%s" domain exchange-domain account-id account-type tag-id))
))))))
Hopefully, this has been enough to show how easy it is to add SQLite support to a Clojure application. See the [https://github.com/bradlucas/ads-txt-crawler for all of the project's source code. Also, if you are interested in previous posts on this project and the Ads.txt specification feel free to visit list posts list http://blog.bradlucas.com/tags/ads.txt//