Brad Lucas

Programming, Clojure and other interests
July 3, 2017

Tokenwatch (Part 1)

Continuing with my research into tokens and blockchain assets I found another site called TokenMarket.net. This site includes ICOs which are running as well as upcoming campaigns.

In a similar fashion as previous post (http://blog.bradlucas.com/posts/2017-07-01-coin-market-cap/) I decided to scrape the site so I could group the entries by status more easily. The result of this is a Python script called tokenwatch.py.

The following are some highlights to get the script working.

Table Data

When you inspect the page at https://tokenmarket.net/blockchain/all-assets you'll find the data is in a table identified by a class of table-assets. To use requests and BeatifulSoup you can use the following function.

def get_table():
    url = 'https://tokenmarket.net/blockchain/all-assets'
    html = requests.get(url, headers={'User-agent': 'Mozilla/5.0'}).text
    soup = BeautifulSoup(html, "lxml")
    table = soup.select_one("table.table-assets")
    return table

Each row in the table has a set of td cells. I'm interested in the details page link, the status, project name, symbol and description. To get these see this function.

def get_data(tds):
    link = tds[3].find("a")['href']  # tds[1].find("a")['href']
    status =  tds[2].text.strip().replace(u'\xa0', ' ')
    name =  tds[3].text.strip().split("\n")[0]
    symbol =  tds[4].text.strip()
    description =  tds[5].text.encode('ascii', 'ignore').strip().replace('\n', '')
    return [symbol, name, status, description, link]

It is most useful to get the data into a Pandas DataFrame.

def build_dataframe(records):
    return pd.DataFrame.from_records(records, columns=['SYMBOL', 'NAME', 'STATUS', 'DESCRIPTION', 'LINK'])

Lastly, I wanted the data grouped by status.

df.sort_values(['STATUS'], ascending=False)

See the complete project in the GitHub repo listed below.

Tags: ethereum python