Continuing with my research into tokens and blockchain assets I found another site called TokenMarket.net. This site includes ICOs which are running as well as upcoming campaigns.
In a similar fashion as previous post (http://blog.bradlucas.com/posts/2017-07-01-coin-market-cap/) I decided to scrape the site so I could group the entries by status more easily. The result of this is a Python script called tokenwatch.py
.
The following are some highlights to get the script working.
When you inspect the page at https://tokenmarket.net/blockchain/all-assets you'll find the data is in a table identified by a class of table-assets
. To use requests
and BeatifulSoup
you can use the following function.
def get_table():
url = 'https://tokenmarket.net/blockchain/all-assets'
html = requests.get(url, headers={'User-agent': 'Mozilla/5.0'}).text
soup = BeautifulSoup(html, "lxml")
table = soup.select_one("table.table-assets")
return table
Each row in the table has a set of td cells. I'm interested in the details page link, the status, project name, symbol and description. To get these see this function.
def get_data(tds):
link = tds[3].find("a")['href'] # tds[1].find("a")['href']
status = tds[2].text.strip().replace(u'\xa0', ' ')
name = tds[3].text.strip().split("\n")[0]
symbol = tds[4].text.strip()
description = tds[5].text.encode('ascii', 'ignore').strip().replace('\n', '')
return [symbol, name, status, description, link]
It is most useful to get the data into a Pandas DataFrame.
def build_dataframe(records):
return pd.DataFrame.from_records(records, columns=['SYMBOL', 'NAME', 'STATUS', 'DESCRIPTION', 'LINK'])
Lastly, I wanted the data grouped by status.
df.sort_values(['STATUS'], ascending=False)
See the complete project in the GitHub repo listed below.