There is a useful page called CryptoCurrency Market Capitalizations
for viewing the current state of the crytocurency markets.
https://coinmarketcap.com/assets/views/all/
The site shows all currencies runing today on a number of platforms. I'm interested in the ones running on Ethereum
which have a Market Cap
. Since, the site doesn't have this specific filtering capability I thought it would make a good project to grab the data from the page and filter it the way I'd like.
To do this I decided to investigate Pandas
and it's read_html
function for pulling data in from html tables.
The following are notes for a Python script that I wrote to pull data from the CryptoCurrency Market Capitalizations
, massage the data and show it in useful formats.
Setup a virtualenv with the following libraries.
tabulate
pandas
beautifulsoup4
html5lib
lxml
numpy
When you investigate the html returned for the page you need to find how the table of data is identified. On inspection you'll see that the table has an id
of assets-all
. The following shows how you can read this table with Pandas
into a DataFrame.
url = 'https://coinmarketcap.com/assets/views/all/'
# Use Pandas to return first table on page
#
df = pd.read_html(url, attrs = {'id': 'assets-all'})[0]
The columns have the names of the table columns which I think are a bit unwieldy to use because they have symols and spaces in them. I changed them to sorter single word names.
# Original column names
#
# [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
# ['#', 'Name', 'Platform', 'Market Cap', 'Price', 'Circulating Supply', 'Volume (24h)', '% 1h', '% 24h', '% 7d']
# New column names
#
df.columns = ['#', 'Name', 'Platform', 'MarketCap', 'Price', 'Supply', 'VolumeDay', 'pctHour', 'pctDay', 'pctWeek']
Looking at the data you'll see that number fields have $, % and comma characters. These need to be removed so we can sort them numerically. Also, all the columns have an object
type and we'll need them to be some sort of numerica for proper behavior.
# Clean the data with 'numbers' by removing $, % and , characters
#
df['Price'] = df['Price'].str.replace('$', '')
df['MarketCap'] = df['MarketCap'].str.replace('$', '')
df['MarketCap'] = df['MarketCap'].str.replace(',', '')
df['VolumeDay'] = df['VolumeDay'].str.replace('$', '')
df['VolumeDay'] = df['VolumeDay'].str.replace(',', '')
df['VolumeDay'] = df['VolumeDay'].str.replace('Low Vol', '0')
df['pctHour'] = df['pctHour'].str.replace('%', '')
df['pctDay'] = df['pctDay'].str.replace('%', '')
df['pctWeek'] = df['pctWeek'].str.replace('%', '')
# Covert 'number' columns to numeric type so they will sort as we'd like
#
def coerce_df_columns_to_numeric(df, column_list):
df[column_list] = df[column_list].apply(pd.to_numeric, errors='coerce')
coerce_df_columns_to_numeric(df, ['MarketCap', 'Price', 'Supply', 'VolumeDay', 'pctHour', 'pctDay', 'pctWeek'])
To have a column that sorts the name nicely you can create an upper case name.
# Build an upper case name column so we can sort on it more easily
#
df['NameUpper'] = map(lambda x: x.upper(), df['Name'])
And lastly, we only want the Ethereum
data with rows which have a MarketCap
value.
# Filter so we only have rows which are Ethereum and which have a value for Market Cap
#
df = df.loc[(df['Platform'] == 'Ethereum') & (df['MarketCap'] != '?')]
The following is one report displayed using tabulate
. The source code in the repo listed below shows a few other example reports. The following was generated at 2017-07-01 09:07
.
Name MarketCap Price Supply VolumeDay
---------------- ----------- -------- ---------- -----------
Aragon 77232067 2.3 33605167 581475
Arcade Token 2605530 1.2 2164691 0
Augur 289609100 26.33 11000000 4210830
Basic Attenti... 140634000 0.140634 1000000000 1450090
BCAP 17505300 1.75 10000000 123788
Bitpark Coin 5708738 0.076117 75000000 0
Chronobank 14923518 21.02 710113 541856
Cofound.it 22082125 0.176657 125000000 580521
Creditbit 8750042 0.736853 11874881 359370
DigixDAO 162637000 81.32 2000000 303509
Edgeless 44011846 0.538422 81742288 699753
Ethbits 1306 0.00307 425388 0
Ethereum Movi... 3605906 0.540886 6666666 3786
Etheroll 28676126 4.1 7001623 25658
FirstBlood 128400869 1.5 85558371 8141070
Gnosis 357368003 323.53 1104590 12010600
Golem 384864116 0.462004 833032000 4593380
Humaniq 26695588 0.163919 162858414 349474
Iconomi 320927340 3.69 87000000 1517730
iDice 1439145 0.916062 1571013 8511
iExec RLC 43572989 0.551063 79070793 210195
Legends Room 3330340 1.67 2000000 571581
Lunyr 6812330 2.96 2297853 194785
Matchpool 18527325 0.247031 75000000 206184
MCAP 98717644 4.84 20383236 265530
Melon 42662775 71.18 599400 318787
Minereum 3193667 5.29 603585 39461
Nexium 19845651 0.298334 66521586 1003330
Numeraire 66873477 54.66 1223451 11890600
Patientory 13120520 0.187436 70000000 1151120
Pluton 11699988 13.76 850000 129772
Quantum 23432939 0.284194 82454023 118966
Quantum Resis... 37007412 0.711681 52000000 546218
RouletteToken 5681468 0.562946 10092385 78663
Round 49461925 0.05819 850000000 304667
SingularDTV 100459800 0.167433 600000000 280495
Status 159849442 0.04606 3470483788 11907400
Swarm City 17365793 2.36 7357576 40417
TaaS 21002345 2.58 8146001 194969
TokenCard 25058443 1.06 23644056 513609
Unity Ingot 16015247 0.079283 202000000 413868
Veritaseum 161739277 82.21 1967282 370198
VOISE 1294737 1.57 825578 5121
vSlice 32649327 0.977803 33390496 175566
WeTrust 21296301 0.231111 92147500 251720
Wings 36238040 0.403954 89708333 425818
Xaurum 30743467 0.241862 127111604 74506
Yocoin 720973 0.006826 105618830 87033