Querying blockchain data for an hackathon

Streamlit recently posted an opportunity for developers to build and share a dashboard in a challenge to win swags, this could be on any subject of interest. This caught my interest and decided that for this exercise I would explore something live, something with some real-time value.

Hence this post is to explore my submission for this APP-A-THON, here, we would explore the basic components of this application, the dependencies, the data explored and the plots generated.

Although my knowledge of blockchain is severely limited, the post on [enter title] by Tony, presented an opportunity to play around with APIs for that real-time value we read about earlier. Read about here querying the data from that post to get the idea, and download the files used.

Next, let’s create an environment, this will store all of our python packages and dependencies, as well as will help with presenting our application’s requirements when it is finally time to deploy. Furthermore, because we would be using the Subgrounds library which is dependent on python 3.10 >, we have to specify when initiating our environment in one of two ways like so:

With conda by following the code below:

conda create -n "myenv" python=3.3.0

Or, if you are running this on linux:

download python version >= 10, copy path to the file.

virtualenv -p <path-to-new-python-installation> <new-venv-name>

virtualenv -p  C:\Users\ssharma\AppData\Local\Programs\Python\Python38\python.exe <new-venv-name>

This will create a new environment with the name you gave at <new-venv-name>, you can now pip nstall the following packages:

scipy==1.10.1
subgrounds==1.1.1
plotly==5.13.1

Having satisfied the requirements, let’s call/query for the first 1000 records from the dataset and clean the data set, for this we would use a jupyter notebook. If you are not familiar with a jupyter notebook, no need to worry, just review this section with me to get familiar with the syntax and flows as we will reuse the codes later. We are creating the following tasks:

Query Data

 ## Save downloaded file as csv
 import pandas as pd

 ## Import Dataset
 df = pd.read_csv('<path/to/csv/file.csv>')

Clean and Wrangle Data

 ## Rename Columns 
 df = df.rename(columns={
 'registrations_domain_name': 'ens_name', 
 'registrations_registrant_id':'owner_address',
 'registrations_registrationdate':'registration_date',
 'registrations_cost':'registration_cost_ether',
 'registrations_expirydate': 'expiry_date'
 }) 

 ## Make a copy 
 ens_df = df.copy() 
 ens_df = ens_df.drop(columns=['owner_address']) 
 ## Standardize the registration ether value because of the outliers 
 from scipy import stats 
 # Z-Score using pandas 
 ens_df['st_registration_cost_ether'] = stats.zscore(df['registration_cost_ether']) 
 ## Convert to datetime values 
 ens_df["expiry_date"] = pd.to_datetime(ens_df["expiry_date"])
 ens_df["registration_date"] = pd.to_datetime(ens_df["registration_date"]) 
 ens_df['registration_hour'] = ens_df['registration_date'].dt.hour

The above would create a standardized value of the Ether variable using the famous z-score parametric.

Turn all of this into a function

 def clean_store_data():
     data_path = './<path/to/daily/downloads>/'
     today_date = str(date.today())

     filepath = data_path + today_date + '.csv'
     ens_df = get_subgraphs()
     ## Cleaning
     ens_df = ens_df.rename(columns={'registrations_domain_name':'ens_name', 'registrations_registrant_id':'owner_address','registrations_registrationdate':'registration_date','registrations_cost':'registration_cost_ether','registrations_expirydate': 'expiry_date'})
     ens_df['st_registration_cost_ether'] = stats.zscore(df['registration_cost_ether'])
     ens_df = ens_df.drop(columns=['owner_address'])
     ens_df["expiry_date"] = pd.to_datetime(ens_df["expiry_date"])
     ens_df["registration_date"] = pd.to_datetime(ens_df["registration_date"])
     ens_df['registration_hour'] = ens_df['registration_date'].dt.hour
     ens_df['st_registration_cost_ether'] = stats.zscore(ens_df['registration_cost_ether'])

     ens_df.to_csv(filepath, index=False)

     print(f'Saved data to {filepath}')
     return ens_df

Now let’s have fun, let’s open a new directory and start building our application with streamlit. In this new directory, do the following:

Create a directory, this is where all of our codes and files stay
Within the above directory, create another directory and call it what you like, it takes the <path/to/daily/downloads> placeholder above, this is where we would pull our daily records
Now, create a python file called app.py and paste the code here into the directory. Your working directory should look like this.

This imports our packages and establishes the paths to files we would be using later.

Dashboard Break-down

The streamlit application breaks down the dashboard into two main parts, the main bar and sidebar, notice how in the code the later elements are appended by '.sidebar',

## e.g to view the highest ETH for the day.
import streamlit as st

st.sidebar.markdown(f"### Highest Registration: {round(max_rce, 2)} ETH")
if st.sidebar.button("View Max Distribution"):
    max_rce_df = curr_df[curr_df['registration_cost_ether'] == max_rce]
    st.sidebar.dataframe(max_rce_df.T.style.highlight_max(axis=1))

st.sidebar.markdown(f"### Lowest Registration: {round(min_rce, 2)} ETH")
if st.sidebar.button("View Min Distribution"):
    min_rce_df = curr_df[curr_df['registration_cost_ether'] == min_rce]
    st.sidebar.dataframe(min_rce_df.T.style.highlight_max(axis=1))

After we are satisfied with the result, we start to build a reactive dashboard based on sliders earlier created, the plotly library is recommended for dashboards for the extra interaction with plots it provides. In the web application sliding filter is based on the hour of the day.

Remember to cache.

The data is not called locally but from an API, sometimes we have to pay for the number of times we query/call this, hence the importance of implementing caching, streamlit explores these functions like @st.cache_data() on this page.

Versioning and Repositories

Your beautiful job should not live on your computer and it is time to share this, git init on your git bash and push this to your GitHub repository, follow this up by sharing this on Streamlit.

Explore the Streamlit dashboard and repository.

If you followed this post till this stage, then go one more step and share what you created in the comments. Can’t wait.

Querying Blockchain APIs for a Streamlit App-A-Thon: My process

Table of contents

Dashboard Break-down

Remember to cache.

Versioning and Repositories

Subscribe to my newsletter

AfroLogicInsect

AfroLogicInsect