print(result) "Part 2 of NotADev"

IsaIsa
4 min read

Fetching Stock Data with yFinance

With the idea in place and my setup ready, it was time to start coding—or, more accurately, instructing AI to code for me.


Getting the Data

I needed historical stock data. The AI suggested using the yfinance library, which is a reliable source for stock market data.

import yfinance as yf

def get_stock_data(ticker, interval='1d', period='5y'):
    stock = yf.Ticker(ticker)
    data = stock.history(interval=interval, period=period)
    return data

It decided to fetch data for companies in the S&P 500's, I added a refinement of technology, energy and utilities sectors. It used Wikipedia's list and extracted the tickers I was interested in.

🛑
Challenge: Some tickers returned empty dataframes or had missing data.

After running the initial version, I realized that for some companies, especially smaller ones or those less actively traded, the data returned was sparse or even nonexistent. This would obviously create issues for the machine learning models down the line.

💡
AI's Solution: Implement error handling and logging to skip tickers with insufficient data.
import logging

def get_stock_data(ticker, interval='1d', period='5y'):
    stock = yf.Ticker(ticker)
    try:
        data = stock.history(interval=interval, period=period)
        if data.empty:
            logging.warning(f"No data for {ticker}")
            return None
        return data
    except Exception as e:
        logging.error(f"Error fetching data for {ticker}: {e}")
        return None

This modification allowed the script to log warnings or errors for tickers with issues and proceed with the rest, enhancing the robustness of the data fetching process.


Asynchronous Data Fetching

Fetching data for multiple tickers sequentially was time-consuming. The AI assistant recommended using asynchronous programming with asyncio to speed up the process.

import asyncio

async def fetch_all_data(tickers, interval='1d', period='5y'):
    data = {}
    for ticker in tickers:
        stock_data = await asyncio.to_thread(get_stock_data, ticker, interval, period)
        if stock_data is not None:
            data[ticker] = stock_data
    return data
🛑
Challenge: Initially, I encountered the error TypeError: cannot unpack non-iterable coroutine object. This error occurred because I wasn't handling the asynchronous functions properly.
💡
AI's Solution: The AI explained that I needed to ensure that any function that involves await is correctly defined as async, and that I should properly await coroutine objects.

Corrected Code:

pythonCopy codeasync def fetch_all_data(tickers, interval='1d', period='5y'):
    tasks = [asyncio.to_thread(get_stock_data, ticker, interval, period) for ticker in tickers]
    results = await asyncio.gather(*tasks)
    data = {ticker: result for ticker, result in zip(tickers, results) if result is not None}
    return data

This adjustment fixed the error, allowing for efficient, concurrent data fetching.


Scheduling with crontab

To automate the bot's execution, I used crontab on my Linode instance to schedule it to run daily. This way, the bot would fetch new data and perform analysis every day without manual intervention.

codecrontab -e
# Add the following line to run the bot every day at 00:10 AM
10 0 * * * /usr/bin/python3 /path/to/the/bot.py

This ensures that the bot fetches new data and performs analysis every day without manual intervention.

INFO:__main__:Telegram message sent.
INFO:__main__:Analyzing 91 tickers.
INFO:__main__:Fetched data for AKAM from yfinance, Data Shape: (262, 7)
INFO:__main__:Fetched data for ADBE from yfinance, Data Shape: (262, 7)
INFO:__main__:Fetched data for ACN from yfinance, Data Shape: (262, 7)
INFO:__main__:Fetched data for AMD from yfinance, Data Shape: (262, 7)
INFO:__main__:Fetched data for APH from yfinance, Data Shape: (262, 7)
INFO:__main__:Combined data for AKAM, Data Shape: (262, 139)
INFO:__main__:Combined data for ACN, Data Shape: (262, 142)
INFO:__main__:Combined data for APH, Data Shape: (262, 144)
INFO:__main__:Combined data for AMD, Data Shape: (262, 143)
INFO:__main__:Combined data for ADBE, Data Shape: (262, 143)

Preference

I run a few things on a schedule, but, particularly with my own bot for a similar concept, I use the .sh file to run the script. This is so I can activate the Python Virtual Environment before running it.

There may be better ways of doing it, but, like I mentioned, I’m not a developer, I just dabble, so I usually fall back to scripts, things I used to create whilst web2 ethical hacking.

#!/bin/bash
# change directory
cd /home/REDACTED/REDACTED/testing
# activate the virtual environment
source /home/REDACTED/REDACTED/testing/.TEST/bin/activate
# run the python script
python3 /home/REDACTED/REDACTED/REDACTED/bot_24_draft.py
# Deactivate the virtual environment
deactivate

We will stop here, and I’ll see you on the next one.

pxng0lin.

0
Subscribe to my newsletter

Read articles from Isa directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Isa
Isa

Former analyst with expertise in data, forecasting, and resource modeling, transitioned to cybersecurity over the past 4 years (as of May 2024). Passionate about security and problem-solving, utilising skills in data and analysis, for cybersecurity challenges. Experience: Extensive background in data analytics, forecasting, and predictive modelling. Experience with platforms like Bugcrowd, Intigriti, and HackerOne. Transitioned to Web3 cybersecurity with Immunefi, exploring smart contract vulnerabilities. Spoken languages: English (Native, British), Arabic (Fus-ha)