How to Pre-train AI Model with HelpingAI Architecture

Abhay KoulAbhay Koul
3 min read

Hello, amazing AI enthusiasts! ๐Ÿ‘‹ Ready to embark on an exciting journey of pre-training an AI model using the innovative HelpingAI architecture? Let's dive in! ๐ŸŒŠ๐Ÿ’ป

Step 1: Set Up Your Environment ๐Ÿ› ๏ธ

First, let's get everything ready! We'll clone the repository and install the necessary packages. Here's how:

import sys
import subprocess
from typing import List

def install(package_name: str) -> None:
    try:
        subprocess.check_call([sys.executable, "-m", "pip", "install", package_name])
        print(f"Yay! ๐ŸŽ‰ {package_name} installed successfully.")
    except subprocess.CalledProcessError as e:
        print(f"Oops! ๐Ÿ˜… Couldn't install {package_name}. Error: {e}")

def clone_repository(url: str, destination: str) -> None:
    try:
        subprocess.check_call(['git', 'clone', url, destination], stderr=subprocess.STDOUT)
        print(f"Woohoo! ๐ŸŽŠ Repository cloned successfully from {url} to {destination}.")
    except subprocess.CalledProcessError as e:
        print(f"Uh-oh! ๐Ÿ˜Ÿ Couldn't clone repository from {url}. Error: {e}")

# Let's clone the repository! ๐Ÿ‘๐Ÿ”ฌ
repository_url: str = "https://github.com/OEvortex/HelpingAI.git"
repository_destination: str = "HelpingAI"
clone_repository(repository_url, repository_destination)

# Time to install our magical packages! โœจ๐Ÿ“ฆ
packages_to_install: List[str] = [
    "datasets",
    "trl==0.8.4",
    "accelerate",
    "bitsandbytes"
]

for package in packages_to_install:
    install(package)

my_model: str = "LLM"

Step 2: Configure Your Model ๐Ÿง ๐Ÿ’ก

Now, let's set up our model configuration. We're going for a smaller model to keep things manageable:

import torch
from HelpingAI.HelpingAI_.configuration_HelpingAI import HelpingAIConfig
from HelpingAI.HelpingAI_.modeling_HelpingAI import HelpingAIForCausalLM
from transformers import TrainingArguments, AutoTokenizer
from datasets import load_dataset
from trl import SFTTrainer
from HelpingAI.HelpingAI_.tokenization_HelpingAI_fast import HelpingAITokenizerFast

# Let's create our amazing model configuration! ๐ŸŒŸ
configuration: HelpingAIConfig = HelpingAIConfig(
    vocab_size=50281,
    hidden_size=512,
    num_hidden_layers=6,
    num_attention_heads=8,
    head_dim=64,
    num_local_experts=4,
    num_experts_per_tok=1,
    intermediate_size=1024,
    hidden_act="silu",
    hidden_dropout=0.1,
    attention_dropout=0.1,
    classifier_dropout=0.1,
    max_position_embeddings=2048,
    initializer_range=0.02,
    rms_norm_eps=1e-6,
    layer_norm_eps=1e-5,
    use_cache=True,
    bos_token_id=50278,
    eos_token_id=50279,
    num_key_value_heads=8,
    norm_eps=1e-05,
)

# Time to create our fantastic model! ๐ŸŽญ
model: HelpingAIForCausalLM = HelpingAIForCausalLM(configuration)

# Let's get our tokenizer ready! ๐Ÿ”ค
tokenizer: HelpingAITokenizerFast = HelpingAITokenizerFast.from_pretrained("Abhaykoul/HelpingAI-tokenizer", local_files_only=False)
tokenizer.pad_token = tokenizer.eos_token

print(f"Wow! ๐Ÿ˜ฎ Our model has {model.num_parameters():,} parameters!")

Step 3: Prepare Your Dataset ๐Ÿ“š๐Ÿ”

Now, let's get our dataset ready for training:

# Time to load our exciting dataset! ๐Ÿ“–
dataset: datasets.Dataset = load_dataset('roneneldan/TinyStories', split="train")

# Let's shake things up a bit! ๐ŸŽฒ
dataset = dataset.shuffle(seed=42)

print(f'Amazing! ๐ŸŒŸ We have {len(dataset)} prompts to work with!')
print(f'Our dataset columns are: {dataset.column_names}')

Step 4: Train Your Model ๐Ÿ‹๏ธโ€โ™€๏ธ๐Ÿ’ช

Now for the exciting part - training our model!

from trl import SFTTrainer
from transformers import TrainingArguments

trainer: SFTTrainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=2048,
    dataset_num_proc=2,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=1,
        warmup_steps=2,
        max_steps=20000,
        learning_rate=1e-4,
        logging_steps=100,
        output_dir="M_outputs",
        overwrite_output_dir=True,
        save_steps=20000,
        optim="paged_adamw_32bit",
        report_to="none"
    )
)

# Let's train our model! ๐Ÿš‚๐Ÿ’จ
trainer.train()
trainer.save_model(my_model)

Step 5: Share Your Model with the World! ๐ŸŒ๐ŸŽ

Finally, let's push our amazing model to the Hugging Face Hub:

model.push_to_hub(my_model, use_temp_dir=False, token="hf_***************************")
tokenizer.push_to_hub(my_model, use_temp_dir=False,token="hf_***************************")

And there you have it! ๐ŸŽ‰ You've successfully pre-trained an AI model using the HelpingAI architecture. Remember, the journey of AI is all about exploration and learning. Keep experimenting, stay curious, and most importantly, have fun! ๐Ÿ˜„๐Ÿš€

Happy coding, AI adventurers! ๐Ÿฆธโ€โ™€๏ธ๐Ÿฆธโ€โ™‚๏ธ

1
Subscribe to my newsletter

Read articles from Abhay Koul directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Abhay Koul
Abhay Koul

I am a developer from KP Colony, Vessu, Anantnag. I have been actively involved in AI projects since January 2023, and I am dedicated to crafting emotionally intelligent conversational AI models. My work includes developing HelpingAI, an advanced AI that provides personalized assistance and empathetic interactions.