How to Pre-train AI Model with HelpingAI Architecture
Hello, amazing AI enthusiasts! ๐ Ready to embark on an exciting journey of pre-training an AI model using the innovative HelpingAI architecture? Let's dive in! ๐๐ป
Step 1: Set Up Your Environment ๐ ๏ธ
First, let's get everything ready! We'll clone the repository and install the necessary packages. Here's how:
import sys
import subprocess
from typing import List
def install(package_name: str) -> None:
try:
subprocess.check_call([sys.executable, "-m", "pip", "install", package_name])
print(f"Yay! ๐ {package_name} installed successfully.")
except subprocess.CalledProcessError as e:
print(f"Oops! ๐
Couldn't install {package_name}. Error: {e}")
def clone_repository(url: str, destination: str) -> None:
try:
subprocess.check_call(['git', 'clone', url, destination], stderr=subprocess.STDOUT)
print(f"Woohoo! ๐ Repository cloned successfully from {url} to {destination}.")
except subprocess.CalledProcessError as e:
print(f"Uh-oh! ๐ Couldn't clone repository from {url}. Error: {e}")
# Let's clone the repository! ๐๐ฌ
repository_url: str = "https://github.com/OEvortex/HelpingAI.git"
repository_destination: str = "HelpingAI"
clone_repository(repository_url, repository_destination)
# Time to install our magical packages! โจ๐ฆ
packages_to_install: List[str] = [
"datasets",
"trl==0.8.4",
"accelerate",
"bitsandbytes"
]
for package in packages_to_install:
install(package)
my_model: str = "LLM"
Step 2: Configure Your Model ๐ง ๐ก
Now, let's set up our model configuration. We're going for a smaller model to keep things manageable:
import torch
from HelpingAI.HelpingAI_.configuration_HelpingAI import HelpingAIConfig
from HelpingAI.HelpingAI_.modeling_HelpingAI import HelpingAIForCausalLM
from transformers import TrainingArguments, AutoTokenizer
from datasets import load_dataset
from trl import SFTTrainer
from HelpingAI.HelpingAI_.tokenization_HelpingAI_fast import HelpingAITokenizerFast
# Let's create our amazing model configuration! ๐
configuration: HelpingAIConfig = HelpingAIConfig(
vocab_size=50281,
hidden_size=512,
num_hidden_layers=6,
num_attention_heads=8,
head_dim=64,
num_local_experts=4,
num_experts_per_tok=1,
intermediate_size=1024,
hidden_act="silu",
hidden_dropout=0.1,
attention_dropout=0.1,
classifier_dropout=0.1,
max_position_embeddings=2048,
initializer_range=0.02,
rms_norm_eps=1e-6,
layer_norm_eps=1e-5,
use_cache=True,
bos_token_id=50278,
eos_token_id=50279,
num_key_value_heads=8,
norm_eps=1e-05,
)
# Time to create our fantastic model! ๐ญ
model: HelpingAIForCausalLM = HelpingAIForCausalLM(configuration)
# Let's get our tokenizer ready! ๐ค
tokenizer: HelpingAITokenizerFast = HelpingAITokenizerFast.from_pretrained("Abhaykoul/HelpingAI-tokenizer", local_files_only=False)
tokenizer.pad_token = tokenizer.eos_token
print(f"Wow! ๐ฎ Our model has {model.num_parameters():,} parameters!")
Step 3: Prepare Your Dataset ๐๐
Now, let's get our dataset ready for training:
# Time to load our exciting dataset! ๐
dataset: datasets.Dataset = load_dataset('roneneldan/TinyStories', split="train")
# Let's shake things up a bit! ๐ฒ
dataset = dataset.shuffle(seed=42)
print(f'Amazing! ๐ We have {len(dataset)} prompts to work with!')
print(f'Our dataset columns are: {dataset.column_names}')
Step 4: Train Your Model ๐๏ธโโ๏ธ๐ช
Now for the exciting part - training our model!
from trl import SFTTrainer
from transformers import TrainingArguments
trainer: SFTTrainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=dataset,
dataset_text_field="text",
max_seq_length=2048,
dataset_num_proc=2,
args=TrainingArguments(
per_device_train_batch_size=2,
gradient_accumulation_steps=1,
warmup_steps=2,
max_steps=20000,
learning_rate=1e-4,
logging_steps=100,
output_dir="M_outputs",
overwrite_output_dir=True,
save_steps=20000,
optim="paged_adamw_32bit",
report_to="none"
)
)
# Let's train our model! ๐๐จ
trainer.train()
trainer.save_model(my_model)
Step 5: Share Your Model with the World! ๐๐
Finally, let's push our amazing model to the Hugging Face Hub:
model.push_to_hub(my_model, use_temp_dir=False, token="hf_***************************")
tokenizer.push_to_hub(my_model, use_temp_dir=False,token="hf_***************************")
And there you have it! ๐ You've successfully pre-trained an AI model using the HelpingAI architecture. Remember, the journey of AI is all about exploration and learning. Keep experimenting, stay curious, and most importantly, have fun! ๐๐
Happy coding, AI adventurers! ๐ฆธโโ๏ธ๐ฆธโโ๏ธ
Subscribe to my newsletter
Read articles from Abhay Koul directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Abhay Koul
Abhay Koul
I am a developer from KP Colony, Vessu, Anantnag. I have been actively involved in AI projects since January 2023, and I am dedicated to crafting emotionally intelligent conversational AI models. My work includes developing HelpingAI, an advanced AI that provides personalized assistance and empathetic interactions.