Llama 2 Meta's Large Language Model

Potential Applications of DFS

Topological sorting: DFS can be used to implement topological sorting, which is a way of ordering a set of tasks so that no task depends on a task that comes after it.
Finding paths: DFS can be used to find the paths between two vertices in a graph.
Detecting cycles: DFS can be used to detect cycles in a graph.
Maze solving: DFS can be used to solve mazes.
Solving puzzles: DFS can be used to solve puzzles, such as sudoku puzzles, that require a single solution.
Analyzing networks: DFS can be used to analyze networks, such as social networks or computer networks.

Llama 2 is a powerful new tool with the potential to change the way we interact with computers. It is still under development, but it is already being used by researchers and developers to build new and innovative applications. In the years to come, Llama 2 is likely to play an increasingly important role in our lives.

Videos

Python Code Example: Llama2 Finetuning on Instacart Dataset

Below is an example of Python code for finetuning the Llama2 model on the Instacart Dataset.

        
# -*- coding: utf-8 -*-
"""Llama2 finetuning on Instacart Dataset-Parikshit Sangar

Automatically generated by Colaboratory.

Original file is located at
    https://colab.research.google.com/drive/1Y9NUqu-Do99h-5ZqzzUU56lMSmhwtafJ

## Installing Necessary Libraries
"""

!pip install -q -U trl transformers accelerate git+https://github.com/huggingface/peft.git
!pip install -q datasets bitsandbytes einops wandb

"""# Dataset details
Instacart Data can be downloaded from [here](https://www.kaggle.com/competitions/instacart-market-basket-analysis/data). We just need product & department csv files
"""

from google.colab import drive
drive.mount('/content/drive')

!ls /content/drive/MyDrive/'Colab Notebooks'

import pandas as pd
df_product = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/products.csv")
df_dept = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/departments.csv')

df_joined = pd.merge(df_product, df_dept, on = ['department_id'])
df_joined['text'] = df_joined.apply(lambda row: row['product_name'] + " ->: " + row['department'], axis = 1)

from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(df_joined, test_size=0.2, random_state=42)

train_df.head(10)

test_df.head(10)

from datasets import Dataset,DatasetDict
train_dataset_dict = DatasetDict({
    "train": Dataset.from_pandas(train_df),
})

"""## Loading the model"""

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, AutoTokenizer

model_name = "TinyPixel/Llama-2-7B-bf16-sharded"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    trust_remote_code=True
)
model.config.use_cache = False

"""Let's also load the tokenizer below"""

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

"""**Let's check what the base model predicts before finetuning. :)**"""

import transformers
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)


sequences = pipeline(
   ["“Free & Clear Stage 4 Overnight Diapers” ->:","Bread Rolls ->:","French Milled Oval Almond Gourmande Soap ->:"],
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq[0]['generated_text']}")

"""Below we will load the configuration file in order to create the LoRA model. According to QLoRA paper, it is important to consider all linear layers in the transformer block for maximum performance."""

from peft import LoraConfig

lora_alpha = 16
lora_dropout = 0.1
lora_r = 64

peft_config = LoraConfig(
    lora_alpha=lora_alpha,
    lora_dropout=lora_dropout,
    r=lora_r,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj","v_proj"]
)

"""## Loading the trainer

Here we will use the [`SFTTrainer` from TRL library](https://huggingface.co/docs/trl/main/en/sft_trainer) that gives a wrapper around transformers `Trainer` to easily fine-tune models on instruction based datasets using PEFT adapters. Let's first load the training arguments below.
"""

from transformers import TrainingArguments

output_dir = "./results"
per_device_train_batch_size = 4
gradient_accumulation_steps = 4
optim = "paged_adamw_32bit"
save_steps = 10
logging_steps = 1
learning_rate = 2e-4
max_grad_norm = 0.3
max_steps = 120
warmup_ratio = 0.03
lr_scheduler_type = "constant"

training_arguments = TrainingArguments(
    output_dir=output_dir,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_steps=logging_steps,
    learning_rate=learning_rate,
    fp16=True,
    max_grad_norm=max_grad_norm,
    max_steps=max_steps,
    warmup_ratio=warmup_ratio,
    group_by_length=True,
    lr_scheduler_type=lr_scheduler_type,
)

"""Then finally pass everthing to the trainer"""

from trl import SFTTrainer

max_seq_length = 512

trainer = SFTTrainer(
    model=model,
    train_dataset=train_dataset_dict['train'],
    # train_dataset=data['train'],
    peft_config=peft_config,
    dataset_text_field="text",
    # dataset_text_field="prediction",
    max_seq_length=max_seq_length,
    tokenizer=tokenizer,
    args=training_arguments,
)

"""We will also pre-process the model by upcasting the layer norms in float 32 for more stable training"""

for name, module in trainer.model.named_modules():
    if "norm" in name:
        module = module.to(torch.float32)

"""## Train the model

Now let's train the model! Simply call `trainer.train()`
"""

trainer.train()

lst_test_data = list(test_df['text'])

len(lst_test_data)

sample_size = 25
lst_test_data_short = lst_test_data[:sample_size]

import transformers

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    # torch_dtype=torch.bfloat16,
    torch_dtype=torch.float16,
    trust_remote_code=True,
    device_map="auto",
)

sequences = pipeline(
    lst_test_data_short,
    max_length=100,  #200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)

for ix,seq in enumerate(sequences):
    print(ix,seq[0]['generated_text'])

def correct_answer(ans):
  return (ans.split("->:")[1]).strip()

answers = []
for ix,seq in enumerate(sequences):
    # print(ix,seq[0]['generated_text'])
    answers.append(correct_answer(seq[0]['generated_text']))

answers

df_evaluate = test_df.iloc[:sample_size][['product_name','department']]

df_evaluate = df_evaluate.reset_index(drop=True)

df_evaluate['department_predicted'] = answers

df_evaluate

Depth Fist Search Algorithm

About DFS

Potential Applications of DFS

Research Papers

Videos

Python Code Example: Llama2 Finetuning on Instacart Dataset

Embedded Presentation

Github page - "https://github.com/ParikshitSangar2/ML_projects"