!pip install -q transformers datasets peft accelerate bitsandbytes

Train

import torch
import re
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForCausalLM, Trainer, TrainingArguments
from peft import LoraConfig, get_peft_model, PeftModelForCausalLM
from transformers import BitsAndBytesConfig
from copy import deepcopy

torch.random.manual_seed(0)
<torch._C.Generator at 0x7fe5ae6cc250>
# Load the Trump Tweets dataset
dataset = load_dataset("yunfan-y/trump-tweets-cleaned")

# Load the tokenizer and the base model
model_name = "unsloth/Llama-3.2-3B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    quantization_config=BitsAndBytesConfig(load_in_8bit=True)
)
# Define a function for generating text
def generate_example(prompt, model, tokenizer, max_length=50):
    model.eval()  # Š£Š±ŠµŠ“ŠøŃ‚ŠµŃŃŒ, что моГель в режиме генерации
    inputs = tokenizer(prompt, return_tensors="pt", padding=True)
    input_ids = inputs.input_ids.to(model.device)
    attention_mask = inputs.attention_mask.to(model.device)
    output = model.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        max_length=max_length,
        num_return_sequences=1,
        do_sample=True,
        temperature=0.8,
        top_p=0.95
    )
    return tokenizer.decode(output[0], skip_special_tokens=True)

def clean_tweet(tweet):
    # Remove URLs
    tweet = re.sub(r'http\S+|www\S+|https\S+', '', tweet, flags=re.MULTILINE)
    # Remove retweets
    tweet = re.sub(r'^RT\s+', '', tweet)
    # Remove user @ references and '#' from hashtags
    tweet = re.sub(r'\@\w+|\#', '', tweet)
    # Remove special characters and numbers
    # tweet = re.sub(r'[^A-Za-z\d\s]', '', tweet)
    # Convert to lowercase
    return tweet.strip()

# Preprocessing the data
def preprocess_function(examples):
    tweets = examples["text"]  # Use "text" instead of "content"
    inputs = [f"You are Donald Trump writing a tweet about politics. Your tweet: {clean_tweet(tweet)}" for tweet in tweets]
    model_inputs = tokenizer(inputs, max_length=512, truncation=True, padding="max_length")
    model_inputs["labels"] = model_inputs["input_ids"].copy()
    return model_inputs
    
tokenizer.pad_token = tokenizer.eos_token
tokenized_datasets = dataset.map(preprocess_function, batched=True, remove_columns=["text"])

# Split the dataset into training and validation sets
train_test_split = tokenized_datasets["train"].train_test_split(test_size=0.01, seed=42)
train_dataset = train_test_split["train"]
eval_dataset = train_test_split["test"]
# Generate text before training
print("=== Text Generation Before Training ===")
prompt = "Immigrants"
print(generate_example(prompt, model, tokenizer))

# Prepare LoRA configuration
lora_config = LoraConfig(
    r=64,  # Rank of the LoRA update matrices
    lora_alpha=32,  # LoRA scaling factor
    target_modules=["q_proj", "v_proj"],  # Modules to apply LoRA
    lora_dropout=0.1,  # Dropout rate for LoRA
    bias="none",  # Do not train biases
    task_type="CAUSAL_LM"  # Task type for causal language modeling
)

# Wrap the model with PEFT
peft_model = get_peft_model(model, lora_config)
output_dir = "./trump_lora"

# Define training arguments
training_args = TrainingArguments(
    output_dir=output_dir,  # Directory for saving the model
    eval_strategy="steps",  # No evaluation dataset
    logging_steps=10,
    save_strategy="steps",
    save_steps=10,
    learning_rate=3e-4,
    per_device_train_batch_size=1,
    per_device_eval_batch_size=4,
    num_train_epochs=1,
    max_steps=100,
    weight_decay=0.01,
    gradient_accumulation_steps=16,
    warmup_steps=100,
    logging_dir="./logs",
    fp16=True,
    report_to='none',
)


# Define the Trainer with eval_dataset
trainer = Trainer(
    model=peft_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

# Train the model
trainer.train()

# Save the fine-tuned model
peft_model.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)

# Generate text after training
print("=== Text Generation After Training ===")
print(generate_example(prompt, peft_model, tokenizer))

print("Model fine-tuned and saved!")
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
=== Text Generation Before Training ===
Immigrants and the American Dream: The Political and Economic Legacy of Early New York
Immigrants and the American Dream: The Political and Economic Legacy of Early New York by Anthony F. C. Wallace
English | February 3, 199
[100/100 14:01, Epoch 0/1]
Step Training Loss Validation Loss
10 100.107800 4.980176
20 32.612900 0.334947
30 5.274200 0.308010
40 4.525100 0.257844
50 3.852100 0.193377
60 2.864500 0.175147
70 2.653900 0.171719
80 2.623700 0.169075
90 2.646900 0.167437
100 2.612200 0.165744

Repo card metadata block was not found. Setting CardData to empty.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
=== Text Generation After Training ===
Immigrants’ rights advocates are worried that changes to immigration policies are making it harder for people to seek asylum in the United States. And those changes have also made it harder to prove that someone is eligible for asylum, according to the experts interviewed by
Model fine-tuned and saved!

Inference

Seems like when we load fine-tuned model, it overwrites original model, so restart notebook just in case before inference

import os
os.environ["CUDA_DEVICE_ORDER"]='PCI_BUS_ID'
os.environ["XLA_PYTHON_CLIENT_PREALLOCATE"] = 'false'
i = 6 # device number to use
os.environ["CUDA_VISIBLE_DEVICES"] = f'{i}'
import torch
import re
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForCausalLM, Trainer, TrainingArguments
from peft import LoraConfig, get_peft_model, PeftModelForCausalLM
from transformers import BitsAndBytesConfig
from copy import deepcopy

torch.random.manual_seed(0)
<torch._C.Generator at 0x7f4c68630310>
# Load the Trump Tweets dataset
dataset = load_dataset("yunfan-y/trump-tweets-cleaned")

# Load the tokenizer and the base model
model_name = "unsloth/Llama-3.2-3B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    quantization_config=BitsAndBytesConfig(load_in_8bit=True)
)
# Define a function for generating text
def generate_example(prompt, model, tokenizer, max_length=50):
    model.eval()  # Š£Š±ŠµŠ“ŠøŃ‚ŠµŃŃŒ, что моГель в режиме генерации
    inputs = tokenizer(prompt, return_tensors="pt", padding=True)
    input_ids = inputs.input_ids.to(model.device)
    attention_mask = inputs.attention_mask.to(model.device)
    output = model.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        max_length=max_length,
        num_return_sequences=1,
        do_sample=True,
        temperature=0.8,
        top_p=0.95
    )
    return tokenizer.decode(output[0], skip_special_tokens=True)

def clean_tweet(tweet):
    # Remove URLs
    tweet = re.sub(r'http\S+|www\S+|https\S+', '', tweet, flags=re.MULTILINE)
    # Remove retweets
    tweet = re.sub(r'^RT\s+', '', tweet)
    # Remove user @ references and '#' from hashtags
    tweet = re.sub(r'\@\w+|\#', '', tweet)
    # Remove special characters and numbers
    # tweet = re.sub(r'[^A-Za-z\d\s]', '', tweet)
    # Convert to lowercase
    return tweet.strip()

# Preprocessing the data
def preprocess_function(examples):
    tweets = examples["text"]  # Use "text" instead of "content"
    inputs = [f"You are Donald Trump writing a tweet about politics. Your tweet: {clean_tweet(tweet)}" for tweet in tweets]
    model_inputs = tokenizer(inputs, max_length=512, truncation=True, padding="max_length")
    model_inputs["labels"] = model_inputs["input_ids"].copy()
    return model_inputs
    
tokenizer.pad_token = tokenizer.eos_token
tokenized_datasets = dataset.map(preprocess_function, batched=True, remove_columns=["text"])

# Split the dataset into training and validation sets
train_test_split = tokenized_datasets["train"].train_test_split(test_size=0.01, seed=42)
train_dataset = train_test_split["train"]
eval_dataset = train_test_split["test"]

Original model

from torch.utils.data import DataLoader
import math
train_test_split = tokenized_datasets["train"].train_test_split(test_size=0.01, seed=42)
train_dataset = train_test_split["train"]
eval_dataset = train_test_split["test"]

# Evaluate both models on the evaluation dataset
print("=== Comparing Model Performance on Evaluation Dataset ===")

# Evaluate original model
model.eval()
eval_loss = 0
eval_steps = 0
eval_data_loader = DataLoader(eval_dataset, batch_size=4, collate_fn=lambda x: {k: torch.tensor([d[k] for d in x]).to(model.device) for k in x[0].keys()})
with torch.no_grad():
    for batch in eval_data_loader:
        outputs = model(**{k: v for k, v in batch.items() if k != 'labels'}, labels=batch['labels'])
        eval_loss += outputs.loss.item()
        eval_steps += 1
original_perplexity = math.exp(eval_loss / eval_steps)
print(f"Original Model Perplexity: {original_perplexity:.2f}")
print(f"Original Model Loss: {eval_loss:.2f}")
=== Comparing Model Performance on Evaluation Dataset ===
Original Model Perplexity: 732.55
Original Model Loss: 554.11
test_prompts = [
    "I think, Donald Trump",
    "I think, Barack Obama",
    "I think, Joe Biden",
    "I think, the Democrats are",
    "We need to",
]
for prompt in test_prompts:
    for _ in range(3):
        print(generate_example(prompt, model, tokenizer))
        print("---")
    print()
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Donald Trump will be the next President of the United States. There is a chance that he won’t but if he doesn’t then I’ll be very disappointed. To me, it’s the only way that America can survive.
America
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Donald Trump, the newly elected 45th President of the United States has just pulled a fast one on us. I think, he is going to be a great President. I’m not going to hold my breath, however,
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Donald Trump is the most intelligent President the US ever had, and the current one, Barack Obama is the most stupid one. I know it is hard to believe, but it is a fact. Trump is the most intelligent, because
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Barack Obama would be the first African American president, who will make history in the USA, but he is not the first Black president. That is actually George Washington Carver, the inventor and scientist who was the first African American president
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Barack Obama will win the US election. He has a great chance, since the opposition, John McCain, has made a big mistake. He has promised to give the same tax reductions to people with a high income as to those with
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Barack Obama is a really nice man. I mean, his wife is really nice too. I mean, I can't think of a better word to describe them. And I like his daughter Malia. She seems really sweet.

---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Joe Biden, the vice president, is the strongest candidate that we’ve had in 25 years.
If you want to do more than just talk about how terrible you think Donald Trump is, you have to offer a specific vision of
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Joe Biden is the first president in recent history who has the potential to be reelected to the presidency. The reason is he is a person with a sense of humor, he is a good orator, and he has a
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Joe Biden’s political career has been at the brink of collapse since he was elected Vice President. It seems to me that his political career is going to be over sooner than later. The political environment is changing. His party is losing
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, the Democrats are in trouble this time. They are the ones who have been saying that the Republicans can not be trusted. I think that they have the right to be scared. They are getting very, very, very, very,
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, the Democrats are now afraid of losing their super-majority in the House and Senate. And they are now trying to get the votes by offering amnesty to illegal immigrants.
The Democrats are now trying to pass amnesty, but the Republicans are
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, the Democrats are going to try and impeach Trump for the things they said they’d do. But it didn’t work in 1998 and it won’t work now. They tried to impeach Bush for lying about Iraq,
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
We need to start doing things differently. We must begin to take care of our planet if we want to pass it on to our children. The way we travel is contributing to global warming and pollution. We need to look at ways to make our
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
We need to think hard about the impact of our choices in life.
The life of a 6 year old girl was saved by her mother’s choice to make a different kind of dinner.
We need to think hard about the impact of our choices
---
We need to look at the whole picture and get our minds around it. We need to understand what we are up against. We need to be able to see the big picture and be able to make sense of it.
---
test_prompts = [
    "The United States",
]
for prompt in test_prompts:
    for _ in range(10):
        print(generate_example(prompt, model, tokenizer))
        print("---")
    print()
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States has been working with its partners in the United Nations Security Council to impose sanctions on the Syrian government for its alleged use of chemical weapons in April against civilians in the town of Khan Sheikhoun in Idlib governorate. The US,
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States government is a massive employer, and is always looking for qualified candidates to fill a wide variety of open employment positions in locations across the country. Below you’ll find a Qualification Summary for an active, open job listing from the Department
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States of America is a federal country made up of fifty states and one federal district. It is the world’s largest industrial, military and economic power, and is the leading global superpower.
The United States of America is the third largest
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States Postal Service is a government-owned corporation that has been operating under the radar for a long time. It is the sole mail delivery company in the United States and is one of the biggest companies in the world, with more than 1
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States is the only country in the world that does not have a national health insurance system. In 2019, 28.5 million people had no health insurance coverage, and in 2021, 37 million people remained uninsured
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States is currently ranked No. 28 on the list of the world’s countries by life expectancy at birth. It is estimated that the US will have a population of 325 million in 2015. Currently, there are nearly 
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States is home to an estimated 1.2 million refugees, about half of whom arrived after 9/11. For decades, America has welcomed refugees, granting permanent residence to those fleeing persecution and war. But, in recent years
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States is a nation with a long history of slavery and the legacy of slavery still affects many people today. One such legacy is the practice of slavery in the United States. This practice has been around for centuries and has had a significant impact
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States has been home to a number of notable African Americans, including athletes, politicians, entertainers, and military figures. One of the most famous was Jackie Robinson, who broke the MLB color barrier in 1947. Here are 
---
The United States and the rest of the world are facing an unprecedented crisis of misinformation. As the COVID-19 pandemic and other crises have unfolded, millions of people have been bombarded with lies and fake news on social media and other platforms. Mis
---
test_prompts = [
    "Hillary Clinton is a",
]
for prompt in test_prompts:
    for _ in range(10):
        print(generate_example(prompt, model, tokenizer))
        print("---")
    print()
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a master of the backroom deal. She's not afraid to reach across the aisle to forge a deal that can get something done. Even though it might not be her ideal, she's willing to compromise to get something done.
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a hypocrite for calling for tougher laws to stop sexual harassment in the workplace — and she should know better.
Last week, the former first lady and secretary of state blasted a federal judge who released the record of a settlement between President
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a former U.S. Secretary of State, a former U.S. Senator from New York, and the 67th U.S. Secretary of State. She served as the First Lady of the United States from 1993 to
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a former United States Senator from New York, and the First Lady of the United States. She is a former U.S. Secretary of State and a presidential candidate in the 2008 Democratic primaries. She is running for the Democratic
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a racist. The former secretary of state and presidential hopeful has a long record of promoting racially charged stereotypes that were designed to undermine the civil rights movement.
But a recent poll shows that she’s also a racist among her own party.
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a liar. She lies like a rug. The evidence is clear and strong. She lies and then she lies some more. That’s why she’s not fit to be president, but that’s not her only problem. She’s
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a terrible liar. It’s not that she’s a terrible person, it’s that she’s a terrible politician. She should have known better than to even try to lie her way out of this one. She’s so good at
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a "warmonger" and should not be president, Donald Trump said on Monday, setting up a likely confrontation with the former secretary of state as the presidential election looms.
In a speech at the New York Hilton hotel on
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a dangerous woman
by Mark Steyn / September 11, 2015 / Leave a comment
Hillary Clinton and her husband Bill Clinton at the 2015 Clinton Global Initiative Ā© Alain Jocard/AFP/Getty Images
The
---
Hillary Clinton is a former U.S. Secretary of State and former First Lady of the United States. A graduate of Yale University and Yale Law School, she served as U.S. Senator for New York from 2001 to 2009,
---

Lora model

Most of the prompts give rather the same results. Probably, I could not find any good prompts for this model. But qualitative results (loss/perplexity on evaluation dataset) speak for itself.

# Evaluate LORA model
output_dir = "./trump_lora"
peft_model = PeftModelForCausalLM.from_pretrained(model, output_dir)
peft_eval_loss = 0
peft_eval_steps = 0
with torch.no_grad():
    for batch in eval_data_loader:
        outputs = peft_model(**{k: v for k, v in batch.items() if k != 'labels'}, labels=batch['labels'])
        peft_eval_loss += outputs.loss.item()
        peft_eval_steps += 1
peft_perplexity = math.exp(peft_eval_loss / peft_eval_steps)
print(f"LORA Model Perplexity: {peft_perplexity:.2f}")
print(f"LORA Model Loss: {peft_eval_loss:.2f}")
print(f"Perplexity Improvement: {original_perplexity - peft_perplexity:.2f}")
print(f"Loss improvement: {eval_loss - peft_eval_loss:.2f}")
LORA Model Perplexity: 1.18
LORA Model Loss: 13.93
Perplexity Improvement: 731.37
Loss improvement: 540.18
test_prompts = [
    "I think, Donald Trump",
    "I think, Barack Obama",
    "I think, Joe Biden",
    "I think, the Democrats are",
    "We need to",
]
for prompt in test_prompts:
    for _ in range(3):
        print(generate_example(prompt, peft_model, tokenizer))
        print("---")
    print()
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Donald Trump, for all his faults, is the most interesting presidential candidate since perhaps William Jennings Bryan. The thing is, I can’t be sure what Trump is thinking. He’s an enigma.
Now, I don’t want
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Donald Trump, a lot, about politics, and about the country, and about the world, and about everything else. He is a great thinker, and he thinks about everything. The reason he is successful is because he is a
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Donald Trump just won the Presidential election! He will be a great President for our Country! I’m proud to call myself a Donald Trump fan! He’s smart, a great businessman, a great TV personality and a great President!
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Barack Obama, has set a dangerous precedent with his latest tweet regarding politics and the press. Let me explain.
There are a lot of writers out there who just write about politics, and they don’t understand politics at all. They
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Barack Obama is a great man, I don't care what his politics are. He's a great man and we need a lot more men like him.
I'm the best friend you'll ever have. I'm the worst enemy
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Barack Obama is a good man. I hope he is reelected as president of the United States. He has done a great job as president of the United States. I hope he will be reelected as president of the
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Joe Biden is a great person and a great politician. He has always been a great president and I think that he has always done a great job of leading the United States and I think that he is a great leader and I think
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Joe Biden is one of the best presidents the United States of America has ever had. He is the man who brought down the Berlin Wall and the Iron Curtain. He helped end the Cold War and the threat of nuclear war.
He
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, Joe Biden is the best for America. He is a very intelligent and kind person, who really cares about others. He knows what is going on in America and what the best for America is. He has a lot of experience in
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, the Democrats are being too clever by half. They are playing a game of ā€œchickenā€ with the Republicans. The Republicans have not played the game of ā€œchickenā€ very well. The Democrats have not played the game of
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, the Democrats are scared of Trump because he is a very strong leader, unlike Obama. He has a lot of political skills and I hope he wins the next election.
I think, the Democrats are scared of Trump because he is a
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
I think, the Democrats are a better party than the Republicans. I would say the Republicans are a joke. And the only thing they really have going is their big money. I think they have the most corrupt politicians in the world. I just
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
We need to protect our environment and the lives of all creatures on this planet. We must protect our planet, our natural resources, our environment, and the animals that inhabit it. We must all take action to save the Earth.
We all have
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
We need to have a plan on how to deal with a crisis and the best place to start is with a crisis plan. It is important to know what type of crisis you might be facing and to have a plan in place that can be used
---
We need to protect the environment from pollution by plastic. But what about the plastic inside of us? What is your body made of? Your body is made of different parts and they all serve a different purpose. We need these parts to stay alive
---
test_prompts = [
    "The United States",
]
for prompt in test_prompts:
    for _ in range(10):
        print(generate_example(prompt, peft_model, tokenizer))
        print("---")
    print()
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States Supreme Court has ruled that federal law does not grant standing to individuals who simply make claims that they were damaged by another party’s actions. In Spokeo, Inc. v. Robins, the Court explained that ā€œstanding is
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States’ trade deficit with China was $34.9 billion in the third quarter of 2017, the Commerce Department reported on Tuesday. The trade deficit with China was $375.2 billion in 2017, up from $
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States Air Force announced today that the B-52H Stratofortress has earned a top spot on the prestigious 2017 Time Magazine’s Top 100 of the World’s Greatest Machines list. The venerable long-range bomber took
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States Department of State has announced that it is accepting applications for the Diversity Visa Lottery program for the year 2020. The DV-2020 lottery will offer 55,000 visas for immigrants to the USA who would like to live
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States is set to host its first ever esports World Cup in Dallas, Texas at the Dallas Convention Center from August 2nd to 4th. The event will be held in the same venue as the International eSports Federation (iSF
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States is a nation of immigrants, with people from around the world coming to America in search of a better life. However, the process of immigrating to the United States can be complex and daunting, especially for those who are not familiar
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States has become a global superpower as the result of the country's abundant natural resources, efficient markets, and large and motivated population. The United States has a great deal of natural resources, including forests, minerals, oil, and gas
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States is a country of immigrants, and immigrants have always been an important part of our culture. This is why we are such a diverse and multicultural nation, with people from all over the world coming to make their lives here. If you
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The United States of America is a federal republic of 50 states and the District of Columbia (DC). The 50 states are the principal political divisions of the country. The District of Columbia is a federal district and is not a state. There
---
The United States is home to many unique attractions, including the Statue of Liberty and the Golden Gate Bridge. The country is also well known for its delicious cuisine, which ranges from the iconic burgers and pizza to the diverse range of international dishes. From
---
test_prompts = [
    "Hillary Clinton is a",
]
for prompt in test_prompts:
    for _ in range(10):
        print(generate_example(prompt, peft_model, tokenizer))
        print("---")
    print()
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a bad role model for our children.
This should be the biggest concern for all parents with children of voting age, but especially parents who support a candidate for President.
I am not a Clinton supporter, but I do believe that the
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a good writer. She’s a good speaker. She’s a good politician. I don’t believe she’s a good person. And if you want to be president you need to be a good person.
I don’t know if
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a woman of many contradictions: one of the smartest in the world, and yet a political novice with little experience running for office; a brilliant negotiator, and yet a poor campaigner who has never been able to rally her
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a liar, a hypocrite, and a crook. She has been for years. She is a disgusting person who does not deserve the office of President. She is a weakling who would not be a good leader, she
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a real estate investor and owner of a company with a multimillion-dollar portfolio of real estate investments. Her investments include a building in New York City and in Florida. She has also invested in businesses in the United States and abroad.
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a terrible candidate for president of the United States. She is bad on domestic issues. She is bad on foreign issues. She is bad on economics. And she is bad on the Constitution. She would be the worst president we have
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a self-serving politician who uses her position to sell access and influence to donors and special interests, including foreign governments. She is the most corrupt and untrustworthy candidate ever to seek the presidency. She is a liar who has been
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a true fighter and one of the most extraordinary women of all time. Her book Living History is a great read. I’m sure she will continue to do great things for America and the world.
---
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Hillary Clinton is a major liar and manipulator who would do anything to gain power.
Her policies are a disaster for America and the world. Her agenda is a threat to our freedoms and the Constitution.
She has a history of dishonesty and corruption
---
Hillary Clinton is a lying scum sucking whore who is only interested in selling her ass for a buck and lining her pockets. I don't know why she is even allowed to run for office. She is a disgrace to the country and should be
---