Large Language Models (LLMs)

What is an LLM?

Remember LLMs are text predictors, so we can customize the instruction to anything we like! Large Language Models (LLMs) are reasoning engines trained on hundreds of billions of words, processing and analyzing data through trillions of connections (tokens). They exist in two forms:

Aspect	Base LLMs	Instruction Tuned LLMs
Definition	Predict next token based on training data	Base models fine-tuned to follow instructions (often with RLHF)
Examples	LLaMA 2 Base, Mistral Base, BLOOM	LLaMA 2 Chat, Claude, GPT-4
Availability	Mainly open-source models	Commercial APIs and open-source chat models
Primary Use	Foundation for further training	Direct user interaction and task completion
Sample Input	"The capital of France is"	"What's the capital of France?"
Sample Output	"Paris the most populous city in France and serves as the country's major" (continues predicting)	"Paris is the capital of France." (direct answer)

All commercial LLMs (like GPT-4, Claude, Gemini) are instruction-tuned versions of their private base models. Open-source projects often release both base and instruction-tuned versions, enabling further customization by the community.

Key Capabilities

API Integration, Iterative Processing, Summarizing, Inferring, Transforming, Expanding

Important Notes:

LLMs are reasoning engines, not knowledge stores
Reliability comes through RAG (Retrieval-Augmented Generation)
Should not be used as primary information sources
Accuracy depends on clear, complete instructions

Effective Prompting Principles

Use delimiters (```)
Request structured outputs (CSV, JSON)
Include condition checks for responses
Implement few-shot prompting (provide examples)

AGI Development Levels

Level	DeepMind (Nov/2023)	OpenAI (Jul/2024)
Level 0	No AI	-
Level 1	Emerging (Equal to or somewhat better than an unskilled human)	Chatbots (AI with conversational language)
Level 2	Competent (At least 50th percentile of skilled adults)	Reasoners (Human-level problem solving)
Level 3*	Expert (At least 90th percentile of skilled adults)	Agents (Systems that can take actions)
Level 4	Virtuoso (At least 99th percentile of skilled adults)	Innovators (AI that can aid in invention)
Level 5	Superhuman (Outperforms 100% of humans)	Organizations (AI that can do the work of an organization)

Decision Tree for Model Selection

decision tree model selection

Model Selection Guide

Type	Use Cases	Scenarios
Fine-Tuning	RAG Implementation, Agent Development, One-Shot Learning	Small Dataset (under 300 rows): Use Instruct Models; Large Dataset (over 1000 rows): Base Models
Instruct Models	Limited Training, Direct Interaction	Prompt-Response Tasks, Out-of-box Solutions
Base Models	Large Datasets, Custom Training	Deep Customization, Domain Tasks

Model Deployment Options

Model Type	When to Use	Key Advantages
External Frontier LLMs	When you need cutting-edge performance and have high-end hardware	State-of-the-art performance, access to latest improvements
Local/Quantized Models	When you need economical solutions or local/edge deployment	Lower resource requirements, faster inference times