Large Language Models (LLMs)

An Overview on LLMs

What is an LLM?

Remember LLMs are text predictors, so we can customize the instruction to anything we like! Large Language Models (LLMs) are reasoning engines trained on hundreds of billions of words, processing and analyzing data through trillions of connections (tokens). They exist in two forms:

AspectBase LLMsInstruction Tuned LLMs
DefinitionPredict next token based on training dataBase models fine-tuned to follow instructions (often with RLHF)
ExamplesLLaMA 2 Base, Mistral Base, BLOOMLLaMA 2 Chat, Claude, GPT-4
AvailabilityMainly open-source modelsCommercial APIs and open-source chat models
Primary UseFoundation for further trainingDirect user interaction and task completion
Sample Input"The capital of France is""What's the capital of France?"
Sample Output"Paris the most populous city in France and serves as the country's major" (continues predicting)"Paris is the capital of France." (direct answer)

All commercial LLMs (like GPT-4, Claude, Gemini) are instruction-tuned versions of their private base models. Open-source projects often release both base and instruction-tuned versions, enabling further customization by the community.

Key Capabilities

API Integration, Iterative Processing, Summarizing, Inferring, Transforming, Expanding

Important Notes:

  • LLMs are reasoning engines, not knowledge stores
  • Reliability comes through RAG (Retrieval-Augmented Generation)
  • Should not be used as primary information sources
  • Accuracy depends on clear, complete instructions

Effective Prompting Principles

  1. Use delimiters (```)
  2. Request structured outputs (CSV, JSON)
  3. Include condition checks for responses
  4. Implement few-shot prompting (provide examples)

AGI Development Levels

LevelDeepMind (Nov/2023)OpenAI (Jul/2024)
Level 0No AI-
Level 1Emerging (Equal to or somewhat better than an unskilled human)Chatbots (AI with conversational language)
Level 2Competent (At least 50th percentile of skilled adults)Reasoners (Human-level problem solving)
Level 3*Expert (At least 90th percentile of skilled adults)Agents (Systems that can take actions)
Level 4Virtuoso (At least 99th percentile of skilled adults)Innovators (AI that can aid in invention)
Level 5Superhuman (Outperforms 100% of humans)Organizations (AI that can do the work of an organization)

Decision Tree for Model Selection

decision tree model selection

Model Selection Guide

TypeUse CasesScenarios
Fine-TuningRAG Implementation, Agent Development, One-Shot LearningSmall Dataset (under 300 rows): Use Instruct Models; Large Dataset (over 1000 rows): Base Models
Instruct ModelsLimited Training, Direct InteractionPrompt-Response Tasks, Out-of-box Solutions
Base ModelsLarge Datasets, Custom TrainingDeep Customization, Domain Tasks

Model Deployment Options

Model TypeWhen to UseKey Advantages
External Frontier LLMsWhen you need cutting-edge performance and have high-end hardwareState-of-the-art performance, access to latest improvements
Local/Quantized ModelsWhen you need economical solutions or local/edge deploymentLower resource requirements, faster inference times

Related Links

Subscribe to AI Spectrum

Stay updated with weekly AI News and Insights delivered to your inbox