LLM Models Directory

O

OpenAI — GPT & o-Series

2026

GPT-5.4 Feb 2026

Latest flagship. Supersedes GPT-5.2. Enhanced reasoning, coding, and multi-step planning.

Closed 400K context Reasoning

GPT-5 / GPT-5 Pro Jan 2026

New default in ChatGPT, replacing GPT-4o, o3, o4-mini, GPT-4.1, and GPT-4.5. Pro variant uses scaled parallel test-time compute.

Closed 400K context Reasoning

Announcement ChatGPT Notes

2025

GPT-5.2 Dec 2025

First GPT-5 series release. 400K context, 100% on AIME 2025, 6.2% hallucination rate (~40% reduction).

Closed 400K context

GPT-oss-120B / GPT-oss-20B 2025

OpenAI's first open-weight models. 120B and 20B parameter variants.

Open Weight 120B / 20B

GPT-4.1 / GPT-4.1 mini Apr 2025

Major advances in coding, instruction following, and long-context understanding. Supports up to 1M tokens. Retired from ChatGPT Feb 2026.

Closed 1M context Coding

o4-mini Apr 2025

Small reasoning model optimized for science, math, and coding. Part of the o-series reasoning family.

Closed Reasoning

o3 / o3-mini Jan–Feb 2025

Next-gen reasoning models. o3-mini optimized for STEM tasks. o3 full model for complex multi-step reasoning.

Closed Reasoning

GPT-4.5 (Research Preview) Feb 2025

General-purpose LLM with improved EQ and reduced hallucinations. Released as research preview.

Closed

2024

o1 / o1-mini Sep 2024

First reasoning models trained with RL for complex multi-step thinking. Preview release.

Closed Reasoning

GPT-4o / GPT-4o mini May 2024

"Omni" model with native multimodal capabilities (text, vision, audio). 128K context. GPT-4o mini is a smaller, faster variant.

Closed 128K context Multimodal

2023

GPT-4 / GPT-4 Turbo Mar – Nov 2023

Flagship model with vision capabilities. GPT-4 Turbo expanded context to 128K and reduced cost.

Closed 128K context Vision

GPT-3.5 Turbo 2023 (updated)

Cost-effective model that powered the original ChatGPT. 16K context variant released mid-2023.

Closed 16K context

A

Anthropic — Claude

2026

Claude Opus 4.6 Feb 5, 2026

Most capable Claude model. 1M context window, Agent Teams for native multi-agent collaboration. API: claude-opus-4-6

Closed 1M context Reasoning

API Docs Release Notes

Claude Sonnet 4.6 Feb 17, 2026

Best balance of speed and quality. Same pricing as Sonnet 4.5. API: claude-sonnet-4-6

Closed Coding Reasoning

API Docs

2025

Claude Opus 4.5 Nov 24, 2025

Major upgrade to Opus tier with improved agentic capabilities and long-form reasoning.

Closed Reasoning

Claude Haiku 4.5 Oct 15, 2025

Fastest and most cost-effective Claude model. API: claude-haiku-4-5-20251001

Closed

Claude Opus 4 / Sonnet 4 May 22, 2025

Claude 4 generation. Opus 4 excels at long-running tasks and agentic workflows. Sonnet 4 improved coding, reasoning, and instruction-following.

Closed 200K context Reasoning

Claude 3.7 Sonnet Feb 2025

Introduced extended thinking — hybrid reasoning that lets Claude pause and think step-by-step. Major quality jump on complex math, science, and code.

Closed 200K context Extended Thinking

2024

Claude 3.5 Sonnet (v2) / Claude 3.5 Haiku Oct 22, 2024

Upgraded Sonnet with computer use capability. Haiku 3.5 introduced as fast, affordable tier.

Closed 200K context Computer Use

Claude 3.5 Sonnet Jun 20, 2024

Outperformed the larger Claude 3 Opus on benchmarks. Set a new standard for the Sonnet tier.

Closed 200K context

Claude 3 Opus / Sonnet / Haiku Mar 4, 2024

Introduced the three-tier model structure. Opus (most capable), Sonnet (balanced), Haiku (fastest).

Closed 200K context Vision

G

Google DeepMind — Gemini

2026

Gemini 3.1 Pro Feb 2026

Most advanced Pro-tier model. 1M context, 77.1% on ARC-AGI-2. Deep Think reasoning. Multimodal (text, image, audio, video, code).

Closed 1M context Multimodal Deep Think

API Docs Changelog

Gemini 3.1 Flash-Lite Feb 2026

Lightweight, fast model for high-throughput tasks. Preview release.

Closed

2025

Gemini 3 Pro / 3 Pro Image Nov 2025

Gemini 3 generation. Pro replaces Ultra tier. Powerful agentic and coding capabilities.

Closed 1M context Multimodal

Gemini 2.5 Pro / Flash / Flash-Lite Mid 2025

Improved reasoning and coding. Flash variants optimized for speed and cost.

Closed 1M context

Gemini 2.0 Flash / Pro Jan–Feb 2025

2.0 Flash became default model. Pro variant released Feb 2025.

Closed Multimodal

2024

Gemini 1.5 Pro / Flash Feb 2024

MoE architecture. First to offer 1M token context. Flash variant for speed-sensitive tasks.

Closed 1M context MoE

2023

Gemini 1.0 Ultra / Pro / Nano Dec 2023

Original Gemini family. Ultra (complex tasks), Pro (general), Nano (on-device).

Closed Multimodal

M

Meta AI — Llama

2025

Llama 4 Scout / Maverick Apr 2025

First MoE architecture in Llama family. Scout: 109B total / 17B active, 10M context. Maverick: 400B total / 17B active, optimized for quality. Multimodal (text + image + video).

Open Weight 109B–400B MoE 10M context Multimodal

Meta AI HuggingFace

2024

Llama 3.2 Oct 2024

First multimodal Llama. Text-only: 1B, 3B. Vision-enabled: 11B, 90B.

Open Weight 1B–90B Vision

Llama 3.1 Jul 2024

8B, 70B, and 405B parameters. 128K context. Multilingual. Strong tool use and reasoning.

Open Weight 8B / 70B / 405B 128K context

Llama 3 Apr 2024

8B and 70B parameters. Significant quality improvements over Llama 2.

Open Weight 8B / 70B

2023

Llama 2 Jul 2023

7B, 13B, 70B. Commercial use license. Partnership with Microsoft.

Open Weight 7B / 13B / 70B

Llama 1 Feb 2023

7B to 65B parameters. Research-only license. Sparked the open-source LLM movement.

Open Weight 7B–65B

Mi

Mistral AI — Mistral & Mixtral

2025

Mistral Large 3 Dec 2025

Sparse MoE. 675B total / 41B active parameters. Frontier-level performance.

Open Weight 675B (41B active) MoE

Announcement Docs

Ministral 3 (3B / 7B / 14B) Dec 2025

Three small dense models for edge and cost-sensitive deployments.

Open Weight 3B / 7B / 14B

Devstral 2 / Devstral Small 2 Dec 2025

Specialized code models for software development workflows.

Open Weight Coding

Magistral Small / Medium Jun 2025

First Mistral reasoning models with chain-of-thought capabilities. Small is open-source.

Open Weight (Small) Reasoning

Mistral Medium 3 May 2025

Mid-tier model balancing quality and cost.

Closed

Mistral Small 3.1 Mar 2025

Efficient small model for edge and embedded use cases.

Open Weight

2024

Mistral Large 2 / Pixtral Large Jul–Nov 2024

Dense 123B parameter model. 128K context, 80+ languages. Pixtral Large adds multimodal (124B).

Open Weight 123B–124B 128K context Multimodal (Pixtral)

2023

Mixtral 8x7B Dec 2023

Sparse MoE: 8 expert networks of 7B each (~47B total, ~13B active). Outperformed Llama 2 70B.

Open Weight 47B (13B active) MoE

Mistral 7B Sep 2023

First Mistral release. 7B dense model that punched far above its weight class.

Open Weight 7B

D

DeepSeek — V-Series & R-Series

2025

DeepSeek-V3.2 / V3.2-Speciale Sep–Dec 2025

V3.2-Exp uses Sparse Attention. Speciale variant surpasses GPT-5 on AIME/HMMT benchmarks.

Open Weight 671B (37B active) MoE Reasoning

API Docs HuggingFace

DeepSeek-V3.1 Aug 2025

Hybrid model combining V3 + R1 strengths. 671B (37B active). 128K context. Switchable thinking/non-thinking modes.

Open Weight 671B (37B active) MoE 128K context

DeepSeek-R1-0528 May 2025

Major R1 upgrade pushing reasoning and inference capabilities further. Built on V3 Base.

Open Weight (MIT) Reasoning

DeepSeek-V3-0324 Mar 2025

Improved post-training drawing lessons from R1. Better reasoning, coding, and tool use.

Open Weight MoE

DeepSeek-R1 Jan 2025

The "DeepSeek moment." ChatGPT-level reasoning at fraction of training cost. MIT License. Includes distilled variants (1.5B–70B).

Open Weight (MIT) 671B (37B active) MoE Reasoning

2024

DeepSeek-V3 Dec 2024

671B total / 37B active MoE. Trained on 14.8T tokens. Competitive with GPT-4o at fraction of cost.

Open Weight 671B (37B active) MoE

DeepSeek-V2 / V2.5 / Coder V2 May–Sep 2024

V2 introduced Multi-head Latent Attention (MLA). V2.5 merged chat and coder capabilities. Coder V2 specialized for code.

Open Weight MoE Coding (Coder)

X

xAI — Grok

Grok 4.1 Nov 2025

#1 on LMArena Elo (1483) and EQ-Bench. Hallucination rate ~4% (65% reduction from Grok 4).

Closed Reasoning

xAI

Grok Code Fast 1 2025

Specialized for agentic coding — automating dev workflows, debugging, and code generation.

Closed Coding

Grok-1 Mar 2024

314B parameter MoE model. Open-sourced under Apache 2.0 license.

Open Weight 314B MoE

Q

Alibaba Cloud — Qwen

2026

Qwen3.5-397B-A17B Feb 2026

Latest flagship MoE. 8.6×–19× higher decoding throughput than Qwen3-Max. Ultra-long context. Multimodal reasoning.

Open Weight 397B (17B active) MoE Multimodal

GitHub HuggingFace

2025

Qwen3 (0.6B–235B) Apr 2025

Dense: 0.6B, 1.7B, 4B, 8B, 14B, 32B. MoE: 30B-A3B, 235B-A22B. 100+ open weight models total.

Open Weight 0.6B–235B MoE variants

Qwen2.5-VL (3B–72B) Jan 2025

Vision-language models in 3B, 7B, 32B, and 72B parameter sizes.

Open Weight Vision-Language

2024

QwQ-32B-Preview Nov 2024

Reasoning model similar to OpenAI's o1. 32B parameters.

Open Weight 32B Reasoning

Qwen2.5 (3B–72B) / Qwen2 Jun–Sep 2024

Qwen2 released Jun 2024. Qwen2.5 refresh in Sep 2024 with improved quality across all sizes.

Open Weight 3B–72B

C

Cohere — Command

Command A (Vision / Reasoning / Translate) 2025

Specialized enterprise variants: Vision for multimodal, Reasoning for complex tasks, Translate for 200+ languages. On-premises deployment available.

Closed Multimodal Reasoning

Cohere API Docs

Command R+ / Command R 2024

RAG-optimized models. Command R+ (104B) for complex tasks, Command R (35B) for efficiency. Multilingual, grounded generation.

Open Weight (R/R+) 35B / 104B

N

NVIDIA — Nemotron

Nemotron Super 49B 2025

Compact model achieving results that previously required 600B+ parameter models.

Open Weight 49B

HuggingFace

Nemotron-4 340B 2024

340B dense model for enterprise applications. Instruct and reward model variants.

Open Weight 340B

Z

Zhipu AI — GLM

GLM-5 2026

Highest Chatbot Arena rating (1451). Top-ranked model by human preference.

Open Weight Reasoning

HuggingFace

GLM-4.7 Late 2025

94.2 HumanEval, 95.7 AIME 2025, 85.7 GPQA Diamond, 84.9 LiveCodeBench. 200K context. Arguably most well-rounded open-source model.

Open Weight 200K context Coding Reasoning

GLM-4 / GLM-4V 2024

GLM-4 general model. GLM-4V adds vision capabilities. Multiple size variants.

Open Weight Vision (4V)

+

Other Notable Models

MiniMax M2.5 2025

80.2 on SWE-bench Verified (highest). 230B parameters, 205K context. S-tier for software engineering.

Closed 230B 205K context SWE-bench Leader

Step-3.5-Flash 2025

StepFun's 196B model achieving frontier results at reduced compute. Part of the "small model revolution."

Open Weight 196B

Baidu ERNIE 4.5 / ERNIE X1 2025

Baidu's MoE-based models. ERNIE 4.5 series open-sourced in 2025.

Open Weight (4.5) MoE

Microsoft Phi-4 / Phi-3 2024–2025

Small language models (SLMs). Phi-4: 14B. Phi-3 family: 3.8B mini, 7B small, 14B medium. Punches above weight class.

Open Weight 3.8B–14B

AI21 Jamba 1.5 2024

Hybrid SSM-Transformer architecture combining Mamba and attention layers. 52B (12B active). 256K context.

Open Weight 52B (12B active) SSM+MoE 256K context

Databricks DBRX Mar 2024

132B MoE (36B active). 16 experts, top-4 routing. Strong on code and enterprise tasks.

Open Weight 132B (36B active) MoE

TII Falcon 2 / Falcon 180B 2023–2024

Falcon 180B was largest open model at release. Falcon 2 (11B) added vision. From Technology Innovation Institute (UAE).

Open Weight 11B–180B