Summary: The table provides a comparison of Context Window and Maximum Output Tokens of various large language models (LLMs) including Claude 3, GPT-4 Turbo, Gemini, Mixtral, and Llama.

Every model has a limit to the number of tokens in the input prompt, normally called "Context Window." Additionally, each model has a limit on the number of tokens that it can generate in the output. This limit is sometimes called "maximum new tokens" or "maximum output tokens."

Model Name	Context Window	Maximum Output Tokens	Source
Claude 3 Haiku	200,000 tokens	4,096 tokens	https://docs.anthropic.com/claude/docs/models-overview#model-comparison
Claude 3 Sonnet	200,000 tokens	4,096 tokens	https://docs.anthropic.com/claude/docs/models-overview#model-comparison
Claude 3 Opus	200,000 tokens	4,096 tokens	https://docs.anthropic.com/claude/docs/models-overview#model-comparison
GPT-4o	128,000 tokens	4,096 tokens	https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models
GPT-4 Turbo	128,000 tokens	4,096 tokens	https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
GPT-3.5 Turbo	16,385 tokens	4,096 tokens	https://platform.openai.com/docs/models/gpt-3-5-turbo
Gemini 1.0 Ultra	8,192 tokens	2,046 tokens	https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models
Gemini 1.5 Pro	1,000,000 tokens	8,192 tokens	https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models
Gemini 1.0 Pro	32,760 tokens	8,192 tokens	https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-models
Mixtral 8x7B	32,768 tokens	16,384 tokens	https://www.ibm.com/docs/en/watsonx/saas?topic=solutions-supported-foundation-models#mixtral-8x7b-instruct-v01
Llama 3	8,192 tokens	4,096 tokens	https://www.ibm.com/docs/en/watsonx/saas?topic=solutions-supported-foundation-models#llama-3
Codellama-34b-instruct	16,384 tokens	8,192 tokens	https://www.ibm.com/docs/en/watsonx/saas?topic=solutions-supported-foundation-models#codellama-34b-instruct

Let me know if you want me to add more information to the table. Other columns that might be useful, or other LLMs.

Table with LLMs Token Output Limit

Subscribe to my newsletter

Juan Pablo Mesa Lopez

Juan Pablo Mesa Lopez