Large Language Models available on Yaddle

gpt-5-nano

GPT-5 Nano is OpenAI's fastest and most cost-efficient GPT-5 model. It is the best fit on Yaddle for quick answers, follow-up question generation, summarization, classification, and other cost-sensitive high-volume tasks.

Notable features include text input and output, image input support, a 400k context window, up to 128k output tokens, streaming, function calling, structured outputs, and configurable reasoning effort.

Learn more on this model from OpenAI or ask Yaddle about gpt-5-nano.

OpenAI

Released August 7, 2025

gpt-4o-mini

GPT-4o Mini is a compact, cost-effective OpenAI model retained on Yaddle for users who want the previous small-model behavior.

Notable features include support for text and vision inputs, a 128k context window, and up to 16k output tokens per request. It remains useful for cost-sensitive tasks that need legacy OpenAI compatibility.

Learn more on this model from OpenAI or ask Yaddle about gpt-4o-mini.

OpenAI

Released May 13, 2024

GPT-OSS-120B

GPT-OSS-120B is OpenAI's flagship open-weight reasoning model. It is built for higher-capability agentic tasks, with stronger multi-step reasoning, coding, and tool-use performance than smaller open models while still running quickly on Groq.

Notable features include a 131k context window, up to 65,536 output tokens, and strong benchmark performance for coding and advanced research workflows. It is a better fit when answer quality matters more than absolute lowest cost or latency.

Learn more on this model on Groq or ask Yaddle about GPT-OSS-120B.

OpenAI

Released August 5, 2025

GPT-OSS-20B

GPT-OSS-20B is one of OpenAI's open-weight language models, released as part of their commitment to open AI development. This 20 billion parameter model offers strong performance across a wide range of tasks while being fully accessible and transparent for developers and researchers.

Notable features include multilingual capabilities, strong reasoning abilities, and efficient inference. It's designed for developers and researchers who want to build upon and customize AI models for their specific use cases, with full access to the model weights and architecture.

Learn more on this model from OpenAI or ask Yaddle about GPT-OSS-20B.

OpenAI

Released August 5, 2025

Claude Haiku 4.5

Claude Haiku 4.5 is a fast, cost-efficient AI model from Anthropic. It is designed for low-latency responses and strong performance across coding, agent, and everyday assistant tasks.

Learn more on this model from Anthropic or ask Yaddle about Claude Haiku 4.5.

Anthropic

Released March 4, 2024

Llama-3.1-8B

Llama-3.1-8B is a low-latency model from Meta designed for quick responses and efficient inference while retaining strong general capability across common language tasks.

Learn more on this model family from Meta or ask Yaddle about llama-3.1-8b.

Llama-3.3-70B

Llama-3.3-70B is a high-capability model from Meta with stronger reasoning and coding performance for more complex prompts and multi-step tasks.

Learn more on this model family from Meta or ask Yaddle about llama-3.3-70b.

Qwen3-32B

Qwen3-32B is a fast reasoning model from Alibaba, designed for efficiency and strong performance on a variety of language tasks. It is well-suited for users seeking a balance of speed and capability.

Learn more on this model from Alibaba or ask Yaddle about Qwen3-32B.

Alibaba

Released May 2025

Gemini 2.5 Flash Lite

Gemini 2.5 Flash Lite is a medium, fast model from Google, designed for quick, high-quality responses. It is suitable for a wide range of general-purpose tasks and excels in speed and efficiency.

Learn more on this model from Google or ask Yaddle about Gemini 2.5 Flash Lite.

Google

Released May 14, 2024

Llama-4-Scout

Llama-4-Scout is Meta's latest fast model, designed for high efficiency and strong performance across a variety of tasks. It is ideal for users who want the latest advancements in open-source language models.

Learn more on this model from Meta or ask Yaddle about Llama-4-Scout.

Choose for me

Not sure which model to use? Select "Choose for me" and Yaddle will automatically pick the best model for your question based on speed, accuracy, and current availability.