Creator | Model name | Price ($ per 1M tokens) | Context size (tokens) | Description | Vision enabled |
---|---|---|---|---|---|
Open AI | GPT-3.5 | in: $0.50 | |||
out: $1.50 | 16k | The model used in the free version of ChatGPT - fast and effective for most needs | |||
GPT-4 | in: $30.00 | ||||
out: $60.00 | 8k | A more powerful but slower model - useful when complex reasoning is required | |||
GPT-4 Turbo | in: $10.00 out: $30.00 | 128k | The model used in the paid version of ChatGPT - a powerful GPT-4 level reasoning model with speed similar to GPT-3.5 and vision capabilities | Yes | |
Note: take a look here for OpenAI’s image token pricing, approx. $0.011 per 1080p image | |||||
GPT-4o | in: $5.00 | ||||
out: $15.00 | 128k | Yes | |||
Note: take a look here for OpenAI’s image token pricing, approx. $0.0055 per 1080p image | |||||
GPT-4o mini | in: $0.15 | ||||
out: $0.60 | 128k | The most cost-efficient model available from OpenAI | Yes | ||
Note: take a look here for OpenAI’s image token pricing, approx. $0.0055 per 1080p image | |||||
o1 | in: $15 | ||||
out: $60 | 128k | Reasoning model designed to solve hard problems across domains. | |||
o1-mini | in: $3 | ||||
out: $12 | 128k | Faster and cheaper reasoning model particularly good at coding, math and science. | |||
Gemini Pro | in: $0.50 | ||||
out: $1.50 | 32k | A multi-modal LLM from Google | Yes | ||
$0.0025/image | |||||
Gemini 1.5 Flash | in: $0.35/$0.70 | ||||
out: $0.53/$1.05 |
(128k/>128k tokens) | 1M | A very good value price-performance model with a huge 1 million token context length and high speeds | Yes ~$0.00265 / image | | | Gemini 1.5 Pro | in: $3.50/$7.00 out: $10.50/$21
(128k/>128k tokens) | 1M | Google’s latest multi-modal LLM with a huge 1 million token token context window | Yes ~$0.00265 / image | | | Gemma 2B | $0.10 | 8k | An older generation fast and cheap open source LLM from Google | | | | Gemma 2 9B | $0.20 | 8k | A next generation fast and cheap open source LLM from Google | | | | Gemma 2 27B | $0.80 | 8k | A larger next generation open source LLM from Google | | | Mistral AI | Mistral 7B | $0.20 | 8k | Fast and small (7B) model | | | | Mixtral 8x7B | $0.60
($0.24 superfast) | 32k | A mixture-of-experts (8x7B) model that outperforms LLaMA-2 70B but with an inference time and cost equivalent to a 13B model
This model has an ‘superfast’ mode available that uses groq for model inference. It’s currently not recommended for production deployments but is worth experimenting with for low latency applications | | | | Mixtral 8x22B | $1.20
(in:$2.00 out: $6.00 structured) | 64k | A mixture-of-experts model fluent in five languages, excels in math and coding, outperforms neary all open models | | | | Mistral NeMo | $0.30 | 128k | Mistral’s best small model with 128k context length | | | | Mistral Medium | in: $2.70 out: $8.10 | 32k | Another Mistral model, fairly performant | | | | Mistral Large | in: $8.00 out: $24.00 | 32k | A good reasoning model with strong multilingual capabilities | | | Anthropic | Claude 2 | in: $8.00 out: $24.00 | 200k | Claude 2.1 is a powerful model with 200k token context capable of complex reasoning | | | | Claude 3 Haiku | in: $0.25 out: $1.25 | 200k | Fast and cheap model with big context | Yes Note: take a look here for Anthropic’s image token pricing, approx. $0.008 per 1080p image | | | Claude 3.5 Haiku | in: $1.00 out: $5.00 | 200k | Upgraded fast and cheap model with big context | | | | Claude 3 Sonnet | in: $3.00 out: $15.00 | 200k | Ideal balance of intelligence and speed for enterprise workloads | Yes Note: take a look here for Anthropic’s image token pricing, approx. $0.008 per 1080p image | | | Claude 3.5 Sonnet | in: $3.00 out: $15.00 | 200k | Anthropic’s current best model, faster, cheaper and more capable than Claude 3 Opus | Yes Note: take a look here for Anthropic’s image token pricing, approx. $0.008 per 1080p image | | | Claude 3 Opus | in: $15.00 out: $75.00 | 200k | A powerful model model for highly complex tasks though likely a worse choice than Claude 3.5 Sonnet nowadays. | Yes Note: take a look here for Anthropic’s image token pricing, approx. $0.008 per 1080p image | | | | | | | | | Meta | Llama 3 8B | $0.20
($0.05/$0.08 superfast) | 8k | The best small open model available currently
This model has an ‘superfast’ mode available that uses groq for model inference. It’s currently not recommended for production deployments but is worth experimenting with for low latency applications | | | | Llama 3 70B | $0.90
($0.59/$0.79superfast) | 8k | A powerful open model from Meta
This model has an ‘superfast’ mode available that uses groq for model inference. It’s currently not recommended for production deployments but is worth experimenting with for low latency applications | | | | Llama 3.1 8B | $0.18 | 128k | A small, fast and effective model from Meta | | | | Llama 3.1 70B | $0.54 | 128k | A medium sized model from Meta | | | | Llama 3.1 405B | $3.50 | 128k | A heavyweight model from Meta that’s one of the most capable open models | | | | Llama 3.2 3B | $0.06 | 128k | A super small and fast model from meta, useful for low-latency inference | | | | Llama 3.2 11B | $0.18 | 128k | A medium sized model from Meta that can also perform visual understanding | Yes ~$0.00115/image | | | Llama 3.2 90B | $1.20 | 128k | Meta’s most advanced model with vision capabilities | Yes ~$0.0077/image | | Databricks | DBRX | $1.20 | 32k | An open-source mixture-of-experts model from Databricks that outperforms GPT-3.5 and all other open-source models | | | Cohere | Command R | in: $0.15 out: $0.60 | 128k | Cohere’s latest mid-weight model with the ability to enable ‘online’ mode for web search before answering | | | | Command R+ | in: $2.50 out: $10.00 | 128k | Cohere’s latest and best model optimised for RAG and tool use with the ability to enable ‘online’ mode for web search before answering | | | Perplexity | Sonar Small | $0.20 (+ $0.005 per request) | 128k | Perplexity AI’s small ‘online’ model with up to date information based on Llama 3.1 8B | | | | Sonar Large | $1.00 (+ $0.005 per request) | 128k | Perplexity AI’s large ‘online’ model with up to date information based on Llama 3.1 70B | | | | Sonar Huge | $5.00 (+ $0.005 per request) | 128k | Perplexity AI’s huge mixture of experts ‘online’ model with up to date information based on Llama 3.1 405B | | | OpenChat | OpenChat 3.5 | $0.20 | 8k | A fast and small model (7B) model that outperforms GPT-3.5 | | | Amazon | Nova Micro | in: $0.035 out: $0.14 | 128k | A small and fast text-only model by Amazon | | | | Nova Lite | in: $0.06 out: $0.24 | 300k | A low-cost multimodal model by Amazon | | | | Nova Pro | in: $0.8 out: $3.2 | 300k | Amazons largest model of the Nova family yet. | |
<aside> 💡 To request a specific model or enquire about using a fine-tuned model reach out to [[email protected]](mailto:[email protected]?subject=Model%20request)
</aside>