Fastest LLM inference — Llama, Mixtral, Gemma at unprecedented speed.
Groq provides the fastest LLM inference API. Run open-source models like Llama 3, Mixtral, and Gemma with industry-leading speed on custom LPU hardware.
Click "Try It" above to test the API in the playground
Click "Add to Agent" to get your API key and integrate
Get started quickly with these code examples in your favorite language
curl -X GET \ 'https://callio.app/api/proxy/groq/forward?target=https%3A%2F%2Fapi.groq.com%2Fopenai%2Fv1%2Fendpoint' \ -H 'Authorization: Bearer YOUR_CALLIO_KEY' \ -H 'Content-Type: application/json'💡 Tip: Replace YOUR_CALLIO_KEY with your actual Callio API key from the dashboard.
Test endpoints live or generate your API key and start building in minutes