With version api-scheduler:2024-07-25-0b303
of our inference stack API-scheduler, we now support a /chat/completions
endpoint. This endpoint can be used to prompt a chat-capable LLM with a conversation history and a prompt to generate a continuation of the conversation. The endpoint is available for all models that support the chat capability. The endpoint is compatible with OpenAI's /chat/completions
endpoint.
Documentation for the endpoint can be found at https://docs.aleph-alpha.com/api/chat-completions/.
Currently, the endpoint supports the following models:
- llama-3-8b-instruct
- llama-3-70b-instruct
- llama-2-7b-chat
- llama-2-13b-chat
- llama-2-70b-chat