Completion() | liteLLM

📄️ Input Params

Common Params

📄️ Prompt Formatting

LiteLLM automatically translates the OpenAI ChatCompletions prompt format, to other models. You can control this by setting a custom prompt template for a model as well.

📄️ Output

Format

📄️ Exception Mapping

LiteLLM maps exceptions across all providers to their OpenAI counterparts.

📄️ Streaming + Async

- Streaming Responses

📄️ Trimming Input Messages

Use litellm.trim_messages() to ensure messages does not exceed a model's token limit or specified max_tokens

📄️ Function Calling

LiteLLM only supports: OpenAI gpt-4-0613 and gpt-3.5-turbo-0613 for function calling

📄️ Model Alias

The model name you show an end-user might be different from the one you pass to LiteLLM - e.g. Displaying GPT-3.5 while calling gpt-3.5-turbo-16k on the backend.

📄️ Reliability

Helper utils

📄️ Model Config

Model-specific changes can make our code complicated, making it harder to debug errors. Use model configs to simplify this.

📄️ Batching Completion()

LiteLLM allows you to:

📄️ Mock Completion() Responses - Save Testing Costs 💰

For testing purposes, you can use completion() with mock_response to mock calling the completion endpoint.