Language Models (LLM)
This project supports multiple large language model backends and models.
Almost all large language model (LLM) APIs and inference engines support the OpenAI format. So, if you find that an LLM API you want to use isn't explicitly supported in our project, you can usually just fill in the relevant information (base URL, API key, model name) into the openai_compatible_llm section, and it should work right away.
How to configure and switch between different large language model backends
The project's default agent is
basic_memory_agent, so to switch the language model for the default agent, make selections under thellm_provideroption ofbasic_memory_agent.
1. Configure large language model settings
Refer to the Supported Large Language Model Backends section below to configure the corresponding large language model backend.
Under agent_config in llm_config, you can configure the connection settings for the backend and various LLMs.
2. Switch to the corresponding large language model (LLM) in the settings of the respective agent
Some agents may not support custom LLMs
Go to the basic_memory_agent settings:
basic_memory_agent:
llm_provider: "openai_compatible_llm" # LLM solution to use
faster_first_response: True
Note that llm_provider can only be filled with large language model backends that exist under llm_configs. Currently, only openai_compatible_llm is supported.
Supported Large Language Model Backends
OpenAI Compatible API (openai_compatible_llm)
Compatible with all API endpoints that support the OpenAI Chat Completion format. This includes LM Studio, vLLM, Ollama, OpenAI Official API, Gemini, Zhipu, DeepSeek, Mistral, Groq, and most inference tools and API providers.
Default Configuration (Kimi)
The project uses Moonshot AI's Kimi model by default. Kimi is a large language model developed by Moonshot AI, supporting ultra-long context windows with powerful performance and high-speed output capabilities.
Default Configuration Example:
# OpenAI compatible inference backend (default: Kimi)
openai_compatible_llm:
base_url: "https://api.moonshot.cn/v1" # Moonshot AI API endpoint
llm_api_key: "your-moonshot-api-key" # Your Moonshot API key
model: "kimi-k2-turbo-preview" # Model to use
temperature: 1.0 # Temperature, between 0 and 2
- Visit Moonshot AI Platform
- Register and log in
- Create an API key in the console
- Fill in the API key in the
llm_api_keyfield
For more information, refer to the Moonshot AI Official Documentation.
kimi-k2-turbo-preview is a high-speed version of the Kimi K2 model, with output speed increased from 10 tokens per second to 40 tokens per second, while maintaining the same performance parameters as the original K2 model.