Language Models (LLM)

This project supports multiple large language model backends and models.

note

Almost all large language model (LLM) APIs and inference engines support the OpenAI format. So, if you find that an LLM API you want to use isn't explicitly supported in our project, you can usually just fill in the relevant information (base URL, API key, model name) into the openai_compatible_llm section, and it should work right away.

How to configure and switch between different large language model backends

The project's default agent is basic_memory_agent, so to switch the language model for the default agent, make selections under the llm_provider option of basic_memory_agent.

1. Configure large language model settings

Refer to the Supported Large Language Model Backends section below to configure the corresponding large language model backend.

Under agent_config in llm_config, you can configure the connection settings for the backend and various LLMs.

2. Switch to the corresponding large language model (LLM) in the settings of the respective agent

Some agents may not support custom LLMs

Go to the basic_memory_agent settings:

basic_memory_agent:
    llm_provider: "openai_compatible_llm" # LLM solution to use
    faster_first_response: True

Note that llm_provider can only be filled with large language model backends that exist under llm_configs. Currently, only openai_compatible_llm is supported.

Supported Large Language Model Backends

OpenAI Compatible API (`openai_compatible_llm`)

Compatible with all API endpoints that support the OpenAI Chat Completion format. This includes LM Studio, vLLM, Ollama, OpenAI Official API, Gemini, Zhipu, DeepSeek, Mistral, Groq, and most inference tools and API providers.

Default Configuration (Kimi)

The project uses Moonshot AI's Kimi model by default. Kimi is a large language model developed by Moonshot AI, supporting ultra-long context windows with powerful performance and high-speed output capabilities.

Default Configuration Example:

# OpenAI compatible inference backend (default: Kimi)
openai_compatible_llm:
    base_url: "https://api.moonshot.cn/v1"  # Moonshot AI API endpoint
    llm_api_key: "your-moonshot-api-key"    # Your Moonshot API key
    model: "kimi-k2-turbo-preview"          # Model to use
    temperature: 1.0                       # Temperature, between 0 and 2

Getting API Key

Visit Moonshot AI Platform
Register and log in
Create an API key in the console
Fill in the API key in the llm_api_key field

For more information, refer to the Moonshot AI Official Documentation.

About kimi-k2-turbo-preview

kimi-k2-turbo-preview is a high-speed version of the Kimi K2 model, with output speed increased from 10 tokens per second to 40 tokens per second, while maintaining the same performance parameters as the original K2 model.

How to configure and switch between different large language model backends​

1. Configure large language model settings​

2. Switch to the corresponding large language model (LLM) in the settings of the respective agent​

Supported Large Language Model Backends​

OpenAI Compatible API (openai_compatible_llm)​

Default Configuration (Kimi)​