05-LM Studio中文版入门教程「含：去除思维链方法及代码实现」

AI悦创原创2025/10/26大约 22 分钟...约 6462 字

1. 安装库

lmstudio-python 已在 PyPI 上发布。你可以使用 pip 进行安装。

pip install lmstudio

2. 快速示例：与 Llama 模型聊天

2.1 使用终端安装大模型

以下代码需要 qwen3-4b-2507 模型，如果没有该模型，请在终端运行以下命令进行下载。

lms get qwen/qwen3-4b-2507

在此处阅读有关 LM Studio 的 CLI 中的 lms get 更多信息。

2.2 基础调用示例

import lmstudio as lms
model = lms.llm("qwen/qwen3-4b-2507")
result = model.respond("生命的意义是什么？")
print(result)

4. 查看当前所安装的大模型 & 使用

4.1 查看所有已安装大模型

本地

import lmstudio as lms

loaded_models = lms.list_loaded_models()  # 查看的是已经在软件中加载的大模型
for idx, model in enumerate(loaded_models):
    print(f"{idx:>3} {model}")

局域网 API

import lmstudio as lms

SERVER_API_HOST = "192.168.3.17:19978"

lms.configure_default_client(SERVER_API_HOST)
#
# model = lms.llm("openai/gpt-oss-20b")
# result = model.respond("说出三种水果")
# print(result)
loaded_models = lms.list_loaded_models()
for idx, model in enumerate(loaded_models):
    print(f"{idx:>3} {model}")

{idx:>3} 的含义

这一部分是对变量 idx 的格式控制。

: 表示“开始设置格式”
> 表示 右对齐（align right）
3 表示 宽度为 3 个字符

👉 所以 {idx:>3} 的意思是：

把变量 idx 占 3 个字符宽度并右对齐，不足的位置用空格填充。

例如：

idx	输出结果
1	`' 1'`
23	`' 23'`
456	`'456'`

4.2 使用大模型

下面的代码，有助于推测 LM Studio 的多轮对话记忆。

model = loaded_models[0]
chat = lms.Chat("你要简明扼要地回答问题")
chat = lms.Chat("你要简明扼要地回答问题")  # 蕴含的意思：可以重复设定，代表 system 角色
# 1
chat.add_user_message("说出三种水果")
print(model.respond(chat, on_message=chat.append))
# 2
chat.add_user_message("再说出三种水果")
print(model.respond(chat, on_message=chat.append))

chat.add_user_message("你已经告诉了我多少种水果？")
print(model.respond(chat, on_message=chat.append))

# ---output---
<|channel|>analysis<|message|>User asks "说出三种水果" meaning "name three fruits". Just answer with three fruit names.<|end|><|start|>assistant<|channel|>final<|message|>苹果、香蕉、西瓜。
<|channel|>analysis<|message|>Need 3 more fruits.<|end|><|start|>assistant<|channel|>final<|message|>葡萄、橙子、芒果。
<|channel|>analysis<|message|>Count: first 3 + second 3 =6.<|end|><|start|>assistant<|channel|>final<|message|>你已被告知 **六** 种水果。

5. 流式回复

import lmstudio as lms
model = lms.llm()

for fragment in model.respond_stream("生命的意义是什么？"):
    print(fragment.content, end="", flush=True)
print() # Advance to a new line at the end of the response

6. 管理上下文聊天

import lmstudio as lms

# Create a chat with an initial system prompt.
chat = lms.Chat("You are a resident AI philosopher.")

# Build the chat context by adding messages of relevant types.
chat.add_user_message("What is the meaning of life?")
# ... continued in next example

基础

# The `chat` object is created in the previous step.
result = model.respond(chat)

print(result)

流式回复

# The `chat` object is created in the previous step.
prediction_stream = model.respond_stream(chat)

for fragment in prediction_stream:
    print(fragment.content, end="", flush=True)
print() # Advance to a new line at the end of the response

7. 自定义推理参数

基础版

result = model.respond(chat, config={
    "temperature": 0.6,
    "maxTokens": 50,
})

流式版本

prediction_stream = model.respond_stream(chat, config={
    "temperature": 0.6,
    "maxTokens": 50,
})

8. 输出资源使用信息

基础

# `result` 是模型返回的响应。
print("使用的模型:", result.model_info.display_name)
print("预测的 Token 数:", result.stats.predicted_tokens_count)
print("首个 Token 延迟（秒）:", result.stats.time_to_first_token_sec)
print("停止原因:", result.stats.stop_reason)

# ---output---
使用的模型: OpenAI's gpt-oss 20B
预测的 Token 数: 40
首个 Token 延迟（秒）: 0.263
停止原因: eosFound

流式

# After iterating through the prediction fragments,
# the overall prediction result may be obtained from the stream
result = prediction_stream.result()

print("Model used:", result.model_info.display_name)
print("Predicted tokens:", result.stats.predicted_tokens_count)
print("Time to first token (seconds):", result.stats.time_to_first_token_sec)
print("Stop reason:", result.stats.stop_reason)

9. 多轮对话

局域网多轮对话

import lmstudio as lms

SERVER_API_HOST = "192.168.3.17:19978"

lms.configure_default_client(SERVER_API_HOST)

model = lms.llm("openai/gpt-oss-20b")
chat = lms.Chat("You are a task focused AI assistant")  # 角色设定，system

while True:
    user_input = input("You (leave blank to exit): ")

    if not user_input:
        break
    chat.add_user_message(user_input)
    prediction = model.respond(
        chat,
        on_message=chat.append,
    )
    print("Bot: ", end="", flush=True)
    print(prediction)
    print()

base-code

import lmstudio as lms

model = lms.llm()
chat = lms.Chat("You are a task focused AI assistant")

while True:
    try:
        user_input = input("You (leave blank to exit): ")
    except EOFError:
        print()
        break
    if not user_input:
        break
    chat.add_user_message(user_input)
    prediction_stream = model.respond_stream(
        chat,
        on_message=chat.append,
    )
    print("Bot: ", end="", flush=True)
    for fragment in prediction_stream:
        print(fragment.content, end="", flush=True)
    print()

注释代码

import lmstudio as lms      # 导入 lmstudio 库，并起一个别名叫 lms，后面用 lms 来访问这个库里的功能

model = lms.llm()           # 创建一个大语言模型实例，赋值给 model 变量，用于后面生成回复
chat = lms.Chat("You are a task focused AI assistant")
                           # 创建一个对话对象 chat，并设置初始系统提示词：
                           # "You are a task focused AI assistant"
                           # 也就是告诉模型“你是一个专注完成任务的 AI 助手”

while True:                # 启动一个无限循环，用来反复与用户进行对话（直到用户退出）
    try:
        user_input = input("You (leave blank to exit): ")
                           # 显示提示文字，让用户在命令行输入内容。
                           # input() 会等待用户输入并按回车，把输入的字符串保存到 user_input。
                           # 括号里的字符串是提示语：“You (leave blank to exit): ”
                           # 提示用户：留空（直接回车）时表示退出。
    except EOFError:       # 捕获 EOFError 异常（例如在某些环境下输入结束 / 管道结束）
        print()            # 打印一个空行，使输出好看一点
        break              # 跳出 while True 循环，结束程序

    if not user_input:     # 如果 user_input 为空字符串（用户直接回车，不输入内容）
        break              # 结束循环，退出程序

    chat.add_user_message(user_input)
                           # 把用户刚才输入的这一句添加到对话历史中，
                           # 相当于告诉 chat：“这是用户说的话”。

    prediction_stream = model.respond_stream(
        chat,              # 把当前整个对话对象 chat 传给模型，让模型根据对话历史生成回复
        on_message=chat.append, 
                           # 设置一个回调函数 on_message，当模型生成新的消息片段时，
                           # 会调用 chat.append 把这些内容追加到对话历史。
    )

    print("Bot: ", end="", flush=True)
                           # 先打印“Bot: ”作为机器人回复的前缀。
                           # end="" 表示打印后不换行，光标留在同一行；
                           # flush=True 表示强制立刻把缓冲区内容输出到终端（避免等待）。

    for fragment in prediction_stream:
                           # 遍历模型生成的流式输出 prediction_stream。
                           # prediction_stream 是一个“可迭代对象”，每次迭代拿到一小段回复（fragment）。
        print(fragment.content, end="", flush=True)
                           # 对每个 fragment，打印其中的文本内容 fragment.content。
                           # end="" 让所有片段连在同一行输出，形成连续的回复；
                           # flush=True 让每个小片段一生成就立刻显示，实现“流式打字”的效果。

    print()                # 最后再打印一个换行，把光标移动到下一行（便于下次输入）

限制 token

import lmstudio as lms

SERVER_API_HOST = "192.168.3.17:19978"

lms.configure_default_client(SERVER_API_HOST)

model = lms.llm("openai/gpt-oss-20b")
chat = lms.Chat("You are a task focused AI assistant")  # 角色设定，system

while True:
    user_input = input("You (leave blank to exit): ")

    if not user_input:
        break
    chat.add_user_message(user_input)
    prediction = model.respond(
        chat,
        on_message=chat.append,
        config={
            "maxTokens": 50,
        }
    )
    print("Bot: ", end="", flush=True)
    print(prediction)
    print()

11. 取消预测

比如你问了一个问题，大模型回答很久。或者你问了一个问题，想要取消。此时，可以使用此方法。

关于更多

直接查看官方文档即可：https://lmstudio.ai/docs/python/getting-started/project-setup

公众号：AI悦创【二维码】

AI悦创·编程一对一

AI悦创·推出辅导班啦，包括「Python 语言辅导班、C++ 辅导班、java 辅导班、算法/数据结构辅导班、少儿编程、pygame 游戏开发、Web、Linux」，招收学员面向国内外，国外占 80%。全部都是一对一教学：一对一辅导 + 一对一答疑 + 布置作业 + 项目实践等。当然，还有线下线上摄影课程、Photoshop、Premiere 一对一教学、QQ、微信在线，随时响应！微信：Jiabcdefh

C++ 信息奥赛题解，长期更新！长期招收一对一中小学信息奥赛集训，莆田、厦门地区有机会线下上门，其他地区线上。微信：Jiabcdefh

方法一：QQ

方法二：微信：Jiabcdefh

更新日志

2025/11/27 20:09

查看所有更新日志

9a06f-docs(Python-编程课程): 添加LMStudio取消预测和官方文档链接于 2025/11/27
bf7b7-docs(Python-Programming-Course): 补充流式与非流式区别说明并调整格式于 2025/11/27
1511b-docs: 修正文档中的标签语法错误并添加新章节于 2025/11/27
b9ffd-docs(Python-Programming-Course): 更新LMStudio文档中的正则表达式说明于 2025/11/27
e65b8-docs(Python-Programming-Course): 加粗显示正则表达式的一句话解释于 2025/11/27
50cf6-docs(Python-Programming-Course): 添加LMStudio非流式输出处理方法文档于 2025/11/27
9194e-docs: 修正LMStudio文档中字符串处理的错误说明于 2025/11/27
04c6b-docs(Python-Programming-Course): 更新大模型安全和LMStudio文档内容于 2025/11/27
9fca5-docs: 在LMStudio文档中标记错误思路为学习案例于 2025/11/24
75b36-docs: 更新LMStudio文档中的代码示例和问题解决方法于 2025/11/24
9f175-docs(Python-Programming-Course): 更新LM Studio教程标题并添加去除思维链方法于 2025/11/24
570bf-docs(Python-Programming-Course): 修正文档中的代码标签和格式错误于 2025/11/24
1178f-docs(Python-Programming-Course): 更新LMStudio文档中的代码示例和输出于 2025/11/24
63654-docs(Python-Programming-Course): 更新LMStudio文档中的代码示例和说明于 2025/11/24
cdffc-docs(Python-Programming-Course): 更新文档中方法标签和添加思维链章节于 2025/11/24
c330b-docs(Python-Programming-Course): 更新LMStudio文档中的token限制示例和注释于 2025/11/24
9cca0-docs(Python-Programming-Course): 更新课程文档添加LM Studio相关内容于 2025/11/24
12fb6-docs(Python-Programming-Course): 添加LMStudio多轮对话示例代码于 2025/11/24
ae897-docs(Python-编程课程): 在LMStudio文档中添加多轮对话代码示例的标签于 2025/11/24
5f1cb-docs: 删除LMStudio.md中的多余空行于 2025/11/20
3b094-docs(Python-Programming-Course): 更新LM Studio中文版入门教程内容于 2025/11/20
ab000-docs(Python-Programming-Course): 更新LMStudio文档中的上下文聊天标题于 2025/11/20
e16b1-docs(Python-Programming-Course): 添加LMStudio文档中查看已安装模型的代码示例于 2025/11/20
5ba06-docs(Python-Programming-Course): 更新LMStudio使用文档并添加示例图片于 2025/11/20
6edcf-docs: 更新文档内容并添加图片于 2025/11/2
fd7ca-docs: 添加Python课程中字符串格式化的说明并更新烹饪教程食材和步骤于 2025/11/2
baaff-docs(Python-Programming-Course): 添加查看已安装大模型的代码示例于 2025/11/2
daaa6-docs: 更新LMStudio文档中的服务器设置示例于 2025/11/2
93e01-docs(Python-编程课程): 添加LMStudio API服务器检查与局域网IP获取说明于 2025/11/2
62039-docs(Python-Programming-Course): 添加LMStudio教程相关图片并更新文档结构于 2025/11/2
326b6-docs(Python-Programming-Course): 添加LMStudio局域网服务器设置说明和图片于 2025/11/2
3fd89-docs(Python-Programming-Course): 更新LMStudio使用文档和异常处理说明于 2025/11/2
62ef9-docs: 更新文档内容并添加新文件于 2025/10/26
82bb0-docs: 更新文档内容并添加新菜谱于 2025/9/13
70353-更新文件夹名称于 2025/5/11
ae8a4-更新食谱文档，优化标签和步骤格式于 2025/5/10
9bfdc-新增 Vuepress 侧边栏自定义文档，包含自动生成侧边栏的说明及相关信息于 2025/4/29
4d098-Add articles on programming and cooking: "Can computer programs really kill people?" and "清蒸鲈鱼"于 2025/4/22
1c35a-去掉head于 2025/4/11
cbb3a-update于 2023/1/30
76989-update于 2022/12/10
86c50-update于 2022/12/9
027da-first commit于 2022/11/28

贡献者

AndersonHJBAI悦创

05-LM Studio中文版入门教程「含：去除思维链方法及代码实现」

1. 安装库

2. 快速示例：与 Llama 模型聊天

2.1 使用终端安装大模型

2.2 基础调用示例

3. 设置局域网服务器

3.1 LM Studio 软件内设置

3.2 开启服务

3.3 程序中设置使用 http 模式

3.4 检查指定的 API 服务器主机是否正在运行

3.5 局域网内 IP 地址获取

3.6 最终设置

4. 查看当前所安装的大模型 & 使用

4.1 查看所有已安装大模型

4.2 使用大模型

5. 流式回复

6. 管理上下文聊天

7. 自定义推理参数

8. 输出资源使用信息

9. 多轮对话

10. 去除思维链

10.1 方法一：非流式

10.1.1 字符串方法

10.1.2 regex 方法

10.1.2.1 方案一：简单版，只保留 final 文本（推荐）

10.1.2.2 方案二：一次性用 regex 拆分出 analysis / final

10.2 方法二：流式

11. 取消预测

关于更多

更新日志

贡献者