此页是 2026-05-19 的观测快照,查看该模型当前信息 → /m/unsloth__qwen36-35b-a3b-gguf/

归档 / 2026-05-19 / Qwen3.6-35B-A3B-GGUF (Unsloth)

Qwen3.6-35B-A3B-GGUF (Unsloth)

支持MTP加速的3B激活参数量化视觉语言模型，面向本地编程代理

入选理由: GGUF 格式可直接用 llama.cpp 部署，下载量 23 万+，是 Qwen3.6 量化版，非新能力。
对位: 对位 Qwen3.5-35B-A3B、Gemma4-26B-A4B
适合: 本地多模态推理与编程代理 (MTP加速) / 视觉问答、文档理解与工具调用
不适合: 要求原始精度的量化敏感场景
规模: 35B (3B active, Q4_K_XL) · 262k (可扩展至1M)
授权: Apache-2.0 · 可商用
框架: llama.cpp / ollama / Unsloth Studio
血统: 量化自 Qwen3.6-35B-A3B
可信度: HuggingFace 23.7万下载，Qwen官方Apache-2.0，Unsloth提供原生MTP GGUF量化

社区实测

社区普遍认为这是 Qwen3.5 的明显升级，工具调用能力在低比特量化下依然出色，Unsloth 的 Dynamic 量化质量在同类 GGUFs 中排名靠前，适合消费级硬件本地部署；但在纯 CPU 场景下推理速度可能比部分其他来源的 GGUF 慢约 30%。

2-bit 量化即可完成 30+ 次工具调用、搜索 20 个站点并执行 Python 代码，仅需 13GB RAM
1-bit GGUF 的工具调用表现依然很好
SVG 绘图质量在特定场景下可媲美甚至超过 Claude Opus 4.7
开放式探索搜索任务相比 Qwen3.5 有明显提升
Unsloth Dynamic 量化在 KL 散度基准上达到 SOTA（99.9%），覆盖编码、聊天、工具调用、科学、非拉丁文字等场景
支持广泛的推理框架（llama.cpp、vLLM、SGLang、LM Studio、Docker Model Runner 等）
可在 RTX 4060 8GB 显存上运行并获得不错的评估结果

纯 CPU 推理时 Unsloth GGUF 比同等大小的其他来源 GGUF 慢约 30%，后续响应处理时间也更长
量化模型中存在 ssm_conv1d 张量漂移问题
SVG 细节仍有瑕疵（如火烈鸟坐在轮胎上而非车座上）
不同用户的体验差异较大

来源

2-bit Qwen3.6-35B-A3B GGUF is amazing! Made 30+ successful tool calls : r/unsloth I've been running this on my laptop with the Unsloth 20.9GB GGUF in LM Studio: h... | Hacker News Qwen3.6-35B-A3B GGUF Performance Benchmarks. : r/unsloth Qwen3.6-35B-A3B GGUF from Unsloth is quite a bit slower? : r/LocalLLaMA unsloth/Qwen3.6-35B-A3B-GGUF - Hugging Face Qwen 3.6 35B A3B GGUF Quality Benchmark: unsloth, bartowski, lmstudio-community, ggml-org, mudler, AesSedai compared Qwen3.6-35B-A3B-Uncensored-Wasserstein-GGUF : r/LocalLLaMA Getting Crazy Eval using Unsloth Qwen3.6 35B A3B on a 4060 with ...Qwen3.6-35-A3B outperforms on 21 of 22 model sizes in GGUF ...Qwen3.6 - How to Run Locally | Unsloth Documentation Qwen3.6-35B-A3B on my laptop drew me a better pelican than ...Qwen3.5 GGUF Benchmarks | Unsloth Documentation

截至 2026-06-21

快速上手示例

llama-server -hf unsloth/Qwen3.6-35B-A3B-MTP-GGUF:UD-Q4_K_XL --spec-type draft-mtp --spec-draft-n-max 6

依赖版本和硬件参数请以源仓库说明为准。

评分详情

Q1: 今天能接上用吗 5 / 5
Q2: 有可信证据吗 5 / 5
Q3: 是新东西吗 1 / 5
总分: 11

HuggingFace 原始数据 (抓取于 2026-05-19)

作者: unsloth
任务类型: image-text-to-text
推理库: transformers
下载: 237,613
点赞: 249
许可证: Apache-2.0
标签: transformers, gguf, unsloth, qwen, qwen3_5_moe, image-text-to-text, base_model:Qwen/Qwen3.6-35B-A3B, base_model:quantized:Qwen/Qwen3.6-35B-A3B, license:apache-2.0, endpoints_compatible, region:us, imatrix, conversational

探索