<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>专家网络 on Answer</title>
    <link>https://answer.freetools.me/tags/%E4%B8%93%E5%AE%B6%E7%BD%91%E7%BB%9C/</link>
    <description>Recent content in 专家网络 on Answer</description>
    <generator>Hugo -- 0.152.2</generator>
    <language>zh-cn</language>
    <lastBuildDate>Sun, 08 Mar 2026 13:47:29 +0800</lastBuildDate>
    <atom:link href="https://answer.freetools.me/tags/%E4%B8%93%E5%AE%B6%E7%BD%91%E7%BB%9C/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>为什么千亿参数的模型只需激活百亿？MoE架构的三十年技术突围</title>
      <link>https://answer.freetools.me/%E4%B8%BA%E4%BB%80%E4%B9%88%E5%8D%83%E4%BA%BF%E5%8F%82%E6%95%B0%E7%9A%84%E6%A8%A1%E5%9E%8B%E5%8F%AA%E9%9C%80%E6%BF%80%E6%B4%BB%E7%99%BE%E4%BA%BFmoe%E6%9E%B6%E6%9E%84%E7%9A%84%E4%B8%89%E5%8D%81%E5%B9%B4%E6%8A%80%E6%9C%AF%E7%AA%81%E5%9B%B4/</link>
      <pubDate>Sun, 08 Mar 2026 13:47:29 +0800</pubDate>
      <guid>https://answer.freetools.me/%E4%B8%BA%E4%BB%80%E4%B9%88%E5%8D%83%E4%BA%BF%E5%8F%82%E6%95%B0%E7%9A%84%E6%A8%A1%E5%9E%8B%E5%8F%AA%E9%9C%80%E6%BF%80%E6%B4%BB%E7%99%BE%E4%BA%BFmoe%E6%9E%B6%E6%9E%84%E7%9A%84%E4%B8%89%E5%8D%81%E5%B9%B4%E6%8A%80%E6%9C%AF%E7%AA%81%E5%9B%B4/</guid>
      <description>深入解析Mixture of Experts架构的原理与演进。从1991年Jordan和Jacobs的理论雏形，到2024年DeepSeek-V3的671B总参数仅激活37B的革命性设计，系统阐述MoE的核心机制：稀疏激活、门控路由、负载均衡。涵盖Switch Transformer、Mixtral 8x7B、GShard等里程碑模型，分析专家特化现象、分布式训练挑战、以及无辅助损失负载均衡策略的技术突破。</description>
    </item>
  </channel>
</rss>
