<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>大模型原理 on Answer</title>
    <link>https://answer.freetools.me/tags/%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%8E%9F%E7%90%86/</link>
    <description>Recent content in 大模型原理 on Answer</description>
    <generator>Hugo -- 0.152.2</generator>
    <language>zh-cn</language>
    <lastBuildDate>Thu, 12 Mar 2026 07:50:29 +0800</lastBuildDate>
    <atom:link href="https://answer.freetools.me/tags/%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%8E%9F%E7%90%86/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Token ID：大模型如何用一个数字代表一个词</title>
      <link>https://answer.freetools.me/token-id%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%A6%82%E4%BD%95%E7%94%A8%E4%B8%80%E4%B8%AA%E6%95%B0%E5%AD%97%E4%BB%A3%E8%A1%A8%E4%B8%80%E4%B8%AA%E8%AF%8D/</link>
      <pubDate>Thu, 12 Mar 2026 07:50:29 +0800</pubDate>
      <guid>https://answer.freetools.me/token-id%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%A6%82%E4%BD%95%E7%94%A8%E4%B8%80%E4%B8%AA%E6%95%B0%E5%AD%97%E4%BB%A3%E8%A1%A8%E4%B8%80%E4%B8%AA%E8%AF%8D/</guid>
      <description>从分词器的词表构建到Embedding查找表，深入解析Token ID的数学本质、实现原理、以及在模型推理中的完整生命周期。涵盖BPE算法的Token ID分配机制、不同语言的Token效率差异、权重共享原理，以及Token ID如何影响模型的算术能力和多语言表现。</description>
    </item>
    <item>
      <title>Transformer的前馈层：为什么这个&#34;配角&#34;占据了模型三分之二的参数</title>
      <link>https://answer.freetools.me/transformer%E7%9A%84%E5%89%8D%E9%A6%88%E5%B1%82%E4%B8%BA%E4%BB%80%E4%B9%88%E8%BF%99%E4%B8%AA%E9%85%8D%E8%A7%92%E5%8D%A0%E6%8D%AE%E4%BA%86%E6%A8%A1%E5%9E%8B%E4%B8%89%E5%88%86%E4%B9%8B%E4%BA%8C%E7%9A%84%E5%8F%82%E6%95%B0/</link>
      <pubDate>Wed, 11 Mar 2026 23:19:42 +0800</pubDate>
      <guid>https://answer.freetools.me/transformer%E7%9A%84%E5%89%8D%E9%A6%88%E5%B1%82%E4%B8%BA%E4%BB%80%E4%B9%88%E8%BF%99%E4%B8%AA%E9%85%8D%E8%A7%92%E5%8D%A0%E6%8D%AE%E4%BA%86%E6%A8%A1%E5%9E%8B%E4%B8%89%E5%88%86%E4%B9%8B%E4%BA%8C%E7%9A%84%E5%8F%82%E6%95%B0/</guid>
      <description>深入解析Transformer架构中最被低估的组件——前馈网络(FFN)。从数学原理到设计权衡，揭示为什么这个看似简单的两层全连接网络承载了模型大部分参数，以及它在知识存储、特征变换中的核心作用。</description>
    </item>
  </channel>
</rss>
