Easy Derivation Of Rotary Position Embeddings For Large Language Models
摘要: 以 LLAMA 为代表的开源大语言模型广泛使用旋转位置编码,原始论文使用复函数推导。本文改用线性代数推导,期望更好地理解该编码方法;提出该方法的一个疑点并给出了改进建议。
Abstract: The Rotary Position Embeddings(RoPE) is widely used in open-source large language models suchas LLAMA. In the original paper, the formula derivation uses complex functions. In this Paper, I derivePoPEs formulas again with linear algebra, hoping to better understand this method.
[V3] | 2023-07-12 15:57:43 | ChinaXiv:202307.00071V3 | 下载全文 |
[V2] | 2023-07-11 19:26:53 | ChinaXiv:202307.00071v2 查看此版本 | 下载全文 |
[V1] | 2023-07-10 16:54:18 | ChinaXiv:202307.00071v1 查看此版本 | 下载全文 |
1. A Conversation with ChatGPT: Think Tank Theory and Practice in the Age of AI | 2023-10-31 |
2. A Conversation with ChatGPT: Dialogue of Civilizations in the Age of AI | 2023-10-30 |
3. 对话ChatGPT:AI时代的“文明的对话” | 2023-10-30 |
4. A Conversation with ChatGPT: The Media and Communications Industry in the Age of AI | 2023-10-25 |
5. 对话ChatGPT:AI时代的新闻传播 | 2023-10-25 |
6. 对话ChatGPT:AI时代的数字政府 | 2023-10-23 |
7. Simplifying Low-Light Image Enhancement Networks with Relative Loss Functions | 2023-10-08 |
8. 空气质量预测的深度学习模型研究与评估 | 2023-09-22 |
9. 对话ChatGPT:AI时代的科学研究 | 2023-09-22 |
10. LLAMA-2 大语言模型的数学形式 | 2023-08-31 |