Language Modelling - 搜索 News

梁文锋署名开源“记忆”模块，DeepSeek V4更细节了

机器之心编辑部就在十几个小时前，DeepSeek 发布了一篇新论文，主题为《Conditional Memory via Scalable Lookup:A New Axis of Sparsity for Large Language Models ...

21IC电子网

DeepSeek发布梁文锋署名新论文 V4有望支持全新记忆架构

1月13日消息，今日，DeepSeek发布新论文《Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models》 (基于可扩展查找的条件记忆：大型语言模型稀疏性的新维度)。

17 小时

DeepSeek发布梁文锋署名新论文，开源相关记忆模块Engram

DeepSeek于12日晚发布新论文《Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language ...

14 天

Unlocking Business Value With Open-Weight Large Language Models

Open-weight LLMs can unlock significant strategic advantages, delivering customization and independence in an increasingly AI ...

Devdiscourse

How to Run LLMs Locally with Ollama: Setup, Models, and Best Practices

Ollama supports common operating systems and is typically installed via a desktop installer (Windows/macOS) or a ...

7 小时

Autocomplete: Large language models can repeat training data verbatim

Researchers show that LLMs can reproduce copyrighted training data almost verbatim. This means headaches for model providers.

Searchenginejournal.com

Google CALM: A New Language Model Technology

Google announced a breakthrough technology called CALM that speeds up large language models (like GPT-3 and LaMDA) without compromising performance levels. Larger Training Data Is Better But Comes ...

eLife

Separating selection from mutation in antibody language models

This important study introduces a new biology-informed strategy for deep learning models aiming to predict mutational effects in antibody sequences. It provides solid evidence that separating ...

EurekAlert!

Beyond bigger models: How efficient multimodal AI is redefining the future of intelligence

Multimodal large language models have shown powerful abilities to understand and reason across text and images, but their ...

Quanta Magazine

Distinct AI Models Seem To Converge On How They Encode Reality

Is the inside of a vision model at all like a language model? Researchers argue that as the models grow more powerful, they ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果