English
全部
搜索
图片
视频
地图
资讯
Copilot
更多
购物
航班
旅游
笔记本
Top stories
Sports
U.S.
Local
World
Science
Technology
Entertainment
Business
More
Politics
时间不限
过去 1 小时
过去 24 小时
过去 7 天
过去 30 天
最佳匹配
最新
生物通
1月
评估大语言模型(LLMs)在可解释的深度强化学习(explainable deep ...
本文评估了CoT、MCTS增强和SFT三种方法在生成强化学习解释中的效果,发现MCTS显著提升大模型在复杂环境(如Lunar Lander)的解释质量,而SFT对中小模型更有效。通过LLMs作为评判者,验证了自动化评估框架与人工评估高度一致(Cohen's κ=0.77,Spearman ρ=0.88)。
一些您可能无法访问的结果已被隐去。
显示无法访问的结果
今日热点
New ICE shooting video
Economy added 50K jobs
Judge dismisses lawsuit
Argentina has repaid US
Woman killed in shark attack
UNC offensive coordinator
Sues over offshore wind halt
Today in history: 1964
Winter storm hits UK, France
Signs 3 nuclear power deals
US delegation in Venezuela
NCAA denies waiver request
WNO leaving Kennedy Center
Suspect pleads not guilty
SCOTUS will hear appeal
Iran cuts internet access
US seizes fifth oil tanker
Trump on land drug cartels
Released from prison early
Jan. 6 plaque to be displayed
CA completely drought-free
Suspended 80 games
Returns to federal court
To meet big oil executives
Pushes crackdown on fraud
SC measles outbreak
Blocked from freezing funds
Syria announces ceasefire
2026 PGA nominees
Prolific Broadway actor dies
Restricts image generation
To build $20B data center
Strikes deal w/ White House
US to provide $45M in aid
RU hits UKR w/ new missile
反馈