【ＡＩ】美團LongCat團隊發布並開源VitaBench大模型評測基準

　　美團LongCat團隊20日正式發布當前高度貼近真實生活場景、面向複雜問題的大模型智能體評測基準--VitaBench(Versatile Interactive Tasks Benchmark)，並已全面開源。

　　據官方介紹，VitaBench以外賣點餐、餐廳就餐、旅遊出行三大高頻真實生活場景為典型載體，構建了包含66個工具的交互式評測環境，並進行了跨場景的綜合任務設計。例如，在旅遊規劃任務中，要求智能體通過思考、調用工具和用戶交互，完整執行從買好票到訂好餐廳的終端狀態。
《經濟通通訊社21日專訊》

【你點睇？】本港2025年本港復甦緩慢，經濟受到衝擊，亦發生多宗社會重大事件。你點睇2025年的香港？展望2026年，你最期待哪方面？► 立即投票

上一篇新聞︰21/10/2025 15:48 美團Ｋｅｅｔａ據報高薪廣招人才，年新１２０萬元聘高級算法專家

下一篇新聞︰21/10/2025 09:15 肖飛與仇廣宇加入美團決策層Ｓ－ｔｅａｍ，分管軟硬件服務和Ｋ…

其他

22/10/2025 08:46 【大行炒Ｄ乜】泡泡瑪特上季銷售強勁多行升目標，大摩唱好中芯唱淡華虹

22/10/2025 08:41 《盤前攻略》美股個別走道指創收市新高，金價急挫金礦股沽壓難免

22/10/2025 08:10 《政政經經－石鏡泉》追？

20/10/2025 16:30 中美貿易緊張局勢似稍緩和，科技股借勢反彈可以點部署？

20/10/2025 11:30 《外資精點》大摩料美團第三季核心業務轉蝕１３２億人幣，雖價…

備註：	即時報價更新時間為29/12/2025 17:59
	港股即時基本市場行情由香港交易所提供; 香港交易所指定免費發放即時基本市場行情的網站

經濟通
強化版MQ
強化版TQ
財曆
Mobile
Web

客務熱線︰(852) 2880 7004 客務郵箱︰cs@etnet.com.hk
關於我們 | 產品服務 | 廣告查詢 | 聯絡我們 | 私隱政策 | 使用條款 | 網站導航 | 有用連結 | RSS新聞

Copyright 2025 ET Net Limited. http://www.etnet.com.hk ET Net Limited, HKEx Information Services Limited, its Holding Companies and/or any Subsidiaries of such holding companies, and Third Party Information Providers endeavour to ensure the availability, completeness, timeliness, accuracy and reliability of the information provided but do not guarantee its availability, completeness, timeliness, accuracy or reliability and accept no liability (whether in tort or contract or otherwise) any loss or damage arising directly or indirectly from any inaccuracies, interruption, incompleteness, delay, omissions, or any decision made or action taken by you or any third party in reliance upon the information provided. The quotes, charts, commentaries and buy/sell ratings on this website should be used as references only with your own discretion. ET Net Limited is not soliciting any subscriber or site visitor to execute any trade. Any trades executed following the commentaries and buy/sell ratings on this website are taken at your own risk for your own account.

《經濟通》所刊的署名及／或不署名文章，相關內容屬作者個人意見，並不代表《經濟通》立場，《經濟通》所扮演的角色是提供一個自由言論平台。