Program

Program

12月3日 上午
时间 日程 报告人
主论坛 (主持人:邹琳教授,中国民用航空飞行学院) - 一楼神树厅
9:00-9:05 开幕式 李维萍教授,中国民用航空飞行学院首席科学家
9:05-9:10 评价学英文版专著发布/评价学专题丛书专家委员会成立 BenchCouncil(国际测试委员会)
9:10-9:55 ICICLE: Intelligent Cyberinfrastructure for Next-Generation AI Applications using Computing Continuum D.K. PANDA教授, IEEE/ACM Fellow
9:55-10:40 AI: Automation of Intelligence Powered by Data 周傲英教授,华东师范大学
10:40-10:50 茶歇
10:50-10:55 BenchCouncil Press 发布 BenchCouncil (国际测试委员会)
10:55-11:40 国际航空评价学研究中心(中飞院)成果发布 李维萍教授、邹琳教授
11:40-12:00 CPU评价标准组成果发布 BenchCouncil CPU评价标准工作组
12:00-12:15 2025 中外大模型发展、算力基建对比与应用实践 北京安联通
12月3日 下午
时间 日程 报告人
主论坛 (主持人:康国新博士,中国科学院计算技术研究所) - 一楼神树厅
14:00-14:45 国际开源评价学研究中心(华东师大)成果发布 王伟教授
14:45-15:30 欧洲开源人工智能系统与应用的最新进展 Hajdi Cenan & Davor Runje (欧洲人工智能专家)
15:30-15:40 茶歇
15:40-16:15 评价学丛书选题Panel 詹剑锋教授 、李茂登教授等专家
16:15-16:30 面向手机的端侧大模型的benchmark评估体系 中国检验认证(集团)有限公司&中国信息通信研究院
16:30-16:45 大模型评价标准组成果发布 BenchCouncil 大模型评价标准工作组
16:45-17:00 数据库评价学——向量数据库的评价 康国新博士
17:00-17:30 科技评价Panel 范帆达博士、康国新博士及领域专家
12月4日 上午
时间 日程 报告人
基于人工智能的出版研讨 - 一楼玉琮会议室
9:45-12:00 BenchCouncil Press:基于人工智能的出版 开放研讨
12月4日下午 - 12月5日上午
时间 日程 报告人
评价学(Evaluatology)培训 - 一楼玉琮会议室
12月4日 14:00-17:00 评价学(Evaluatology)培训—Part I 开放培训
12月5日 9:00-12:00 评价学(Evaluatology)培训—Part II 开放培训
12月4日全天:Bench会议报告 I - 一楼青琮会议室
时间 报告题目 报告人
9:00-9:45 Keynote: Challenges to Evaluation from a LET Perspective Prof. Weining Qian, East China Normal University
9:45-10:05 Meta Evaluation Hongxiao Li, ICT, CAS
10:05-10:25 Leveraging Network and Content Features for Open Source Software Value Assessment Wentong Dai, East China Normal University
10:25-10:45 Compiler Tuning Method Based on Program Feature Extraction and Model Prediction Chenghua Xu, University of Science and Technology of China
10:45-11:05 Examining TPC-C Characteristics on Modern E-Commerce Applications Xueyuan Ren, The Ohio State University
11:05-11:25 Multidimensional Identification and Complex System Transmission Pathway Analysis of Scale-up Risks for Sustainable Aviation Fuel (SAF) in China Zhujun Liu, Civil Aviation Flight University of China
11:25-11:45 An Empirical Analysis of Contribution Evaluation in Open Source Courses Using OpenRank Wentong Dai, East China Normal University
11:45-12:05 Relationship Evaluation for Developer Recommendation in Open Source Communities Xuanhao Zhao, East China Normal University
12:05-14:00 午休
14:00-14:20 Current Status and Future Trends of Evaluation Methods for Artificial Intelligence Chips Xiaotong Yu, Aerospace Science & Industry Defense Technology Research And Test Center
14:20-14:40 Phys-TSGAIN: A Physics-Informed Generative Imputation for Data Completeness Governance of Lithium-Ion Battery Time Series Wei Zuo, Shanghai University
14:40-15:00 Empirical Bias in Theoretical Frameworks: Validation of Distance and Load Assumptions in ICAO Aviation Carbon Emissions Calculation Methodologies Jianxiong Chen, Civil Aviation Flight University of China
15:00-15:20 Auto-tuning Compiler Flags with Pretrained Language Models and Surrogate-guided Search Yinjun Pan, University of Science and Technology of China
15:20-15:40 PerfMamba: Performance Analysis and Pruning of Selective State Space Models Abdullah Al Asif, Iowa State University
15:40-16:00 Review of LLM Jailbreaks: White-Box and Black-Box Perspectives on Attacks, Defenses, and Critical Metrics Shuyuan Liu, East China Normal University
16:00-16:20 OceanNNEval: Benchmark for Three-Dimensional Temperature and Salinity Reconstruction Qingsong Zou, Sun Yat-sen University
12月4日全天:Bench会议报告 II - 一楼白琮会议室
时间 报告题目 报告人
9:45-10:05 iDATA: An Open-source Vectorization Dataset for AI-EDA Xingquan Li, Pengcheng Laboratory
10:05-10:25 M-CORE: A Dual-Axis Grading Framework for Evaluating the Completeness and Openness of Model Release Units Zhen Zhang, East China Normal University
10:25-10:45 The Global Pulse of Code: A Framework for Evaluating the Globalization of Open Source Projects Jiaheng Peng, East China Normal University
10:45-11:05 Open Source Development Goals: A Comprehensive Framework for Evaluating and Guiding Global Open Source Initiatives Fanyu Han, East China Normal University
11:05-11:25 GeoClaim: Programmable Geoscientific Fact Verification and Judge-Guided Evaluation for Open-Ended Mineral Exploration QA Yuang Zhang, China University of Geosciences
11:25-11:45 AC Bench: An open Artificial intelligence chip performance benchmark tool Qian Zhang, China Academy of Information and Communications Technology
12:05-14:00 午休
14:00-14:20 OpenChartInsight: A Lightweight Framework for Automatic Interpretation of GitHub Repository Charts Xie Siyi, East China Normal University
14:20-14:40 Dynamic Multi-View RAG Mitigating Hallucinations of Large Language Models in Education Weijun Zhao, China Academy of Information and Communications Technology
14:40-15:00 MicroGen: Agent-Driven Automated Extraction of Realistic Microbenchmarks from Complex Software Systems Fei Tang, Inspur Data Co.,Ltd.
15:00-15:20 Systematic Evaluation of Miniaturized Lunar Navigation and Communication Satellite Systems Maodeng Li, Deep Space Exploration Lab
15:20-15:40 FashionAtlas: Enhancing Semantics and Control in Multimodal Fashion Image Editing Enzhen Gu, Beijing Institute of Fashion Technology
15:40-16:00 Research on the Effectiveness Evaluation System of Lunar-Based Near-Earth Asteroid Monitoring Systems Zhiliu Lu, Deep Space Exploration Lab