OpenCompass 大模型评测作业(lesson 7)

作者 : admin 本文共95940个字,预计阅读时间需要240分钟 发布时间: 2024-06-10 共2人阅读

书生·浦语大模型实战系列文章目录

书生·浦语大模型全链路开源体系发展历程和特点(lesson 1)
部署 InternLM2-Chat-1.8B(lesson 2-1)
部署八戒demo InternLM2-Chat-1.8B(lesson 2-2)
部署InternLM2-Chat-7B 模型(lesson 2-3)
部署浦语·灵笔2 模型(lesson 2-4)
部署InternLM Studio“茴香豆”知识助手(lesson 3)
XTuner 微调 LLM: 1.8B、多模态和 Agent(lesson 4)
LMDeploy 量化部署 LLM & VLM 实践(lesson 5)
Lagent & AgentLego 智能体应用搭建(lesson 6)
OpenCompass 大模型评测实战(lesson 7)


OpenCompass 大模型评测作业(lesson 7)

  • 书生·浦语大模型实战系列文章目录
    • 一、环境配置
      • 1.1创建开发机和 conda 环境
      • 1.2 安装
      • 1.3 数据准备
      • 1.4 启动评测 (10% A100 8GB 资源)

OpenCompass 大模型评测作业(lesson 7)插图
模型评估阶段:配置 -> 推理 -> 评估 -> 可视化。

一、环境配置

1.1创建开发机和 conda 环境

选择镜像为 Cuda11.7-conda,并选择 GPU 为10% A100。
OpenCompass 大模型评测作业(lesson 7)插图(1)

1.2 安装

面向GPU的环境安装

studio-conda -o internlm-base -t opencompass
source activate opencompass
git clone -b 0.2.4 https://github.com/open-compass/opencompass
cd opencompass
pip install -e .

OpenCompass 大模型评测作业(lesson 7)插图(2)

如果pip install -e .安装未成功,请运行:

dart pip install -r requirements.txt

1.3 数据准备

解压评测数据集到 data/ 处

cp /share/temp/datasets/OpenCompassData-core-20231110.zip /root/opencompass/
unzip OpenCompassData-core-20231110.zip

OpenCompass 大模型评测作业(lesson 7)插图(3)

查看支持的数据集和模型
列出所有跟 InternLM 及 C-Eval 相关的配置

python tools/list_configs.py internlm ceval

报错了:

(opencompass) root-studio-50092202:~/opencompass# python tools/list_configs.py internlm ceval
Traceback (most recent call last):
  File "/root/opencompass/tools/list_configs.py", line 3, in <module>
    import tabulate
ModuleNotFoundError: No module named 'tabulate'

安装该模块:

(opencompass) root-studio-50092202:~/opencompass# pip install tabulate
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting tabulate
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/40/44/4a5f08c96eb108af5cb50b41f76142f0afa346dfa99d5296fe7202a11854/tabulate-0.9.0-py3-none-any.whl (35 kB)
Installing collected packages: tabulate
Successfully installed tabulate-0.9.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

安装 tabulate 模块后,再次尝试运行 list_configs.py 脚本

python tools/list_configs.py internlm ceval

依然报错,尝试执行以下代码:

pip install -r requirements.txt

OpenCompass 大模型评测作业(lesson 7)插图(4)

再次尝试运行 list_configs.py 脚本

python tools/list_configs.py internlm ceval

看到以下内容:

(opencompass) root-studio-50092202:~/opencompass# python tools/list_configs.py internlm ceval
+----------------------------------------+----------------------------------------------------------------------+
| Model                                  | Config Path                                                          |
|----------------------------------------+----------------------------------------------------------------------|
| hf_internlm2_1_8b                      | configs/models/hf_internlm/hf_internlm2_1_8b.py                      |
| hf_internlm2_20b                       | configs/models/hf_internlm/hf_internlm2_20b.py                       |
| hf_internlm2_7b                        | configs/models/hf_internlm/hf_internlm2_7b.py                        |
| hf_internlm2_base_20b                  | configs/models/hf_internlm/hf_internlm2_base_20b.py                  |
| hf_internlm2_base_7b                   | configs/models/hf_internlm/hf_internlm2_base_7b.py                   |
| hf_internlm2_chat_1_8b                 | configs/models/hf_internlm/hf_internlm2_chat_1_8b.py                 |
| hf_internlm2_chat_1_8b_sft             | configs/models/hf_internlm/hf_internlm2_chat_1_8b_sft.py             |
| hf_internlm2_chat_20b                  | configs/models/hf_internlm/hf_internlm2_chat_20b.py                  |
| hf_internlm2_chat_20b_sft              | configs/models/hf_internlm/hf_internlm2_chat_20b_sft.py              |
| hf_internlm2_chat_20b_with_system      | configs/models/hf_internlm/hf_internlm2_chat_20b_with_system.py      |
| hf_internlm2_chat_7b                   | configs/models/hf_internlm/hf_internlm2_chat_7b.py                   |
| hf_internlm2_chat_7b_sft               | configs/models/hf_internlm/hf_internlm2_chat_7b_sft.py               |
| hf_internlm2_chat_7b_with_system       | configs/models/hf_internlm/hf_internlm2_chat_7b_with_system.py       |
| hf_internlm2_chat_math_20b             | configs/models/hf_internlm/hf_internlm2_chat_math_20b.py             |
| hf_internlm2_chat_math_20b_with_system | configs/models/hf_internlm/hf_internlm2_chat_math_20b_with_system.py |
| hf_internlm2_chat_math_7b              | configs/models/hf_internlm/hf_internlm2_chat_math_7b.py              |
| hf_internlm2_chat_math_7b_with_system  | configs/models/hf_internlm/hf_internlm2_chat_math_7b_with_system.py  |
| hf_internlm_20b                        | configs/models/hf_internlm/hf_internlm_20b.py                        |
| hf_internlm_7b                         | configs/models/hf_internlm/hf_internlm_7b.py                         |
| hf_internlm_chat_20b                   | configs/models/hf_internlm/hf_internlm_chat_20b.py                   |
| hf_internlm_chat_7b                    | configs/models/hf_internlm/hf_internlm_chat_7b.py                    |
| hf_internlm_chat_7b_8k                 | configs/models/hf_internlm/hf_internlm_chat_7b_8k.py                 |
| hf_internlm_chat_7b_v1_1               | configs/models/hf_internlm/hf_internlm_chat_7b_v1_1.py               |
| internlm_7b                            | configs/models/internlm/internlm_7b.py                               |
| lmdeploy_internlm2_chat_20b            | configs/models/hf_internlm/lmdeploy_internlm2_chat_20b.py            |
| lmdeploy_internlm2_chat_7b             | configs/models/hf_internlm/lmdeploy_internlm2_chat_7b.py             |
| ms_internlm_chat_7b_8k                 | configs/models/ms_internlm/ms_internlm_chat_7b_8k.py                 |
+----------------------------------------+----------------------------------------------------------------------+
+--------------------------------+------------------------------------------------------------------+
| Dataset                        | Config Path                                                      |
|--------------------------------+------------------------------------------------------------------|
| ceval_clean_ppl                | configs/datasets/ceval/ceval_clean_ppl.py                        |
| ceval_contamination_ppl_810ec6 | configs/datasets/contamination/ceval_contamination_ppl_810ec6.py |
| ceval_gen                      | configs/datasets/ceval/ceval_gen.py                              |
| ceval_gen_2daf24               | configs/datasets/ceval/ceval_gen_2daf24.py                       |
| ceval_gen_5f30c7               | configs/datasets/ceval/ceval_gen_5f30c7.py                       |
| ceval_internal_ppl_1cd8bf      | configs/datasets/ceval/ceval_internal_ppl_1cd8bf.py              |
| ceval_ppl                      | configs/datasets/ceval/ceval_ppl.py                              |
| ceval_ppl_1cd8bf               | configs/datasets/ceval/ceval_ppl_1cd8bf.py                       |
| ceval_ppl_578f8d               | configs/datasets/ceval/ceval_ppl_578f8d.py                       |
| ceval_ppl_93e5ce               | configs/datasets/ceval/ceval_ppl_93e5ce.py                       |
| ceval_zero_shot_gen_bd40ef     | configs/datasets/ceval/ceval_zero_shot_gen_bd40ef.py             |
+--------------------------------+------------------------------------------------------------------+
(opencompass) root-studio-50092202:~/opencompass# 

1.4 启动评测 (10% A100 8GB 资源)

通过以下命令评测 InternLM2-Chat-1.8B 模型在 C-Eval 数据集上的性能。由于 OpenCompass 默认并行启动评估过程,我们可以在第一次运行时以 –debug 模式启动评估,并检查是否存在问题。在 –debug 模式下,任务将按顺序执行,并实时打印输出。

python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug

命令解析

--datasets ceval_gen \
--hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b \  # HuggingFace 模型路径
--tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b \  #
HuggingFace tokenizer 路径(如果与模型路径相同,可以省略)
--tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True \  # 构建 tokenizer 的参数
--model-kwargs device_map='auto' trust_remote_code=True \  # 构建模型的参数
--max-seq-len 1024 \  # 模型可以接受的最大序列长度
--max-out-len 16 \  # 生成的最大 token 数
--batch-size 2  \  # 批量大小
--num-gpus 1  # 运行模型所需的 GPU 数量
--debug ```

出现大量报错:

(opencompass) root-studio-50092202:~/opencompass# python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
06/09 07:17:23 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
06/09 07:17:23 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
06/09 07:17:24 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
06/09 07:17:27 - OpenCompass - INFO - Partitioned into 1 tasks.
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
06/09 07:17:31 - OpenCompass - INFO - Partitioned into 52 tasks.
06/09 07:17:33 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network]: No predictions found.
06/09 07:17:35 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system]: No predictions found.
06/09 07:17:37 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture]: No predictions found.
06/09 07:17:38 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming]: No predictions found.
06/09 07:17:40 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics]: No predictions found.
06/09 07:17:42 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry]: No predictions found.
06/09 07:17:44 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics]: No predictions found.
06/09 07:17:46 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics]: No predictions found.
06/09 07:17:48 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics]: No predictions found.
06/09 07:17:49 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer]: No predictions found.
06/09 07:17:51 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer]: No predictions found.
06/09 07:17:53 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics]: No predictions found.
06/09 07:17:55 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics]: No predictions found.
06/09 07:17:57 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry]: No predictions found.
06/09 07:17:59 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology]: No predictions found.
06/09 07:18:00 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics]: No predictions found.
06/09 07:18:02 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology]: No predictions found.
06/09 07:18:04 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics]: No predictions found.
06/09 07:18:06 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry]: No predictions found.
06/09 07:18:08 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine]: No predictions found.
06/09 07:18:10 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics]: No predictions found.
06/09 07:18:11 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration]: No predictions found.
06/09 07:18:13 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism]: No predictions found.
06/09 07:18:15 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought]: No predictions found.
06/09 07:18:17 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science]: No predictions found.
06/09 07:18:19 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification]: No predictions found.
06/09 07:18:21 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics]: No predictions found.
06/09 07:18:22 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography]: No predictions found.
06/09 07:18:24 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics]: No predictions found.
06/09 07:18:26 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]: No predictions found.
06/09 07:18:28 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history]: No predictions found.
06/09 07:18:30 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation]: No predictions found.
06/09 07:18:32 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic]: No predictions found.
06/09 07:18:33 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law]: No predictions found.
06/09 07:18:35 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature]: No predictions found.
06/09 07:18:37 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies]: No predictions found.
06/09 07:18:39 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide]: No predictions found.
06/09 07:18:41 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional]: No predictions found.
06/09 07:18:43 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese]: No predictions found.
06/09 07:18:45 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history]: No predictions found.
06/09 07:18:46 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history]: No predictions found.
06/09 07:18:48 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant]: No predictions found.
06/09 07:18:50 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science]: No predictions found.
06/09 07:18:52 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection]: No predictions found.
06/09 07:18:54 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine]: No predictions found.
06/09 07:18:56 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine]: No predictions found.
06/09 07:18:57 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner]: No predictions found.
06/09 07:18:59 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant]: No predictions found.
06/09 07:19:01 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer]: No predictions found.
06/09 07:19:03 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer]: No predictions found.
06/09 07:19:05 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant]: No predictions found.
06/09 07:19:06 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician]: No predictions found.
dataset                                         version    metric    mode    opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
----------------------------------------------  ---------  --------  ------  ---------------------------------------------------------------------------------------
ceval-computer_network                          -          -         -       -
ceval-operating_system                          -          -         -       -
ceval-computer_architecture                     -          -         -       -
ceval-college_programming                       -          -         -       -
ceval-college_physics                           -          -         -       -
ceval-college_chemistry                         -          -         -       -
ceval-advanced_mathematics                      -          -         -       -
ceval-probability_and_statistics                -          -         -       -
ceval-discrete_mathematics                      -          -         -       -
ceval-electrical_engineer                       -          -         -       -
ceval-metrology_engineer                        -          -         -       -
ceval-high_school_mathematics                   -          -         -       -
ceval-high_school_physics                       -          -         -       -
ceval-high_school_chemistry                     -          -         -       -
ceval-high_school_biology                       -          -         -       -
ceval-middle_school_mathematics                 -          -         -       -
ceval-middle_school_biology                     -          -         -       -
ceval-middle_school_physics                     -          -         -       -
ceval-middle_school_chemistry                   -          -         -       -
ceval-veterinary_medicine                       -          -         -       -
ceval-college_economics                         -          -         -       -
ceval-business_administration                   -          -         -       -
ceval-marxism                                   -          -         -       -
ceval-mao_zedong_thought                        -          -         -       -
ceval-education_science                         -          -         -       -
ceval-teacher_qualification                     -          -         -       -
ceval-high_school_politics                      -          -         -       -
ceval-high_school_geography                     -          -         -       -
ceval-middle_school_politics                    -          -         -       -
ceval-middle_school_geography                   -          -         -       -
ceval-modern_chinese_history                    -          -         -       -
ceval-ideological_and_moral_cultivation         -          -         -       -
ceval-logic                                     -          -         -       -
ceval-law                                       -          -         -       -
ceval-chinese_language_and_literature           -          -         -       -
ceval-art_studies                               -          -         -       -
ceval-professional_tour_guide                   -          -         -       -
ceval-legal_professional                        -          -         -       -
ceval-high_school_chinese                       -          -         -       -
ceval-high_school_history                       -          -         -       -
ceval-middle_school_history                     -          -         -       -
ceval-civil_servant                             -          -         -       -
ceval-sports_science                            -          -         -       -
ceval-plant_protection                          -          -         -       -
ceval-basic_medicine                            -          -         -       -
ceval-clinical_medicine                         -          -         -       -
ceval-urban_and_rural_planner                   -          -         -       -
ceval-accountant                                -          -         -       -
ceval-fire_engineer                             -          -         -       -
ceval-environmental_impact_assessment_engineer  -          -         -       -
ceval-tax_accountant                            -          -         -       -
ceval-physician                                 -          -         -       -
06/09 07:19:07 - OpenCompass - INFO - write summary to /root/opencompass/outputs/default/20240609_071723/summary/summary_20240609_071723.txt
06/09 07:19:07 - OpenCompass - INFO - write csv to /root/opencompass/outputs/default/20240609_071723/summary/summary_20240609_071723.csv

尝试安装protobuf

(opencompass) root-studio-50092202:~/opencompass# pip install protobuf
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting protobuf
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/3d/50/85a4f42e14eebfc9091e8e38e77ba4874d52fae08984c97725ba3265a1e1/protobuf-5.27.1-cp38-abi3-manylinux2014_x86_64.whl (309 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 309.2/309.2 kB 4.0 MB/s eta 0:00:00
Installing collected packages: protobuf
Successfully installed protobuf-5.27.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

再次启动,依然报错,设置一下:

export MKL_SERVICE_FORCE_INTEL=1

再次启动,似乎没在报错:

(opencompass) root-studio-50092202:~/opencompass# python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
06/09 07:35:11 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
06/09 07:35:11 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
06/09 07:35:11 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
06/09 07:35:11 - OpenCompass - INFO - Partitioned into 1 tasks.
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
06/09 07:35:24 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:38<00:00, 19.36s/it]
。。。

OpenCompass 大模型评测作业(lesson 7)插图(5)

这一句似乎有问题:

Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.

遇到错误mkl-service + Intel® MKL MKL_THREADING_LAYER=INTEL is
incompatible with libgomp.so.1 … 解决方案:

·“`
dart export MKL_SERVICE_FORCE_INTEL=1
#或 export MKL_THREADING_LAYER=GNU

(opencompass) root-studio-50092202:~/opencompass# export MKL_SERVICE_FORCE_INTEL=1
(opencompass) root-studio-50092202:~/opencompass# python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
06/09 07:35:11 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
06/09 07:35:11 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
06/09 07:35:11 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
06/09 07:35:11 - OpenCompass - INFO - Partitioned into 1 tasks.
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
06/09 07:35:24 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:38<00:00, 19.36s/it]
06/09 07:36:56 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 55/55 [00:00<00:00, 1460042.53it/s]
[2024-06-09 07:36:56,575] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 28/28 [01:00<00:00,  2.15s/it]
06/09 07:37:56 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1684597.51it/s]
[2024-06-09 07:37:56,864] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [01:01<00:00,  2.48s/it]
06/09 07:38:58 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1533738.03it/s]
[2024-06-09 07:38:58,918] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:55<00:00,  2.22s/it]
06/09 07:39:54 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1657426.58it/s]
[2024-06-09 07:39:54,521] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:37<00:00,  1.51s/it]
06/09 07:40:32 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 47/47 [00:00<00:00, 1564541.97it/s]
[2024-06-09 07:40:32,398] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [01:06<00:00,  2.79s/it]
06/09 07:41:39 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 46/46 [00:00<00:00, 1429170.25it/s]
[2024-06-09 07:41:39,474] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:41<00:00,  1.80s/it]
06/09 07:42:21 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 44/44 [00:00<00:00, 1453144.69it/s]
[2024-06-09 07:42:21,081] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:36<00:00,  1.65s/it]
06/09 07:42:57 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:00<00:00, 1326403.83it/s]
[2024-06-09 07:42:57,517] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:39<00:00,  2.09s/it]
06/09 07:43:37 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:00<00:00, 1337838.34it/s]
[2024-06-09 07:43:37,399] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:36<00:00,  1.94s/it]
06/09 07:44:14 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:00<00:00, 1235821.71it/s]
[2024-06-09 07:44:14,280] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:30<00:00,  1.82s/it]
06/09 07:44:45 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:00<00:00, 302870.97it/s]
[2024-06-09 07:44:45,339] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:25<00:00,  1.49s/it]
06/09 07:45:10 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 1160923.43it/s]
[2024-06-09 07:45:10,721] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:35<00:00,  2.19s/it]
06/09 07:45:45 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 1150649.77it/s]
[2024-06-09 07:45:45,836] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:30<00:00,  1.91s/it]
06/09 07:46:16 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 29/29 [00:00<00:00, 273336.67it/s]
[2024-06-09 07:46:16,516] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:23<00:00,  1.59s/it]
06/09 07:46:40 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 29/29 [00:00<00:00, 1136773.98it/s]
[2024-06-09 07:46:40,406] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:24<00:00,  1.64s/it]
06/09 07:47:04 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 949653.74it/s]
[2024-06-09 07:47:05,015] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:25<00:00,  2.13s/it]
06/09 07:47:30 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 949653.74it/s]
[2024-06-09 07:47:30,681] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:25<00:00,  2.16s/it]
06/09 07:47:56 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 932067.56it/s]
[2024-06-09 07:47:56,665] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:23<00:00,  1.98s/it]
06/09 07:48:20 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 996666.30it/s]
[2024-06-09 07:48:20,468] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:29<00:00,  2.46s/it]
06/09 07:48:49 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 955138.53it/s]
[2024-06-09 07:48:50,029] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:19<00:00,  1.66s/it]
06/09 07:49:09 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 910084.83it/s]
[2024-06-09 07:49:09,981] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:24<00:00,  2.05s/it]
06/09 07:49:34 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 910084.83it/s]
[2024-06-09 07:49:34,721] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:19<00:00,  1.62s/it]
06/09 07:49:54 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 910084.83it/s]
[2024-06-09 07:49:54,224] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:37<00:00,  3.11s/it]
06/09 07:50:31 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 913610.77it/s]
[2024-06-09 07:50:31,735] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:34<00:00,  3.12s/it]
06/09 07:51:06 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 887256.62it/s]
[2024-06-09 07:51:06,088] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:15<00:00,  1.42s/it]
06/09 07:51:21 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 904653.80it/s]
[2024-06-09 07:51:21,801] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:15<00:00,  1.40s/it]
06/09 07:51:37 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 971312.51it/s]
[2024-06-09 07:51:37,281] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:20<00:00,  1.90s/it]
06/09 07:51:58 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 863533.18it/s]
[2024-06-09 07:51:58,276] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:23<00:00,  2.14s/it]
06/09 07:52:21 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 898779.43it/s]
[2024-06-09 07:52:21,905] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:17<00:00,  1.62s/it]
06/09 07:52:39 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 855149.36it/s]
[2024-06-09 07:52:39,795] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:26<00:00,  2.42s/it]
06/09 07:53:06 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 855980.41it/s]
[2024-06-09 07:53:06,519] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:21<00:00,  2.16s/it]
06/09 07:53:28 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 855980.41it/s]
[2024-06-09 07:53:28,157] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:19<00:00,  1.94s/it]
06/09 07:53:47 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 796917.76it/s]
[2024-06-09 07:53:47,609] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:15<00:00,  1.51s/it]
06/09 07:54:02 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 804967.43it/s]
[2024-06-09 07:54:02,804] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:13<00:00,  1.31s/it]
06/09 07:54:15 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 198732.61it/s]
[2024-06-09 07:54:15,940] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:25<00:00,  2.53s/it]
06/09 07:54:41 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 744782.95it/s]
[2024-06-09 07:54:41,363] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:31<00:00,  3.11s/it]
06/09 07:55:12 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 198732.61it/s]
[2024-06-09 07:55:12,508] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:16<00:00,  1.69s/it]
06/09 07:55:29 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 565189.90it/s]
[2024-06-09 07:55:29,494] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:18<00:00,  1.86s/it]
06/09 07:55:48 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 711533.71it/s]
[2024-06-09 07:55:48,150] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:18<00:00,  1.80s/it]
06/09 07:56:06 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 866214.96it/s]
[2024-06-09 07:56:06,263] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:24<00:00,  2.46s/it]
06/09 07:56:30 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 773706.56it/s]
[2024-06-09 07:56:30,951] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:18<00:00,  1.83s/it]
06/09 07:56:49 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 766267.08it/s]
[2024-06-09 07:56:49,334] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:14<00:00,  1.47s/it]
06/09 07:57:04 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 758969.30it/s]
[2024-06-09 07:57:04,081] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:30<00:00,  3.01s/it]
06/09 07:57:34 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 781291.92it/s]
[2024-06-09 07:57:34,239] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:13<00:00,  1.38s/it]
06/09 07:57:48 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 813181.39it/s]
[2024-06-09 07:57:48,118] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:12<00:00,  1.27s/it]
06/09 07:58:00 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 821564.70it/s]
[2024-06-09 07:58:00,899] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:32<00:00,  3.24s/it]
06/09 07:58:33 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 201751.33it/s]
[2024-06-09 07:58:33,373] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:13<00:00,  1.31s/it]
06/09 07:58:46 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 796917.76it/s]
[2024-06-09 07:58:46,584] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:12<00:00,  1.26s/it]
06/09 07:58:59 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 794710.23it/s]
[2024-06-09 07:58:59,282] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:29<00:00,  3.32s/it]
06/09 07:59:29 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 747499.72it/s]
[2024-06-09 07:59:29,271] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:26<00:00,  2.99s/it]
06/09 07:59:56 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 677867.31it/s]
[2024-06-09 07:59:56,234] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:14<00:00,  1.80s/it]
06/09 08:00:10 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 131414.22it/s]
[2024-06-09 08:00:10,676] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:08<00:00,  1.48s/it]
06/09 08:00:19 - OpenCompass - INFO - time elapsed: 1494.75s
06/09 08:00:22 - OpenCompass - INFO - Partitioned into 52 tasks.
06/09 08:00:24 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network]: {'accuracy': 47.368421052631575}
06/09 08:00:26 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system]: {'accuracy': 47.368421052631575}
06/09 08:00:28 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture]: {'accuracy': 23.809523809523807}
06/09 08:00:30 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming]: {'accuracy': 13.513513513513514}
06/09 08:00:32 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics]: {'accuracy': 42.10526315789473}
06/09 08:00:34 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry]: {'accuracy': 33.33333333333333}
06/09 08:00:36 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics]: {'accuracy': 10.526315789473683}
06/09 08:00:37 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics]: {'accuracy': 38.88888888888889}
06/09 08:00:39 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics]: {'accuracy': 25.0}
06/09 08:00:41 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer]: {'accuracy': 27.027027027027028}
06/09 08:00:43 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer]: {'accuracy': 54.166666666666664}
06/09 08:00:45 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics]: {'accuracy': 16.666666666666664}
06/09 08:00:47 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics]: {'accuracy': 42.10526315789473}
06/09 08:00:49 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry]: {'accuracy': 47.368421052631575}
06/09 08:00:50 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology]: {'accuracy': 26.31578947368421}
06/09 08:00:52 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics]: {'accuracy': 36.84210526315789}
06/09 08:00:54 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology]: {'accuracy': 80.95238095238095}
06/09 08:00:56 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics]: {'accuracy': 47.368421052631575}
06/09 08:00:58 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry]: {'accuracy': 80.0}
06/09 08:01:00 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine]: {'accuracy': 43.47826086956522}
06/09 08:01:02 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics]: {'accuracy': 32.72727272727273}
06/09 08:01:03 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration]: {'accuracy': 36.36363636363637}
06/09 08:01:05 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism]: {'accuracy': 68.42105263157895}
06/09 08:01:07 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought]: {'accuracy': 70.83333333333334}
06/09 08:01:09 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science]: {'accuracy': 55.172413793103445}
06/09 08:01:11 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification]: {'accuracy': 59.09090909090909}
06/09 08:01:13 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics]: {'accuracy': 57.89473684210527}
06/09 08:01:14 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography]: {'accuracy': 47.368421052631575}
06/09 08:01:16 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics]: {'accuracy': 71.42857142857143}
06/09 08:01:18 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]: {'accuracy': 75.0}
06/09 08:01:20 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history]: {'accuracy': 52.17391304347826}
06/09 08:01:22 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation]: {'accuracy': 73.68421052631578}
06/09 08:01:24 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic]: {'accuracy': 27.27272727272727}
06/09 08:01:26 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law]: {'accuracy': 29.166666666666668}
06/09 08:01:28 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature]: {'accuracy': 47.82608695652174}
06/09 08:01:30 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies]: {'accuracy': 42.42424242424242}
06/09 08:01:32 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide]: {'accuracy': 51.724137931034484}
06/09 08:01:33 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional]: {'accuracy': 34.78260869565217}
06/09 08:01:35 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese]: {'accuracy': 42.10526315789473}
06/09 08:01:37 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history]: {'accuracy': 65.0}
06/09 08:01:39 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history]: {'accuracy': 86.36363636363636}
06/09 08:01:41 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant]: {'accuracy': 42.5531914893617}
06/09 08:01:43 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science]: {'accuracy': 52.63157894736842}
06/09 08:01:45 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection]: {'accuracy': 40.909090909090914}
06/09 08:01:47 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine]: {'accuracy': 68.42105263157895}
06/09 08:01:48 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine]: {'accuracy': 31.818181818181817}
06/09 08:01:50 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner]: {'accuracy': 47.82608695652174}
06/09 08:01:52 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant]: {'accuracy': 36.734693877551024}
06/09 08:01:54 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer]: {'accuracy': 38.70967741935484}
06/09 08:01:56 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer]: {'accuracy': 51.61290322580645}
06/09 08:01:58 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant]: {'accuracy': 36.734693877551024}
06/09 08:02:00 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician]: {'accuracy': 42.857142857142854}
dataset                                         version    metric         mode      opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
----------------------------------------------  ---------  -------------  ------  ---------------------------------------------------------------------------------------
ceval-computer_network                          db9ce2     accuracy       gen                                                                                       47.37
ceval-operating_system                          1c2571     accuracy       gen                                                                                       47.37
ceval-computer_architecture                     a74dad     accuracy       gen                                                                                       23.81
ceval-college_programming                       4ca32a     accuracy       gen                                                                                       13.51
ceval-college_physics                           963fa8     accuracy       gen                                                                                       42.11
ceval-college_chemistry                         e78857     accuracy       gen                                                                                       33.33
ceval-advanced_mathematics                      ce03e2     accuracy       gen                                                                                       10.53
ceval-probability_and_statistics                65e812     accuracy       gen                                                                                       38.89
ceval-discrete_mathematics                      e894ae     accuracy       gen                                                                                       25
ceval-electrical_engineer                       ae42b9     accuracy       gen                                                                                       27.03
ceval-metrology_engineer                        ee34ea     accuracy       gen                                                                                       54.17
ceval-high_school_mathematics                   1dc5bf     accuracy       gen                                                                                       16.67
ceval-high_school_physics                       adf25f     accuracy       gen                                                                                       42.11
ceval-high_school_chemistry                     2ed27f     accuracy       gen                                                                                       47.37
ceval-high_school_biology                       8e2b9a     accuracy       gen                                                                                       26.32
ceval-middle_school_mathematics                 bee8d5     accuracy       gen                                                                                       36.84
ceval-middle_school_biology                     86817c     accuracy       gen                                                                                       80.95
ceval-middle_school_physics                     8accf6     accuracy       gen                                                                                       47.37
ceval-middle_school_chemistry                   167a15     accuracy       gen                                                                                       80
ceval-veterinary_medicine                       b4e08d     accuracy       gen                                                                                       43.48
ceval-college_economics                         f3f4e6     accuracy       gen                                                                                       32.73
ceval-business_administration                   c1614e     accuracy       gen                                                                                       36.36
ceval-marxism                                   cf874c     accuracy       gen                                                                                       68.42
ceval-mao_zedong_thought                        51c7a4     accuracy       gen                                                                                       70.83
ceval-education_science                         591fee     accuracy       gen                                                                                       55.17
ceval-teacher_qualification                     4e4ced     accuracy       gen                                                                                       59.09
ceval-high_school_politics                      5c0de2     accuracy       gen                                                                                       57.89
ceval-high_school_geography                     865461     accuracy       gen                                                                                       47.37
ceval-middle_school_politics                    5be3e7     accuracy       gen                                                                                       71.43
ceval-middle_school_geography                   8a63be     accuracy       gen                                                                                       75
ceval-modern_chinese_history                    fc01af     accuracy       gen                                                                                       52.17
ceval-ideological_and_moral_cultivation         a2aa4a     accuracy       gen                                                                                       73.68
ceval-logic                                     f5b022     accuracy       gen                                                                                       27.27
ceval-law                                       a110a1     accuracy       gen                                                                                       29.17
ceval-chinese_language_and_literature           0f8b68     accuracy       gen                                                                                       47.83
ceval-art_studies                               2a1300     accuracy       gen                                                                                       42.42
ceval-professional_tour_guide                   4e673e     accuracy       gen                                                                                       51.72
ceval-legal_professional                        ce8787     accuracy       gen                                                                                       34.78
ceval-high_school_chinese                       315705     accuracy       gen                                                                                       42.11
ceval-high_school_history                       7eb30a     accuracy       gen                                                                                       65
ceval-middle_school_history                     48ab4a     accuracy       gen                                                                                       86.36
ceval-civil_servant                             87d061     accuracy       gen                                                                                       42.55
ceval-sports_science                            70f27b     accuracy       gen                                                                                       52.63
ceval-plant_protection                          8941f9     accuracy       gen                                                                                       40.91
ceval-basic_medicine                            c409d6     accuracy       gen                                                                                       68.42
ceval-clinical_medicine                         49e82d     accuracy       gen                                                                                       31.82
ceval-urban_and_rural_planner                   95b885     accuracy       gen                                                                                       47.83
ceval-accountant                                002837     accuracy       gen                                                                                       36.73
ceval-fire_engineer                             bc23f5     accuracy       gen                                                                                       38.71
ceval-environmental_impact_assessment_engineer  c64e2d     accuracy       gen                                                                                       51.61
ceval-tax_accountant                            3a5e3c     accuracy       gen                                                                                       36.73
ceval-physician                                 6e277d     accuracy       gen                                                                                       42.86
ceval-stem                                      -          naive_average  gen                                                                                       39.21
ceval-social-science                            -          naive_average  gen                                                                                       57.43
ceval-humanities                                -          naive_average  gen                                                                                       50.23
ceval-other                                     -          naive_average  gen                                                                                       44.62
ceval-hard                                      -          naive_average  gen                                                                                       32
ceval                                           -          naive_average  gen                                                                                       46.19
06/09 08:02:00 - OpenCompass - INFO - write summary to /root/opencompass/outputs/default/20240609_073511/summary/summary_20240609_073511.txt
06/09 08:02:00 - OpenCompass - INFO - write csv to /root/opencompass/outputs/default/20240609_073511/summary/summary_20240609_073511.csv

OpenCompass 大模型评测作业(lesson 7)插图(6)
为了解决

is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it. ```
export MKL_THREADING_LAYER=GNU

上述错误没再出现

opencompass) root-studio-50092202:~/opencompass# export MKL_THREADING_LAYER=GNU
(opencompass) root-studio-50092202:~/opencompass# python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
06/09 09:20:40 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
06/09 09:20:40 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
06/09 09:20:40 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
06/09 09:20:40 - OpenCompass - INFO - Partitioned into 1 tasks.
06/09 09:20:51 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]

OpenCompass 大模型评测作业(lesson 7)插图(7)
OpenCompass 大模型评测作业(lesson 7)插图(8)

20240609_092040
tabulate format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dataset                                         version    metric         mode      opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
----------------------------------------------  ---------  -------------  ------  ---------------------------------------------------------------------------------------
ceval-computer_network                          db9ce2     accuracy       gen                                                                                       47.37
ceval-operating_system                          1c2571     accuracy       gen                                                                                       47.37
ceval-computer_architecture                     a74dad     accuracy       gen                                                                                       23.81
ceval-college_programming                       4ca32a     accuracy       gen                                                                                       13.51
ceval-college_physics                           963fa8     accuracy       gen                                                                                       42.11
ceval-college_chemistry                         e78857     accuracy       gen                                                                                       33.33
ceval-advanced_mathematics                      ce03e2     accuracy       gen                                                                                       10.53
ceval-probability_and_statistics                65e812     accuracy       gen                                                                                       38.89
ceval-discrete_mathematics                      e894ae     accuracy       gen                                                                                       25
ceval-electrical_engineer                       ae42b9     accuracy       gen                                                                                       27.03
ceval-metrology_engineer                        ee34ea     accuracy       gen                                                                                       54.17
ceval-high_school_mathematics                   1dc5bf     accuracy       gen                                                                                       16.67
ceval-high_school_physics                       adf25f     accuracy       gen                                                                                       42.11
ceval-high_school_chemistry                     2ed27f     accuracy       gen                                                                                       47.37
ceval-high_school_biology                       8e2b9a     accuracy       gen                                                                                       26.32
ceval-middle_school_mathematics                 bee8d5     accuracy       gen                                                                                       36.84
ceval-middle_school_biology                     86817c     accuracy       gen                                                                                       80.95
ceval-middle_school_physics                     8accf6     accuracy       gen                                                                                       47.37
ceval-middle_school_chemistry                   167a15     accuracy       gen                                                                                       80
ceval-veterinary_medicine                       b4e08d     accuracy       gen                                                                                       43.48
ceval-college_economics                         f3f4e6     accuracy       gen                                                                                       32.73
ceval-business_administration                   c1614e     accuracy       gen                                                                                       36.36
ceval-marxism                                   cf874c     accuracy       gen                                                                                       68.42
ceval-mao_zedong_thought                        51c7a4     accuracy       gen                                                                                       70.83
ceval-education_science                         591fee     accuracy       gen                                                                                       55.17
ceval-teacher_qualification                     4e4ced     accuracy       gen                                                                                       59.09
ceval-high_school_politics                      5c0de2     accuracy       gen                                                                                       57.89
ceval-high_school_geography                     865461     accuracy       gen                                                                                       47.37
ceval-middle_school_politics                    5be3e7     accuracy       gen                                                                                       71.43
ceval-middle_school_geography                   8a63be     accuracy       gen                                                                                       75
ceval-modern_chinese_history                    fc01af     accuracy       gen                                                                                       52.17
ceval-ideological_and_moral_cultivation         a2aa4a     accuracy       gen                                                                                       73.68
ceval-logic                                     f5b022     accuracy       gen                                                                                       27.27
ceval-law                                       a110a1     accuracy       gen                                                                                       29.17
ceval-chinese_language_and_literature           0f8b68     accuracy       gen                                                                                       47.83
ceval-art_studies                               2a1300     accuracy       gen                                                                                       42.42
ceval-professional_tour_guide                   4e673e     accuracy       gen                                                                                       51.72
ceval-legal_professional                        ce8787     accuracy       gen                                                                                       34.78
ceval-high_school_chinese                       315705     accuracy       gen                                                                                       42.11
ceval-high_school_history                       7eb30a     accuracy       gen                                                                                       65
ceval-middle_school_history                     48ab4a     accuracy       gen                                                                                       86.36
ceval-civil_servant                             87d061     accuracy       gen                                                                                       42.55
ceval-sports_science                            70f27b     accuracy       gen                                                                                       52.63
ceval-plant_protection                          8941f9     accuracy       gen                                                                                       40.91
ceval-basic_medicine                            c409d6     accuracy       gen                                                                                       68.42
ceval-clinical_medicine                         49e82d     accuracy       gen                                                                                       31.82
ceval-urban_and_rural_planner                   95b885     accuracy       gen                                                                                       47.83
ceval-accountant                                002837     accuracy       gen                                                                                       36.73
ceval-fire_engineer                             bc23f5     accuracy       gen                                                                                       38.71
ceval-environmental_impact_assessment_engineer  c64e2d     accuracy       gen                                                                                       51.61
ceval-tax_accountant                            3a5e3c     accuracy       gen                                                                                       36.73
ceval-physician                                 6e277d     accuracy       gen                                                                                       42.86
ceval-stem                                      -          naive_average  gen                                                                                       39.21
ceval-social-science                            -          naive_average  gen                                                                                       57.43
ceval-humanities                                -          naive_average  gen                                                                                       50.23
ceval-other                                     -          naive_average  gen                                                                                       44.62
ceval-hard                                      -          naive_average  gen                                                                                       32
ceval                                           -          naive_average  gen                                                                                       46.19
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
-------------------------------------------------------------------------------------------------------------------------------- THIS IS A DIVIDER --------------------------------------------------------------------------------------------------------------------------------
csv format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dataset,version,metric,mode,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
ceval-computer_network,db9ce2,accuracy,gen,47.37
ceval-operating_system,1c2571,accuracy,gen,47.37
ceval-computer_architecture,a74dad,accuracy,gen,23.81
ceval-college_programming,4ca32a,accuracy,gen,13.51
ceval-college_physics,963fa8,accuracy,gen,42.11
ceval-college_chemistry,e78857,accuracy,gen,33.33
ceval-advanced_mathematics,ce03e2,accuracy,gen,10.53
ceval-probability_and_statistics,65e812,accuracy,gen,38.89
ceval-discrete_mathematics,e894ae,accuracy,gen,25.00
ceval-electrical_engineer,ae42b9,accuracy,gen,27.03
ceval-metrology_engineer,ee34ea,accuracy,gen,54.17
ceval-high_school_mathematics,1dc5bf,accuracy,gen,16.67
ceval-high_school_physics,adf25f,accuracy,gen,42.11
ceval-high_school_chemistry,2ed27f,accuracy,gen,47.37
ceval-high_school_biology,8e2b9a,accuracy,gen,26.32
ceval-middle_school_mathematics,bee8d5,accuracy,gen,36.84
ceval-middle_school_biology,86817c,accuracy,gen,80.95
ceval-middle_school_physics,8accf6,accuracy,gen,47.37
ceval-middle_school_chemistry,167a15,accuracy,gen,80.00
ceval-veterinary_medicine,b4e08d,accuracy,gen,43.48
ceval-college_economics,f3f4e6,accuracy,gen,32.73
ceval-business_administration,c1614e,accuracy,gen,36.36
ceval-marxism,cf874c,accuracy,gen,68.42
ceval-mao_zedong_thought,51c7a4,accuracy,gen,70.83
ceval-education_science,591fee,accuracy,gen,55.17
ceval-teacher_qualification,4e4ced,accuracy,gen,59.09
ceval-high_school_politics,5c0de2,accuracy,gen,57.89
ceval-high_school_geography,865461,accuracy,gen,47.37
ceval-middle_school_politics,5be3e7,accuracy,gen,71.43
ceval-middle_school_geography,8a63be,accuracy,gen,75.00
ceval-modern_chinese_history,fc01af,accuracy,gen,52.17
ceval-ideological_and_moral_cultivation,a2aa4a,accuracy,gen,73.68
ceval-logic,f5b022,accuracy,gen,27.27
ceval-law,a110a1,accuracy,gen,29.17
ceval-chinese_language_and_literature,0f8b68,accuracy,gen,47.83
ceval-art_studies,2a1300,accuracy,gen,42.42
ceval-professional_tour_guide,4e673e,accuracy,gen,51.72
ceval-legal_professional,ce8787,accuracy,gen,34.78
ceval-high_school_chinese,315705,accuracy,gen,42.11
ceval-high_school_history,7eb30a,accuracy,gen,65.00
ceval-middle_school_history,48ab4a,accuracy,gen,86.36
ceval-civil_servant,87d061,accuracy,gen,42.55
ceval-sports_science,70f27b,accuracy,gen,52.63
ceval-plant_protection,8941f9,accuracy,gen,40.91
ceval-basic_medicine,c409d6,accuracy,gen,68.42
ceval-clinical_medicine,49e82d,accuracy,gen,31.82
ceval-urban_and_rural_planner,95b885,accuracy,gen,47.83
ceval-accountant,002837,accuracy,gen,36.73
ceval-fire_engineer,bc23f5,accuracy,gen,38.71
ceval-environmental_impact_assessment_engineer,c64e2d,accuracy,gen,51.61
ceval-tax_accountant,3a5e3c,accuracy,gen,36.73
ceval-physician,6e277d,accuracy,gen,42.86
ceval-stem,-,naive_average,gen,39.21
ceval-social-science,-,naive_average,gen,57.43
ceval-humanities,-,naive_average,gen,50.23
ceval-other,-,naive_average,gen,44.62
ceval-hard,-,naive_average,gen,32.00
ceval,-,naive_average,gen,46.19
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
-------------------------------------------------------------------------------------------------------------------------------- THIS IS A DIVIDER --------------------------------------------------------------------------------------------------------------------------------
raw format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-------------------------------
Model: opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
ceval-computer_network: {'accuracy': 47.368421052631575}
ceval-operating_system: {'accuracy': 47.368421052631575}
ceval-computer_architecture: {'accuracy': 23.809523809523807}
ceval-college_programming: {'accuracy': 13.513513513513514}
ceval-college_physics: {'accuracy': 42.10526315789473}
ceval-college_chemistry: {'accuracy': 33.33333333333333}
ceval-advanced_mathematics: {'accuracy': 10.526315789473683}
ceval-probability_and_statistics: {'accuracy': 38.88888888888889}
ceval-discrete_mathematics: {'accuracy': 25.0}
ceval-electrical_engineer: {'accuracy': 27.027027027027028}
ceval-metrology_engineer: {'accuracy': 54.166666666666664}
ceval-high_school_mathematics: {'accuracy': 16.666666666666664}
ceval-high_school_physics: {'accuracy': 42.10526315789473}
ceval-high_school_chemistry: {'accuracy': 47.368421052631575}
ceval-high_school_biology: {'accuracy': 26.31578947368421}
ceval-middle_school_mathematics: {'accuracy': 36.84210526315789}
ceval-middle_school_biology: {'accuracy': 80.95238095238095}
ceval-middle_school_physics: {'accuracy': 47.368421052631575}
ceval-middle_school_chemistry: {'accuracy': 80.0}
ceval-veterinary_medicine: {'accuracy': 43.47826086956522}
ceval-college_economics: {'accuracy': 32.72727272727273}
ceval-business_administration: {'accuracy': 36.36363636363637}
ceval-marxism: {'accuracy': 68.42105263157895}
ceval-mao_zedong_thought: {'accuracy': 70.83333333333334}
ceval-education_science: {'accuracy': 55.172413793103445}
ceval-teacher_qualification: {'accuracy': 59.09090909090909}
ceval-high_school_politics: {'accuracy': 57.89473684210527}
ceval-high_school_geography: {'accuracy': 47.368421052631575}
ceval-middle_school_politics: {'accuracy': 71.42857142857143}
ceval-middle_school_geography: {'accuracy': 75.0}
ceval-modern_chinese_history: {'accuracy': 52.17391304347826}
ceval-ideological_and_moral_cultivation: {'accuracy': 73.68421052631578}
ceval-logic: {'accuracy': 27.27272727272727}
ceval-law: {'accuracy': 29.166666666666668}
ceval-chinese_language_and_literature: {'accuracy': 47.82608695652174}
ceval-art_studies: {'accuracy': 42.42424242424242}
ceval-professional_tour_guide: {'accuracy': 51.724137931034484}
ceval-legal_professional: {'accuracy': 34.78260869565217}
ceval-high_school_chinese: {'accuracy': 42.10526315789473}
ceval-high_school_history: {'accuracy': 65.0}
ceval-middle_school_history: {'accuracy': 86.36363636363636}
ceval-civil_servant: {'accuracy': 42.5531914893617}
ceval-sports_science: {'accuracy': 52.63157894736842}
ceval-plant_protection: {'accuracy': 40.909090909090914}
ceval-basic_medicine: {'accuracy': 68.42105263157895}
ceval-clinical_medicine: {'accuracy': 31.818181818181817}
ceval-urban_and_rural_planner: {'accuracy': 47.82608695652174}
ceval-accountant: {'accuracy': 36.734693877551024}
ceval-fire_engineer: {'accuracy': 38.70967741935484}
ceval-environmental_impact_assessment_engineer: {'accuracy': 51.61290322580645}
ceval-tax_accountant: {'accuracy': 36.734693877551024}
ceval-physician: {'accuracy': 42.857142857142854}
ceval-stem: {'naive_average': 39.210234139009884}
ceval-social-science: {'naive_average': 57.43003472631422}
ceval-humanities: {'naive_average': 50.22940845801545}
ceval-other: {'naive_average': 44.618935819046335}
ceval-hard: {'naive_average': 31.99926900584795}
ceval: {'naive_average': 46.18916955944266}
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
本站无任何商业行为
个人在线分享 » OpenCompass 大模型评测作业(lesson 7)
E-->