【求助】使用免费GPU计算实例尝试模型库中的大模型，报错：无法利用GPU，找不到模型文件-阿里云开发者社区

【求助】使用免费GPU计算实例尝试模型库中的大模型，报错：无法利用GPU，找不到模型文件

2024-02-22 809

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

模型在线服务 PAI-EAS，A10/V100等 500元 1个月

模型训练 PAI-DLC，100CU*H 3个月

交互式建模 PAI-DSW，每月250计算时 3个月

简介： 开启了免费GPU计算实例，但是报错。

开启了如下GPU的实例：

运行如下代码：

from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
torch.manual_seed(0)

path = 'OpenBMB/MiniCPM-2B-dpo-bf16'
tokenizer = AutoTokenizer.from_pretrained(path)
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map='cuda', trust_remote_code=True)

responds, history = model.chat(tokenizer, "山东省最高的山是哪座山, 它比黄山高还是矮？差距多少？", temperature=0.8, top_p=0.8)
print(responds)

得到如下执行日志：

2024-02-22 20:25:41,904 - modelscope - INFO - PyTorch version 2.1.2+cu121 Found.
2024-02-22 20:25:41,907 - modelscope - INFO - TensorFlow version 2.14.0 Found.
2024-02-22 20:25:41,909 - modelscope - INFO - Loading ast index from /mnt/workspace/.cache/modelscope/ast_indexer
2024-02-22 20:25:41,909 - modelscope - INFO - No valid ast index found from /mnt/workspace/.cache/modelscope/ast_indexer, generating ast index from prebuilt!
2024-02-22 20:25:41,967 - modelscope - INFO - Loading done! Current index file version is 1.12.0, with md5 509123dba36c5e70a95f6780df348471 and a total number of 964 components indexed
/opt/conda/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
2024-02-22 20:25:43.223187: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-02-22 20:25:43.225595: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-22 20:25:43.263147: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-22 20:25:43.263200: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-22 20:25:43.263228: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-22 20:25:43.271228: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-22 20:25:43.271779: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-22 20:25:44.177266: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Downloading: 100%|██████████| 0.99k/0.99k [00:00<00:00, 4.45MB/s]
Downloading: 100%|██████████| 9.54k/9.54k [00:00<00:00, 31.0MB/s]
Downloading: 100%|██████████| 113/113 [00:00<00:00, 1.00MB/s]
Downloading: 100%|██████████| 66.3k/66.3k [00:00<00:00, 8.84MB/s]
Downloading: 100%|██████████| 11.3k/11.3k [00:00<00:00, 36.5MB/s]
Downloading: 100%|██████████| 414/414 [00:00<00:00, 3.11MB/s]
Downloading: 100%|██████████| 5.92M/5.92M [00:00<00:00, 39.3MB/s]
Downloading: 100%|██████████| 1.90M/1.90M [00:00<00:00, 128MB/s]
Downloading: 100%|██████████| 1.11k/1.11k [00:00<00:00, 6.25MB/s]
2024-02-22 20:25:49,918 - modelscope - WARNING - Download interval is too small, use local cache
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /mnt/workspace/.cache/modelscope/OpenBMB/MiniCPM-2B-dpo-bf16.

其中有两个错误：

2024-02-22 20:25:43.271228: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /mnt/workspace/.cache/modelscope/OpenBMB/MiniCPM-2B-dpo-bf16.