开发者社区 > ModelScope模型即服务 > 正文

在modelscope社区,阿里云弹性加速计算EAIS 执行错误,怎么解决?

"在modelscope社区,阿里云弹性加速计算EAIS 执行错误,怎么解决?:

from swift.llm import (
ModelType, get_vllm_engine, get_default_template_type,
get_template, inference_vllm
)
import torch
import os

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

model_type = ModelType.qwen1half_1_8b_chat
llm_engine = get_vllm_engine(model_type)
template_type = get_default_template_type(model_type)
template = get_template(template_type, llm_engine.hf_tokenizer)
llm_engine.generation_config.max_new_tokens = 2048

request_list = [{'query': '蚂蚁'}, {'query': '大象'}]
resp_list = inference_vllm(llm_engine, template, request_list)
for request, resp in zip(request_list, resp_list):
print(f""query: {request['query']}"")
print(f""response: {resp['response']}"")

错误:ValueError: Bfloat16 is only supported on GPUs with compute capability of at least 8.0. Your Tesla P100-PCIE-16GB GPU has compute capability 6.0. You can use float16 instead by explicitly setting thedtype flag in CLI, for example: --dtype=half.

更改:
llm_engine = get_vllm_engine(model_type, torch_dtype=torch.float16)

错误:CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
"

展开
收起
小小爱吃香菜 2024-04-16 20:28:47 80 0
1 条回答
写回答
取消 提交回答
  • vllm不支持p100。此回答整理自钉群“魔搭ModelScope开发者联盟群 ①”

    2024-04-16 22:04:00
    赞同 3 展开评论 打赏

ModelScope旨在打造下一代开源的模型即服务共享平台,为泛AI开发者提供灵活、易用、低成本的一站式模型服务产品,让模型应用更简单!欢迎加入技术交流群:微信公众号:魔搭ModelScope社区,钉钉群号:44837352

相关电子书

更多
视觉AI能力的开放现状及ModelScope实战 立即下载
ModelScope助力语音AI模型创新与应用 立即下载
弹性加速计算实例(EAIS)产品发布 立即下载