【Azure 环境】Azure 虚拟机上部署 DeepSeek R1 模型教程(1.5B参数)【失败】

2025-02-08 31

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

Serverless 应用引擎免费试用套餐包，4320000 CU，有效期3个月

云原生网关 MSE Higress，422元/月

函数计算FC，每月15万CU 3个月

简介： 遇见错误一：operator torchvision::nms does not exist 遇见错误二：RuntimeError: Failed to infer device type

前言

学习大模型者，都是想在自己的环境中部署一个模型进行试验。最近DeepSeek让我想实现了这个目标。

准备

本文使用一台Azure云上的虚拟机(16 vCPU， 64GB内存)，Python环境，模型选择 DeepSeek-R1-Distill-Qwen-1.5B。

VLLM CPU : https://docs.vllm.ai/en/latest/getting_started/installation/cpu/index.html
VLLM Project : https://github.com/vllm-project/vllm
DeepSeek-R1 : https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

实操

第一步：在Azure中创建Linux 虚拟机

使用Ubuntu Server 24.04 LTS- Gen2 镜像, Size 选择为 Standard D16ads v5 ( 16 vcpus, 64GiB memory)

第二步：登录到Linux虚拟机，安全Python环境

sudo apt update
apt list --upgradable
sudo apt install python3

第三步：安装CPU vLLM编译器，下载VLLM（https://github.com/vllm-project/vllm）

sudo apt-get update  -y
sudo apt-get install -y gcc-12 g++-12 libnuma-dev
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 10 --slave /usr/bin/g++ g++ /usr/bin/g++-12
 
git clone https://github.com/vllm-project/vllm.git
cd vllm
pip install -r requirements-cpu.txt
pip install -e .

错误信息：

~$ git clone https://github.com/vllm-project/vllm.git

Cloning into 'vllm'...

fatal: unable to access 'https://github.com/vllm-project/vllm.git/': Failed to connect to github.com port 443 after 133171 ms: Couldn't connect to server

通过Git clone VLLM文件并编译安装，但因为访问github不成功，所以需要提前准备好 VLLM ZIP文件，通过上传到Storage Account后，在Linux中使用wget下载并unzip。采用“曲线方针”方式，成功下载VLLM文件。

VLLM 成功下载后，进入VLLM目录。执行 requirements-cpu.txt依赖安装

pip install --upgrade pip
pip install cmake>=3.26 wheel packaging ninja "setuptools-scm>=8" numpy
pip install -v -r requirements-cpu.txt --extra-index-url https://download.pytorch.org/whl/cpu

第四步：安装完成后，通过下面的命令安装 VLLM CPU版本

VLLM_TARGET_DEVICE=cpu python setup.py install

Note： 此处setup.py文件有个bug，需要安装前修改文件。通过运行以下命令行修改setup.py文件中的get_vllm_version()函数：

def get_vllm_version() -> str:
    try:
        version = get_version(
        write_to="vllm/_version.py",  # TODO: move this to pyproject.toml
    )
    except LookupError:
        version = "0.0.0"

PS: 执行“VLLM_TARGET_DEVICE=cpu python setup.py install“这一步需要较长时间。

第五步：安装完成后，VLLM就完成了cpu版本的安装

考虑是中国区环境下的VM，可以通过以下endpoint访问mirror huggingface. 在Linux VM中添加以下环境变量，访问mirror huggingface镜像。

export HF_ENDPOINT=https://hf-mirror.com

第六步：加载DeepSeek 1.5B的模型

vllm serve "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"

遇见错误一：operator torchvision::nms does not exist

File "/home/lbadmin/.venv/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1805, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbadmin/.venv/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1819, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.processing_utils because of the following error (look up to see its traceback):
operator torchvision::nms does not exist

通过重新安装 torchvision 后，解决以上问题。

pip install --force-reinstall torchvision --extra-index-url https://download.pytorch.org/whl/torchvision/

遇见错误二：RuntimeError: Failed to infer device type

File "/home/lbadmin/.venv/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 1074, in create_engine_config
    device_config = DeviceConfig(device=self.device)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbadmin/.venv/lib/python3.12/site-packages/vllm/config.py", line 1626, in __init__
    raise RuntimeError("Failed to infer device type")
RuntimeError: Failed to infer device type

操作失败！通过从网上的资料判断，目前错误原因是所选择的模型不支持当前CPU 运行！

！试验失败！

假如第六步加载模型可以成功，就可以通过下面的代码测试模型：

curl -X POST "http://<public ip>:8000/v1/chat/completions" \
    -H "Content-Type: application/json" \
    --data '{
        "model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
        "messages": [
            {
                "role": "user",
                "content": "What is the capital of France?"
            }
        ]
}'

谨以此文做一个笔记，2025年，持续学习大模型！

当在复杂的环境中面临问题，格物之道需：浊而静之徐清，安以动之徐生。云中，恰是如此!