【人工智能】Transformers之Pipeline(概述):30w+大模型极简应用

简介: 【人工智能】Transformers之Pipeline(概述):30w+大模型极简应用

一、引言

pipeline(管道)是huggingface transformers库中一种极简方式使用大模型推理的抽象,将所有大模型分为语音(Audio)、计算机视觉(Computer vision)、自然语言处理(NLP)、多模态(Multimodal)等4大类,28小类任务(tasks)。共计覆盖32万个模型

本文对pipeline进行整体介绍,之后本专栏以每个task为主题,分别介绍各种task使用方法。

二、pipeline库

2.1 概述

管道是一种使用模型进行推理的简单而好用的方法。这些管道是从库中抽象出大部分复杂代码的对象,提供了专用于多项任务的简单 API,包括命名实体识别、掩码语言建模、情感分析、特征提取和问答。在使用上,主要有2种方法

  • 使用task实例化pipeline对象
  • 使用model实例化pipeline对象

2.2 使用task实例化pipeline对象

2.2.1 基于task实例化“自动语音识别

自动语音识别的task为automatic-speech-recognition:

import os
os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
os.environ["CUDA_VISIBLE_DEVICES"] = "2"
 
from transformers import pipeline
 
speech_file = "./output_video_enhanced.mp3"
pipe = pipeline(task="automatic-speech-recognition")
result = pipe(speech_file)
print(result)

2.2.2 task列表

task共计28类,按首字母排序,列表如下,直接替换2.2.1代码中的pipeline的task即可应用:

2.2.3 task默认模型

针对每一个task,pipeline默认配置了模型,可以通过pipeline源代码查看:

SUPPORTED_TASKS = {
    "audio-classification": {
        "impl": AudioClassificationPipeline,
        "tf": (),
        "pt": (AutoModelForAudioClassification,) if is_torch_available() else (),
        "default": {"model": {"pt": ("superb/wav2vec2-base-superb-ks", "372e048")}},
        "type": "audio",
    },
    "automatic-speech-recognition": {
        "impl": AutomaticSpeechRecognitionPipeline,
        "tf": (),
        "pt": (AutoModelForCTC, AutoModelForSpeechSeq2Seq) if is_torch_available() else (),
        "default": {"model": {"pt": ("facebook/wav2vec2-base-960h", "55bb623")}},
        "type": "multimodal",
    },
    "text-to-audio": {
        "impl": TextToAudioPipeline,
        "tf": (),
        "pt": (AutoModelForTextToWaveform, AutoModelForTextToSpectrogram) if is_torch_available() else (),
        "default": {"model": {"pt": ("suno/bark-small", "645cfba")}},
        "type": "text",
    },
    "feature-extraction": {
        "impl": FeatureExtractionPipeline,
        "tf": (TFAutoModel,) if is_tf_available() else (),
        "pt": (AutoModel,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("distilbert/distilbert-base-cased", "935ac13"),
                "tf": ("distilbert/distilbert-base-cased", "935ac13"),
            }
        },
        "type": "multimodal",
    },
    "text-classification": {
        "impl": TextClassificationPipeline,
        "tf": (TFAutoModelForSequenceClassification,) if is_tf_available() else (),
        "pt": (AutoModelForSequenceClassification,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("distilbert/distilbert-base-uncased-finetuned-sst-2-english", "af0f99b"),
                "tf": ("distilbert/distilbert-base-uncased-finetuned-sst-2-english", "af0f99b"),
            },
        },
        "type": "text",
    },
    "token-classification": {
        "impl": TokenClassificationPipeline,
        "tf": (TFAutoModelForTokenClassification,) if is_tf_available() else (),
        "pt": (AutoModelForTokenClassification,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("dbmdz/bert-large-cased-finetuned-conll03-english", "f2482bf"),
                "tf": ("dbmdz/bert-large-cased-finetuned-conll03-english", "f2482bf"),
            },
        },
        "type": "text",
    },
    "question-answering": {
        "impl": QuestionAnsweringPipeline,
        "tf": (TFAutoModelForQuestionAnswering,) if is_tf_available() else (),
        "pt": (AutoModelForQuestionAnswering,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("distilbert/distilbert-base-cased-distilled-squad", "626af31"),
                "tf": ("distilbert/distilbert-base-cased-distilled-squad", "626af31"),
            },
        },
        "type": "text",
    },
    "table-question-answering": {
        "impl": TableQuestionAnsweringPipeline,
        "pt": (AutoModelForTableQuestionAnswering,) if is_torch_available() else (),
        "tf": (TFAutoModelForTableQuestionAnswering,) if is_tf_available() else (),
        "default": {
            "model": {
                "pt": ("google/tapas-base-finetuned-wtq", "69ceee2"),
                "tf": ("google/tapas-base-finetuned-wtq", "69ceee2"),
            },
        },
        "type": "text",
    },
    "visual-question-answering": {
        "impl": VisualQuestionAnsweringPipeline,
        "pt": (AutoModelForVisualQuestionAnswering,) if is_torch_available() else (),
        "tf": (),
        "default": {
            "model": {"pt": ("dandelin/vilt-b32-finetuned-vqa", "4355f59")},
        },
        "type": "multimodal",
    },
    "document-question-answering": {
        "impl": DocumentQuestionAnsweringPipeline,
        "pt": (AutoModelForDocumentQuestionAnswering,) if is_torch_available() else (),
        "tf": (),
        "default": {
            "model": {"pt": ("impira/layoutlm-document-qa", "52e01b3")},
        },
        "type": "multimodal",
    },
    "fill-mask": {
        "impl": FillMaskPipeline,
        "tf": (TFAutoModelForMaskedLM,) if is_tf_available() else (),
        "pt": (AutoModelForMaskedLM,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("distilbert/distilroberta-base", "ec58a5b"),
                "tf": ("distilbert/distilroberta-base", "ec58a5b"),
            }
        },
        "type": "text",
    },
    "summarization": {
        "impl": SummarizationPipeline,
        "tf": (TFAutoModelForSeq2SeqLM,) if is_tf_available() else (),
        "pt": (AutoModelForSeq2SeqLM,) if is_torch_available() else (),
        "default": {
            "model": {"pt": ("sshleifer/distilbart-cnn-12-6", "a4f8f3e"), "tf": ("google-t5/t5-small", "d769bba")}
        },
        "type": "text",
    },
    # This task is a special case as it's parametrized by SRC, TGT languages.
    "translation": {
        "impl": TranslationPipeline,
        "tf": (TFAutoModelForSeq2SeqLM,) if is_tf_available() else (),
        "pt": (AutoModelForSeq2SeqLM,) if is_torch_available() else (),
        "default": {
            ("en", "fr"): {"model": {"pt": ("google-t5/t5-base", "686f1db"), "tf": ("google-t5/t5-base", "686f1db")}},
            ("en", "de"): {"model": {"pt": ("google-t5/t5-base", "686f1db"), "tf": ("google-t5/t5-base", "686f1db")}},
            ("en", "ro"): {"model": {"pt": ("google-t5/t5-base", "686f1db"), "tf": ("google-t5/t5-base", "686f1db")}},
        },
        "type": "text",
    },
    "text2text-generation": {
        "impl": Text2TextGenerationPipeline,
        "tf": (TFAutoModelForSeq2SeqLM,) if is_tf_available() else (),
        "pt": (AutoModelForSeq2SeqLM,) if is_torch_available() else (),
        "default": {"model": {"pt": ("google-t5/t5-base", "686f1db"), "tf": ("google-t5/t5-base", "686f1db")}},
        "type": "text",
    },
    "text-generation": {
        "impl": TextGenerationPipeline,
        "tf": (TFAutoModelForCausalLM,) if is_tf_available() else (),
        "pt": (AutoModelForCausalLM,) if is_torch_available() else (),
        "default": {"model": {"pt": ("openai-community/gpt2", "6c0e608"), "tf": ("openai-community/gpt2", "6c0e608")}},
        "type": "text",
    },
    "zero-shot-classification": {
        "impl": ZeroShotClassificationPipeline,
        "tf": (TFAutoModelForSequenceClassification,) if is_tf_available() else (),
        "pt": (AutoModelForSequenceClassification,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("facebook/bart-large-mnli", "c626438"),
                "tf": ("FacebookAI/roberta-large-mnli", "130fb28"),
            },
            "config": {
                "pt": ("facebook/bart-large-mnli", "c626438"),
                "tf": ("FacebookAI/roberta-large-mnli", "130fb28"),
            },
        },
        "type": "text",
    },
    "zero-shot-image-classification": {
        "impl": ZeroShotImageClassificationPipeline,
        "tf": (TFAutoModelForZeroShotImageClassification,) if is_tf_available() else (),
        "pt": (AutoModelForZeroShotImageClassification,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("openai/clip-vit-base-patch32", "f4881ba"),
                "tf": ("openai/clip-vit-base-patch32", "f4881ba"),
            }
        },
        "type": "multimodal",
    },
    "zero-shot-audio-classification": {
        "impl": ZeroShotAudioClassificationPipeline,
        "tf": (),
        "pt": (AutoModel,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("laion/clap-htsat-fused", "973b6e5"),
            }
        },
        "type": "multimodal",
    },
    "image-classification": {
        "impl": ImageClassificationPipeline,
        "tf": (TFAutoModelForImageClassification,) if is_tf_available() else (),
        "pt": (AutoModelForImageClassification,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("google/vit-base-patch16-224", "5dca96d"),
                "tf": ("google/vit-base-patch16-224", "5dca96d"),
            }
        },
        "type": "image",
    },
    "image-feature-extraction": {
        "impl": ImageFeatureExtractionPipeline,
        "tf": (TFAutoModel,) if is_tf_available() else (),
        "pt": (AutoModel,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("google/vit-base-patch16-224", "3f49326"),
                "tf": ("google/vit-base-patch16-224", "3f49326"),
            }
        },
        "type": "image",
    },
    "image-segmentation": {
        "impl": ImageSegmentationPipeline,
        "tf": (),
        "pt": (AutoModelForImageSegmentation, AutoModelForSemanticSegmentation) if is_torch_available() else (),
        "default": {"model": {"pt": ("facebook/detr-resnet-50-panoptic", "fc15262")}},
        "type": "multimodal",
    },
    "image-to-text": {
        "impl": ImageToTextPipeline,
        "tf": (TFAutoModelForVision2Seq,) if is_tf_available() else (),
        "pt": (AutoModelForVision2Seq,) if is_torch_available() else (),
        "default": {
            "model": {
                "pt": ("ydshieh/vit-gpt2-coco-en", "65636df"),
                "tf": ("ydshieh/vit-gpt2-coco-en", "65636df"),
            }
        },
        "type": "multimodal",
    },
    "object-detection": {
        "impl": ObjectDetectionPipeline,
        "tf": (),
        "pt": (AutoModelForObjectDetection,) if is_torch_available() else (),
        "default": {"model": {"pt": ("facebook/detr-resnet-50", "2729413")}},
        "type": "multimodal",
    },
    "zero-shot-object-detection": {
        "impl": ZeroShotObjectDetectionPipeline,
        "tf": (),
        "pt": (AutoModelForZeroShotObjectDetection,) if is_torch_available() else (),
        "default": {"model": {"pt": ("google/owlvit-base-patch32", "17740e1")}},
        "type": "multimodal",
    },
    "depth-estimation": {
        "impl": DepthEstimationPipeline,
        "tf": (),
        "pt": (AutoModelForDepthEstimation,) if is_torch_available() else (),
        "default": {"model": {"pt": ("Intel/dpt-large", "e93beec")}},
        "type": "image",
    },
    "video-classification": {
        "impl": VideoClassificationPipeline,
        "tf": (),
        "pt": (AutoModelForVideoClassification,) if is_torch_available() else (),
        "default": {"model": {"pt": ("MCG-NJU/videomae-base-finetuned-kinetics", "4800870")}},
        "type": "video",
    },
    "mask-generation": {
        "impl": MaskGenerationPipeline,
        "tf": (),
        "pt": (AutoModelForMaskGeneration,) if is_torch_available() else (),
        "default": {"model": {"pt": ("facebook/sam-vit-huge", "997b15")}},
        "type": "multimodal",
    },
    "image-to-image": {
        "impl": ImageToImagePipeline,
        "tf": (),
        "pt": (AutoModelForImageToImage,) if is_torch_available() else (),
        "default": {"model": {"pt": ("caidas/swin2SR-classical-sr-x2-64", "4aaedcb")}},
        "type": "image",
    },
}

2.3 使用model实例化pipeline对象

2.3.1 基于model实例化“自动语音识别”

如果不想使用task中默认的模型,可以指定huggingface中的模型:

import os
os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
os.environ["CUDA_VISIBLE_DEVICES"] = "2"
 
from transformers import pipeline
 
speech_file = "./output_video_enhanced.mp3"
#transcriber = pipeline(task="automatic-speech-recognition", model="openai/whisper-medium")
pipe = pipeline(model="openai/whisper-medium")
result = pipe(speech_file)
print(result)

2.3.2 查看model与task的对应关系

可以登录https://huggingface.co/tasks查看

三、总结

本文为transformers之pipeline专栏的第0篇,后面会以每个task为一篇,共计讲述28+个tasks的用法,通过28个tasks的pipeline使用学习,可以掌握语音、计算机视觉、自然语言处理、多模态乃至强化学习等30w+个huggingface上的开源大模型。让你成为大模型领域的专家!


目录
相关文章
|
17天前
|
机器学习/深度学习 人工智能 运维
人工智能在事件管理中的应用
人工智能在事件管理中的应用
67 21
|
1月前
|
机器学习/深度学习 人工智能 自然语言处理
人工智能在医疗诊断中的应用与前景####
本文深入探讨了人工智能(AI)技术在医疗诊断领域的应用现状、面临的挑战及未来发展趋势。通过分析AI如何辅助医生进行疾病诊断,提高诊断效率和准确性,以及其在个性化医疗中的潜力,文章揭示了AI技术对医疗行业变革的推动作用。同时,也指出了数据隐私、算法偏见等伦理问题,并展望了AI与人类医生协同工作的前景。 ####
124 0
|
2天前
|
机器学习/深度学习 数据采集 人工智能
人工智能在变更管理中的应用:变革的智能化之路
人工智能在变更管理中的应用:变革的智能化之路
28 13
|
6天前
|
机器学习/深度学习 人工智能 自然语言处理
人工智能在客服领域有哪些应用?
人工智能正在彻底改变着传统客服行业,它不仅拓展了业务边界,还推动着整个行业向更高效、更人性化方向迈进。
39 7
|
21天前
|
机器学习/深度学习 数据采集 人工智能
人工智能在农业中的应用:智慧农业的未来
人工智能在农业中的应用:智慧农业的未来
55 11
|
1月前
|
人工智能 缓存 异构计算
云原生AI加速生成式人工智能应用的部署构建
本文探讨了云原生技术背景下,尤其是Kubernetes和容器技术的发展,对模型推理服务带来的挑战与优化策略。文中详细介绍了Knative的弹性扩展机制,包括HPA和CronHPA,以及针对传统弹性扩展“滞后”问题提出的AHPA(高级弹性预测)。此外,文章重点介绍了Fluid项目,它通过分布式缓存优化了模型加载的I/O操作,显著缩短了推理服务的冷启动时间,特别是在处理大规模并发请求时表现出色。通过实际案例,展示了Fluid在vLLM和Qwen模型推理中的应用效果,证明了其在提高模型推理效率和响应速度方面的优势。
云原生AI加速生成式人工智能应用的部署构建
|
1月前
|
数据采集 人工智能 移动开发
盘点人工智能在医疗诊断领域的应用
人工智能在医疗诊断领域的应用广泛,包括医学影像诊断、疾病预测与风险评估、病理诊断、药物研发、医疗机器人、远程医疗诊断和智能辅助诊断系统等。这些应用提高了诊断的准确性和效率,改善了患者的治疗效果和生活质量。然而,数据质量和安全性、AI系统的透明度等问题仍需关注和解决。
247 10
|
1月前
|
机器学习/深度学习 人工智能 算法
探索人工智能在医疗诊断中的应用
本文深入探讨了人工智能(AI)技术在医疗诊断领域的革新性应用,通过分析AI如何助力提高诊断准确性、效率以及个性化治疗方案的制定,揭示了AI技术为现代医学带来的巨大潜力和挑战。文章还展望了AI在未来医疗中的发展趋势,强调了跨学科合作的重要性。 ###
110 9
|
1月前
|
机器学习/深度学习 数据采集 人工智能
深度探索:人工智能在医疗影像诊断中的应用与挑战####
本文旨在深入剖析人工智能(AI)技术在医疗影像诊断领域的最新进展、核心优势、面临的挑战及未来发展趋势。通过综合分析当前AI算法在提高诊断准确性、效率及可解释性方面的贡献,结合具体案例,揭示其在临床实践中的实际价值与潜在局限。文章还展望了AI如何与其他先进技术融合,以推动医疗影像学迈向更高层次的智能化时代。 ####
|
1月前
|
机器学习/深度学习 人工智能 自然语言处理
探索未来编程:Python在人工智能领域的深度应用与前景###
本文将深入探讨Python语言在人工智能(AI)领域的广泛应用,从基础原理到前沿实践,揭示其如何成为推动AI技术创新的关键力量。通过分析Python的简洁性、灵活性以及丰富的库支持,展现其在机器学习、深度学习、自然语言处理等子领域的卓越贡献,并展望Python在未来AI发展中的核心地位与潜在变革。 ###

热门文章

最新文章