pipeline 详解

简介

transformers库中的pipeline是一种极简方式使用大模型推理的抽象，将所有大模型分为语音（Audio）、计算机视觉（Computer vision）、自然语言处理（NLP）、多模态（Multimodal）等4大类，28小类任务（tasks）。

1，基本概念

Pipeline：是一个封装了从输入到输出整个流程的对象，包括数据预处理、模型推理以及结果后处理。
Task：指定要执行的具体NLP任务类型，如文本分类、问答等。
Model：用于执行特定任务的预训练模型。
Tokenizer：将原始文本转换成模型能够理解的形式（例如词嵌入）。
Processor：在某些情况下，除了tokenizer外还需要其他类型的处理器来准备或解释数据。

2，支持那些任务

audio-classification - 音频分类：对音频文件进行分类，例如识别音频中的声音类型（如狗叫声、汽车声等）。
automatic-speech-recognition (ASR) - 自动语音识别：将人类语音转换为文本。
text-to-audio - 文本转音频：将文本内容转换成语音输出。
feature-extraction - 特征提取：从文本或图像数据中提取有用的特征向量，常用于后续处理步骤。
text-classification - 文本分类：根据文本内容将其归类到预定义的类别中。
token-classification - 令牌分类：针对文本中的每个单词或子词单元进行分类，如命名实体识别。
question-answering - 问答系统：给定一个问题和上下文，生成一个答案。
table-question-answering - 表格问答：基于表格数据回答问题。
visual-question-answering (VQA) - 视觉问答：结合视觉信息和文本信息回答问题。
document-question-answering - 文档问答：基于整个文档而非单一段落回答问题。
fill-mask - 填空：在句子中填充缺失的部分。
summarization - 摘要生成：生成文本摘要。
translation - 翻译：将一种语言的文本翻译成另一种语言。
text2text-generation - 文本到文本生成：从输入文本生成相关的输出文本。
text-generation - 文本生成：生成新的文本内容。
zero-shot-classification - 零样本分类：无需训练直接使用新标签进行分类。
zero-shot-image-classification - 零样本图像分类：对于未见过的类别也能进行图像分类。
zero-shot-audio-classification - 零样本音频分类：对于未知的声音类别也能进行分类。
image-classification - 图像分类：识别图像中的对象类别。
image-feature-extraction - 图像特征提取：从图像中抽取可用于进一步分析的特征。
image-segmentation - 图像分割：将图像划分为多个区域或部分。
image-to-text - 图像转文本：描述图像的内容。
object-detection - 目标检测：识别并定位图像中的特定对象。
zero-shot-object-detection - 零样本目标检测：能够检测之前未曾见过的对象。
depth-estimation - 深度估计：估计图像中物体与相机之间的距离。
video-classification - 视频分类：对视频片段进行分类。
mask-generation - 掩码生成：创建掩码以突出显示图像中的特定区域。
image-to-image - 图像到图像转换：将一张图像转换为另一张具有不同风格或内容的图像。

使用

简单使用感谢分析

from transformers import pipeline

# 创建一个用于情感分析的pipeline
classifier = pipeline("sentiment-analysis")

# 使用该pipeline对文本进行情感分析
result = classifier("I love using the transformers library!")
print(result)
#结果：
[{'label': 'POSITIVE', 'score': 0.9993904829025269}]

使用指定模型和分词器进行使用

使用模型：ahmedrachid/FinancialBERT-Sentiment-Analysis
FinancialBERT是一个在大型金融文本语料库上预训练的 BERT 模型。目的是加强金融领域的金融 NLP 研究和实际，按情绪（消极、中性、积极）分类，目前支持英文。

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipeline

model = BertForSequenceClassification.from_pretrained("ahmedrachid/FinancialBERT-Sentiment-Analysis",num_labels=3)
tokenizer = BertTokenizer.from_pretrained("ahmedrachid/FinancialBERT-Sentiment-Analysis")

nlp = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

sentences = ["Operating profit rose to EUR 13.1 mn from EUR 8.7 mn in the corresponding period in 2007 representing 7.7 % of net sales.",  
             "Bids or offers include at least 1,000 shares and the value of the shares must correspond to at least EUR 4,000.", 
             "Raute reported a loss per share of EUR 0.86 for the first half of 2009 , against EPS of EUR 0.74 in the corresponding period of 2008.", 
             ]
results = nlp(sentences)
print(results)
结果：
[{'label': 'positive', 'score': 0.9998133778572083}, {'label': 'neutral', 'score': 0.9997822642326355}, {'label': 'negative', 'score': 0.9877365231513977}]

pipeline内部实现细节

1. 初始化Tokenizer和Model

接下来，我们初始化一个tokenizer和一个模型。这里我们使用distilbert-base-uncased-finetuned-sst-2-english模型，这是一个预训练的模型，用于情感分析任务。

model_name = "distilbert-base-uncased-finetuned-sst-2-english"
# 初始化tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
# 初始化模型
model = AutoModelForSequenceClassification.from_pretrained(model_name)

2. 创建Pipeline

现在，我们可以创建一个TextClassificationPipeline对象。这个对象会封装tokenizer和模型，并提供一个简洁的接口来处理输入数据。

# 创建pipeline
pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer, device=0 if torch.cuda.is_available() else -1)

3. 数据预处理

当我们调用pipeline时，它会自动调用tokenizer对输入数据进行预处理。这里我们手动演示这一过程：

# 输入文本
text = "I love using the transformers library!"

# 使用tokenizer进行预处理
inputs = tokenizer(text, return_tensors="pt")

4. 模型推理

预处理后的输入数据会被传递给模型进行推理。这里我们手动演示这一过程：

# 将输入数据传递给模型
with torch.no_grad():
    outputs = model(**inputs)

5 结果后处理

模型的输出通常是logits，我们需要将这些logits转换为概率分布，并选择最高概率的类别。这里我们手动演示这一过程：

# 获取logits
logits = outputs.logits

# 将logits转换为概率分布
probabilities = torch.softmax(logits, dim=-1)

# 获取最高概率的类别
predicted_class_id = torch.argmax(probabilities, dim=-1).item()
predicted_label = model.config.id2label[predicted_class_id]
predicted_probability = probabilities[0][predicted_class_id].item()

print(f"Predicted label: {predicted_label}, Probability: {predicted_probability:.4f}")

使用Pipeline进行批量处理
最后，我们展示如何使用pipeline进行批量处理。pipeline会自动处理批量输入，优化推理速度。

# 批量输入文本
texts = ["I love this place.", "This is not good."]

# 使用pipeline进行批量处理
results = pipeline(texts)

# 打印结果
for result in results:
    print(f"Label: {result['label']}, Score: {result['score']:.4f}")

完整代码

from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline
import torch

# 初始化tokenizer和模型
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 创建pipeline
pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer, device=0 if torch.cuda.is_available() else -1)

# 单个输入文本
text = "I love using the transformers library!"

# 使用tokenizer进行预处理
inputs = tokenizer(text, return_tensors="pt")

# 将输入数据传递给模型
with torch.no_grad():
    outputs = model(**inputs)

# 获取logits
logits = outputs.logits

# 将logits转换为概率分布
probabilities = torch.softmax(logits, dim=-1)

# 获取最高概率的类别
predicted_class_id = torch.argmax(probabilities, dim=-1).item()
predicted_label = model.config.id2label[predicted_class_id]
predicted_probability = probabilities[0][predicted_class_id].item()

print(f"Predicted label: {predicted_label}, Probability: {predicted_probability:.4f}")

# 批量输入文本
texts = ["I love this place.", "This is not good."]

# 使用pipeline进行批量处理
results = pipeline(texts)

# 打印结果
for result in results:
    print(f"Label: {result['label']}, Score: {result['score']:.4f}")