Python 中 ONNX Runtime 入門

以下是安裝 ONNX 軟體包的快速指南，用於模型序列化和使用 ORT 進行推理。

安裝 ONNX Runtime

ONNX Runtime 有兩個 Python 軟體包。在任何一個環境中，一次只能安裝其中一個軟體包。GPU 軟體包包含了大部分 CPU 功能。

安裝 ONNX Runtime CPU

如果您在基於 Arm® 的 CPU 和/或 macOS 上執行，請使用 CPU 軟體包。

pip install onnxruntime

安裝 ONNX Runtime GPU (CUDA 12.x)

ORT 的預設 CUDA 版本是 12.x。

pip install onnxruntime-gpu

安裝 ONNX Runtime GPU (CUDA 11.8)

對於 CUDA 11.8，請使用以下說明從 ORT Azure Devops Feed 進行安裝。

pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-11/pypi/simple/

安裝 ONNX 用於模型匯出

## ONNX is built into PyTorch
pip install torch

## tensorflow
pip install tf2onnx

## sklearn
pip install skl2onnx

PyTorch、TensorFlow 和 SciKit Learn 快速入門示例

使用您喜歡的框架訓練模型，匯出為 ONNX 格式，並在任何支援的 ONNX Runtime 語言中進行推理！

PyTorch CV

在此示例中，我們將介紹如何將 PyTorch CV 模型匯出為 ONNX 格式，然後使用 ORT 進行推理。建立模型的程式碼來自 Microsoft Learn 上的 PyTorch 基礎學習路徑。

使用 torch.onnx.export 匯出模型

torch.onnx.export(model,                                # model being run
                  torch.randn(1, 28, 28).to(device),    # model input (or a tuple for multiple inputs)
                  "fashion_mnist_model.onnx",           # where to save the model (can be a file or file-like object)
                  input_names = ['input'],              # the model's input names
                  output_names = ['output'])            # the model's output names

使用 onnx.load 載入 onnx 模型

import onnx
onnx_model = onnx.load("fashion_mnist_model.onnx")
onnx.checker.check_model(onnx_model)

使用 ort.InferenceSession 建立推理會話

import onnxruntime as ort
import numpy as np
x, y = test_data[0][0], test_data[0][1]
ort_sess = ort.InferenceSession('fashion_mnist_model.onnx')
outputs = ort_sess.run(None, {'input': x.numpy()})

# Print Result
predicted, actual = classes[outputs[0][0].argmax(0)], classes[y]
print(f'Predicted: "{predicted}", Actual: "{actual}"')

PyTorch NLP

在此示例中，我們將介紹如何將 PyTorch NLP 模型匯出為 ONNX 格式，然後使用 ORT 進行推理。建立 AG News 模型的程式碼來自這個 PyTorch 教程。

處理文字並建立用於匯出的樣本資料輸入和偏移量。

import torch
text = "Text from the news article"
text = torch.tensor(text_pipeline(text))
offsets = torch.tensor([0])

匯出模型

# Export the model
torch.onnx.export(model,                     # model being run
                (text, offsets),           # model input (or a tuple for multiple inputs)
                "ag_news_model.onnx",      # where to save the model (can be a file or file-like object)
                export_params=True,        # store the trained parameter weights inside the model file
                opset_version=10,          # the ONNX version to export the model to
                do_constant_folding=True,  # whether to execute constant folding for optimization
                input_names = ['input', 'offsets'],   # the model's input names
                output_names = ['output'], # the model's output names
                dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                              'output' : {0 : 'batch_size'}})

使用 onnx.load 載入模型

import onnx
onnx_model = onnx.load("ag_news_model.onnx")
onnx.checker.check_model(onnx_model)

使用 ort.InferenceSession 建立推理會話

import onnxruntime as ort
import numpy as np
ort_sess = ort.InferenceSession('ag_news_model.onnx')
outputs = ort_sess.run(None, {'input': text.numpy(),
                            'offsets':  torch.tensor([0]).numpy()})
# Print Result
result = outputs[0].argmax(axis=1)+1
print("This is a %s news" %ag_news_label[result[0]])

TensorFlow CV

在此示例中，我們將介紹如何將 TensorFlow CV 模型匯出為 ONNX 格式，然後使用 ORT 進行推理。使用的模型來自此 GitHub Notebook 用於 Keras resnet50。

獲取預訓練模型

import os
import tensorflow as tf
from tensorflow.keras.applications.resnet50 import ResNet50
import onnxruntime

model = ResNet50(weights='imagenet')

preds = model.predict(x)
print('Keras Predicted:', decode_predictions(preds, top=3)[0])
model.save(os.path.join("/tmp", model.name))

將模型轉換為 onnx 並匯出

import tf2onnx
import onnxruntime as rt

spec = (tf.TensorSpec((None, 224, 224, 3), tf.float32, name="input"),)
output_path = model.name + ".onnx"

model_proto, _ = tf2onnx.convert.from_keras(model, input_signature=spec, opset=13, output_path=output_path)
output_names = [n.name for n in model_proto.graph.output]

使用 rt.InferenceSession 建立推理會話

providers = ['CPUExecutionProvider']
m = rt.InferenceSession(output_path, providers=providers)
onnx_pred = m.run(output_names, {"input": x})

print('ONNX Predicted:', decode_predictions(onnx_pred[0], top=3)[0])

SciKit Learn CV

在此示例中，我們將介紹如何將 SciKit Learn CV 模型匯出為 ONNX 格式，然後使用 ORT 進行推理。我們將使用著名的鳶尾花資料集。

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

from sklearn.linear_model import LogisticRegression
clr = LogisticRegression()
clr.fit(X_train, y_train)
print(clr)

LogisticRegression()

將模型轉換為 ONNX 格式或匯出

from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(clr, initial_types=initial_type)
with open("logreg_iris.onnx", "wb") as f:
    f.write(onx.SerializeToString())

使用 ONNX Runtime 載入並執行模型。我們將使用 ONNX Runtime 計算此機器學習模型的預測。

import numpy
import onnxruntime as rt

sess = rt.InferenceSession("logreg_iris.onnx")
input_name = sess.get_inputs()[0].name
pred_onx = sess.run(None, {input_name: X_test.astype(numpy.float32)})[0]
print(pred_onx)

OUTPUT:
 [0 1 0 0 1 2 2 0 0 2 1 0 2 2 1 1 2 2 2 0 2 2 1 2 1 1 1 0 2 1 1 1 1 0 1 0 0
  1]

獲取預測類別

透過將特定輸出名稱指定到列表中，可以更改程式碼以獲取一個特定輸出。

import numpy
import onnxruntime as rt

sess = rt.InferenceSession("logreg_iris.onnx")
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run(
    [label_name], {input_name: X_test.astype(numpy.float32)})[0]
print(pred_onx)

Python API 參考文件

前往 ORT Python API 文件

構建

如果使用 pip，請在下載前執行 pip install --upgrade pip。

工件	描述	支援平臺
onnxruntime	CPU (釋出版)	Windows (x64), Linux (x64, ARM64), Mac (X64),
nightly	CPU (開發版)	同上
onnxruntime-gpu	GPU (釋出版)	Windows (x64), Linux (x64, ARM64)
適用於 CUDA 11.* 的 onnxruntime-gpu	GPU (開發版)	Windows (x64), Linux (x64, ARM64)
適用於 CUDA 12.* 的 onnxruntime-gpu	GPU (開發版)	Windows (x64), Linux (x64, ARM64)

安裝適用於 CUDA 11.* 的 onnxruntime-gpu 的示例

python -m pip install onnxruntime-gpu --extra-index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ort-cuda-11-nightly/pypi/simple/

安裝適用於 CUDA 12.* 的 onnxruntime-gpu 的示例

python -m pip install onnxruntime-gpu --pre --extra-index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/

有關 Python 編譯器版本說明，請參閱此頁面