使用 C# 透過 Faster RCNN 深度學習進行目標檢測

本示例將演示如何使用 ONNX Runtime C# API 執行預訓練的 Faster R-CNN 目標檢測 ONNX 模型。

本示例的原始碼可在此處獲取：此處。

先決條件

要執行此示例，您需要具備以下條件：

為您的作業系統（Mac、Windows 或 Linux）安裝 .NET Core 3.1 或更高版本。
將 Faster R-CNN ONNX 模型下載到本地系統。
下載此演示影像以測試模型。您也可以使用任何您喜歡的影像。

開始使用

現在我們已經設定好了一切，我們可以開始新增程式碼來在影像上執行模型。為了簡單起見，我們將在程式的 main 方法中完成此操作。

讀取路徑

首先，讓我們讀取模型路徑、要測試的影像路徑和輸出影像路徑

string modelFilePath = args[0];
string imageFilePath = args[1];
string outImageFilePath = args[2];

讀取影像

接下來，我們將使用跨平臺影像庫 ImageSharp 讀取影像。

using Image<Rgb24> image = Image.Load<Rgb24>(imageFilePath, out IImageFormat format);

請注意，我們專門讀取 Rgb24 型別，以便在後續步驟中高效地預處理影像。

調整影像大小

接下來，我們將影像調整到模型期望的合適大小；建議將影像調整為高度和寬度都在 [800, 1333] 範圍內的尺寸。

float ratio = 800f / Math.Min(image.Width, image.Height);
using Stream imageStream = new MemoryStream();
image.Mutate(x => x.Resize((int)(ratio * image.Width), (int)(ratio * image.Height)));
image.Save(imageStream, format);

預處理影像

接下來，我們將根據模型要求預處理影像。

var paddedHeight = (int)(Math.Ceiling(image.Height / 32f) * 32f);
var paddedWidth = (int)(Math.Ceiling(image.Width / 32f) * 32f);
var mean = new[] { 102.9801f, 115.9465f, 122.7717f };

// Preprocessing image
// We use DenseTensor for multi-dimensional access
DenseTensor<float> input = new(new[] { 3, paddedHeight, paddedWidth });
image.ProcessPixelRows(accessor =>
{
    for (int y = paddedHeight - accessor.Height; y < accessor.Height; y++)
    {
        Span<Rgb24> pixelSpan = accessor.GetRowSpan(y);
        for (int x = paddedWidth - accessor.Width; x < accessor.Width; x++)
        {
            input[0, y, x] = pixelSpan[x].B - mean[0];
            input[1, y, x] = pixelSpan[x].G - mean[1];
            input[2, y, x] = pixelSpan[x].R - mean[2];
        }
    }
});

這裡，我們建立了一個所需大小 (channels, paddedHeight, paddedWidth) 的張量，訪問畫素值，對其進行預處理，最後將其分配到張量的相應索引處。

設定輸入

// 鎖定 DenseTensor 記憶體並在 OrtValue 張量中直接使用 // 它將在 OrtValue 釋放時解除鎖定

using var inputOrtValue = OrtValue.CreateTensorValueFromMemory(OrtMemoryInfo.DefaultInstance,
    input.Buffer, new long[] { 3, paddedHeight, paddedWidth });

接下來，我們將建立模型的輸入

var inputs = new Dictionary<string, OrtValue>
{
    { "image", inputOrtValue }
};

要檢查 ONNX 模型的輸入節點名稱，您可以使用 Netron 視覺化模型並檢視輸入/輸出名稱。在本例中，該模型的輸入節點名稱為 image。

執行推理

接下來，我們將建立一個推理會話並透過它執行輸入

using var session = new InferenceSession(modelFilePath);
using var runOptions = new RunOptions();
using IDisposableReadOnlyCollection<OrtValue> results = session.Run(runOptions, inputs, session.OutputNames);

後處理輸出

接下來，我們需要後處理輸出，以獲取每個框的邊界框、關聯的標籤和置信度分數。

var boxesSpan = results[0].GetTensorDataAsSpan<float>();
var labelsSpan = results[1].GetTensorDataAsSpan<long>();
var confidencesSpan = results[2].GetTensorDataAsSpan<float>();

const float minConfidence = 0.7f;
var predictions = new List<Prediction>();

for (int i = 0; i < boxesSpan.Length - 4; i += 4)
{
    var index = i / 4;
    if (confidencesSpan[index] >= minConfidence)
    {
        predictions.Add(new Prediction
        {
            Box = new Box(boxesSpan[i], boxesSpan[i + 1], boxesSpan[i + 2], boxesSpan[i + 3]),
            Label = LabelMap.Labels[labelsSpan[index]],
            Confidence = confidencesSpan[index]
        });
    }
}

請注意，我們只選擇置信度高於 0.7 的框，以消除誤報。

檢視預測

接下來，我們將在影像上繪製邊界框以及關聯的標籤和置信度分數，以檢視模型的表現。

using var outputImage = File.OpenWrite(outImageFilePath);
Font font = SystemFonts.CreateFont("Arial", 16);
foreach (var p in predictions)
{
    image.Mutate(x =>
    {
        x.DrawLines(Color.Red, 2f, new PointF[] {

            new PointF(p.Box.Xmin, p.Box.Ymin),
            new PointF(p.Box.Xmax, p.Box.Ymin),

            new PointF(p.Box.Xmax, p.Box.Ymin),
            new PointF(p.Box.Xmax, p.Box.Ymax),

            new PointF(p.Box.Xmax, p.Box.Ymax),
            new PointF(p.Box.Xmin, p.Box.Ymax),

            new PointF(p.Box.Xmin, p.Box.Ymax),
            new PointF(p.Box.Xmin, p.Box.Ymin)
        });
        x.DrawText($"{p.Label}, {p.Confidence:0.00}", font, Color.White, new PointF(p.Box.Xmin, p.Box.Ymin));
    });
}
image.Save(outputImage, format);

對於每個邊界框預測，我們使用 ImageSharp 繪製紅線以建立邊界框，並繪製標籤和置信度文字。

執行程式

現在程式已建立，我們可以使用以下命令執行它

dotnet run [path-to-model] [path-to-image] [path-to-output-image]

例如，執行

dotnet run ~/Downloads/FasterRCNN-10.onnx ~/Downloads/demo.jpg ~/Downloads/out.jpg

檢測影像中的以下物件