YOLOv8/v10 Detection and Segmentation

Feature Description

This operator utilizes advanced YOLOv8 or YOLOv10 deep learning models to perform object detection, instance segmentation, or rotated object detection tasks on input color images, supporting multiple model formats such as .pt, .onnx, and .epicnn.

Use Cases

  • Object Detection: Quickly locate and identify multiple objects in an image, outputting their bounding boxes and categories. Suitable for part recognition, defect localization, item counting, etc.

  • Instance Segmentation: On top of object detection, further generate precise pixel-level segmentation masks (contours) for each identified object instance. Suitable for scenarios requiring precise shape information, such as grasp localization, area measurement, etc.

  • Rotated Object Detection: Detect objects with arbitrary orientations and output rotated bounding boxes that tightly enclose them, along with their angles. Suitable for detecting tilted or arbitrarily placed objects.

Inputs and Outputs

Input Item

Image: The color image to be detected or segmented (must be in RGB format). Currently, only single image input is supported.

Output Item

Detection results: A list containing detection/segmentation results.

Parameter Descriptions

  • Input Image: The input format requires a color RGB image. Because Epic series cameras apply special processing to the data, the output images are all compatible with YOLO algorithms for detection and segmentation.

  • Single Image Processing: The current operator implementation only supports processing one image at a time.

  • GPU Environment: If GPU is enabled, especially when using ONNX models, ensure that the CUDA environment and onnxruntime-gpu library are correctly installed and compatible.

  • Epicnn Model: When using an .epicnn model, the "Inferred Type" parameter must be set correctly.

Weight files

Parameter Description

Specifies the YOLO model weight file to be used for inference. Supports PyTorch (.pt), ONNX (.onnx), and epicnn (.epicnn) formats. A valid model file must be selected.

Parameter Tuning Guide

Select a model file appropriate for your task requirements and hardware capabilities.

  • .pt files are typically used for training and debugging;

  • .onnx files offer better cross-platform compatibility and generally run faster on CPUs;

  • .epicnn files are for dedicated intelligent camera platforms to achieve optimal performance.

Enable GPU

Parameter Description

Select whether to use GPU for model inference computation. If checked, ensure the computer has an available NVIDIA graphics card and the corresponding CUDA environment.

Parameter Tuning Guide

Checking this option can significantly improve processing speed, especially for large models or high-resolution images.

  • If using an .onnx model and GPU is enabled, you need to install the onnxruntime-gpu library matching your CUDA version as prompted;

  • If there is no compatible GPU or the environment is not configured correctly, this should be unchecked (use CPU).

  • For .epicnn models, this option is invalid.

Inferred Type

Parameter Description

Only valid when selecting an .epicnn weight file. Used to explicitly inform the operator which task the .epicnn model is for (detection, segmentation, rotated detection) and which YOLO version it is based on (v8 or v10).

Parameter Tuning Guide

When loading an .epicnn file, the correct inference type must be selected based on the model’s actual training task, otherwise it may lead to post-processing errors.

For example, if loading an .epicnn file converted from a YOLOv8 segmentation model, "yolov8 segmentation" should be selected. For .pt and .onnx models, the operator automatically identifies the task type, and this parameter is ignored.

Confidence threshold

Parameter Description

Confidence score threshold for filtering detection/segmentation results. Only instances with scores higher than this threshold will be output.

Parameter Tuning Guide

This is the most commonly adjusted parameter. Increasing this value will result in fewer outputs, retaining only objects the model is very confident about, effectively reducing false positives. Decreasing this value will yield more detection results, potentially including some less confident or lower-quality targets, but may also recover some missed targets. A trade-off between recall and precision needs to be made based on the actual application scenario. Usually, start with the default value and adjust based on performance.

Parameter Range

[0.005, 1], Default value: 0.8