Segment Anything (Fast)
Feature Description
This operator utilizes the FastSAM (Fast Segment Anything Model) model to perform rapid instance segmentation on input images based on user-provided prompts (such as bounding boxes, points, or text descriptions). It can identify and segment object regions in the image corresponding to the prompts.
Use Cases
-
Interactive Segmentation: When needing to quickly segment specific objects of interest in an image, prompts like bounding boxes, points, or textual descriptions in parameters can guide the model for segmentation.
-
Target Extraction: Used to precisely extract contour information of specific targets from complex backgrounds.
-
Automated Annotation Assistance: Can be part of an automated annotation workflow, quickly generating initial segmentation masks for targets through simple prompts.
Inputs and Outputs
Input Item |
Image: The color image to be segmented (must be in RGB format). List of hint points: A list containing multiple point coordinates [X, Y], used to indicate the target region to be segmented. For example, clicking several points on the target. Tooltip list: A list containing multiple bounding box coordinates, each box defined by four corner points, used to frame the target region to be segmented. |
Output Item |
Detection results: A list containing segmentation results, each result representing a segmented instance, containing its bounding box (which can be a rotated or horizontal box), category (defaults to 0 or specified category), confidence score, and segmented polygon contour. |
Parameter Descriptions
This operator relies on the Fastsam Python library. If your environment has not yet installed it, please visit Qianyi’s internal pypi source and use pip install fastsam to install. |
|
Weight files
Parameter Description |
Specifies the FastSAM model weight file (usually .pt format) to be used for segmentation. A valid model file must be selected. |
Parameter Tuning Guide |
Select a model that matches your task requirements and hardware capabilities. Generally, larger models (like FastSAM-x) have higher accuracy but are slower, while smaller models (like FastSAM-s) are faster but may have slightly lower accuracy. |
Enable GPU
Parameter Description |
Select whether to use GPU for model inference computation. If checked, ensure the computer has an available NVIDIA graphics card and the corresponding CUDA environment. |
Parameter Tuning Guide |
Enabling GPU can significantly improve processing speed. If no compatible GPU is available, this should be unchecked (use CPU). |
Image size
Parameter Description |
The size to which the input image will be scaled before being fed into the model for segmentation. |
Parameter Tuning Guide |
Larger image sizes generally lead to higher segmentation accuracy but also increase computation time and GPU/CPU memory consumption. Smaller sizes have the opposite effect. Common values include 640, 1024, etc. A trade-off between accuracy and speed needs to be made based on the specific application scenario. |
Parameter Range |
Default value: 640 |
Confidence threshold
Parameter Description |
Confidence score threshold for filtering FastSAM’s initial segmentation results. Only segmentation results with confidence higher than this threshold will be retained. |
Parameter Tuning Guide |
Increasing this value will result in fewer segmentation outputs, retaining only objects the model is very confident about, which can reduce missegmentations and speed up post-processing. Decreasing this value will yield more segmentation results, potentially including some lower-confidence targets, but may increase missegmentations and post-processing time. Usually, start with the default value and adjust. |
Parameter Range |
[0, 1], Default value: 0.5 |
Overlap Filter Threshold
Parameter Description |
Intersection over Union (IoU) threshold for Non-Maximum Suppression (NMS). When multiple segmentation results (masks or boxes) overlap by more than this threshold, lower-confidence results will be suppressed. |
Parameter Tuning Guide |
Increasing this value allows more overlapping results to coexist, which might be suitable for scenes with dense and mutually occluding objects. Decreasing this value will more aggressively remove overlapping results, ensuring each target outputs only one best result. The default value is usually suitable for most scenarios. |
Parameter Range |
[0, 1], Default value: 0.9 |
Extra class name
Parameter Description |
Assigns a category name (ID) to the output segmentation results. FastSAM itself does not distinguish specific categories; this parameter is used to tag these segmentation results for subsequent processing (such as filtering, statistics). |
Parameter Tuning Guide |
Assign a meaningful category ID to the segmented objects according to your application scenario needs. |
Parameter Range |
Provides category name options from "0" to "29", defaults to 0. |
Prompt
Parameter Description |
Input text description to guide the model in segmenting objects related to the text content. For example, input "bag" or "red box". |
Parameter Tuning Guide |
Try using concise, specific nouns or phrases to describe the target you want to segment. You can use commas to separate multiple prompts. For example, "a blue car, the traffic light". The effectiveness of text prompts depends on the model’s comprehension ability. |
Text prompt threshold
Parameter Description |
When using text prompts, this threshold is used to filter segmentation results based on text similarity scores. Only results with similarity scores higher than this threshold will be retained. |
Parameter Tuning Guide |
This is a relatively sensitive parameter that needs adjustment based on actual results. If text prompts do not segment the desired results, try lowering this threshold; if many irrelevant results are segmented, try raising this threshold. Note that this threshold is not the confidence score in the final output results and is usually set relatively low. |
Parameter Range |
[0, 10], Default value: 0.01 |