SAM Annotator API Reference
This document provides a comprehensive guide to the SAM Annotator API, enabling programmatic access to the annotation functionality.
Overview
The SAM Annotator API allows you to: - Load and initialize the SAM model - Process images programmatically - Generate masks using box or point prompts - Manage annotations - Export annotations to various formats
Installation
To use the SAM Annotator API, install the package:
pip install sam-annotator
Example Scripts
The SAM Annotator package includes example scripts to help you get started quickly:
Simple Example
simple_api_example.py
demonstrates the core functionality in a minimal script:
import os
import cv2
import numpy as np
from sam_annotator.core import SAMAnnotator
# Initialize the annotator
annotator = SAMAnnotator(
checkpoint_path=None, # Will use default
category_path="work_dir",
classes_csv="classes.csv",
sam_version="sam1"
)
# Load an image
image = cv2.imread("path/to/image.jpg")
# Set the image in the predictor
predictor = annotator.predictor
predictor.set_image(image)
# Generate a mask using a box prompt
box = np.array([100, 100, 300, 300]).reshape(1, 4) # [x1, y1, x2, y2]
masks, scores, _ = predictor.predict(
point_coords=None,
point_labels=None,
box=box,
multimask_output=True
)
# Get the best mask
mask = masks[np.argmax(scores)]
# Add the mask as an annotation
annotation = {
'mask': mask,
'class_id': 1,
'box': box[0],
'area': np.sum(mask),
'metadata': {'annotation_mode': 'box'}
}
annotation_id = annotator.annotation_manager.add_annotation(annotation)
# Set the current image path (needed for saving)
annotator.session_manager.current_image_path = "path/to/image.jpg"
# Save the annotation
annotator.file_manager.save_annotations(
annotations=[annotation],
image_name="path/to/image.jpg",
original_dimensions=image.shape[:2],
display_dimensions=image.shape[:2],
class_names=["class1"],
save_visualization=True
)
Comprehensive Example
api_example.py
is a full-featured example that demonstrates:
- Command-line argument handling
- Multiple annotation methods (box and point prompts)
- Working with multiple classes
- Exporting to all supported formats
- Creating visualizations
These example scripts can be found in the examples/
directory of the SAM Annotator repository.
Core Components
The API is organized into several core components:
1. SAMAnnotator
The main class that coordinates the annotation functionality.
from sam_annotator.core import SAMAnnotator
# Initialize the annotator
annotator = SAMAnnotator(
checkpoint_path="path/to/checkpoint.pth",
category_path="path/to/category",
classes_csv="path/to/classes.csv",
sam_version="sam1", # or "sam2"
model_type="vit_h" # depends on the SAM version
)
# Access to components
predictor = annotator.predictor # The SAM model predictor
annotations = annotator.annotations # List of annotations
file_manager = annotator.file_manager # Handles file operations
session_manager = annotator.session_manager # Manages the current session
command_manager = annotator.command_manager # Manages undo/redo and commands
2. Predictor
The interface to the SAM model for generating masks.
# Get the predictor from the annotator
predictor = annotator.predictor
# Load an image
image = cv2.imread("path/to/image.jpg")
# Important: Always set the image in the predictor before prediction
predictor.set_image(image)
# Predict with a box
box = np.array([100, 100, 300, 300]).reshape(1, 4) # [x1, y1, x2, y2]
masks, scores, logits = predictor.predict(
point_coords=None,
point_labels=None,
box=box,
multimask_output=True
)
# Predict with points
point_coords = np.array([[100, 100], [200, 200]])
point_labels = np.array([1, 1]) # 1 for foreground, 0 for background
masks, scores, logits = predictor.predict(
point_coords=point_coords,
point_labels=point_labels,
box=None,
multimask_output=True
)
3. Command Manager
Manages operations on annotations, including adding, deleting, and modifying annotations.
# Import the necessary command
from sam_annotator.core.command_manager import AddAnnotationCommand
# Create an annotation structure
mask = masks[np.argmax(scores)] # Get the best mask
# Need to create contours from the mask
mask_uint8 = mask.astype(np.uint8) * 255
contours, _ = cv2.findContours(mask_uint8, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_TC89_KCOS)
contour = max(contours, key=cv2.contourArea) # Get largest contour
# Create flattened contour list
contour_list = contour.tolist()
if len(contour_list) > 0 and isinstance(contour_list[0], list) and len(contour_list[0]) == 1:
contour_list = [point[0] for point in contour_list]
# Create annotation dictionary
annotation = {
'id': len(annotator.annotations), # Next available ID
'class_id': 1,
'class_name': annotator.class_names[1],
'box': [100, 100, 300, 300],
'contour': contour_list, # Flattened points
'contour_points': contour, # Original OpenCV contour
'mask': mask, # Boolean numpy array
'display_box': [100, 100, 300, 300],
'area': np.sum(mask),
'metadata': {'annotation_mode': 'box'}
}
# Add annotation using the command manager
command = AddAnnotationCommand(annotator.annotations, annotation, annotator.window_manager)
annotator.command_manager.execute(command)
# Note: There's no direct "annotation_manager" - annotations are stored directly in annotator.annotations
4. SessionManager
Manages the current annotation session, including storing current image path and navigating between images.
# Get the session manager from the annotator
session_manager = annotator.session_manager
# Set the current image path (required for saving)
session_manager.current_image_path = "path/to/image.jpg"
# Save annotations for the current image
session_manager.save_annotations()
# Navigate to next/previous image
next_path = session_manager.next_image()
prev_path = session_manager.previous_image()
# Check if navigation is possible
can_go_next = session_manager.can_move_next()
can_go_prev = session_manager.can_move_prev()
# Get the current image path
current_path = session_manager.get_current_image_path()
5. FileManager
Handles file operations like loading/saving annotations and exporting to different formats.
# Get the file manager from the annotator
file_manager = annotator.file_manager
# Export annotations to different formats
coco_path = file_manager.handle_export("coco", class_names)
yolo_path = file_manager.handle_export("yolo", class_names)
pascal_path = file_manager.handle_export("pascal", class_names)
Basic Usage
Initializing the Annotator
from sam_annotator.core import SAMAnnotator
# Initialize the annotator
annotator = SAMAnnotator(
checkpoint_path="path/to/checkpoint.pth",
category_path="path/to/category",
classes_csv="path/to/classes.csv",
sam_version="sam1",
model_type="vit_h"
)
Loading an Image
# Set the current image path in session manager
annotator.session_manager.current_image_path = "path/to/image.jpg"
# Or load directly for prediction
image = cv2.imread("path/to/image.jpg")
annotator.predictor.set_image(image)
Generating a Mask with a Box Prompt
# Define a bounding box [x1, y1, x2, y2]
box = np.array([100, 100, 300, 300]).reshape(1, 4)
# Set the image in the predictor
predictor = annotator.predictor
predictor.set_image(image)
# Generate masks
masks, scores, _ = predictor.predict(
point_coords=None,
point_labels=None,
box=box,
multimask_output=True
)
# Get the best mask
mask_idx = np.argmax(scores)
mask = masks[mask_idx]
Generating a Mask with Point Prompts
# Define foreground points [[x1, y1], [x2, y2], ...]
foreground_points = np.array([[150, 150], [200, 200]])
foreground_labels = np.array([1, 1]) # 1 for foreground
# Define background points (optional)
background_points = np.array([[50, 50], [350, 350]])
background_labels = np.array([0, 0]) # 0 for background
# Combine points and labels
point_coords = np.vstack((foreground_points, background_points))
point_labels = np.hstack((foreground_labels, background_labels))
# Generate masks
masks, scores, _ = predictor.predict(
point_coords=point_coords,
point_labels=point_labels,
box=None,
multimask_output=True
)
# Get the best mask
mask_idx = np.argmax(scores)
mask = masks[mask_idx]
Adding an Annotation
from sam_annotator.core.command_manager import AddAnnotationCommand
# Create contours from the mask
mask_uint8 = mask.astype(np.uint8) * 255
contours, _ = cv2.findContours(mask_uint8, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_TC89_KCOS)
if contours:
contour = max(contours, key=cv2.contourArea)
# Create flattened contour list
contour_list = contour.tolist()
if len(contour_list) > 0 and isinstance(contour_list[0], list) and len(contour_list[0]) == 1:
contour_list = [point[0] for point in contour_list]
# Create an annotation structure
annotation = {
'id': len(annotator.annotations),
'class_id': 1,
'class_name': annotator.class_names[1],
'box': [100, 100, 300, 300],
'contour': contour_list,
'contour_points': contour,
'mask': mask,
'display_box': [100, 100, 300, 300],
'area': np.sum(mask),
'metadata': {'annotation_mode': 'box'}
}
# Add the annotation using the command manager
command = AddAnnotationCommand(annotator.annotations, annotation, annotator.window_manager)
annotator.command_manager.execute(command)
Saving Annotations
# Make sure the session manager knows about the current image
annotator.session_manager.current_image_path = "path/to/image.jpg"
# Save annotations
success = annotator.file_manager.save_annotations(
annotations=[annotation],
image_name="path/to/image.jpg",
original_dimensions=image.shape[:2],
display_dimensions=image.shape[:2],
class_names=["class1"],
save_visualization=True
)
Exporting Annotations
# Export all annotations to COCO format
coco_path = annotator.file_manager.handle_export("coco", annotator.class_names)
# Export to YOLO format
yolo_path = annotator.file_manager.handle_export("yolo", annotator.class_names)
# Export to Pascal VOC format
pascal_path = annotator.file_manager.handle_export("pascal", annotator.class_names)
Advanced Usage
Batch Processing
Process multiple images in a folder:
import os
import cv2
import numpy as np
from sam_annotator.core import SAMAnnotator
# Initialize
annotator = SAMAnnotator(
checkpoint_path="path/to/checkpoint.pth",
category_path="path/to/category",
classes_csv="path/to/classes.csv"
)
# Get all images in the images folder
image_folder = os.path.join(annotator.category_path, "images")
image_files = [f for f in os.listdir(image_folder)
if f.endswith(('.jpg', '.jpeg', '.png'))]
# Get the predictor
predictor = annotator.predictor
for image_file in image_files:
image_path = os.path.join(image_folder, image_file)
# Load image
image = cv2.imread(image_path)
predictor.set_image(image)
# Set the current image path in the session manager
annotator.session_manager.current_image_path = image_path
# Example: generate a mask for center of the image
height, width = image.shape[:2]
center_x, center_y = width // 2, height // 2
box_size = min(width, height) // 3
box = np.array([
center_x - box_size // 2,
center_y - box_size // 2,
center_x + box_size // 2,
center_y + box_size // 2
]).reshape(1, 4)
# Generate mask
masks, scores, _ = predictor.predict(
point_coords=None,
point_labels=None,
box=box,
multimask_output=True
)
# Get the best mask
if scores.size > 0:
mask_idx = np.argmax(scores)
mask = masks[mask_idx]
# Add annotation
annotation = {
'mask': mask,
'class_id': 1,
'box': box[0].tolist(),
'area': np.sum(mask),
'metadata': {'annotation_mode': 'box'}
}
annotator.annotation_manager.add_annotation(annotation)
# Save annotations
annotator.file_manager.save_annotations(
annotations=[annotation],
image_name=image_file,
original_dimensions=(height, width),
display_dimensions=(height, width),
class_names=["class1"],
save_visualization=True
)
# Export all annotations
annotator.file_manager.handle_export("coco", annotator.class_names)
API Reference
SAMAnnotator Class
class SAMAnnotator:
"""Main class coordinating SAM-based image annotation."""
def __init__(self,
checkpoint_path: str,
category_path: str,
classes_csv: str,
sam_version: str = 'sam1',
model_type: str = None):
"""Initialize SAM annotator with all components."""
# Internal method, not meant for direct API use:
def _load_image(self, image_path: str) -> None:
"""Internal method to load image and its existing annotations."""
Predictor Classes
class BaseSAMPredictor:
"""Base class for SAM predictors."""
def initialize(self, checkpoint_path: str) -> None:
"""Initialize the predictor with a model checkpoint."""
def predict(self,
point_coords: np.ndarray = None,
point_labels: np.ndarray = None,
box: np.ndarray = None,
multimask_output: bool = True) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
"""Generate masks from the provided prompts."""
AnnotationManager Class
class AnnotationManager:
"""Manages annotations and their operations."""
def add_annotation(self, annotation: Dict) -> int:
"""Add a new annotation."""
def delete_annotation(self, annotation_id: int) -> bool:
"""Delete an annotation by ID."""
def select_annotation(self, annotation_id: int) -> Dict:
"""Select an annotation by ID."""
def modify_annotation(self, annotation_id: int, properties: Dict) -> bool:
"""Modify properties of an annotation."""
FileManager Class
class FileManager:
"""Manages file operations for annotations."""
def load_annotations(self, image_path: str) -> List[Dict]:
"""Load annotations for the specified image."""
def save_annotations(self,
annotations: List[Dict],
image_name: str,
original_dimensions: Tuple[int, int],
display_dimensions: Tuple[int, int],
class_names: List[str]) -> bool:
"""Save annotations for the specified image."""
def handle_export(self, format: str, class_names: List[str]) -> str:
"""Export annotations to the specified format."""
Building Custom Extensions
Creating a Custom Predictor
You can extend the predictor functionality by creating a custom predictor:
from sam_annotator.core import BaseSAMPredictor
import numpy as np
class CustomPredictor(BaseSAMPredictor):
def initialize(self, checkpoint_path: str) -> None:
"""Initialize with custom logic."""
# Your custom initialization code
def predict(self,
point_coords: np.ndarray = None,
point_labels: np.ndarray = None,
box: np.ndarray = None,
multimask_output: bool = True) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
"""Custom prediction implementation."""
# Your custom prediction logic
# Return format: (masks, scores, logits)
return masks, scores, logits
Creating a Custom Exporter
You can create a custom exporter for a new annotation format:
from sam_annotator.data.exporters import BaseExporter
class CustomExporter(BaseExporter):
def __init__(self, base_path: str):
super().__init__(base_path)
def export(self) -> str:
"""Export annotations to a custom format."""
# Get annotation data
annotations = self.load_all_annotations()
# Process annotations into custom format
# ...
# Save to file
export_path = self._get_export_path("custom")
with open(export_path, 'w') as f:
# Write your custom format
pass
return export_path
Error Handling
The API includes robust error handling for various common issues:
try:
# Initialize the annotator
annotator = SAMAnnotator(
checkpoint_path="path/to/checkpoint.pth",
category_path="path/to/category",
classes_csv="path/to/classes.csv"
)
# Try to load an image that might not exist
try:
image = annotator.load_image("non_existent_image.jpg")
except FileNotFoundError as e:
print(f"Error loading image: {e}")
# Try to generate a mask with invalid inputs
try:
mask = annotator.predict_mask_from_box([-100, -100, 100, 100])
except ValueError as e:
print(f"Invalid box coordinates: {e}")
except Exception as e:
print(f"General error: {e}")
Performance Considerations
When using the API programmatically, consider the following for optimal performance:
- Batch Processing: Process images in batches rather than one by one to amortize model loading time
- Memory Management: Clear unused objects to free memory, especially after processing large images
- GPU Utilization: SAM benefits significantly from GPU acceleration; ensure CUDA is properly configured
- Image Sizing: Consider resizing large images before processing to improve performance
- Error Handling: Implement robust error handling to avoid interruptions during batch processing
Coming Soon
The following API features are planned for future releases:
- Automatic Annotation: Functionality for automatic annotation suggestions
- Annotation Refinement: Methods to refine existing annotations
- Multi-Model Support: Integration with additional segmentation models
- Async Processing: Asynchronous processing for improved performance
- Web API: REST API for remote access to annotation functionality