Object Detection with AngelCV

AngelCV provides a comprehensive suite for object detection tasks, currently centered around the YOLOv10 model architecture. This section delves into the details of using the YoloDetectionModel and its high-level interface ObjectDetectionModel.

Core Model: `YoloDetectionModel`

The YoloDetectionModel (located in angelcv.model.yolo) is the backbone of our object detection capabilities. It's a PyTorch Lightning module that implements the YOLOv10 architecture.

Key features:

Configurable Architecture: The model's backbone and detection head can be customized through configuration files.
PyTorch Lightning Integration: Leverages PyTorch Lightning for streamlined training, validation, and testing loops, along with features like multi-GPU support, mixed-precision training, and logging.
Weight Initialization: Implements proper weight initialization techniques crucial for stable and effective training of YOLO models.
Loss Calculation: Utilizes DetectionLoss and EndToEndDetectionLoss for computing losses during training.
Metrics: Integrates torchmetrics.detection.mean_ap.MeanAveragePrecision for robust mAP calculation.

Initialization

Typically, you won't interact directly with YoloDetectionModel for initialization. Instead, you'll use the ObjectDetectionModel interface.

However, understanding its core constructor is useful:

# from angelcv.model.yolo import YoloDetectionModel
# from angelcv.config import Config
# model = YoloDetectionModel(config: Config)

config: A Config object containing the model architecture, training parameters, dataset information, etc.

Loading from Checkpoint

A crucial method for resuming training or deploying a trained model is load_from_checkpoint_custom:

# from angelcv.model.yolo import YoloDetectionModel
# from pathlib import Path
# model = YoloDetectionModel.load_from_checkpoint_custom(checkpoint_path: Path | str)

This class method loads model weights and configuration from a saved checkpoint file (.ckpt or .pt). It intelligently matches weights, even if the model architecture in the checkpoint differs slightly from the current definition (e.g., changes in the number of classes).

Forward Pass (Inference)

The forward method processes an input image tensor and returns raw model predictions.

# images: torch.Tensor # Input image(s)
# predictions = model.forward(images)

Training, Validation, and Testing Steps

These methods are standard PyTorch Lightning hooks:

training_step(batch, batch_idx): Processes a batch of data, calculates loss, and logs training metrics.
validation_step(batch, batch_idx): Processes a validation batch, calculates loss, and updates mAP metrics.
test_step(batch, batch_idx): Similar to validation_step, but for the test set.

Epoch-level hooks like on_validation_epoch_end and on_test_epoch_end compute and log aggregated metrics (e.g., mAP over the entire validation set).

Optimizer Configuration

The configure_optimizers method sets up the optimizer (AdamW) and learning rate schedulers (Linear Warmup followed by Linear Decay). It separates parameters into groups with and without weight decay (BatchNorm layers and biases typically don't use weight decay).

Updating Number of Classes

The update_num_classes(num_classes: int) method is vital for transfer learning. If you load a pre-trained model (e.g., on COCO with 80 classes) and want to fine-tune it on a custom dataset with a different number of classes, this method adapts the model's detection head. It attempts to preserve weights where possible for a smoother transition.

High-Level Interface: `ObjectDetectionModel`

The ObjectDetectionModel (in angelcv.interface.object_detection) provides a user-friendly API, abstracting many of the complexities of YoloDetectionModel and PyTorch Lightning.

from angelcv.interface import ObjectDetectionModel

Initialization

# Initialize from a configuration file (e.g., "yolov10n.yaml")
model = ObjectDetectionModel("yolov10n.yaml")

# Load a pre-trained model from a checkpoint file
model = ObjectDetectionModel("yolov10n.ckpt") # or "path/to/your/model.ckpt"

If a .yaml file is provided, a new model is built based on that configuration.
If a .ckpt or .pt file is provided, a pre-trained model is loaded from that checkpoint.

AngelCV can automatically resolve file paths, looking in predefined locations (like angelcv/config/model/) or S3 buckets if configured.

Prediction (Inference)

The primary way to perform inference is by calling the model instance or using the predict method:

# results = model("image.jpg") # Equivalent to model.predict(...)
results = model.predict(
    source: str | Path | torch.Tensor | np.ndarray | list[...],
    confidence_th: float = 0.3,
    image_size: int | None = None,
)

source: Can be a single image path, URL, PIL image, NumPy array, PyTorch tensor, or a list of these for batch processing.
confidence_th: Threshold to filter detections. Default is 0.3.
image_size: Target size for the longest side of the image during preprocessing. If None, uses the model's default image size.

This method returns a list of InferenceResult objects, one for each input image.

Training

results = model.train(
    dataset: str | Path, # Path to dataset YAML config (e.g., "coco.yaml")
    image_size: int = None,
    batch_size: int = None,
    num_workers: int = None,
    patience: int = 0, # For early stopping
    max_epochs: int = 100,
    # ... other PyTorch Lightning Trainer arguments (accelerator, devices, etc.)
)

Key arguments:

dataset: Path to a dataset configuration YAML file (see Configuration section).
image_size, batch_size, num_workers: Override default training parameters.
patience: Number of epochs to wait for validation loss improvement before early stopping.
**kwargs: You can pass any valid argument for the PyTorch Lightning Trainer (e.g., accelerator="gpu", devices=1, precision="16-mixed").

This method handles:

Updating the model configuration with the dataset information.
Adjusting the model's number of classes if the new dataset has a different count.
Setting up the appropriate data module (CocoDataModule or YOLODataModule).
Configuring PyTorch Lightning callbacks (ModelCheckpoint, EarlyStopping, LearningRateMonitor, etc.).
Running the training loop.

Returns a dictionary with training results.

Validation

results = model.validation(
    dataset: str | Path | None = None, # Optional: path to dataset YAML for validation
    # ... other arguments similar to train() for overriding datamodule settings
)

If dataset is not provided, it attempts to use the dataset information from the last training run or the model's loaded configuration.

Testing

results = model.test(
    dataset: str | Path | None = None, # Optional: path to dataset YAML for testing
    # ... other arguments similar to train() for overriding datamodule settings
)

Similar to validation, this runs the model on a test set and computes metrics.

Exporting Models

AngelCV supports exporting models to standard formats like ONNX for deployment.

exported_path = model.export(
    format: str = "onnx", # Currently ONNX is the primary supported format
    output_path: str | None = None, # Optional: specify output file path
    # ... other format-specific export arguments (e.g., opset for ONNX)
)

format: The target export format (e.g., "onnx").
output_path: If not provided, a default path is generated.

Returns the Path to the exported model file.

Object Detection with AngelCV

Core Model: YoloDetectionModel

Initialization

Loading from Checkpoint

Forward Pass (Inference)

Training, Validation, and Testing Steps

Optimizer Configuration

Updating Number of Classes

High-Level Interface: ObjectDetectionModel

Initialization

Prediction (Inference)

Training

Validation

Testing

Exporting Models

Core Model: `YoloDetectionModel`

High-Level Interface: `ObjectDetectionModel`