biapy.models.unext_v2

This module implements the U-NeXt (Version 2) architecture, a U-Net based model that incorporates the latest advancements from ConvNeXt V2 blocks.

It aims to combine the strong hierarchical feature learning of U-Nets with the improved design principles of ConvNeXt V2, which are co-designed and scaled with Masked Autoencoders for enhanced performance.

U-NeXt_V2 is designed for both 2D and 3D image segmentation tasks. It features a ConvNeXt V2-style encoder and decoder, with specialized blocks for downsampling, upsampling, and the bottleneck. It supports various configurations, including optional super-resolution, multi-head outputs, and stochastic depth for regularization.

Classes:

U_NeXt_V2: The main U-NeXt model (Version 2).

This module relies on building blocks defined in biapy.models.blocks, such as UpConvNeXtBlock_V2, ConvNeXtBlock_V2, and ProjectionHead.

References:

Image representation:

class biapy.models.unext_v2.U_NeXt_V2(image_shape=(256, 256, 1), feature_maps=[32, 64, 128, 256], upsample_layer='convtranspose', z_down=[2, 2, 2, 2], yx_down=[2, 2, 2, 2], output_channels=[1], separated_decoders=False, output_channel_info=['F'], explicit_activations: bool = False, head_activations: List[str] = ['ce_sigmoid'], upsampling_factor=(), upsampling_position='pre', stochastic_depth_prob=0.1, cn_layers=[2, 2, 2, 2], isotropy=True, stem_k_size=2, contrast: bool = False, contrast_proj_dim: int = 256, return_one_tensor: bool = False)[source]

Bases: Module

Create 2D/3D U-NeXt V2 (U-Net based model with ConvNeXt V2 blocks).

U-NeXt V2 combines the classic U-Net architecture with modern ConvNeXt V2 blocks, leveraging the co-design and scaling principles from Masked Autoencoders. This model aims to achieve high performance in biomedical image segmentation by integrating strong hierarchical feature learning with efficient and robust convolutional designs.

Reference: U-Net: Convolutional Networks for Biomedical Image Segmentation, ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders.

forward(x) → Dict | Tensor[source]

Forward pass of the model.

Parameters:: x (torch.Tensor) – Input tensor of shape (batch_size, channels, height, width) for 2D or (batch_size, channels, depth, height, width) for 3D.
Returns:: Model output. Returns a dictionary if multi-head or contrastive outputs are enabled, otherwise returns the main prediction tensor.
Return type:: Dict or torch.Tensor