biapy.models.unext_v1

This module implements the U-NeXt architecture (Version 1), a U-Net variant that integrates elements from the ConvNeXt model.

It aims to combine the strong hierarchical feature learning of U-Nets with the modern ConvNeXt design principles, which are inspired by Vision Transformers but retain the efficiency and inductive biases of convolutional networks.

U-NeXt_V1 is designed for both 2D and 3D image segmentation tasks. It features a ConvNeXt-style encoder and decoder, with specialized blocks for downsampling, upsampling, and the bottleneck. It supports various configurations, including optional super-resolution, multi-head outputs, and stochastic depth for regularization.

Classes:

  • U_NeXt_V1: The main U-NeXt model (Version 1).

This module relies on building blocks defined in biapy.models.blocks, such as UpConvNeXtBlock_V1, ConvNeXtBlock_V1, and ProjectionHead.

References:

Image representation:

../../_images/unext.png
class biapy.models.unext_v1.U_NeXt_V1(image_shape=(256, 256, 1), feature_maps=[32, 64, 128, 256], upsample_layer='convtranspose', z_down=[2, 2, 2, 2], yx_down=[2, 2, 2, 2], output_channels=[1], separated_decoders=False, output_channel_info=['F'], explicit_activations: bool = False, head_activations: List[str] = ['ce_sigmoid'], upsampling_factor=(), upsampling_position='pre', stochastic_depth_prob=0.1, layer_scale=1e-06, cn_layers=[2, 2, 2, 2], isotropy=True, stem_k_size=2, contrast: bool = False, contrast_proj_dim: int = 256, return_one_tensor: bool = False)[source]

Bases: Module

Create 2D/3D U-NeXt (Version 1) model.

U-NeXt combines the classic U-Net architecture with modern ConvNeXt blocks, aiming to leverage both the strong hierarchical feature learning of U-Nets and the efficiency and inductive biases of ConvNeXt. It is designed for biomedical image segmentation tasks.

Reference: U-Net: Convolutional Networks for Biomedical Image Segmentation, A ConvNet for the 2020s.

forward(x) Dict | Tensor[source]

Forward pass of the model.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, channels, height, width) for 2D or (batch_size, channels, depth, height, width) for 3D.

Returns:

Model output. Returns a dictionary if multi-head or contrastive outputs are enabled, otherwise returns the main prediction tensor.

Return type:

Dict or torch.Tensor