Configuration
In order to use BiaPy, a plain text YAML configuration file must be created using YACS. This configuration file includes information about the hardware to be used, such as the number of CPUs or GPUs, the specific task or workflow, the model to be used, optional hyperparameters, the optimizer, and the paths for loading and storing data.
As an example, a full pipeline for semantic segmentation can be created using this configuration file. This file would include information on the specific model to be used, any necessary hyperparameters, the optimizer to be used during training, and the paths for loading and storing data. This configuration file is an essential component of BiaPy and is used to streamline the execution of the pipeline and ensure reproducibility of the results.
PROBLEM:
TYPE: SEMANTIC SEG
NDIM: 2D
DATA:
PATCH_SIZE: (256, 256, 1)
TRAIN:
PATH: /TRAIN_PATH
GT_PATH: /TRAIN_GT_PATH
VAL:
SPLIT_TRAIN: 0.1
TEST:
PATH: /TEST_PATH
AUGMENTOR:
ENABLE: True
RANDOM_ROT: True
MODEL:
ARCHITECTURE: unet
TRAIN:
OPTIMIZER: ADAMW
LR: 1.E−4
BATCH_SIZE: 6
EPOCHS: 360
TEST:
POST_PROCESSING:
YZ_FILTERING: True
In order to run BiaPy, a YAML configuration file must be created. Examples for each workflow can be found in the templates folder on the BiaPy GitHub repository. If you are unsure about which workflow is most suitable for your data, you can refer to the Select Workflow page for guidance.
The options for the configuration file can be found in the config.py file on the BiaPy GitHub repository. However, some of the most commonly used options are explained below:
System
To limit the number of CPUs used by the program, use the SYSTEM.NUM_WORKERS
option.
Problem specification
To specify the type of workflow, use the PROBLEM.TYPE
option and select one of the following options: SEMANTIC_SEG
, INSTANCE_SEG
, DETECTION
, DENOISING
, SUPER_RESOLUTION
, SELF_SUPERVISED
, or CLASSIFICATION
.
To specify whether the data is 2D
or 3D
, use the PROBLEM.NDIM
option and select either 2D
or 3D
.
Data management
The DATA.PATCH_SIZE
variable is used to specify the shape of the images that will be used in the workflow. The order of the dimensions for 2D
images is (y,x,c)
and for 3D
images it is (z,y,x,c)
.
The paths for the training data are set using the DATA.TRAIN.PATH
and DATA.TRAIN.GT_PATH
variables (if necessary, depending on the specific workflow). Similarly, the paths for the validation data can be set using DATA.VAL.PATH
and DATA.VAL.GT_PATH
unless DATA.VAL.FROM_TRAIN
is set, in which case these variables do not need to be defined. For test data, the DATA.TEST.PATH
variable should be set if TEST.ENABLE
is True
. However, DATA.TEST.GT_PATH
is not used when DATA.TEST.LOAD_GT
is disabled, as there is usually no ground truth for test data.
There are two ways to handle the data during the workflow: 1) loading all images into memory at once, or 2) loading each image individually as it is needed. This behavior can be set for the training, validation, and test data using the DATA.TRAIN.IN_MEMORY
, DATA.VAL.IN_MEMORY
, and DATA.TEST.IN_MEMORY
variables, respectively.
When loading training data into memory, i.e. setting DATA.TRAIN.IN_MEMORY
to True
, all the images will be loaded into memory only once. During this process, each image will be divided into patches of size DATA.PATCH_SIZE
using DATA.TRAIN.OVERLAP
and DATA.TRAIN.PADDING
. By default, the minimum overlap is used, and the patches will always cover the entire image. In this configuration, the validation data can be created from the training data by setting DATA.VAL.SPLIT_TRAIN
to the desired percentage of the training data to be used as validation. For this, DATA.VAL.FROM_TRAIN
and DATA.VAL.IN_MEMORY
must be set to True
. In general, loading training data in memory is the fastest approach, but it relies on having enough memory available on the computer.
Alternatively, when data is not loaded into memory, i.e. DATA.TRAIN.IN_MEMORY
is set to False, a number of images equal to TRAIN.BATCH_SIZE
will be loaded from the disk for each training epoch. If an image does not match the selected shape, i.e. DATA.PATCH_SIZE
, you can use DATA.EXTRACT_RANDOM_PATCH
to extract a random patch from the image. As this approach requires loading each image multiple times, it is slower than the first approach but it saves memory.
In the case of test data, even if DATA.TEST.IN_MEMORY
is selected or not, each image is cropped to DATA.PATCH_SIZE
using DATA.TEST.OVERLAP
and DATA.TEST.PADDING
. Minimum overlap is made by default and the patches always cover the entire image. If ground truth is available you can set DATA.TEST.LOAD_GT
to load it and measure the performance of the model. The metrics used depends on the workflow selected.
See also
In general, in test phase, if for some reason the images loaded are smaller than the given patch size, i.e. DATA.PATCH_SIZE
, there will be no option to extract a patch from it. For that purpose the variable DATA.REFLECT_TO_COMPLETE_SHAPE
was created so the image can be reshaped in those dimensions to complete DATA.PATCH_SIZE
shape when needed.
See also
Set DATA.TRAIN.RESOLUTION
and DATA.TEST.RESOLUTION
to let the model know the resolution of training and test data respectively. In training, that information will be taken into account for some data augmentations. In test, that information will be used when the user selects to remove points from predictions in detection workflow.
Data normalization
Two options are available for normalizing the data:
Adjusting it to the
[0-1]
range, which is the default option. This can be done by settingDATA.NORMALIZATION.TYPE
todiv
.Custom normalization using a specified mean (
DATA.NORMALIZATION.CUSTOM_MEAN
) and standard deviation (DATA.NORMALIZATION.CUSTOM_STD
). This can be done by settingDATA.NORMALIZATION.TYPE
tocustom
. If the mean and standard deviation are both set to-1
, which is the default, they will be calculated based on the training data. These values will be stored in the job’s folder to be used at the inference phase, so that the test images are normalized using the same values. If specific values for mean and standard deviation are provided, those values will be used for normalization.
Pre-processing
There are a few pre-processing functions (controlled by DATA.PREPROCESS
) that can be applied to the train (DATA.PREPROCESS.TRAIN
), validation (DATA.PREPROCESS.VAL
) or test data (DATA.PREPROCESS.TEST
). So they can be applied the images need to be loaded in memory (DATA.*.IN_MEMORY
to True
). The pre-processing is done right after loading the images, when no normalization has been done yet. These is the list of available functions:
Resize (controlled by
DATA.PREPROCESS.RESIZE
): to resize images to the desired shape.Gaussian blur (controlled by
DATA.PREPROCESS.GAUSSIAN_BLUR
): to add gaussian blur.Median blur (controlled by
DATA.PREPROCESS.MEDIAN_BLUR
): to add median blur.CLAHE (controlled by
DATA.PREPROCESS.CLAHE
): to apply a contrast limited adaptive histogram equalization.Canny (controlled by
DATA.PREPROCESS.CANNY
): to apply Canny or edge detection (only for2D
images, grayscale or RGB).
Data augmentation
The AUGMENTOR.ENABLE
variable must be set to True
to enable data augmentation (DA). The probability of each transformation is set using the AUGMENTOR.DA_PROB
variable. BiaPy offers a wide range of transformations, which can be found in the config.py file in the BiaPy repository on GitHub.
Images generated using data augmentation will be saved in the PATHS.DA_SAMPLES
directory (which is aug
by default). This allows you to check the data augmentation applied to the images. If you want a more exhaustive check, you can save all the augmented training data by enabling DATA.CHECK_GENERATORS
. The images will be saved in PATHS.GEN_CHECKS
and PATHS.GEN_MASK_CHECKS
. Be aware that this option can consume a large amount of disk space as the training data will be entirely copied.
Model definition
BiaPy offers three different backends to be used to choose a model (controlled by MODEL.SOURCE
):
biapy
, which uses BiaPy as the backend for the model definition. UseMODEL.ARCHITECTURE
to select the model. Different models for each workflow are implemented:Semantic segmentation:
unet
,resunet
,resunet++
,attention_unet
,seunet
,multiresunet
andunetr
.Instance segmentation:
unet
,resunet
,resunet++
,attention_unet
,seunet
,multiresunet
andunetr
.Detection:
unet
,resunet
,resunet++
,attention_unet
andseunet
.Denoising:
unet
,resunet
,resunet++
,attention_unet
andseunet
.Super-resolution:
edsr
,rcan
,dfcan
,wdsr
,unet
,resunet
,resunet++
,attention_unet
,seunet
andmultiresunet
.Self-supervision:
edsr
,rcan
,dfcan
,wdsr
,unet
,resunet
,resunet++
,attention_unet
,seunet
,multiresunet
,unetr
,vit
andmae
.Classification:
simple_cnn
,efficientnet_b0
,efficientnet_b1
,efficientnet_b2
,efficientnet_b3
,efficientnet_b4
,efficientnet_b5
,efficientnet_b6
,efficientnet_b7
,vit
.
For
unet
,resunet
,resunet++
,attention_unet
andseunet
architectures you can setMODEL.FEATURE_MAPS
to determine the feature maps to use on each network level. In the same way,MODEL.DROPOUT_VALUES
can be set for each level in those networks. Forunetr
andvit
networks only the first value of those variables will be taken into account.The
MODEL.BATCH_NORMALIZATION
variable can be used to enable batch normalization on theunet
,resunet
,resunet++
,attention_unet
,seunet
andunetr
models. For the3D
versions of these networks (except forunetr
), theMODEL.Z_DOWN
option can also be used to avoid downsampling in the z-axis, which is typically beneficial for anisotropic data.The
MODEL.N_CLASSES
variable can be used to specify the number of classes for the classification problem, excluding the background class (labeled as0
). If the number of classes is set to1
or2
, the problem is considered binary, and the behavior is the same. For more than2
classes, the problem is considered multi-class, and the output of the models will have the corresponding number of channels.Finally, the
MODEL.LOAD_CHECKPOINT
variable can be used to load a pre-trained checkpoint of the network (for finetunning).torchvision
, which uses models defined in TorchVision. UseMODEL.TORCHVISION_MODEL_NAME
to select the model. All the models load pretrained their corresponding default weights. Currently, BiaPy supports the following models for each workflow:Semantic segmentation (defined here):
deeplabv3_mobilenet_v3_large
,deeplabv3_resnet101
,deeplabv3_resnet50
,fcn_resnet101
,fcn_resnet50
andlraspp_mobilenet_v3_large
.Instance segmentation (defined here):
maskrcnn_resnet50_fpn
andmaskrcnn_resnet50_fpn_v2
.Detection (defined here):
fasterrcnn_mobilenet_v3_large_320_fpn
,fasterrcnn_mobilenet_v3_large_fpn
,fasterrcnn_resnet50_fpn
,fasterrcnn_resnet50_fpn_v2
,fcos_resnet50_fpn
,ssd300_vgg16
,ssdlite320_mobilenet_v3_large
,retinanet_resnet50_fpn
,retinanet_resnet50_fpn_v2
.Denoising: no model available with TorchVision backend.
Super-resolution: no model available with TorchVision backend.
Self-supervision: no model available with TorchVision backend.
Classification (defined here):
alexnet
,convnext_base
,convnext_large
,convnext_small
,convnext_tiny
,densenet121
,densenet161
,densenet169
,densenet201
,efficientnet_b0
,efficientnet_b1
,efficientnet_b2
,efficientnet_b3
,efficientnet_b4
,efficientnet_b5
,efficientnet_b6
,efficientnet_b7
,efficientnet_v2_l
,efficientnet_v2_m
,efficientnet_v2_s
,googlenet
,inception_v3
,maxvit_t
,mnasnet0_5
,mnasnet0_75
,mnasnet1_0
,mnasnet1_3
,mobilenet_v2
,mobilenet_v3_large
,mobilenet_v3_small
,quantized_googlenet
,quantized_inception_v3
,quantized_mobilenet_v2
,quantized_mobilenet_v3_large
,quantized_resnet18
,quantized_resnet50
,quantized_resnext101_32x8d
,quantized_resnext101_64x4d
,quantized_shufflenet_v2_x0_5
,quantized_shufflenet_v2_x1_0
,quantized_shufflenet_v2_x1_5
,quantized_shufflenet_v2_x2_0
,regnet_x_16gf
,regnet_x_1_6gf
,regnet_x_32gf
,regnet_x_3_2gf
,regnet_x_400mf
,regnet_x_800mf
,regnet_x_8gf
,regnet_y_128gf
,regnet_y_16gf
,regnet_y_1_6gf
,regnet_y_32gf
,regnet_y_3_2gf
,regnet_y_400mf
,regnet_y_800mf
,regnet_y_8gf
,resnet101
,resnet152
,resnet18
,resnet34
,resnet50
,resnext101_32x8d
,resnext101_64x4d
,resnext50_32x4d
,retinanet_resnet50_fpn
,shufflenet_v2_x0_5
,shufflenet_v2_x1_0
,shufflenet_v2_x1_5
,shufflenet_v2_x2_0
,squeezenet1_0
,squeezenet1_1
,swin_b
,swin_s
,swin_t
,swin_v2_b
,swin_v2_s
,swin_v2_t
,vgg11
,vgg11_bn
,vgg13
,vgg13_bn
,vgg16
,vgg16_bn
,vgg19
,vit_b_16
,vit_b_32
,vit_h_14
,vit_l_16
,vit_l_32
,wide_resnet101_2
andwide_resnet50_2
.
bmz
, which uses Bioimage Model Zoo (bioimage.io) pretrained models. More a more models are added to the zoo so please check Bioimage Model Zoo page to see available models. BiaPy can only consume models exported with Torchscript.
Training phase
To activate the training phase, set the TRAIN.ENABLE
variable to True
. The TRAIN.OPTIMIZER
variable can be set to either SGD
, ADAM
or ADAMW
, and the learning rate can be set using the TRAIN.LR
variable. If you do not have much expertise in choosing these settings, you can use ADAMW
and 1.E-4
as a starting point. It is also possible to use a learning rate scheduler with TRAIN.LR_SCHEDULER
variable.
Additionally, you need to specify how many images will be fed into the network at the same time using the TRAIN.BATCH_SIZE
variable. For example, if you have 100
training samples and you select a batch size of 6
, this means that 17
batches (100/6 = 16.6
) are needed to input all the training data to the network, after which one epoch is completed.
To train the network, you need to specify the number of epochs using the TRAIN.EPOCHS
variable. You can also set the patience using TRAIN.PATIENCE
, which will stop the training process if no improvement is made on the validation data for that number of epochs.
Test phase
To initiate the testing phase, also referred to as inference or prediction, one must set the variable TEST.ENABLE
to True
within the BiaPy framework. BiaPy provides two distinct prediction options contingent upon the dimensions of the test images to be predicted. It is essential to consider that not only must the test image fit into memory, but also the model’s prediction, characterized by a data type of float32
(or float16
if TEST.REDUCE_MEMORY
is activated). Moreover, if the test image cannot be accommodated within the GPU memory, a cropping procedure becomes necessary. Typically, this entails cropping into patches with overlap and/or padding to circumvent border effects during the reconstruction of the original shape, albeit at the expense of increased memory usage. Given these considerations, two alternative procedures are available for predicting a test image:
When each test image can be fit in memory the procedure is the following. In this scenario there are two options:
First option, and the default, is where each test image is divided into patches of size
DATA.PATCH_SIZE
and passed through the network individually. Then, the original image will be reconstructed. Apart from this, it will automatically calculate performance metrics per patch and per reconstructed image if the ground truth is available (enabled byDATA.TEST.LOAD_GT
).Second option is to enable
TEST.FULL_IMG
, to pass entire images through the model without cropping them. This option requires enough GPU memory to fit the images into, so to prevent possible errors it is only available for2D
images.
In both options described above you can also use test-time augmentation by setting
TEST.AUGMENTATION
toTrue
, which will create multiple augmented copies of each patch, or image ifTEST.FULL_IMG
selected, by all possible rotations (8
copies in2D
and16
in3D
). This will slow down the inference process, but it will return more robust predictions.You can use also use
DATA.REFLECT_TO_COMPLETE_SHAPE
to ensure that the patches can be made as pointed out in Data management.See also
If the test images are large and you experience memory issues during the testing phase, you can set the
TEST.REDUCE_MEMORY
variable toTrue
. This will reduce memory usage as much as possible, but it may slow down the inference process.When each test image can not fit into memory (scalable solution):
BiaPy offers to use H5 or Zarr files to generate predictions by configuring
TEST.BY_CHUNKS
variable. In this setting,TEST.BY_CHUNKS.FORMAT
decides which files are you working with andTEST.BY_CHUNKS.INPUT_IMG_AXES_ORDER
sets the axis order (all the test images need to be order in the same way). This way, BiaPy enables multi-GPU processing per image by chunking large images into patches with overlap and padding to mitigate artifacts at the edges. Each GPU processes a chunk of the large image, storing the patch in its designated location using Zarr or H5 file formats. This is possible because these file formats facilitate reading and storing data chunks without requiring the entire file to be loaded into memory. Consequently, our approach allows the generation of predictions for large images, overcoming potential memory bottlenecks.Warning
There is also an option to generate a TIFF file from the predictions with
TEST.BY_CHUNKS.SAVE_OUT_TIF
. However, take into account that this option require to load the entire data into memory, which is sometimes not fleasible.After the prediction is generated the variable
TEST.BY_CHUNKS.WORKFLOW_PROCESS.ENABLE
controls whether the rest of the workflow process is going to be done or not (as may require large memory consumption depending on the workflow). If enabled, the prediction can be processed in two different ways (controlled byTEST.BY_CHUNKS.WORKFLOW_PROCESS.TYPE
):chunk_by_chunk
: prediction will be processed by chunks, where each chunk will be considered as an individual image. Select this operation if you have not enough memory to process the entire prediction image withentire_pred
.entire_pred
: the predicted image will be loaded in memory at once and processed entirely (be aware of your memory budget).
The option
chunk_by_chunk
is not trivial depending on the workflow, e.g. in instance segmentation different instances on each chunk need to be merged into one. Three workflows need to post-process the predictions to have a final result, semantic segmentation, instance segmentation and detection. Currently,chunk_by_chunk
is only supported in detection workflow.
Metric measurement
We can divide the metrics in two types:
Ones calculated in all the workflows that return probabilities from their models, e.g. semantic/instance segmentation and detection. Currently, it is measured the Intersection over Union (IoU).
Workflow-specific metrics tailored for the task at hand.
Post-processing
BiaPy is equipped with several post-processing methods that are primarily applied in two distinct stages:
After the network’s prediction. These post-processing methods are common among workflows that return probabilities from their models, e.g. semantic/instance segmentation and detection. These post-processing methods aim to improve the resulting probabilities. Currently, these post-processing methods are only avaialable for
3D
images (e.g.PROBLEM.NDIM
is2D
orTEST.ANALIZE_2D_IMGS_AS_3D_STACK
isTrue
):
TEST.POST_PROCESSING.APPLY_MASK
: a binary mask is applied to remove anything not contained within the mask. For this, theDATA.TEST.BINARY_MASKS
path needs to be set.
TEST.POST_PROCESSING.Z_FILTERING
: Z-axis filtering is applied for3D
data. Additionally, you can apply a YZ-axes filtering usingTEST.POST_PROCESSING.YZ_FILTERING
variable.
After each workflow main process is done there is another post-processing step on some of the workflows to achieve the final results, i.e. workflow-specific post-processing methods. Find a full description of each method inside the workflow description:
Instance segmentation:
Big instance repair
Filter instances by morphological features
Detection:
Remove close points
Create instances from points