biapy.data.data_manipulationο
Data Manipulation Module for BiaPy.
This module provides a collection of functions for loading, processing, and manipulating biological image data for deep learning applications. It supports both 2D and 3D data formats, including common file types like TIFF, HDF5, Zarr, and NumPy arrays.
Key Functionalities:
Loading training, validation, and test data from various formats
Data preprocessing and normalization
Image cropping and patching with overlap
Data filtering based on various properties
Cross-validation and train-test splitting
Data augmentation and shape manipulation
Format conversion (e.g., to one-hot encoding)
Data saving in multiple formats
The module supports:
Both 2D and 3D image data
Multiple input formats (TIFF, HDF5, Zarr, NumPy arrays)
Classification and segmentation workflows
Memory-efficient loading of large datasets
Parallel processing capabilities
Data validation and consistency checks
Main Classes and Functions:
load_and_prepare_train_data(): Main function for loading training data
load_and_prepare_test_data(): Function for loading test data
load_and_prepare_cls_test_data(): For classification test data
samples_from_image_list(): Creates dataset from image list
samples_from_zarr(): Handles Zarr/HDF5 datasets
filter_samples_by_properties(): Filters data based on conditions
img_to_onehot_encoding(): Converts masks to one-hot format
save_tif(), save_npy_files(): Data saving utilities
Typical Workflow:
Load data using one of the load_and_prepare_* functions
Apply preprocessing/normalization
Filter or augment data as needed
Use in training or save processed data
- biapy.data.data_manipulation.load_and_prepare_train_data(train_path: str, train_mask_path: str, train_in_memory: str, train_ov: Tuple[float, ...], train_padding: Tuple[int, ...], val_path: str, val_mask_path: str, val_in_memory: bool, val_ov: Tuple[float, ...], val_padding: Tuple[int, ...], norm_module: Dict, crop_shape: Tuple[int, ...], cross_val: bool = False, cross_val_nsplits: int = 5, cross_val_fold: int = 1, val_split: float = 0.1, seed: int = 0, shuffle_val: bool = True, train_preprocess_f: Callable | None = None, train_preprocess_cfg: CfgNode | None = None, train_filter_props: List[List[str]] = [], train_filter_vals: List[List[float]] = [], train_filter_signs: List[List[str]] = [], val_preprocess_f: Callable | None = None, val_preprocess_cfg: CfgNode | None = None, val_filter_props: List[List[str]] = [], val_filter_vals: List[List[float]] = [], val_filter_signs: List[List[str]] = [], filter_by_entire_image: bool = True, norm_before_filter: bool = False, random_crops_in_DA: bool = False, y_upscaling: Tuple[int, ...] = (1, 1), gt_channels_expected: int = 1, reflect_to_complete_shape: bool = False, convert_to_rgb: bool = False, is_y_mask: bool = False, is_3d: bool = False, train_zarr_data_information: Dict | None = None, val_zarr_data_information: Dict | None = None, multiple_raw_images: bool = False, save_filtered_images: bool = True, save_filtered_images_dir: str | None = None, save_filtered_images_num: int = 3) Tuple[BiaPyDataset, BiaPyDataset, BiaPyDataset, BiaPyDataset][source]ο
Load training and validation data.
- Parameters:
train_path (str) β Path to the training data.
train_mask_path (str) β Path to the training data masks.
train_in_memory (str) β Whether the training data must be loaded in memory or not.
train_ov (2D/3D float tuple, optional) β Amount of minimum overlap on x and y dimensions for train data. The values must be on range
[0, 1), that is,0%or99%of overlap. Shape is(y, x)for 2D or(z, y, x)for 3D.train_padding (2D/3D int tuple, optional) β Size of padding to be added on each axis to the train data. Shape is
(y, x)for 2D or(z, y, x)for 3D.val_path (str) β Path to the validation data.
val_mask_path (str) β Path to the validation data masks.
val_in_memory (str) β Whether the validation data must be loaded in memory or not.
val_ov (2D/3D float tuple, optional) β Amount of minimum overlap on x and y dimensions for val data. The values must be on range
[0, 1), that is,0%or99%of overlap. Shape is(y, x)for 2D or(z, y, x)for 3D.val_padding (2D/3D int tuple, optional) β Size of padding to be added on each axis to the val data. Shape is
(y, x)for 2D or(z, y, x)for 3D.norm_module (Dict) β Information about the normalization.
crop_shape (3D/4D int tuple, optional) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.cross_val (bool, optional) β Whether to use cross validation or not.
cross_val_nsplits (int, optional) β Number of folds for the cross validation.
cross_val_fold (int, optional) β Number of the fold to be used as validation.
val_split (float, optional) β % of the train data used as validation (value between
0and1).seed (int, optional) β Seed value.
shuffle_val (bool, optional) β Take random training examples to create validation data.
train_preprocess_f (function, optional) β The train preprocessing function, is necessary in case you want to apply any preprocessing.
train_preprocess_cfg (dict, optional) β Configuration parameters for train preprocessing, is necessary in case you want to apply any preprocessing.
train_filter_props (list of lists of str) β Filter conditions to be applied to the train data. The three variables,
filter_props,filter_valsandfilter_valswill compose a list of conditions to remove the samples from the list. They are list of list of conditions. For instance, the conditions can be like this:[['A'], ['B','C']]. Then, if the sample satisfies the first list of conditions, only βAβ in this first case (from [βAβ] list), or satisfy βBβ and βCβ (from [βBβ,βCβ] list) it will be removed. In each sublist all the conditions must be satisfied. Available properties are: ['foreground','mean','min','max'].Each property descrition:
'foreground'is defined as the mask foreground percentage.'mean'is defined as the mean value.'min'is defined as the min value.'max'is defined as the max value.'diff'is defined as the difference between ground truth and raw images. Requirey_datasetto be provided.'diff_by_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between raw image max and min.'target_mean'is defined as the mean intensity value of the raw image targets. Requirey_datasetto be provided.'target_min'is defined as the min intensity value of the raw image targets. Requirey_datasetto be provided.'target_max'is defined as the max intensity value of the raw image targets. Requirey_datasetto be provided.'diff_by_target_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between ground truth image max and min.
train_filter_vals (list of int/float) β Represent the values of the properties listed in
train_filter_propsthat the images need to satisfy to not be dropped.train_filter_signs (list of list of str) β Signs to do the comparison for train data filtering. Options: [
'gt','ge','lt','le'] that corresponds to βgreather thanβ, e.g. β>β, βgreather equalβ, e.g. β>=β, βless thanβ, e.g. β<β, and βless equalβ e.g. β<=β comparisons.val_preprocess_f (function, optional) β The validation preprocessing function, is necessary in case you want to apply any preprocessing.
val_preprocess_cfg (dict, optional) β Configuration parameters for validation preprocessing, is necessary in case you want to apply any preprocessing.
val_filter_props (list of lists of str) β Filter conditions to be applied to the validation data. The three variables,
filter_props,filter_valsandfilter_valswill compose a list of conditions to remove the images from the list. They are list of list of conditions. For instance, the conditions can be like this:[['A'], ['B','C']]. Then, if the sample satisfies the first list of conditions, only βAβ in this first case (from [βAβ] list), or satisfy βBβ and βCβ (from [βBβ,βCβ] list) it will be removed. In each sublist all the conditions must be satisfied. Available properties are: ['foreground','mean','min','max'].Each property descrition:
'foreground'is defined as the mask foreground percentage.'mean'is defined as the mean value.'min'is defined as the min value.'max'is defined as the max value.'diff'is defined as the difference between ground truth and raw images. Requirey_datasetto be provided.'diff_by_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between raw image max and min.'target_mean'is defined as the mean intensity value of the raw image targets. Requirey_datasetto be provided.'target_min'is defined as the min intensity value of the raw image targets. Requirey_datasetto be provided.'target_max'is defined as the max intensity value of the raw image targets. Requirey_datasetto be provided.'diff_by_target_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between ground truth image max and min.
val_filter_vals (list of int/float) β Represent the values of the properties listed in
val_filter_propsthat the images need to satisfy to not be dropped.val_filter_signs (list of list of str) β Signs to do the comparison for validation data filtering. Options: [
'gt','ge','lt','le'] that corresponds to βgreather thanβ, e.g. β>β, βgreather equalβ, e.g. β>=β, βless thanβ, e.g. β<β, and βless equalβ e.g. β<=β comparisons.filter_by_entire_image (bool, optional) β
If filtering is done this will decide how the filtering will be done:
True: apply filter image by image.False: apply filtering sample by sample. Each sample represents a patch within an image.
norm_before_filter (bool, optional) β Whether to apply normalization before filtering. Be aware then that the values for filtering may change.
random_crops_in_DA (bool, optional) β To advice the method that not preparation of the data must be done, as random subvolumes will be created on DA, and the whole volume will be used for that.
y_upscaling (2D/3D int tuple, optional) β Upscaling to be done when loading Y data. User for super-resolution workflow.
gt_channels_expected (int, optional) β Expected number of channels in the GT.
reflect_to_complete_shape (bool, optional) β Wheter to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
convert_to_rgb (bool, optional) β In case RGB images are expected, e.g. if
crop_shapechannel is 3, those images that are grayscale are converted into RGB.is_y_mask (bool, optional) β Whether the data are masks. It is used to control the preprocessing of the data.
is_3d (bool, optional) β Whether if the expected images to read are 3D or not.
train_zarr_data_information (dict, optional) β Additional information when using Zarr/H5 files for training. The following keys are expected:
"raw_path", str: path where the raw images reside within the zarr (used whenmultiple_data_within_zarrisTrue)."gt_path", str: path where the mask images reside within the zarr (used whenmultiple_data_within_zarrisTrue)."use_gt_path", bool: whether the GT that should be used or not."multiple_data_within_zarr", bool: whether if your input Zarr contains the raw images and labels together or not."input_img_axes", tuple of int: order of the axes of the images."input_mask_axes", tuple of int: order of the axes of the masks.
val_zarr_data_information (dict, optional) β Additional information when using Zarr/H5 files for validation. Same keys as
train_zarr_data_informationare expected.multiple_raw_images (bool, optional) β When a folder of folders for each image is expected. In each of those subfolder different versions of the same image are placed. Visit the following tutorial for a real use case and a more detailed description: Light My Cells. This is used when
PROBLEM.IMAGE_TO_IMAGE.MULTIPLE_RAW_ONE_TARGET_LOADERis selected.save_filtered_images (bool, optional) β Whether to save or not filtered images.
save_filtered_images_dir (str, optional) β Directory to save filtered images.
save_filtered_images_num (int, optional) β Number of filtered images to save. Only work when
save_filtered_imagesisTrue.
- Returns:
X_train (BiaPyDataset) β Loaded train X dataset.
Y_train (BiaPyDataset) β Loaded train Y dataset.
X_val (list of dict) β Loaded validation X dataset.
Y_val (list of dict) β Loaded validation Y dataset.
- biapy.data.data_manipulation.load_and_prepare_test_data(test_path: str, test_mask_path: str | None, multiple_raw_images: bool | None = False, test_zarr_data_information: Dict | None = None) Tuple[BiaPyDataset, BiaPyDataset | None, List][source]ο
Load test data.
- Parameters:
test_path (str) β Path to the test data.
test_mask_path (str) β Path to the test data masks.
multiple_raw_images (bool, optional) β When a folder of folders for each image is expected. In each of those subfolder different versions of the same image are placed. Visit the following tutorial for a real use case and a more detailed description: Light My Cells. This is used when
PROBLEM.IMAGE_TO_IMAGE.MULTIPLE_RAW_ONE_TARGET_LOADERis selected.test_zarr_data_information (dict, optional) β
- Additional information when using Zarr/H5 files for test. The following keys are expected:
"raw_path", str: path where the raw images reside within the zarr."gt_path", str: path where the mask images reside within the zarr."use_gt_path", str: whether the GT that should be used or not.
- Returns:
X_train (list of dict) β
- Loaded train X data. Each item in the list represents a sample of the dataset. Each sample is represented as follows:
"filename", str: name of the image to extract the data sample from."dir", str: directory where the image resides.
Y_train (list of dict, optional) β
- Loaded train Y data. Each item in the list represents a sample of the dataset. Each sample is represented as follows:
"train_path", str: name of the image to extract the data sample from."dir", str: directory where the image resides.
test_filenames (list of str) β List of test filenames.
- biapy.data.data_manipulation.load_and_prepare_cls_test_data(test_path: str, norm_module: Dict, use_val_as_test: bool, expected_classes: int, crop_shape: Tuple[int, ...], is_3d: bool = True, reflect_to_complete_shape: bool = True, convert_to_rgb: bool = False, use_val_as_test_info: Dict | None = None)[source]ο
Load test data.
- Parameters:
train_path (str) β Path to the training data.
norm_module (Dict) β Information about the normalization.
use_val_as_test (bool) β Whether to use validation data as test.
expected_classes (int) β Expected number of classes to be loaded.
crop_shape (3D/4D int tuple) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.is_3d (bool, optional) β Whether the data to load is expected to be 3D or not.
reflect_to_complete_shape (bool, optional) β Wheter to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
convert_to_rgb (bool, optional) β In case RGB images are expected, e.g. if
crop_shapechannel is 3, those images that are grayscale are converted into RGB.use_val_as_test_info (dict, optional) β Additional information to create the test set based on the validation. Used when
use_val_as_testisTrue. The expected keys of the dictionary are as follows:"cross_val_samples_ids", list of int: ids of the validation samples (out of the cross validation)."train_path", str: training path, as the data must be extracted from there."selected_foldβ, int: fold selected in cross validation."n_splits", int: folds to create in cross validation."shuffle", bool: whether to shuffle the data or not."seed", int: mathematical seed.
- Returns:
X_test (list of dict) β
Loaded test data. Each item in the list represents a sample of the dataset. Each sample is represented as follows:
"filename", str: name of the image to extract the data sample from."dir", str: directory where the image resides."class_name", str: name of the class."class", int: represents the class (-1if no ground truth provided).
test_filenames (list of str) β List of test filenames.
- biapy.data.data_manipulation.load_data_from_dir(data_path: str, is_3d: bool = False) List[ndarray[tuple[int, ...], dtype[_ScalarType_co]]][source]ο
Create dataset samples from the given list.
- Parameters:
data_path (str) β Path to read the images from.
is_3d (bool, optional) β Whether if the expected images to read are 3D or not.
- biapy.data.data_manipulation.load_cls_data_from_dir(data_path: str, norm_module: Dict, expected_classes: int, crop_shape: Tuple[int, ...] | None, is_3d: bool = True, reflect_to_complete_shape: bool = True, convert_to_rgb: bool = False, preprocess_f: Callable | None = None, preprocess_cfg: Dict | None = None) BiaPyDataset[source]ο
Create dataset samples from the given list following a classification workflow directory tree.
- Parameters:
data_path (str) β Path to read the images from.
norm_module (Dict) β Information about the normalization.
expected_classes (int) β Expected number of classes to be loaded.
crop_shape (3D/4D int tuple, optional) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.is_3d (bool, optional) β Whether if the expected images to read are 3D or not.
reflect_to_complete_shape (bool, optional) β Wheter to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
convert_to_rgb (bool, optional) β In case RGB images are expected, e.g. if
crop_shapechannel is 3, those images that are grayscale are converted into RGB.preprocess_f (function, optional) β The preprocessing function, is necessary in case you want to apply any preprocessing.
preprocess_cfg (dict, optional) β Configuration parameters for preprocessing, is necessary in case you want to apply any preprocessing.
- Returns:
data_samples β Dataset created out of
data_path.- Return type:
- biapy.data.data_manipulation.load_and_prepare_train_data_cls(train_path: str, train_in_memory: bool, val_path: str, val_in_memory: bool, expected_classes: int, norm_module: Dict, crop_shape: Tuple[int, ...], cross_val: bool = False, cross_val_nsplits: int = 5, cross_val_fold: int = 1, val_split: float = 0.1, seed: int = 0, shuffle_val: bool = True, train_preprocess_f: Callable | None = None, train_preprocess_cfg: Dict | None = None, train_filter_props: List[List[str]] = [], train_filter_vals: List[List[float | int]] = [], train_filter_signs: List[List[str]] = [], val_preprocess_f: Callable | None = None, val_preprocess_cfg: Dict | None = None, val_filter_props: List[List[str]] = [], val_filter_vals: List[List[float | int]] = [], val_filter_signs: List[List[str]] = [], norm_before_filter: bool = False, reflect_to_complete_shape: bool = False, convert_to_rgb: bool = False, is_3d: bool = False)[source]ο
Load data to train classification methods.
- Parameters:
train_path (str) β Path to the training data.
train_in_memory (str) β Whether the train data must be loaded in memory or not.
val_path (str) β Path to the validation data.
val_in_memory (str) β Whether the validation data must be loaded in memory or not.
expected_classes (int) β Expected number of classes to be loaded.
norm_module (Dict) β Information about the normalization.
crop_shape (3D/4D int tuple) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.cross_val (bool, optional) β Whether to use cross validation or not.
cross_val_nsplits (int, optional) β Number of folds for the cross validation.
cross_val_fold (int, optional) β Number of the fold to be used as validation.
val_split (float, optional) β % of the train data used as validation (value between
0and1).seed (int, optional) β Seed value.
shuffle_val (bool, optional) β Take random training examples to create validation data.
train_preprocess_f (function, optional) β The train preprocessing function, is necessary in case you want to apply any preprocessing.
train_preprocess_cfg (dict, optional) β Configuration parameters for train preprocessing, is necessary in case you want to apply any preprocessing.
train_filter_props (list of lists of str) β Filter conditions to be applied to the train data. The three variables,
filter_props,filter_valsandfilter_valswill compose a list of conditions to remove the samples from the list. They are list of list of conditions. For instance, the conditions can be like this:[['A'], ['B','C']]. Then, if the sample satisfies the first list of conditions, only βAβ in this first case (from [βAβ] list), or satisfy βBβ and βCβ (from [βBβ,βCβ] list) it will be removed. In each sublist all the conditions must be satisfied. Available properties are: ['foreground','mean','min','max'].Each property descrition:
'foreground'is defined as the mask foreground percentage.'mean'is defined as the mean value.'min'is defined as the min value.'max'is defined as the max value.'diff'is defined as the difference between ground truth and raw images. Requirey_datasetto be provided.'diff_by_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between raw image max and min.'target_mean'is defined as the mean intensity value of the raw image targets. Requirey_datasetto be provided.'target_min'is defined as the min intensity value of the raw image targets. Requirey_datasetto be provided.'target_max'is defined as the max intensity value of the raw image targets. Requirey_datasetto be provided.'diff_by_target_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between ground truth image max and min.
train_filter_vals (list of int/float) β Represent the values of the properties listed in
train_filter_propsthat the images need to satisfy to not be dropped.train_filter_signs (list of list of str) β Signs to do the comparison for train data filtering. Options: [
'gt','ge','lt','le'] that corresponds to βgreather thanβ, e.g. β>β, βgreather equalβ, e.g. β>=β, βless thanβ, e.g. β<β, and βless equalβ e.g. β<=β comparisons.val_preprocess_f (function, optional) β The validation preprocessing function, is necessary in case you want to apply any preprocessing.
val_preprocess_cfg (dict, optional) β Configuration parameters for validation preprocessing, is necessary in case you want to apply any preprocessing.
val_filter_props (list of lists of str) β Filter conditions to be applied to the validation data. The three variables,
filter_props,filter_valsandfilter_valswill compose a list of conditions to remove the images from the list. They are list of list of conditions. For instance, the conditions can be like this:[['A'], ['B','C']]. Then, if the sample satisfies the first list of conditions, only βAβ in this first case (from [βAβ] list), or satisfy βBβ and βCβ (from [βBβ,βCβ] list) it will be removed. In each sublist all the conditions must be satisfied. Available properties are: ['foreground','mean','min','max'].Each property descrition:
'foreground'is defined as the mask foreground percentage.'mean'is defined as the mean value.'min'is defined as the min value.'max'is defined as the max value.'diff'is defined as the difference between ground truth and raw images. Requirey_datasetto be provided.'diff_by_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between raw image max and min.'target_mean'is defined as the mean intensity value of the raw image targets. Requirey_datasetto be provided.'target_min'is defined as the min intensity value of the raw image targets. Requirey_datasetto be provided.'target_max'is defined as the max intensity value of the raw image targets. Requirey_datasetto be provided.'diff_by_target_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between ground truth image max and min.
val_filter_vals (list of int/float) β Represent the values of the properties listed in
val_filter_propsthat the images need to satisfy to not be dropped.val_filter_signs (list of list of str) β Signs to do the comparison for validation data filtering. Options: [
'gt','ge','lt','le'] that corresponds to βgreather thanβ, e.g. β>β, βgreather equalβ, e.g. β>=β, βless thanβ, e.g. β<β, and βless equalβ e.g. β<=β comparisons.reflect_to_complete_shape (bool, optional) β Wheter to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
convert_to_rgb (bool, optional) β In case RGB images are expected, e.g. if
crop_shapechannel is 3, those images that are grayscale are converted into RGB.is_3d (bool, optional) β Whether if the expected images to read are 3D or not.
- Returns:
X_train (list of dict) β
Loaded train data. Each item in the list represents a sample of the dataset. Each sample is represented as follows:
"filename", str: name of the image to extract the data sample from."dir", str: directory where the image resides."class_name", str: name of the class."class", int: represents the class (-1if no ground truth provided)."img", ndarray (optional): image sample itself. It is of(y, x, channels)in2Dand(z, y, x, channels)in3D. Provided whenval_in_memoryisTrue.
X_val (list of dict) β
Loaded validation data. Each item in the list represents a sample of the dataset. Each sample is represented as follows:
"filename", str: name of the image to extract the data sample from."dir", str: directory where the image resides."class_name", str: name of the class."class", int: represents the class (-1if no ground truth provided)."img", ndarray (optional): image sample itself. It is of(y, x, channels)in2Dand(z, y, x, channels)in3D. Provided whenval_in_memoryisTrue.
x_val_ids (list of int) β Indexes of the samples beloging to the validation. Used in cross-validation.
- biapy.data.data_manipulation.samples_from_image_list(list_of_data: List[str], data_path: str, crop_shape: Tuple[int, ...], ov: Tuple[float, ...], padding: Tuple[int, ...], norm_module: Dict, crop: bool = True, is_mask: bool = False, is_3d: bool = True, reflect_to_complete_shape: bool = True, convert_to_rgb: bool = False, preprocess_f: Callable | None = None, preprocess_cfg: Dict | None = None) BiaPyDataset[source]ο
Create dataset samples from the given list. This function does not load the data.
- Parameters:
list_of_data (list of str) β Filenames of the images to read.
data_path (str) β Directory of the images to read.
crop_shape (3D/4D int tuple) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.ov (2D/3D float tuple) β Amount of minimum overlap on x and y dimensions. The values must be on range
[0, 1), that is,0%or99%of overlap. Shape is(y, x)for 2D or(z, y, x)for 3D.padding (2D/3D int tuple) β Size of padding to be added on each axis. Shape is
(y, x)for 2D or(z, y, x)for 3D.norm_module (Dict) β Information about the normalization.
crop (bool, optional) β Whether if the data needs to be cropped or not.
is_mask (bool, optional) β Whether the data are masks. It is used to control the preprocessing of the data.
is_3d (bool, optional) β Whether the data to load is expected to be 3D or not.
reflect_to_complete_shape (bool, optional) β Wheter to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
convert_to_rgb (bool, optional) β In case RGB images are expected, e.g. if
crop_shapechannel is 3, those images that are grayscale are converted into RGB.preprocess_f (function, optional) β The preprocessing function, is necessary in case you want to apply any preprocessing.
preprocess_cfg (dict, optional) β Configuration parameters for preprocessing, is necessary in case you want to apply any preprocessing.
- Returns:
dataset β Dataset.
- Return type:
- biapy.data.data_manipulation.samples_from_zarr(list_of_data: List[str], data_path: str, zarr_data_info: Dict, crop_shape: Tuple[int, ...], ov: Tuple[float, ...], padding: Tuple[int, ...], is_mask: bool = False, is_3d: bool = True) BiaPyDataset[source]ο
Create dataset samples from the given list. This function does not load the data.
- Parameters:
list_of_data (list of str) β Filenames of the images to read.
data_path (str) β Directory of the images to read.
zarr_data_info (dict) β
- Additional information when using Zarr/H5 files for training. The following keys are expected:
"raw_path": path where the raw images reside within the zarr (used whenmultiple_data_within_zarrisTrue)."gt_path": path where the mask images reside within the zarr (used whenmultiple_data_within_zarrisTrue)."multiple_data_within_zarr": Whether if your input Zarr contains the raw images and labels together or not."input_img_axes": order of the axes of the images."input_mask_axes": order of the axes of the masks.
crop_shape (3D/4D int tuple) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.ov (2D/3D float tuple, optional) β Amount of minimum overlap on x and y dimensions. The values must be on range
[0, 1), that is,0%or99%of overlap. Shape is(y, x)for 2D or(z, y, x)for 3D.padding (2D/3D int tuple, optional) β Size of padding to be added on each axis. Shape is
(y, x)for 2D or(z, y, x)for 3D.is_mask (bool, optional) β Whether the data are masks. It is used to control the preprocessing of the data.
is_3d (bool, optional) β Whether the data to load is expected to be 3D or not.
- Returns:
dataset β Dataset.
- Return type:
- biapy.data.data_manipulation.samples_from_image_list_multiple_raw_one_gt(data_path: str, gt_path: str, crop_shape: Tuple[int, ...], ov: Tuple[float, ...], padding: Tuple[int, ...], norm_module: Dict, crop: bool = True, is_3d: bool = True, reflect_to_complete_shape: bool = True, convert_to_rgb: bool = False, preprocess_f: Callable | None = None, preprocess_cfg: Dict | None = None) Tuple[BiaPyDataset, BiaPyDataset][source]ο
Create dataset samples from the given lists. This function does not load the data.
- Parameters:
data_path (str) β Directory of the images to read.
gt_path (str) β Directory to read ground truth images from.
crop_shape (3D/4D int tuple) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.ov (2D/3D float tuple) β Amount of minimum overlap on x and y dimensions. The values must be on range
[0, 1), that is,0%or99%of overlap. Shape is(y, x)for 2D or(z, y, x)for 3D.padding (2D/3D int tuple) β Size of padding to be added on each axis. Shape is
(y, x)for 2D or(z, y, x)for 3D.norm_module (Dict) β Information about the normalization.
crop (bool, optional) β Whether if the data needs to be cropped or not.
is_3d (bool, optional) β Whether the data to load is expected to be 3D or not.
reflect_to_complete_shape (bool, optional) β Wheter to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
convert_to_rgb (bool, optional) β In case RGB images are expected, e.g. if
crop_shapechannel is 3, those images that are grayscale are converted into RGB.preprocess_f (function, optional) β The preprocessing function, is necessary in case you want to apply any preprocessing.
preprocess_cfg (dict, optional) β Configuration parameters for preprocessing, is necessary in case you want to apply any preprocessing.
- Returns:
dataset (BiaPyDataset) β X dataset.
gt_dataset (BiaPyDataset) β Y dataset.
- biapy.data.data_manipulation.samples_from_class_list(data_path: str, norm_module: Dict, crop_shape: Tuple[int, ...] | None = None, expected_classes: int = -1, is_3d: bool = True, reflect_to_complete_shape: bool = True, convert_to_rgb: bool = False) BiaPyDataset[source]ο
Create dataset samples from the given path taking into account that each subfolder represents a class. This function does not load the data.
- Parameters:
data_path (str) β Directory of the images to read.
norm_module (Dict) β Information about the normalization.
crop_shape (3D/4D int tuple, optional) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.expected_classes (int, optional) β Expected number of classes to be loaded. Set to -1 if you donβt expect any.
is_3d (bool, optional) β Whether the data to load is expected to be 3D or not.
reflect_to_complete_shape (bool, optional) β Wheter to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
convert_to_rgb (bool, optional) β In case RGB images are expected, e.g. if
crop_shapechannel is 3, those images that are grayscale are converted into RGB.
- Returns:
sample_list β Samples generated out of
data_path.- Return type:
list of DataSample
- biapy.data.data_manipulation.filter_samples_by_properties(x_dataset: BiaPyDataset, is_3d: bool, filter_props: List[List[str]], filter_vals: List[List[float | int]], filter_signs: List[List[str]], crop_shape: Tuple[int, ...], reflect_to_complete_shape: bool = False, filter_by_entire_image: bool = True, norm_before_filter: bool = False, norm_module: Dict | None = None, y_dataset: BiaPyDataset | None = None, zarr_data_information: Dict | None = None, save_filtered_images: bool = True, save_filtered_images_dir: str | None = None, save_filtered_images_num: int = 3)[source]ο
Filter samples from
x_datasetusing defined conditions.The filtering will be done using the images each sample is extracted from. However, if
zarr_data_infois provided the function will assume that Zarr/h5 files are provided, so the filtering will be performed sample by sample.- Parameters:
x_dataset (BiaPyDataset) β X dataset to filter samples from.
is_3d (bool, optional) β Whether the data to load is expected to be 3D or not.
filter_props (list of lists of str) β Filter conditions to be applied. The three variables,
filter_props,filter_valsandfilter_valswill compose a list of conditions to remove the images from the list. They are list of list of conditions. For instance, the conditions can be like this:[['A'], ['B','C']]. Then, if the sample satisfies the first list of conditions, only βAβ in this first case (from [βAβ] list), or satisfy βBβ and βCβ (from [βBβ,βCβ] list) it will be removed. In each sublist all the conditions must be satisfied. Available properties are: ['foreground','mean','min','max',diff,target_mean,target_min,target_max]. Each property descrition:'foreground'is defined as the mask foreground percentage.'mean'is defined as the mean value.'min'is defined as the min value.'max'is defined as the max value.'diff'is defined as the difference between ground truth and raw images. Requirey_datasetto be provided.'diff_by_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between raw image max and min.'target_mean'is defined as the mean intensity value of the raw image targets. Requirey_datasetto be provided.'target_min'is defined as the min intensity value of the raw image targets. Requirey_datasetto be provided.'target_max'is defined as the max intensity value of the raw image targets. Requirey_datasetto be provided.'diff_by_target_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between ground truth image max and min.
filter_vals (list of int/float) β Represent the values of the properties listed in
filter_propsthat the images need to satisfy to not be dropped.filter_signs (list of list of str) β Signs to do the comparison. Options: [
'gt','ge','lt','le'] that corresponds to βgreather thanβ, e.g. β>β, βgreather equalβ, e.g. β>=β, βless thanβ, e.g. β<β, and βless equalβ e.g. β<=β comparisons.crop_shape (3D/4D int tuple) β Shape of the crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.reflect_to_complete_shape (bool, optional) β Wheter to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
filter_by_entire_image (bool, optional) β
This decides how the filtering is done:
True: apply filter image by image.False: apply filtering sample by sample. Each sample represents a patch within an image.
norm_before_filter (bool, optional) β Whether to apply normalization before filtering. Be aware then that the values for filtering may change.
norm_module (Dict) β Information about the normalization.
y_dataset (BiaPyDataset, optional) β Y dataset to filter samples from.
zarr_data_info (dict, optional) β
Additional information when using Zarr/H5 files for training. The following keys are expected:
"raw_path": path where the raw images reside within the zarr (used whenmultiple_data_within_zarrisTrue)."gt_path": path where the mask images reside within the zarr (used whenmultiple_data_within_zarrisTrue)."multiple_data_within_zarr": Whether if your input Zarr contains the raw images and labels together or not."input_img_axes": order of the axes of the images."input_mask_axes": order of the axes of the masks.
save_filtered_images (bool, optional) β Whether to save or not filtered images.
save_filtered_images_dir (str, optional) β Directory to save filtered images.
save_filtered_images_num (int, optional) β Number of filtered images to save. Only work when
save_filtered_imagesisTrue.
- Returns:
new_x_filenames (list of dict) β
x_datasetlist filtered.new_y_filenames (list of dict, optional) β
y_datasetlist filtered.
- biapy.data.data_manipulation.sample_satisfy_conds(img: ndarray[tuple[int, ...], dtype[_ScalarType_co]], filter_props: List[List[str]], filter_vals: List[List[float | int]], filter_signs: List[List[str]], mask: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | None = None, img_ratio: float = 0, mask_ratio: float | None = 0) bool[source]ο
Whether
imgsatisfy at least one of the conditions composed byfilter_props,filter_vals,filter_sings.- Parameters:
img (4D/5D Numpy array) β Image to check if satisfy conditions. E.g.
(z, y, x, num_classes)for 3D or(y, x, num_classes)for 2D.filter_props (list of lists of str) β Filter conditions to be applied. The three variables,
filter_props,filter_valsandfilter_valswill compose a list of conditions to remove the images from the list. They are list of list of conditions. For instance, the conditions can be like this:[['A'], ['B','C']]. Then, if the sample satisfies the first list of conditions, only βAβ in this first case (from [βAβ] list), or satisfy βBβ and βCβ (from [βBβ,βCβ] list) it will be removed. In each sublist all the conditions must be satisfied. Available properties are: ['foreground','mean','min','max']. Each property descrition:'foreground'is defined as the mask foreground percentage.'mean'is defined as the mean value of the input.'min'is defined as the min value of the input.'max'is defined as the max value of the input.'diff'is defined as the difference between ground truth and raw images. Requirey_datasetto be provided.'diff_by_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between raw image max and min.'target_mean'is defined as the mean intensity value of the raw image targets. Requirey_datasetto be provided.'target_min'is defined as the min intensity value of the raw image targets. Requirey_datasetto be provided.'target_max'is defined as the max intensity value of the raw image targets. Requirey_datasetto be provided.'diff_by_target_min_max_ratio'is defined as the difference between ground truth and raw images multiplied by the ratio between ground truth image max and min.
filter_vals (list of int/float) β Represent the values of the properties listed in
filter_propsthat the images need to satisfy to not be dropped.filter_signs (list of list of str) β Signs to do the comparison. Options: [
'gt','ge','lt','le'] that corresponds to βgreather thanβ, e.g. β>β, βgreather equalβ, e.g. β>=β, βless thanβ, e.g. β<β, and βless equalβ e.g. β<=β comparisons.mask (4D/5D Numpy array, optional) β Mask to check if satisfy βforegroundβ condition in
filter_props. E.g.(z, y, x, num_classes)for 3D or(y, x, num_classes)for 2D.img_ratio (float, optional) β Ratio of the input image. Expected to be
(img.max - img.min)of the entire image.mask_ratio (float, optional) β Minimum value of the entire image. Expected to be
(mask.max - mask.min)of the entire image.
- Returns:
satisfy_conds β Whether if the sample satisfy one of the conditions or not.
- Return type:
bool
- biapy.data.data_manipulation.load_images_to_dataset(dataset: BiaPyDataset, crop_shape: Tuple[int, ...] | None, reflect_to_complete_shape: bool = False, convert_to_rgb: bool = False, is_mask: bool = False, is_3d: bool = False, preprocess_cfg: Dict | None = None, preprocess_f: Callable | None = None, zarr_data_information: Dict | None = None)[source]ο
Load images into the
dataset: creating"img"key.The process done faster if the samples extracted from the same image are in continuous positions within the list.
- Parameters:
dataset (BiaPyDataset) β Loaded data.
crop_shape (3D/4D int tuple) β Shape of the expected crops. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.reflect_to_complete_shape (bool, optional) β Whether to increase the shape of the dimension that have less size than selected patch size padding it with βreflectβ.
convert_to_rgb (bool, optional) β In case RGB images are expected, e.g. if
crop_shapechannel is 3, those images that are grayscale are converted into RGB.preprocess_cfg (dict, optional) β Configuration parameters for preprocessing, is necessary in case you want to apply any preprocessing.
is_mask (bool, optional) β Whether the data are masks. It is used to control the preprocessing of the data.
preprocess_f (function, optional) β The preprocessing function, is necessary in case you want to apply any preprocessing.
is_3d (bool, optional) β Whether the data to load is expected to be 3D or not.
zarr_data_information (dict, optional) β Additional information of where to find the data within the Zarr files.
- biapy.data.data_manipulation.pad_and_reflect(img: ndarray[tuple[int, ...], dtype[_ScalarType_co]], crop_shape: Tuple[int, ...], verbose: bool = False) ndarray[tuple[int, ...], dtype[_ScalarType_co]][source]ο
Load data from a directory.
- Parameters:
img (3D/4D Numpy array) β Image to pad. E.g.
(y, x, channels)or(z, y, x, channels).crop_shape (Tuple of 3/4 int, optional) β Shape of the subvolumes to create when cropping. E.g.
(y, x, channels)or(z, y, x, channels).verbose (bool, optional) β Whether to output information.
- Returns:
img β Image padded. E.g.
(y, x, channels)for 2D and(z, y, x, channels)for 3D.- Return type:
3D/4D Numpy array
- biapy.data.data_manipulation.extract_patch_within_image(img: ndarray[tuple[int, ...], dtype[_ScalarType_co]], coords: PatchCoords, is_3d=False) ndarray[tuple[int, ...], dtype[_ScalarType_co]][source]ο
Extract patch within the image.
- Parameters:
img (3D/4D Numpy array) β Input image to extract the patch from. E.g.
(y, x, channels)in2Dand(z, y, x, channels)in3D.coords (dict) β
- Coordinates of the crop where the following keys are expected:
"z_start": starting point of the patch in Z axis."z_end": end point of the patch in Z axis."y_start": starting point of the patch in Y axis."y_end": end point of the patch in Y axis."x_start": starting point of the patch in X axis."x_end": end point of the patch in X axis.
is_3d (bool, optional) β Whether if the expected image to read is 3D or not.
- Returns:
img β X element. E.g.
(y, x, channels)in2Dand(z, y, x, channels)in3D.- Return type:
3D/4D Numpy array
- biapy.data.data_manipulation.img_to_onehot_encoding(img: ndarray[tuple[int, ...], dtype[_ScalarType_co]], num_classes: int = 2) ndarray[tuple[int, ...], dtype[_ScalarType_co]][source]ο
Convert image given into one-hot encode format.
The opposite function is
onehot_encoding_to_img().- Parameters:
img (Numpy 3D/4D array) β Image. E.g.
(y, x, channels)or(z, y, x, channels).num_classes (int, optional) β Number of classes to distinguish.
- Returns:
one_hot_labels β Data one-hot encoded. E.g.
(y, x, num_classes)or(z, y, x, num_classes).- Return type:
Numpy 3D/4D array
- biapy.data.data_manipulation.onehot_encoding_to_img(encoded_image: ndarray[tuple[int, ...], dtype[_ScalarType_co]]) ndarray[tuple[int, ...], dtype[_ScalarType_co]][source]ο
Convert one-hot encode image into an image with jus tone channel and all the classes represented by an integer.
The opposite function is
img_to_onehot_encoding().- Parameters:
encoded_image (Numpy 3D/4D array) β Image. E.g.
(y, x, channels)or(z, y, x, channels).- Returns:
img β Data one-hot encoded. E.g.
(z, y, x, num_classes).- Return type:
Numpy 3D/4D array
- biapy.data.data_manipulation.load_img_data(path: str, is_3d: bool = False, data_within_zarr_path: str | None = None) Tuple[ndarray[tuple[int, ...], dtype[Any]], str][source]ο
Load data from a given path.
- Parameters:
path (str) β Path to the image to read.
is_3d (bool, optional) β Whether if the expected image to read is 3D or not.
data_within_zarr_path (str, optional) β Path to find the data within the Zarr file. E.g. βvolumes.labels.neuron_idsβ.
- Returns:
data (Zarr, H5 or Numpy 3D/4D array) β Data read. E.g.
(z, y, x, channels)for 3D or(y, x, channels)for 2D.file (str) β File of the data read. Useful to close it in case it is an H5 file.
- biapy.data.data_manipulation.read_img_as_ndarray(path: str, is_3d: bool = False) ndarray[tuple[int, ...], dtype[_ScalarType_co]][source]ο
Read an image from a given path.
- Parameters:
path (str) β Path to the image to read.
is_3d (bool, optional) β Whether if the expected image to read is 3D or not.
- Returns:
img β Image read. E.g.
(z, y, x, channels)for 3D or(y, x, channels)for 2D.- Return type:
Numpy 3D/4D array
- biapy.data.data_manipulation.imread(path: str) ndarray[tuple[int, ...], dtype[_ScalarType_co]] | Tuple[ndarray[tuple[int, ...], dtype[_ScalarType_co]], str | None][source]ο
Read an image from a given path.
In the past from
skimage.io import imreadwas used but now it is deprecated.- Parameters:
path (str) β Path to the image to read.
- Returns:
img β Image read.
- Return type:
Numpy array
- biapy.data.data_manipulation.imwrite(path: str, image: ndarray[tuple[int, ...], dtype[_ScalarType_co]])[source]ο
Write
datain the givenpath.In the past from
skimage.io import imsavewas used but now it is deprecated.- Parameters:
path (str) β Path to the image to read.
image (Numpy array) β Image to store.
- biapy.data.data_manipulation.check_value(value: int | float | Tuple[int | float] | List[float | int] | ndarray[tuple[int, ...], dtype[_ScalarType_co]], value_range: Tuple[int | float, int | float] = (0, 1)) bool[source]ο
Check whether a value or a collection of values falls within a specified range.
This function supports individual values (int, float), lists or tuples of values, and NumPy arrays. If value is a list or tuple, all elements must fall within the specified value_range. For NumPy arrays, both the minimum and maximum values of the array must be within the range.
- Parameters:
value (int, float, list, tuple or np.ndarray) β The value or collection of values to check.
value_range (tuple of (int or float), optional) β A (min, max) tuple specifying the inclusive range of valid values. Default is (0, 1).
- Returns:
True if all values are within the specified range; False otherwise.
- Return type:
bool
- biapy.data.data_manipulation.data_range(x: ndarray[tuple[int, ...], dtype[_ScalarType_co]]) str[source]ο
Determine the value range of a NumPy array commonly used in image data.
This function checks whether the input array falls within one of the standard intensity ranges used in image processing: [0, 1], [0, 255], or [0, 65535], corresponding to normalized float, 8-bit, or 16-bit unsigned integer images, respectively.
- Parameters:
x (np.ndarray) β The input array whose range is to be determined.
- Returns:
A string indicating the value range: - β01 rangeβ for values in [0, 1] - βuint8 rangeβ for values in [0, 255] - βuint16 rangeβ for values in [0, 65535] - βnone_rangeβ if values fall outside these common ranges
- Return type:
str
- Raises:
ValueError β If the input is not a NumPy array.
- biapy.data.data_manipulation.check_masks(path: str, n_classes: int = 2, is_3d: bool = False)[source]ο
Check whether the data masks have the correct labels inspection a few random images of the given path.
If the function gives no error one should assume that the masks are correct.
- Parameters:
path (str) β Path to the data mask.
n_classes (int, optional) β Maximum classes that the masks must contain.
is_3d (bool, optional) β Whether if the expected image to read is 3D or not.
- biapy.data.data_manipulation.shape_mismatch_message(X_data: BiaPyDataset, Y_data: BiaPyDataset) str[source]ο
Build an error message with the shape mismatch between two provided data
X_dataandY_data.- Parameters:
X_data (BiaPyDataset) β X data.
Y_data (BiaPyDataset) β Y data.
- Returns:
mistmatch_message β Message containing which samples mismatch.
- Return type:
str
- biapy.data.data_manipulation.save_tif(X: ndarray[tuple[int, ...], dtype[_ScalarType_co]], data_dir: str, filenames: List[str] | None = None, verbose: bool = True)[source]ο
Save images in the given directory.
If the input file has a different dtype than np.uint8, np.uint16, np.float32 it is casted into np.float32 automatically. This is done because if not the axes are not correctly set when opening resulting images in Fiji/ImageJ.
- Parameters:
X (4D/5D numpy array) β Data to save as images. The first dimension must be the number of images. E.g.
(num_of_images, y, x, channels)or(num_of_images, z, y, x, channels).data_dir (str) β Path to store X images.
filenames (List, optional) β Filenames that should be used when saving each image.
verbose (bool, optional) β To print saving information.
- biapy.data.data_manipulation.save_tif_pair_discard(X: ndarray[tuple[int, ...], dtype[_ScalarType_co]], Y: ndarray[tuple[int, ...], dtype[_ScalarType_co]], data_dir: str, suffix: str = '', filenames: List | None = None, discard: bool = True, verbose: bool = True)[source]ο
Save images in the given directory.
- Parameters:
X (4D/5D numpy array) β Data to save as images. The first dimension must be the number of images. E.g.
(num_of_images, y, x, channels)or(num_of_images, z, y, x, channels).Y (4D/5D numpy array) β Data mask to save. The first dimension must be the number of images. E.g.
(num_of_images, y, x, channels)or(num_of_images, z, y, x, channels).data_dir (str) β Path to store X images.
suffix (str, optional) β Suffix to apply on output directory.
filenames (List, optional) β Filenames that should be used when saving each image.
discard (bool, optional) β Whether to discard image/mask pairs if the mask has no label information.
verbose (bool, optional) β To print saving information.
- biapy.data.data_manipulation.save_npy_files(X: ndarray[tuple[int, ...], dtype[_ScalarType_co]], data_dir: str, filenames: List[str] | None = None, verbose: bool = True)[source]ο
Save images in the given directory.
- Parameters:
X (4D/5D numpy array) β Data to save as images. The first dimension must be the number of images. E.g.
(num_of_images, y, x, channels)or(num_of_images, z, y, x, channels).data_dir (str) β Path to store X images.
filenames (List, optional) β Filenames that should be used when saving each image.
verbose (bool, optional) β To print saving information.
- biapy.data.data_manipulation.reduce_dtype(x: ndarray[tuple[int, ...], dtype[_ScalarType_co]], x_min: float, x_max: float, out_min: float = 0, out_max: float = 1, out_type: str = 'float32', eps: float = 1e-06) ndarray[tuple[int, ...], dtype[_ScalarType_co]][source]ο
Reduce the data type of the given input to the selected range.
It uses the following formula:
results = ((x - x_min)/(x_max - x_min)) * (out_max - out_min)- Parameters:
x (3D/4D Numpy array) β Image to reduce itβs data type. E.g.
(y, x, channels)in2Dand(z, y, x, channels)in3D.x_min (float) β
x_minin the formula above.x_max (float) β
x_maxin the formula above.out_min (float, optional) β
out_minin the formula above.out_max (float, optional) β
out_maxin the formula above.out_type (str, optional) β Type of the output data.
eps (float, optional) β Epsilon to use in order to avoid zero division.
- Returns:
x β Data type reduced image. E.g.
(y, x, channels)in2Dand(z, y, x, channels)in3D.- Return type:
3D/4D Numpy array
- biapy.data.data_manipulation.resize(input_data, size, mode='bilinear', **kwargs)[source]ο
Resize a multi-dimensional image tensor or array to a specified size.
This function resizes 2D or 3D image data in either PyTorch tensor or NumPy array format using appropriate interpolation methods. The input is expected to follow common conventions for image dimensions.
Supported input formats: - PyTorch tensor of shape (B, C, H, W) for 2D or (B, C, D, H, W) for 3D data - NumPy array of shape (B, H, W, C) for 2D or (B, D, H, W, C) for 3D data
- Parameters:
input_data (torch.Tensor or np.ndarray) β The image data to be resized.
size (tuple of int) β Target size for each dimension. Must match the number of dimensions in input_data. Only spatial dimensions are resized (e.g., H, W, D), batch and channel dimensions are preserved.
mode (str, optional) β Interpolation mode to use. Must be one of the keys in interp_mode_map. Defaults to βbilinearβ.
**kwargs (dict) β Additional arguments passed to torch.nn.functional.interpolate or skimage.transform.resize.
- Returns:
The resized image data in the same format as the input.
- Return type:
torch.Tensor or np.ndarray
- Raises:
ValueError β If the length of size does not match the number of dimensions in input_data, or if an unsupported interpolation mode is specified.
TypeError β If input_data is neither a PyTorch tensor nor a NumPy array.
- biapy.data.data_manipulation.decide_dtype(num_values: int) dtype[source]ο
Decide the smallest unsigned integer dtype that can hold the given number of values.
- Parameters:
num_values (int) β The number of distinct values that need to be represented.
- Returns:
The smallest unsigned integer dtype that can represent num_values distinct values. Possible return values are np.uint8, np.uint16, or np.uint32.
- Return type:
np.dtype
- Raises:
ValueError β If num_values is negative or exceeds the maximum representable by np.uint32.