Pair base generator
- class biapy.data.generators.pair_base_data_generator.PairBaseDataGenerator(ndim, X, Y, seed=0, data_mode='', data_paths=None, da=True, da_prob=0.5, rotation90=False, rand_rot=False, rnd_rot_range=(-180, 180), shear=False, shear_range=(-20, 20), zoom=False, zoom_range=(0.8, 1.2), shift=False, shift_range=(0.1, 0.2), affine_mode='constant', vflip=False, hflip=False, elastic=False, e_alpha=(240, 250), e_sigma=25, e_mode='constant', g_blur=False, g_sigma=(1.0, 2.0), median_blur=False, mb_kernel=(3, 7), motion_blur=False, motb_k_range=(3, 8), gamma_contrast=False, gc_gamma=(1.25, 1.75), brightness=False, brightness_factor=(1, 3), brightness_mode='2D', contrast=False, contrast_factor=(1, 3), contrast_mode='2D', brightness_em=False, brightness_em_factor=(1, 3), brightness_em_mode='2D', contrast_em=False, contrast_em_factor=(1, 3), contrast_em_mode='2D', dropout=False, drop_range=(0, 0.2), cutout=False, cout_nb_iterations=(1, 3), cout_size=(0.2, 0.4), cout_cval=0, cout_apply_to_mask=False, cutblur=False, cblur_size=(0.1, 0.5), cblur_down_range=(2, 8), cblur_inside=True, cutmix=False, cmix_size=(0.2, 0.4), cutnoise=False, cnoise_scale=(0.1, 0.2), cnoise_nb_iterations=(1, 3), cnoise_size=(0.2, 0.4), misalignment=False, ms_displacement=16, ms_rotate_ratio=0.0, missing_sections=False, missp_iterations=(30, 40), grayscale=False, channel_shuffle=False, gridmask=False, grid_ratio=0.6, grid_d_range=(0.4, 1), grid_rotate=1, grid_invert=False, gaussian_noise=False, gaussian_noise_mean=0, gaussian_noise_var=0.01, gaussian_noise_use_input_img_mean_and_var=False, poisson_noise=False, salt=False, salt_amount=0.05, pepper=False, pepper_amount=0.05, salt_and_pepper=False, salt_pep_amount=0.05, salt_pep_proportion=0.5, random_crops_in_DA=False, shape=(256, 256, 1), resolution=(-1,), prob_map=None, val=False, n_classes=1, extra_data_factor=1, n2v=False, n2v_perc_pix=0.198, n2v_manipulator='uniform_withCP', n2v_neighborhood_radius=5, n2v_structMask=array([[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]]), norm_dict=None, instance_problem=False, random_crop_scale=(1, 1), convert_to_rgb=False, multiple_raw_images=False)[source]
Bases:
Dataset
Custom BaseDataGenerator based on imgaug and our own augmentors.py transformations.
Based on microDL and Shervine’s blog.
- Parameters:
ndim (int) – Dimensions of the data (
2
for2D
and3
for 3D).X (4D/5D Numpy array) – Data. E.g.
(num_of_images, y, x, channels)
for2D
or(num_of_images, z, y, x, channels)
for3D
.Y (4D/5D Numpy array) – Mask data. E.g.
(num_of_images, y, x, channels)
for2D
or(num_of_images, z, y, x, channels)
for3D
.seed (int, optional) – Seed for random functions.
data_mode (str, optional) – Information about how the data needs to be managed. Options: [‘in_memory’, ‘not_in_memory’, ‘chunked_data’]
data_paths (List of str, optional) – If the data is in memory (
data_mode
=='in_memory'
), this list should contain the paths to load data and masks.data_paths[0]
should be data path anddata_paths[1]
masks path.da (bool, optional) – To activate the data augmentation.
da_prob (float, optional) – Probability of doing each transformation.
rotation90 (bool, optional) – To make square (90, 180,270) degree rotations.
rand_rot (bool, optional) – To make random degree range rotations.
rnd_rot_range (tuple of float, optional) – Range of random rotations. E. g.
(-180, 180)
.shear (bool, optional) – To make shear transformations.
shear_range (tuple of int, optional) – Degree range to make shear. E. g.
(-20, 20)
.zoom (bool, optional) – To make zoom on images.
zoom_range (tuple of floats, optional) – Zoom range to apply. E. g.
(0.8, 1.2)
.shift (float, optional) – To make shifts.
shift_range (tuple of float, optional) – Range to make a shift. E. g.
(0.1, 0.2)
.affine_mode (str, optional) – Method to use when filling in newly created pixels. Same meaning as in skimage (and numpy.pad()). E.g.
constant
,reflect
etc.vflip (bool, optional) – To activate vertical flips.
hflip (bool, optional) – To activate horizontal flips.
elastic (bool, optional) – To make elastic deformations.
e_alpha (tuple of ints, optional) – Strength of the distortion field. E. g.
(240, 250)
.e_sigma (int, optional) – Standard deviation of the gaussian kernel used to smooth the distortion fields.
e_mode (str, optional) – Parameter that defines the handling of newly created pixels with the elastic transformation.
g_blur (bool, optional) – To insert gaussian blur on the images.
g_sigma (tuple of floats, optional) – Standard deviation of the gaussian kernel. E. g.
(1.0, 2.0)
.median_blur (bool, optional) – To blur an image by computing median values over neighbourhoods.
mb_kernel (tuple of ints, optional) – Median blur kernel size. E. g.
(3, 7)
.motion_blur (bool, optional) – Blur images in a way that fakes camera or object movements.
motb_k_range (int, optional) – Kernel size to use in motion blur.
gamma_contrast (bool, optional) – To insert gamma constrast changes on images.
gc_gamma (tuple of floats, optional) – Exponent for the contrast adjustment. Higher values darken the image. E. g.
(1.25, 1.75)
.brightness (bool, optional) – To aply brightness to the images as PyTorch Connectomics.
brightness_factor (tuple of 2 floats, optional) – Strength of the brightness range, with valid values being
0 <= brightness_factor <= 1
. E.g.(0.1, 0.3)
.brightness_mode (str, optional) – Apply same brightness change to the whole image or diffent to slice by slice.
contrast (boolen, optional) – To apply contrast changes to the images as PyTorch Connectomics.
contrast_factor (tuple of 2 floats, optional) – Strength of the contrast change range, with valid values being
0 <= contrast_factor <= 1
. E.g.(0.1, 0.3)
.contrast_mode (str, optional) – Apply same contrast change to the whole image or diffent to slice by slice.
brightness_em (bool, optional) – To aply brightness to the images as PyTorch Connectomics.
brightness_em_factor (tuple of 2 floats, optional) – Strength of the brightness range, with valid values being
0 <= brightness_em_factor <= 1
. E.g.(0.1, 0.3)
.brightness_em_mode (str, optional) – Apply same brightness change to the whole image or diffent to slice by slice.
contrast_em (boolen, optional) – To apply contrast changes to the images as PyTorch Connectomics.
contrast_em_factor (tuple of 2 floats, optional) – Strength of the contrast change range, with valid values being
0 <= contrast_em_factor <= 1
. E.g.(0.1, 0.3)
.contrast_em_mode (str, optional) – Apply same contrast change to the whole image or diffent to slice by slice.
dropout (bool, optional) – To set a certain fraction of pixels in images to zero.
drop_range (tuple of floats, optional) – Range to take a probability
p
to drop pixels. E.g.(0, 0.2)
will take ap
folowing0<=p<=0.2
and then dropp
percent of all pixels in the image (i.e. convert them to black pixels).cutout (bool, optional) – To fill one or more rectangular areas in an image using a fill mode.
cout_nb_iterations (tuple of ints, optional) – Range of number of areas to fill the image with. E. g.
(1, 3)
.cout_size (tuple of floats, optional) – Range to select the size of the areas in % of the corresponding image size. Values between
0
and1
. E. g.(0.2, 0.4)
.cout_cval (int, optional) – Value to fill the area of cutout with.
cout_apply_to_mask (boolen, optional) – Whether to apply cutout to the mask.
cutblur (boolean, optional) – Blur a rectangular area of the image by downsampling and upsampling it again.
cblur_size (tuple of floats, optional) – Range to select the size of the area to apply cutblur on. E. g.
(0.2, 0.4)
.cblur_inside (boolean, optional) – If
True
only the region inside will be modified (cut LR into HR image). IfFalse
the50%
of the times the region inside will be modified (cut LR into HR image) and the other50%
the inverse will be done (cut HR into LR image). See Figure 1 of the official paper.cutmix (boolean, optional) – Combine two images pasting a region of one image to another.
cmix_size (tuple of floats, optional) – Range to select the size of the area to paste one image into another. E. g.
(0.2, 0.4)
.cnoise (boolean, optional) – Randomly add noise to a cuboid region in the image.
cnoise_scale (tuple of floats, optional) – Range to choose a value that will represent the % of the maximum value of the image that will be used as the std of the Gaussian Noise distribution. E.g.
(0.1, 0.2)
.cnoise_nb_iterations (tuple of ints, optional) – Number of areas with noise to create. E.g.
(1, 3)
.cnoise_size (tuple of floats, optional) – Range to choose the size of the areas to transform. E.g.
(0.2, 0.4)
.misalignment (boolean, optional) – To add miss-aligment augmentation.
ms_displacement (int, optional) – Maximum pixel displacement in xy-plane for misalignment.
ms_rotate_ratio (float, optional) – Ratio of rotation-based mis-alignment
missing_sections (boolean, optional) – Augment the image by creating a black line in a random position.
missp_iterations (tuple of 2 ints, optional) – Iterations to dilate the missing line with. E.g.
(30, 40)
.grayscale (bool, optional) – Whether to augment images converting partially in grayscale.
gridmask (bool, optional) – Whether to apply gridmask to the image. See the official paper for more information about it and its parameters.
grid_ratio (float, optional) – Determines the keep ratio of an input image (
r
in the original paper).grid_d_range (tuple of floats, optional) – Range to choose a
d
value. It represents the % of the image size. E.g.(0.4,1)
.grid_rotate (float, optional) – Rotation of the mask in GridMask. Needs to be between
[0,1]
where 1 is 360 degrees.grid_invert (bool, optional) – Whether to invert the mask of GridMask.
channel_shuffle (bool, optional) – Whether to shuflle the channels of the images.
gaussian_noise (bool, optional) – To apply Gaussian noise to the images.
gaussian_noise_mean (tuple of ints, optional) – Mean of the Gaussian noise.
gaussian_noise_var (tuple of ints, optional) – Variance of the Gaussian noise.
gaussian_noise_use_input_img_mean_and_var (bool, optional) – Whether to use the mean and variance of the input image instead of
gaussian_noise_mean
andgaussian_noise_var
.poisson_noise (bool, optional) – To apply Poisson noise to the images.
salt (tuple of ints, optional) – Mean of the gaussian noise.
salt_amount (tuple of ints, optional) – Variance of the gaussian noise.
pepper (bool, optional) – To apply poisson noise to the images.
pepper_amount (tuple of ints, optional) – Mean of the gaussian noise.
salt_and_pepper (bool, optional) – To apply poisson noise to the images.
salt_pep_amount (tuple of ints, optional) – Mean of the gaussian noise.
salt_pep_proportion (bool, optional) – To apply poisson noise to the images.
random_crops_in_DA (bool, optional) – Decide to make random crops in DA (before transformations).
shape (3D int tuple, optional) – Shape of the desired images when using ‘random_crops_in_DA’.
resolution (2D tuple of floats, optional) – Resolution of the given data
(y,x)
. E.g.(8,8)
.prob_map (4D Numpy array or str, optional) – If it is an array, it should represent the probability map used to make random crops when
random_crops_in_DA
is set. If str given should be the path to read these maps from.val (bool, optional) – Advise the generator that the images will be to validate the model to not make random crops (as the val. data must be the same on each epoch). Valid when
random_crops_in_DA
is set.n_classes (int, optional) – Number of classes.
extra_data_factor (int, optional) – Factor to multiply the batches yielded in a epoch. It acts as if
X
andY`
where concatenatedextra_data_factor
times.n2v (bool, optional) – Whether to create Noise2Void mask. Used in DENOISING problem type.
n2v_perc_pix (float, optional) – Input image pixels to be manipulated.
n2v_manipulator (str, optional) – How to manipulate the input pixels. Most pixel manipulators will compute the replacement value based on a neighborhood. Possible options: normal_withoutCP: samples the neighborhood according to a normal gaussian distribution, but without the center pixel; normal_additive: adds a random number to the original pixel value. The random number is sampled from a gaussian distribution with zero-mean and sigma = n2v_neighborhood_radius ; normal_fitted: uses a random value from a gaussian normal distribution with mean equal to the mean of the neighborhood and standard deviation equal to the standard deviation of the neighborhood ; identity: performs no pixel manipulation.
n2v_neighborhood_radius (int, optional) – Neighborhood size to use when manipulating the values.
n2v_structMask (Array of ints, optional) – Masking kernel for StructN2V to hide pixels adjacent to main blind spot. Value 1 = ‘hidden’, Value 0 = ‘non hidden’. Nested lists equivalent to ndarray. Must have odd length in each dimension (center pixel is blind spot).
None
implies normal N2V masking.norm_dict (dict, optional) – Normalization instructions.
instance_problem (bool, optional) – Advice the class that the workflow is of instance segmentation to divide the labels by channels.
random_crop_scale (tuple of ints, optional) – Scale factor the mask used in super-resolution workflow. E.g.
(2,2)
.convert_to_rgb (bool, optional) – In case RGB images are expected, e.g. if
crop_shape
channel is 3, those images that are grayscale are converted into RGB.multiple_raw_images (bool, optional) – Whether to consider more than one raw images or not. In this case, a folder per each sample is expected. Visit LightMyCells challenge approach for a real use case.
- load_sample(_idx)[source]
Load one data sample given its corresponding index.
- Parameters:
_idx (int) – Sample index counter.
- Returns:
img (3D/4D Numpy array) – X element. E.g.
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.mask (3D/4D Numpy array) – Y element. E.g.
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.
- norm_X(img)[source]
X data normalization.
- Parameters:
img (3D/4D Numpy array) – X element, for instance, an image. E.g.
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.- Returns:
img – X element normalized. E.g.
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.- Return type:
3D/4D Numpy array
- norm_Y(mask)[source]
Y data normalization.
- Parameters:
mask (3D/4D Numpy array) – Y element, for instance, an image’s mask. E.g.
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.- Returns:
mask – Y element normalized. E.g.
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.- Return type:
3D/4D Numpy array
- getitem(index)[source]
Generation of one pair of data.
- Parameters:
index (int) – Index counter.
- Returns:
item – X and Y (if avail) elements. Each one shape is
(z, y, x, channels)
if2D
or(y, x, channels)
if3D
.- Return type:
tuple of 3D/4D Numpy arrays
- apply_transform(image, mask, e_im=None, e_mask=None)[source]
Transform the input image and its mask at the same time with one of the selected choices based on a probability.
- Parameters:
image (3D/4D Numpy array) – Image to transform. E.g.
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.mask (3D/4D Numpy array) – Mask to transform. E.g.
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.e_img (3D/4D Numpy array) – Extra image to help transforming
image
. E.g.(y, x, channels)
in2D
or(z, y, x, channels)
in3D
.e_mask (3D/4D Numpy array) – Extra mask to help transforming
mask
. E.g.(y, x, channels)
in2D
or(z, y, x, channels)
in3D
.
- Returns:
image (3D/4D Numpy array) – Transformed image. E.g.
(y, x, channels)
in2D
or(y, x, z, channels)
in3D
.mask (3D/4D Numpy array) – Transformed image mask. E.g.
(y, x, channels)
in2D
or(y, x, z, channels)
in3D
.
- get_transformed_samples(num_examples, random_images=True, save_to_dir=True, out_dir='aug', train=False, draw_grid=True)[source]
Apply selected transformations to a defined number of images from the dataset.
- Parameters:
num_examples (int) – Number of examples to generate.
random_images (bool, optional) – Randomly select images from the dataset. If
False
the examples will be generated from the start of the dataset.save_to_dir (bool, optional) – Save the images generated. The purpose of this variable is to check the images generated by data augmentation.
out_dir (str, optional) – Name of the folder where the examples will be stored.
train (bool, optional) – To avoid drawing a grid on the generated images. This should be set when the samples will be used for training.
draw_grid (bool, optional) – Draw a grid in the generated samples. Useful to see some types of deformations.
- Returns:
sample_x (List of 3D/4D Numpy array) – Transformed images. E.g. list of
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.sample_y (List of 3D/4D Numpy array) – Transformed image mask. E.g. list of
(y, x, channels)
in2D
and(z, y, x, channels)
in3D
.
Examples
# EXAMPLE 1 # Generate 10 samples following with the example 1 of the class definition X_train = np.ones((1776, 256, 256, 1)) Y_train = np.ones((1776, 256, 256, 1)) data_gen_args = dict(X=X_train, Y=Y_train, shape=(256, 256, 1), rotation_range=True, vflip=True, hflip=True) train_generator = BaseDataGenerator(**data_gen_args) train_generator.get_transformed_samples(10, save_to_dir=True, train=False, out_dir='da_dir') # EXAMPLE 2 # If random crop in DA-time is choosen, as the example 2 of the class definition, the call should be the # same but two more images will be stored: img and mask representing the random crop extracted. There a # red point is painted representing the pixel choosen to be the center of the random crop and a blue # square which delimits crop boundaries prob_map = calculate_2D_volume_prob_map(Y_train, 0.94, 0.06, save_file='prob_map.npy') data_gen_args = dict(X=X_train, Y=Y_train, shape=(256, 256, 1), rotation_range=True, vflip=True, hflip=True, r random_crops_in_DA=True, prob_map=True, prob_map=prob_map) train_generator = BaseDataGenerator(**data_gen_args) train_generator.get_transformed_samples(10, save_to_dir=True, train=False, out_dir='da_dir')
Example 2 will store two additional images as the following:
Together with these images another pair of images will be stored: the crop made and a transformed version of it, which is really the generator output.
For instance, setting
elastic=True
the above extracted crop should be transformed as follows:The grid is only painted if
train=False
which should be used just to display transformations made. Selecting random rotations between 0 and 180 degrees should generate the following:
- draw_grid(im, grid_width=None)[source]
Draw grid of the specified size on an image.
- Parameters:
im (3D/4D Numpy array) – Image to draw the grid into. E.g.
(y, x, channels)
in2D
or(z, y, x, channels)
in3D
.grid_width (int, optional) – Grid’s width.
- prepare_n2v(_img)[source]
Creates Noise2Void mask.
- Parameters:
_img (3D/4D Numpy array) – Image to wipe some pixels from. E.g.
(y, x, channels)
in2D
or(z, y, x, channels)
in3D
.- Returns:
img (3D/4D Numpy array) – Input image modified removing some pixels. E.g.
(y, x, channels)
in2D
or(y, x, z, channels)
in3D
.mask (3D/4D Numpy array) – Noise2Void mask created. E.g.
(y, x, channels)
in2D
or(y, x, z, channels)
in3D
.