Single base generator

class biapy.data.generators.single_base_data_generator.SingleBaseDataGenerator(ndim, X, Y, data_path, ptype, n_classes, seed=0, data_mode='', da=True, da_prob=0.5, rotation90=False, rand_rot=False, rnd_rot_range=(-180, 180), shear=False, shear_range=(-20, 20), zoom=False, zoom_range=(0.8, 1.2), shift=False, shift_range=(0.1, 0.2), affine_mode='constant', vflip=False, hflip=False, elastic=False, e_alpha=(240, 250), e_sigma=25, e_mode='constant', g_blur=False, g_sigma=(1.0, 2.0), median_blur=False, mb_kernel=(3, 7), motion_blur=False, motb_k_range=(3, 8), gamma_contrast=False, gc_gamma=(1.25, 1.75), dropout=False, drop_range=(0, 0.2), val=False, resize_shape=None, norm_dict=None, convert_to_rgb=False)[source]

Bases: Dataset

Custom BaseDataGenerator based on imgaug and our own augmentors.py transformations.

Based on microDL and Shervine’s blog.

Parameters:
  • ndim (int) – Dimensions of the data (2 for 2D and 3 for 3D).

  • X (4D/5D Numpy array) – Data. E.g. (num_of_images, y, x, channels) for 2D or (num_of_images, z, y, x, channels) for 3D.

  • Y (2D Numpy array) – Image class. (num_of_images, class).

  • data_path (List of str, optional) – If the data is in memory (data_mode == 'in_memory') this should contain the path to load images.

  • ptype (str) – Problem type. Options [‘mae’,’classification’].

  • n_classes (int) – Number of classes to predict.

  • seed (int, optional) – Seed for random functions.

  • data_mode (str, optional) – Information about how the data needs to be managed. Options: [‘in_memory’, ‘not_in_memory’, ‘chunked_data’]

  • da (bool, optional) – To activate the data augmentation.

  • da_prob (float, optional) – Probability of doing each transformation.

  • rotation90 (bool, optional) – To make square (90, 180,270) degree rotations.

  • rand_rot (bool, optional) – To make random degree range rotations.

  • rnd_rot_range (tuple of float, optional) – Range of random rotations. E. g. (-180, 180).

  • shear (bool, optional) – To make shear transformations.

  • shear_range (tuple of int, optional) – Degree range to make shear. E. g. (-20, 20).

  • zoom (bool, optional) – To make zoom on images.

  • zoom_range (tuple of floats, optional) – Zoom range to apply. E. g. (0.8, 1.2).

  • shift (float, optional) – To make shifts.

  • shift_range (tuple of float, optional) – Range to make a shift. E. g. (0.1, 0.2).

  • affine_mode (str, optional) – Method to use when filling in newly created pixels. Same meaning as in skimage (and numpy.pad()). E.g. constant, reflect etc.

  • vflip (bool, optional) – To activate vertical flips.

  • hflip (bool, optional) – To activate horizontal flips.

  • elastic (bool, optional) – To make elastic deformations.

  • e_alpha (tuple of ints, optional) – Strength of the distortion field. E. g. (240, 250).

  • e_sigma (int, optional) – Standard deviation of the gaussian kernel used to smooth the distortion fields.

  • e_mode (str, optional) – Parameter that defines the handling of newly created pixels with the elastic transformation.

  • g_blur (bool, optional) – To insert gaussian blur on the images.

  • g_sigma (tuple of floats, optional) – Standard deviation of the gaussian kernel. E. g. (1.0, 2.0).

  • median_blur (bool, optional) – To blur an image by computing median values over neighbourhoods.

  • mb_kernel (tuple of ints, optional) – Median blur kernel size. E. g. (3, 7).

  • motion_blur (bool, optional) – Blur images in a way that fakes camera or object movements.

  • motb_k_range (int, optional) – Kernel size to use in motion blur.

  • gamma_contrast (bool, optional) – To insert gamma constrast changes on images.

  • gc_gamma (tuple of floats, optional) – Exponent for the contrast adjustment. Higher values darken the image. E. g. (1.25, 1.75).

  • dropout (bool, optional) – To set a certain fraction of pixels in images to zero.

  • drop_range (tuple of floats, optional) – Range to take a probability p to drop pixels. E.g. (0, 0.2) will take a p folowing 0<=p<=0.2 and then drop p percent of all pixels in the image (i.e. convert them to black pixels).

  • val (bool, optional) – Advise the generator that the images will be to validate the model to not make random crops (as the val. data must be the same on each epoch). Valid when random_crops_in_DA is set.

  • resize_shape (tuple of ints, optional) – If defined the input samples will be scaled into that shape.

  • norm_dict (dict, optional) – Normalization instructions.

  • convert_to_rgb (bool, optional) – In case RGB images are expected, e.g. if crop_shape channel is 3, those images that are grayscale are converted into RGB.

abstract save_aug_samples(img, orig_image, i, pos, out_dir, draw_grid)[source]
abstract ensure_shape(img, mask)[source]
load_sample(idx)[source]

Load one data sample given its corresponding index.

Parameters:

idx (int) – Sample index counter.

Returns:

  • img (3D/4D Numpy array) – X element. E.g. (y, x, channels) in 2D and (z, y, x, channels) in 3D.

  • class (int) – Y element.

getitem(index)[source]

Generation of one pair of data.

Parameters:

index (int) – Index counter.

Returns:

item – X and Y (if avail) elements. X is (z, y, x, channels) if 3D or (y, x, channels) if 2D. Y is an integer.

Return type:

tuple of 3D/4D Numpy arrays

apply_transform(image)[source]

Transform the input image with one of the selected choices based on a probability.

Parameters:

image (3D/4D Numpy array) – Image to transform. E.g. (y, x, channels) in 2D or (z, y, x, channels) in 3D.

Returns:

image – Transformed image. E.g. (y, x, channels) in 2D or (z, y, x, channels) in 3D.

Return type:

3D/4D Numpy array

draw_grid(im, grid_width=None)[source]

Draw grid of the specified size on an image.

Parameters:
  • im (3D/4D Numpy array) – Image to be modified. E.g. (y, x, channels) in 2D or (z, y, x, channels) in 3D.

  • grid_width (int, optional) – Grid’s width.

get_transformed_samples(num_examples, random_images=True, save_to_dir=True, out_dir='aug', train=False, draw_grid=True)[source]

Apply selected transformations to a defined number of images from the dataset.

Parameters:
  • num_examples (int) – Number of examples to generate.

  • random_images (bool, optional) – Randomly select images from the dataset. If False the examples will be generated from the start of the dataset.

  • save_to_dir (bool, optional) – Save the images generated. The purpose of this variable is to check the images generated by data augmentation.

  • out_dir (str, optional) – Name of the folder where the examples will be stored.

  • train (bool, optional) – To avoid drawing a grid on the generated images. This should be set when the samples will be used for training.

  • draw_grid (bool, optional) – Draw a grid in the generated samples. Useful to see some types of deformations.

Returns:

sample_x – Batch of data. E.g. (num_examples, y, x, channels) in 2D or (num_examples, z, y, x, channels) in 3D.

Return type:

4D/5D Numpy array

get_data_normalization()[source]