biapy.data.data_2D_manipulation

2D data manipulation utilities for biomedical image processing.

This module provides functions for processing 2D image data, particularly focused on:

  • Overlapping patch extraction and reconstruction

  • Image cropping and merging with configurable overlap

  • Shape validation and normalization

  • Memory-efficient handling of large 2D datasets

Key Features:

The module is optimized for biomedical image analysis workflows and supports:

  • Both HDF5 and numpy array inputs

  • Configurable padding and overlap strategies

  • Multi-image batch processing

  • Mask handling for segmentation tasks

Typical usage involves:

  1. Extracting patches from large images using crop_data_with_overlap()

  2. Processing patches through a neural network

  3. Reconstructing full images using merge_data_with_overlap()

Examples:
>>> from biapy.data.data_2D_manipulation import crop_data_with_overlap
>>> # Crop 512x512 images into 256x256 patches with 25% overlap
>>> patches, coords = crop_data_with_overlap(images, (256,256,1), overlap=(0.25,0.25))
Note:

All functions expect and return images in (y, x, channels) format by default.

biapy.data.data_2D_manipulation.crop_data_with_overlap(data: ndarray[tuple[int, ...], dtype[_ScalarType_co]], crop_shape: Tuple[int, ...], data_mask: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | None = None, overlap: Tuple[float, ...] = (0, 0), padding: Tuple[int, ...] = (0, 0), verbose: bool = True, load_data: bool = True) Tuple[ndarray[tuple[int, ...], dtype[_ScalarType_co]], ndarray[tuple[int, ...], dtype[_ScalarType_co]], List[PatchCoords]] | Tuple[ndarray[tuple[int, ...], dtype[_ScalarType_co]], List[PatchCoords]] | List[PatchCoords][source]

Crop data into small square pieces with overlap.

The difference with crop_data() is that this function allows you to create patches with overlap.

The opposite function is merge_data_with_overlap().

Parameters:
  • data (4D Numpy array) – Data to crop. E.g. (num_of_images, y, x, channels).

  • crop_shape (3 int tuple) – Shape of the crops to create. E.g. (y, x, channels).

  • data_mask (4D Numpy array, optional) – Data mask to crop. E.g. (num_of_images, y, x, channels).

  • overlap (Tuple of 2 floats, optional) – Amount of minimum overlap on x and y dimensions. The values must be on range [0, 1), that is, 0% or 99% of overlap. E. g. (y, x).

  • padding (tuple of ints, optional) – Size of padding to be added on each axis (y, x). E.g. (24, 24).

  • verbose (bool, optional) – To print information about the crop to be made.

  • load_data (bool, optional) – Whether to create the patches or not. It saves memory in case you only need the coordiantes of the cropped patches.

Returns:

  • cropped_data (4D Numpy array, optional) – Cropped image data. E.g. (num_of_images, y, x, channels). Returned if load_data is True.

  • cropped_data_mask (4D Numpy array, optional) – Cropped image data masks. E.g. (num_of_images, y, x, channels). Returned if load_data is True and data_mask is provided.

  • crop_coords (list of dict) –

    Coordinates of each crop where the following keys are available:
    • "z": image used to extract the crop.

    • "y_start": starting point of the patch in Y axis.

    • "y_end": end point of the patch in Y axis.

    • "x_start": starting point of the patch in X axis.

    • "x_end": end point of the patch in X axis.

Examples

# EXAMPLE 1
# Divide in crops of (256, 256) a given data with the minimum overlap
X_train = np.ones((165, 768, 1024, 1))
Y_train = np.ones((165, 768, 1024, 1))

X_train, Y_train = crop_data_with_overlap(X_train, (256, 256, 1), Y_train, (0, 0))

# Notice that as the shape of the data has exact division with the wnanted crops shape so no overlap will be
# made. The function will print the following information:
#     Minimum overlap selected: (0, 0)
#     Real overlapping (%): (0.0, 0.0)
#     Real overlapping (pixels): (0.0, 0.0)
#     (3, 4) patches per (x,y) axis
#     **** New data shape is: (1980, 256, 256, 1)

# EXAMPLE 2
# Same as example 1 but with 25% of overlap between crops
X_train, Y_train = crop_data_with_overlap(X_train, (256, 256, 1), Y_train, (0.25, 0.25))

# The function will print the following information:
#     Minimum overlap selected: (0.25, 0.25)
#     Real overlapping (%): (0.33203125, 0.3984375)
#     Real overlapping (pixels): (85.0, 102.0)
#     (4, 6) patches per (x,y) axis
#     **** New data shape is: (3960, 256, 256, 1)

# EXAMPLE 3
# Same as example 1 but with 50% of overlap between crops
X_train, Y_train = crop_data_with_overlap(X_train, (256, 256, 1), Y_train, (0.5, 0.5))

# The function will print the shape of the created array. In this example:
#     Minimum overlap selected: (0.5, 0.5)
#     Real overlapping (%): (0.59765625, 0.5703125)
#     Real overlapping (pixels): (153.0, 146.0)
#     (6, 8) patches per (x,y) axis
#     **** New data shape is: (7920, 256, 256, 1)

# EXAMPLE 4
# Same as example 2 but with 50% of overlap only in x axis
X_train, Y_train = crop_data_with_overlap(X_train, (256, 256, 1), Y_train, (0.5, 0))

# The function will print the shape of the created array. In this example:
#     Minimum overlap selected: (0.5, 0)
#     Real overlapping (%): (0.59765625, 0.0)
#     Real overlapping (pixels): (153.0, 0.0)
#     (6, 4) patches per (x,y) axis
#     **** New data shape is: (3960, 256, 256, 1)
biapy.data.data_2D_manipulation.merge_data_with_overlap(data: ndarray[tuple[int, ...], dtype[_ScalarType_co]], original_shape: Tuple[int, ...], data_mask: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | None = None, overlap: Tuple[float, ...] = (0, 0), padding: Tuple[int, ...] = (0, 0), verbose: bool = True) ndarray[tuple[int, ...], dtype[_ScalarType_co]] | Tuple[ndarray[tuple[int, ...], dtype[_ScalarType_co]], ndarray[tuple[int, ...], dtype[_ScalarType_co]]][source]

Merge data with an amount of overlap.

The opposite function is crop_data_with_overlap().

Parameters:
  • data (4D Numpy array) – Data to merge. E.g. (num_of_images, y, x, channels).

  • original_shape (4D int tuple) – Shape of the original data. E.g. (num_of_images, y, x, channels)

  • data_mask (4D Numpy array, optional) – Data mask to merge. E.g. (num_of_images, y, x, channels).

  • overlap (Tuple of 2 floats, optional) – Amount of minimum overlap on x and y dimensions. Should be the same as used in crop_data_with_overlap(). The values must be on range [0, 1), that is, 0% or 99% of overlap. E. g. (y, x).

  • padding (tuple of ints, optional) – Size of padding to be added on each axis (y, x). E.g. (24, 24).

  • verbose (bool, optional) – To print information about the crop to be made.

  • out_dir (str, optional) – If provided an image that represents the overlap made will be saved. The image will be colored as follows: green region when ==2 crops overlap, yellow when 2 < x < 6 and red when =<6 or more crops are merged.

  • prefix (str, optional) – Prefix to save overlap map with.

Returns:

  • merged_data (4D Numpy array) – Merged image data. E.g. (num_of_images, y, x, channels).

  • merged_data_mask (4D Numpy array, optional) – Merged image data mask. E.g. (num_of_images, y, x, channels).

biapy.data.data_2D_manipulation.ensure_2d_shape(img: ndarray[tuple[int, ...], dtype[_ScalarType_co]], path: str | None = None) ndarray[tuple[int, ...], dtype[_ScalarType_co]][source]

Read an image from a given path.

Parameters:
  • img (ndarray) – Image read.

  • path (str) – Path of the image (just use to print possible errors).

Returns:

img – Image read. E.g. (y, x, channels).

Return type:

Numpy 3D array