ct.image

Functions for manipulating images.

ct.image.crop_white_borders(im, padding=(0, 0, 0, 0))[source]

Crop white borders from an image and apply optional padding.

Parameters:
  • im (Float[ndarray, 'h w 3']) – Input float image in range [0.0, 1.0].

  • padding (Tuple[int, int, int, int]) – Padding to apply after cropping in the format (top, bottom, left, right). Defaults to (0, 0, 0, 0).

Returns:

Cropped and padded image.

Return type:

Float[ndarray, ‘h_cropped w_cropped 3’]

ct.image.compute_cropping(im)[source]

Compute white border sizes in pixels for 3-channel RGB images.

This function calculates the number of white pixels on each edge of a 3-channel RGB image. White pixels are defined as having values of (1.0, 1.0, 1.0).

Parameters:

im (Float[ndarray, 'h w 3']) – Input float image in range [0.0, 1.0].

Returns:

  • crop_t: Number of white pixels on the top edge

  • crop_b: Number of white pixels on the bottom edge

  • crop_l: Number of white pixels on the left edge

  • crop_r: Number of white pixels on the right edge

Return type:

Tuple[int, int, int, int]

Raises:

ValueError – If input image has invalid dtype or dimensions.

ct.image.apply_cropping_padding(im_src, cropping, padding)[source]

Apply cropping and padding to an RGB image.

Parameters:
  • im_src (Float[ndarray, 'h w 3']) – Source float image in range [0.0, 1.0].

  • cropping (Tuple[int, int, int, int]) – Cropping values in the format (crop_top, crop_bottom, crop_left, crop_right).

  • padding (Tuple[int, int, int, int]) – Padding values in the format (pad_top, pad_bottom, pad_left, pad_right).

Returns:

Cropped and padded image.

Raises:

ValueError – If input image has invalid dtype or dimensions.

Return type:

Float[ndarray, ‘h_cropped w_cropped 3’]

ct.image.apply_croppings_paddings(src_ims, croppings, paddings)[source]

Apply cropping and padding to a list of RGB images.

Parameters:
  • src_ims (List[Float[ndarray, 'h w 3']]) – List of source images.

  • croppings (List[Tuple[int, int, int, int]]) – List of 4-tuples: [(crop_t, crop_b, crop_l, crop_r), …]

  • paddings (List[Tuple[int, int, int, int]]) – List of 4-tuples: [(pad_t, pad_b, pad_l, pad_r), …]

Returns:

List of cropped and padded images.

Raises:
  • ValueError – If the number of croppings or paddings doesn’t match the

  • number of images, or if any cropping tuple has invalid length.

Return type:

List[Float[ndarray, ‘h_cropped w_cropped 3’]]

ct.image.get_post_croppings_paddings_shapes(src_shapes, croppings, paddings)[source]

Compute the shapes of images after applying cropping and padding.

Parameters:
Returns:

List of resulting image shapes in format (height_cropped, width_cropped, channels).

Return type:

List[Tuple[int, int, int]]

ct.image.overlay_mask_on_rgb(im_rgb, im_mask, overlay_alpha=0.4, overlay_color=array([0, 0, 1]))[source]

Overlay a mask on top of an RGB image with specified transparency and color.

Parameters:
  • im_rgb (Float[ndarray, 'h w 3']) – RGB image in range [0.0, 1.0].

  • im_mask (Float[ndarray, 'h w']) – Mask image in range [0.0, 1.0].

  • overlay_alpha (float) – Transparency level for the overlay, in range [0.0, 1.0]. Defaults to 0.4.

  • overlay_color (Float[ndarray, '3']) – Color for the overlay as a 3-channel array in range [0.0, 1.0]. Defaults to blue.

Returns:

Resulting image with mask overlay applied.

Raises:

AssertionError – If input images have invalid shapes, dtypes, or value ranges.

Return type:

Float[ndarray, ‘h w 3’]

ct.image.ndc_coords_to_pixels(ndc_coords, im_size_wh, align_corners=False)[source]

Convert Normalized Device Coordinates (NDC) to pixel coordinates.

Parameters:
  • ndc_coords (Float[ndarray, 'n 2']) – NDC coordinates. Each row represents (x, y) or (c, r). Most values shall be in [-1, 1], where (-1, -1) is the top left corner and (1, 1) is the bottom right corner.

  • im_size_wh (Tuple[int, int]) – Image size (width, height).

  • align_corners (bool) – Determines how NDC coordinates map to pixel coordinates. If True: -1 and 1 are aligned to the center of the corner pixels. If False: -1 and 1 are aligned to the corner of the corner pixels.

Returns:

Pixel coordinates as a float array.

Return type:

Float[ndarray, ‘n 2’]

Notes

This function is commonly used in computer graphics to map normalized coordinates to specific pixel locations in an image.

When align_corners is True, src and dst images are aligned by the center point of their corner pixels; when align_corners is False, src and dst images are aligned by the corner points of the corner pixels.

The NDC space does not have a “pixels size”, so we precisely align the extrema -1 and 1 to either the center or corner of the corner pixels.

ct.image.rotate(im, ccw_degrees)[source]

Rotate an image by a specified counter-clockwise angle.

Parameters:
  • im (Float[ndarray, 'h w c']) – Input image.

  • ccw_degrees (int) – Counter-clockwise rotation angle in degrees. Must be one of: 0, 90, 180, or 270.

Returns:

  • 0 or 180 degrees: (height, width, channels)

  • 90 or 270 degrees: (width, height, channels)

Return type:

Rotated image. The shape will depend on the rotation angle

Raises:

ValueError – If ccw_degrees is not one of the allowed values.

ct.image.recover_rotated_pixels(dst_pixels, src_wh, ccw_degrees)[source]

Convert pixel coordinates from a rotated image back to the original image space.

Parameters:
  • dst_pixels – Pixel coordinates in the rotated image. Each row is (col, row).

  • src_wh – Width and height of the original image.

  • ccw_degrees – Counter-clockwise rotation angle in degrees that was applied to create the rotated image. Must be one of: 0, 90, 180, or 270.

Returns:

Pixel coordinates in the original image space.

Raises:

ValueError – If ccw_degrees is not one of the allowed values.

Notes

This function is the inverse operation of image rotation. It maps coordinates from the rotated image back to the original image space.

ct.image.resize(im, shape_wh, aspect_ratio_fill=None, interpolation=1)[source]

Resize an image to a specified width and height, optionally maintaining aspect ratio.

Parameters:
  • im (Float[ndarray, 'h_src w_src'] | Float[ndarray, 'h_src w_src 3'] | UInt8[ndarray, 'h_src w_src'] | UInt8[ndarray, 'h_src w_src 3'] | UInt16[ndarray, 'h_src w_src'] | UInt16[ndarray, 'h_src w_src 3']) – Input image.

  • shape_wh (Tuple[int, int]) – Target size as (width, height) in pixels.

  • aspect_ratio_fill (float | Tuple[float, float, float] | ndarray | None) – Value(s) to use for padding when maintaining aspect ratio. If None, image is directly resized without maintaining aspect ratio. If provided, must match the number of channels in the input image.

  • interpolation (int) – OpenCV interpolation method (e.g., cv2.INTER_LINEAR).

Returns:

Resized image.

Return type:

Float[ndarray, ‘h_dst w_dst’] | Float[ndarray, ‘h_dst w_dst 3’] | UInt8[ndarray, ‘h_dst w_dst’] | UInt8[ndarray, ‘h_dst w_dst 3’] | UInt16[ndarray, ‘h_dst w_dst’] | UInt16[ndarray, ‘h_dst w_dst 3’]

Notes

  • When maintaining aspect ratio, the image is resized to fit within the target dimensions and padded with aspect_ratio_fill values as needed.

  • OpenCV uses (width, height) for image size while numpy uses (height, width).

ct.image.recover_resized_pixels(dst_pixels, src_wh, dst_wh, keep_aspect_ratio=True)[source]

Convert pixel coordinates from a resized image back to the original image space.

Parameters:
  • dst_pixels (Float[ndarray, 'n 2']) – Pixel coordinates in the resized image. Each row is (col, row).

  • src_wh (Tuple[int, int]) – Width and height of the original image.

  • dst_wh (Tuple[int, int]) – Width and height of the resized image.

  • keep_aspect_ratio (bool) – Whether aspect ratio was maintained during resizing. If True, accounts for any padding that was added to maintain aspect ratio.

Returns:

Pixel coordinates in the original image space.

Return type:

Float[ndarray, ‘n 2’]

Notes

  1. This function is paired with OpenCV’s cv2.resize() function, where the center of the top-left pixel is considered to be (0, 0).

    • Top-left corner: (-0.5, -0.5)

    • Bottom-right corner: (w - 0.5, h - 0.5)

    However, most other implementations in computer graphics treat the corner of the top-left pixel to be (0, 0). For more discussions, see: https://www.realtimerendering.com/blog/the-center-of-the-pixel-is-0-50-5/

  2. OpenCV’s image size is (width, height), while numpy’s array shape is (height, width) or (height, width, 3). Be careful with the order.

  3. This function is the inverse operation of image resizing.

  4. Coordinates are not rounded to integers and out-of-bound values are not corrected.

ct.image.make_corres_image(im_src, im_dst, src_pixels, dst_pixels, confidences=None, texts=None, point_color=(0, 1, 0, 1.0), line_color=(0, 0, 1, 0.75), text_color=(1, 1, 1), point_size=1, line_width=1, sample_ratio=None)[source]

Make correspondence image.

Parameters:
  • im_src (Float[ndarray, 'h w 3']) – Source image in range [0, 1].

  • im_dst (Float[ndarray, 'h w 3']) – Destination image in range [0, 1].

  • src_pixels (Int[ndarray, 'n 2']) – Source pixel coordinates. Each row represents (x, y) or (c, r).

  • dst_pixels (Int[ndarray, 'n 2']) – Destination pixel coordinates. Each row represents (x, y) or (c, r).

  • confidences (Float[ndarray, 'n'] | None) – Confidence values for each correspondence in range [0, 1].

  • texts (List[str] | None) – List of texts to draw on the top-left of the image.

  • point_color (Tuple[float, ...] | None) – RGB or RGBA color of the point in range [0, 1].

  • None (If point_color != None and confidences !=) – points will never be drawn.

  • None – point color will be determined by point_color.

  • None – point color will be determined by “viridis” colormap.

  • line_color (Tuple[float, ...] | None) – RGB or RGBA color of the line in range [0, 1].

  • text_color (Tuple[float, float, float]) – RGB color of the text in range [0, 1].

  • point_size (int) – Size of the point.

  • line_width (int) – Width of the line.

  • sample_ratio (float | None) – Float value from 0-1. If None, all points are drawn.

Returns:

Correspondence image.

Return type:

Float[ndarray, ‘h 2*w 3’]

ct.image.vstack_images(ims, alignment='left', background_color=(1.0, 1.0, 1.0))[source]

Vertically stack multiple images with optional alignment and background color.

Parameters:
  • ims (List[Float[ndarray, 'h w 3']]) – List of RGB images in range [0.0, 1.0].

  • alignment (Literal['left', 'center', 'right']) –

    Horizontal alignment of images in the stack. Must be one of:

    • ”left”: Align images to the left

    • ”center”: Center align images

    • ”right”: Align images to the right

    Defaults to “left”.

  • background_color (Tuple[float, float, float]) – Background color for the stacked image as (R, G, B) values in range [0.0, 1.0]. Defaults to white (1.0, 1.0, 1.0).

Returns:

Stacked image.

Raises:

ValueError – If images have invalid shapes, dtypes, or value ranges, or if alignment is not one of the allowed values.

Return type:

Float[ndarray, ‘h_stacked w_stacked 3’]