ct.project¶

Functions for projecting 2D->3D or 3D->2D.

ct.project.points_to_pixels(points, K, T)[source]¶

Project 3D points in world coordinates to 2D pixel coordinates using the camera intrinsic and extrinsic parameters.

Parameters:

points (Float[ndarray, 'n 3']) – (N, 3) array of 3D points in world coordinates.
K (Float[ndarray, '3 3']) – (3, 3) camera intrinsic matrix.
T (Float[ndarray, '4 4']) – (4, 4) camera extrinsic matrix (world-to-camera transformation).

Returns:

(N, 2) array of pixel coordinates, where each row contains [x, y] coordinates. The x-coordinate corresponds to the image width (columns) and the y-coordinate corresponds to the image height (rows).

Return type:

Float[ndarray, ‘n 2’]

Examples

pixels = ct.project.points_to_pixels(points, K, T)

# Extract and round pixel coordinates
cols = pixels[:, 0]  # x-coordinates (width dimension)
rows = pixels[:, 1]  # y-coordinates (height dimension)
cols = np.round(cols).astype(np.int32)
rows = np.round(rows).astype(np.int32)

# Clamp to image boundaries
cols[cols >= width] = width - 1
cols[cols < 0] = 0
rows[rows >= height] = height - 1
rows[rows < 0] = 0

ct.project.points_to_depths(points, T)[source]¶

Convert 3D points in world coordinates to z-depths in camera coordinates.

Parameters:

points (Float[ndarray, 'n 3']) – (N, 3) array of 3D points in world coordinates.
T (Float[ndarray, '4 4']) – (4, 4) camera extrinsic matrix (world-to-camera transformation).

Returns:

(N,) array of z-depths in camera coordinates. Positive values indicate points in front of the camera, negative values indicate points behind the camera.

Return type:

Float[ndarray, ‘n’]

Note: The depth is z-depth instead of distance to the camera center.

ct.project.im_depth_to_point_cloud(im_depth, K, T, im_color=None, to_image=False, ignore_invalid=True, scale_factor=1.0)[source]¶

Convert a depth image to a 3D point cloud in world coordinates, optionally including color information. The point cloud can be returned in either a sparse format (N, 3) or a dense format matching the input image dimensions (H, W, 3).

Parameters:

im_depth (Float[ndarray, 'h w']) – (H, W) depth image in world scale, float32 or float64.
K (Float[ndarray, '3 3']) – (3, 3) camera intrinsic matrix.
T (Float[ndarray, '4 4']) – (4, 4) camera extrinsic matrix (world-to-camera transformation).
im_color (Float[ndarray, 'h w 3'] | None) – Optional (H, W, 3) color image in range [0, 1], float32/float64.
to_image (bool) – If True, returns a dense point cloud with shape (H, W, 3). If False, returns a sparse point cloud with shape (N, 3).
ignore_invalid (bool) – If True, filters out points with invalid depths (<= 0 or >= inf).
scale_factor (float) – Scaling factor for the input images. When scale_factor < 1, the images are downsampled and the intrinsic matrix is adjusted accordingly.

Returns:

im_color == None, to_image == False: returns (N, 3) array of 3D points
im_color == None, to_image == True: returns (H, W, 3) array of 3D points
im_color != None, to_image == False: returns (N, 3) array of 3D points and (N, 3) array of colors
im_color != None, to_image == True: returns (H, W, 3) array of 3D points and (H, W, 3) array of colors

Return type:

Single array or a tuple of two arrays