ct.render¶

Functions for controlled rendering of 3D geometries to images or depth images.

ct.render.render_geometries(geometries, K=None, T=None, view_status_str=None, height=720, width=1280, point_size=1.0, line_radius=None, to_depth=False, visible=False)[source]¶

Render Open3D geometries to an image using the specified camera parameters. This function may require a display.

Parameters:

geometries (List[Geometry3D]) – List of Open3D geometries to render. Supported types are TriangleMesh, PointCloud, and LineSet.
K (Float[ndarray, '3 3'] | None) – Camera intrinsic matrix. If None, uses Open3D’s default camera inferred from the geometries. Must be provided if T is provided.
T (Float[ndarray, '4 4'] | None) – Camera extrinsic matrix (world-to-camera transformation). If None, uses Open3D’s default camera inferred from the geometries. Must be provided if K is provided.
view_status_str (str | None) – JSON string containing viewing camera parameters from o3d.visualization.Visualizer.get_view_status(). This does not include window size or point size.
height (int) – Height of the output image in pixels.
width (int) – Width of the output image in pixels.
point_size (float) – Size of points for PointCloud objects, in pixels.
line_radius (float | None) – Radius of lines for LineSet objects, in world units. When set, LineSets are converted to cylinder meshes with this radius. Unlike point_size, this is in world metric space, not pixel space.
to_depth (bool) – If True, renders a depth image instead of RGB. Invalid depths are set to 0.
visible (bool) – If True, shows the rendering window.

Returns:

If to_depth is False: (H, W, 3) float32 RGB image array with values in [0, 1].
If to_depth is True: (H, W) float32 depth image array with depth values in world units.

Return type:

Float[np.ndarray, “h w 3”]

Examples

# Create some geometries
mesh = o3d.geometry.TriangleMesh.create_box()
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(np.random.rand(100, 3))

# Render with default camera
image = render_geometries([mesh, pcd])

# Render with specific camera parameters
K = np.array([[1000, 0, 640], [0, 1000, 360], [0, 0, 1]])
T = np.eye(4)
depth_image = render_geometries([mesh], K=K, T=T, to_depth=True)

ct.render.get_render_view_status_str(geometries, K=None, T=None, height=720, width=1280)[source]¶

Get a view status string containing camera parameters from Open3D visualizer. This is useful for rendering multiple geometries with consistent camera views. This function may require a display.

The view status string contains camera parameters in JSON format, including:

Camera position and orientation
Field of view
Zoom level
Other view control settings

Parameters:

geometries (List[Geometry3D]) – List of Open3D geometries to set up the view. Supported types:
TriangleMesh (-)
PointCloud (-)
LineSet (-)
K (Float[ndarray, '3 3'] | None) – Camera intrinsic matrix. If None, uses Open3D’s default camera inferred from the geometries. Must be provided if T is provided.
T (Float[ndarray, '4 4'] | None) – Camera extrinsic matrix (world-to-camera transformation). If None, uses Open3D’s default camera inferred from the geometries. Must be provided if K is provided.
height (int) – Height of the view window in pixels.
width (int) – Width of the view window in pixels.

Returns:

JSON string containing camera view parameters from o3d.visualization.Visualizer.get_view_status(). This includes:

Camera position and orientation

Field of view

Zoom level

Other view control settings

Note: Does not include window size or point size.

Return type:

str

Examples

# Get view status for default camera
view_str = get_render_view_status_str([mesh, pcd])

# Get view status for specific camera
K = np.array([[1000, 0, 640], [0, 1000, 360], [0, 0, 1]])
T = np.eye(4)
view_str = get_render_view_status_str([mesh], K=K, T=T)

# Use view status for consistent rendering
image1 = render_geometries([mesh], view_status_str=view_str)
image2 = render_geometries([pcd], view_status_str=view_str)

ct.render.get_render_K_T(geometries, view_status_str=None, height=720, width=1280)[source]¶

Get the camera intrinsic (K) and extrinsic (T) matrices from Open3D visualizer. These matrices represent the current rendering camera parameters.

The matrices follow the standard pinhole camera model:

λ[x, y, 1]^T = K @ [R | t] @ [X, Y, Z, 1]^T

where:

[X, Y, Z, 1]^T is a homogeneous 3D point in world coordinates
[R | t] is the 3x4 extrinsic matrix (world-to-camera transformation)
K is the 3x3 intrinsic matrix
[x, y, 1]^T is the projected homogeneous 2D point in pixel coordinates
λ is the depth value

Parameters:

geometries (List[Geometry3D]) – List of Open3D geometries to set up the view. Supported types include TriangleMesh, PointCloud, and LineSet.
view_status_str (str | None) – Optional JSON string containing camera parameters from o3d.visualization.Visualizer.get_view_status(). If provided, uses these parameters to set up the view.
height (int) – Height of the view window in pixels.
width (int) – Width of the view window in pixels.

Returns:

K: camera intrinsic matrix
T: camera extrinsic matrix, world-to-camera transformation

Return type:

Tuple[Float[np.ndarray, “3 3”], Float[np.ndarray, “4 4”]]

Examples

# Get camera matrices for default view
K, T = get_render_K_T([mesh, pcd])

# Get camera matrices for specific view
view_str = get_render_view_status_str([mesh])
K, T = get_render_K_T([mesh], view_status_str=view_str)

# Use matrices for consistent rendering
image = render_geometries([mesh], K=K, T=T)

ct.render.render_text(text, font_size=72, font_type='tex', font_color=(0, 0, 0), tight_layout=False, multiline_alignment='left', padding_tblr=(0, 0, 0, 0))[source]¶

Global function to render text using specified font settings.

Parameters:

text (str) – The text to render.
font_size (int) – The font size to use.
font_type (Literal['tex', 'serif', 'sans', 'mono']) – The type of font.
font_color (Tuple[float, float, float]) – The color of the font, as an RGB tuple in the range [0, 1].
tight_layout (bool) – If True, renders the text without padding. If False, may include padding on top for top alignment in images.
alignment – The alignment of the text. Can be “left”, “center”, or “right”, this is useful for multi-line text.
padding_tblr (Tuple[int, int, int, int]) – The padding to add to the top, bottom, left, and right of the rendered text, in pixels.

Returns:

The rendered text image as a float32 NumPy array.

Return type:

Float[ndarray, ‘h w’]

ct.render.render_texts(texts, font_size=72, font_type='tex', font_color=(0.0, 0.0, 0.0), multiline_alignment='center', same_height=False, same_width=False, padding_tblr=(0, 0, 0, 0))[source]¶

Render multiple text strings into images with consistent formatting options.

Parameters:

texts (List[str]) – List of text strings to render.
font_size (int) – Font size in points. Default is 72.
font_type (Literal['tex', 'serif', 'sans', 'mono']) – Type of font to use. Default is “tex”.
font_color (Tuple[float, float, float]) – Font color as RGB tuple in range [0, 1]. Default is black (0, 0, 0).
multiline_alignment (Literal['left', 'center', 'right']) – Text alignment for multi-line text. Can be “left”, “center”, or “right”. Default is “center”.
same_height (bool) – If True, makes all rendered images the same height by padding. Default is False.
same_width (bool) – If True, makes all rendered images the same width by padding. Default is False.
padding_tblr (Tuple[int, int, int, int]) – Padding to add to top, bottom, left, and right of rendered text in pixels. Default is (0, 0, 0, 0).

Returns:

List of rendered text images as float32 NumPy arrays with values in range [0, 1].

Return type:

List[Float[ndarray, ‘h w’]]