ct.render¶
Functions for controlled rendering of 3D geometries to images or depth images.
- ct.render.render_geometries(geometries, K=None, T=None, view_status_str=None, height=720, width=1280, point_size=1.0, line_radius=None, to_depth=False, visible=False)[source]¶
Render Open3D geometries to an image using the specified camera parameters. This function may require a display.
- Parameters:
geometries (List[Geometry3D]) – List of Open3D geometries to render. Supported types are TriangleMesh, PointCloud, and LineSet.
K (Float[ndarray, '3 3'] | None) – Camera intrinsic matrix. If None, uses Open3D’s default camera inferred from the geometries. Must be provided if T is provided.
T (Float[ndarray, '4 4'] | None) – Camera extrinsic matrix (world-to-camera transformation). If None, uses Open3D’s default camera inferred from the geometries. Must be provided if K is provided.
view_status_str (str | None) – JSON string containing viewing camera parameters from o3d.visualization.Visualizer.get_view_status(). This does not include window size or point size.
height (int) – Height of the output image in pixels.
width (int) – Width of the output image in pixels.
point_size (float) – Size of points for PointCloud objects, in pixels.
line_radius (float | None) – Radius of lines for LineSet objects, in world units. When set, LineSets are converted to cylinder meshes with this radius. Unlike point_size, this is in world metric space, not pixel space.
to_depth (bool) – If True, renders a depth image instead of RGB. Invalid depths are set to 0.
visible (bool) – If True, shows the rendering window.
- Returns:
If to_depth is False: (H, W, 3) float32 RGB image array with values in [0, 1].
If to_depth is True: (H, W) float32 depth image array with depth values in world units.
- Return type:
Float[np.ndarray, “h w 3”]
Examples
# Create some geometries mesh = o3d.geometry.TriangleMesh.create_box() pcd = o3d.geometry.PointCloud() pcd.points = o3d.utility.Vector3dVector(np.random.rand(100, 3)) # Render with default camera image = render_geometries([mesh, pcd]) # Render with specific camera parameters K = np.array([[1000, 0, 640], [0, 1000, 360], [0, 0, 1]]) T = np.eye(4) depth_image = render_geometries([mesh], K=K, T=T, to_depth=True)
- ct.render.get_render_view_status_str(geometries, K=None, T=None, height=720, width=1280)[source]¶
Get a view status string containing camera parameters from Open3D visualizer. This is useful for rendering multiple geometries with consistent camera views. This function may require a display.
The view status string contains camera parameters in JSON format, including:
Camera position and orientation
Field of view
Zoom level
Other view control settings
- Parameters:
geometries (List[Geometry3D]) – List of Open3D geometries to set up the view. Supported types:
TriangleMesh (-)
PointCloud (-)
LineSet (-)
K (Float[ndarray, '3 3'] | None) – Camera intrinsic matrix. If None, uses Open3D’s default camera inferred from the geometries. Must be provided if T is provided.
T (Float[ndarray, '4 4'] | None) – Camera extrinsic matrix (world-to-camera transformation). If None, uses Open3D’s default camera inferred from the geometries. Must be provided if K is provided.
height (int) – Height of the view window in pixels.
width (int) – Width of the view window in pixels.
- Returns:
JSON string containing camera view parameters from o3d.visualization.Visualizer.get_view_status(). This includes:
Camera position and orientation
Field of view
Zoom level
Other view control settings
Note: Does not include window size or point size.
- Return type:
Examples
# Get view status for default camera view_str = get_render_view_status_str([mesh, pcd]) # Get view status for specific camera K = np.array([[1000, 0, 640], [0, 1000, 360], [0, 0, 1]]) T = np.eye(4) view_str = get_render_view_status_str([mesh], K=K, T=T) # Use view status for consistent rendering image1 = render_geometries([mesh], view_status_str=view_str) image2 = render_geometries([pcd], view_status_str=view_str)
- ct.render.get_render_K_T(geometries, view_status_str=None, height=720, width=1280)[source]¶
Get the camera intrinsic (K) and extrinsic (T) matrices from Open3D visualizer. These matrices represent the current rendering camera parameters.
- The matrices follow the standard pinhole camera model:
λ[x, y, 1]^T = K @ [R | t] @ [X, Y, Z, 1]^T
- where:
[X, Y, Z, 1]^T is a homogeneous 3D point in world coordinates
[R | t] is the 3x4 extrinsic matrix (world-to-camera transformation)
K is the 3x3 intrinsic matrix
[x, y, 1]^T is the projected homogeneous 2D point in pixel coordinates
λ is the depth value
- Parameters:
geometries (List[Geometry3D]) – List of Open3D geometries to set up the view. Supported types include TriangleMesh, PointCloud, and LineSet.
view_status_str (str | None) – Optional JSON string containing camera parameters from o3d.visualization.Visualizer.get_view_status(). If provided, uses these parameters to set up the view.
height (int) – Height of the view window in pixels.
width (int) – Width of the view window in pixels.
- Returns:
K: camera intrinsic matrix
T: camera extrinsic matrix, world-to-camera transformation
- Return type:
Tuple[Float[np.ndarray, “3 3”], Float[np.ndarray, “4 4”]]
Examples
# Get camera matrices for default view K, T = get_render_K_T([mesh, pcd]) # Get camera matrices for specific view view_str = get_render_view_status_str([mesh]) K, T = get_render_K_T([mesh], view_status_str=view_str) # Use matrices for consistent rendering image = render_geometries([mesh], K=K, T=T)
- ct.render.render_text(text, font_size=72, font_type='tex', font_color=(0, 0, 0), tight_layout=False, multiline_alignment='left', padding_tblr=(0, 0, 0, 0))[source]¶
Global function to render text using specified font settings.
- Parameters:
text (str) – The text to render.
font_size (int) – The font size to use.
font_type (Literal['tex', 'serif', 'sans', 'mono']) – The type of font.
font_color (Tuple[float, float, float]) – The color of the font, as an RGB tuple in the range [0, 1].
tight_layout (bool) – If True, renders the text without padding. If False, may include padding on top for top alignment in images.
alignment – The alignment of the text. Can be “left”, “center”, or “right”, this is useful for multi-line text.
padding_tblr (Tuple[int, int, int, int]) – The padding to add to the top, bottom, left, and right of the rendered text, in pixels.
- Returns:
The rendered text image as a float32 NumPy array.
- Return type:
Float[ndarray, ‘h w’]
- ct.render.render_texts(texts, font_size=72, font_type='tex', font_color=(0.0, 0.0, 0.0), multiline_alignment='center', same_height=False, same_width=False, padding_tblr=(0, 0, 0, 0))[source]¶
Render multiple text strings into images with consistent formatting options.
- Parameters:
font_size (int) – Font size in points. Default is 72.
font_type (Literal['tex', 'serif', 'sans', 'mono']) – Type of font to use. Default is “tex”.
font_color (Tuple[float, float, float]) – Font color as RGB tuple in range [0, 1]. Default is black (0, 0, 0).
multiline_alignment (Literal['left', 'center', 'right']) – Text alignment for multi-line text. Can be “left”, “center”, or “right”. Default is “center”.
same_height (bool) – If True, makes all rendered images the same height by padding. Default is False.
same_width (bool) – If True, makes all rendered images the same width by padding. Default is False.
padding_tblr (Tuple[int, int, int, int]) – Padding to add to top, bottom, left, and right of rendered text in pixels. Default is (0, 0, 0, 0).
- Returns:
List of rendered text images as float32 NumPy arrays with values in range [0, 1].
- Return type:
List[Float[ndarray, ‘h w’]]