Skip to content

allenact.embodiedai.mapping.mapping_utils.map_builders#

[view_source]

BinnedPointCloudMapBuilder#

class BinnedPointCloudMapBuilder(object)

[view_source]

Class used to iteratively construct a map of "free space" based on input depth maps (i.e. pointclouds).

Adapted from https://github.com/devendrachaplot/Neural-SLAM

This class can be used to (iteratively) construct a metric map of free space in an environment as an agent moves around. After every step the agent takes, you should call the update function and pass the agent's egocentric depth image along with the agent's new position. This depth map will be converted into a pointcloud, binned along the up/down axis, and then projected onto a 3-dimensional tensor of shape (HxWxC) whose where HxW represent the ground plane and where C equals the number of bins the up-down coordinate was binned into. This 3d map counts the number of points in each bin. Thus a lack of points within a region can be used to infer that that region is free space.

Attributes

  • fov: FOV of the camera used to produce the depth images given when calling update.
  • vision_range_in_map_units: The maximum distance (in number of rows/columns) that will be updated when calling update, points outside of this map vision range are ignored.
  • map_size_in_cm: Total map size in cm.
  • resolution_in_cm: Number of cm per row/column in the map.
  • height_bins: The bins used to bin the up-down coordinate (for us the y-coordinate). For example, if height_bins = [0.1, 1] then all y-values < 0.1 will be mapped to 0, all y values in [0.1, 1) will be mapped to 1, and all y-values >= 1 will be mapped to 2.
  • **Importantly:** these y-values will first be recentered by the min_xyz value passed when calling reset(...).
  • device: A torch.device on which to run computations. If this device is a GPU you can potentially obtain significant speed-ups.

BinnedPointCloudMapBuilder.update#

 | update(depth_frame: np.ndarray, camera_xyz: np.ndarray, camera_rotation: float, camera_horizon: float) -> Dict[str, np.ndarray]

[view_source]

Updates the map with the input depth frame from the agent.

See the allenact.embodiedai.mapping.mapping_utils.point_cloud_utils.project_point_cloud_to_map function for more information input parameter definitions. We assume that the input depth_frame has depths recorded in meters.

Returns

Letmap_size = self.map_size_in_cm // self.resolution_in_cm. Returns a dictionary with keys-values:

  • "egocentric_update" - A tensor of shape (vision_range_in_map_units)x(vision_range_in_map_units)x(len(self.height_bins) + 1) corresponding to the binned pointcloud after having been centered on the agent and rotated so that points ahead of the agent correspond to larger row indices and points further to the right of the agent correspond to larger column indices. Note that by "centered" we mean that one can picture the agent as being positioned at (0, vision_range_in_map_units/2) and facing downward. Each entry in this tensor is a count equaling the number of points in the pointcloud that, once binned, fell into this entry. This is likely the output you want to use if you want to build a model to predict free space from an image.
  • "allocentric_update" - A (map_size)x(map_size)x(len(self.height_bins) + 1) corresponding to "egocentric_update" but rotated to the world-space coordinates. This allocentric_update is what is used to update the internally stored representation of the map.
  • "map" - A (map_size)x(map_size)x(len(self.height_bins) + 1) tensor corresponding to the sum of all "allocentric_update" values since the last reset().
<a name="allenact.embodiedai.mapping.mapping_utils.map_builders.BinnedPointCloudMapBuilder.reset"></a>
### `BinnedPointCloudMapBuilder.reset`

```python
 | reset(min_xyz: np.ndarray)

[view_source]

Reset the map.

Resets the internally stored map.

Parameters

  • min_xyz : An array of size (3,) corresponding to the minimum possible x, y, and z values that will be observed as a point in a pointcloud when calling .update(...). The (world-space) maps returned by calls to update
  • will have been normalized so the (0,0,:) entry corresponds to these minimum values.

ObjectHull2d#

class ObjectHull2d()

[view_source]

ObjectHull2d.__init__#

 | __init__(object_id: str, object_type: str, hull_points: Union[np.ndarray, Sequence[Sequence[float]]])

[view_source]

A class used to represent 2d convex hulls of objects when projected to the ground plane.

Parameters

  • object_id : A unique id for the object.
  • object_type : The type of the object.
  • hull_points : A Nx2 matrix with hull_points[:, 0] being the x coordinates and hull_points[:, 1] being the z coordinates (this is using the Unity game engine conventions where the y axis is up/down).

SemanticMapBuilder#

class SemanticMapBuilder(object)

[view_source]

Class used to iteratively construct a semantic map based on input depth maps (i.e. pointclouds).

Adapted from https://github.com/devendrachaplot/Neural-SLAM

This class can be used to (iteratively) construct a semantic map of objects in the environment.

This map is similar to that generated by BinnedPointCloudMapBuilder (see its documentation for more information) but the various channels correspond to different object types. Thus if the (i,j,k) entry of a map generated by this function is True, this means that an object of type k is present in position i,j in the map. In particular, by "present" we mean that, after projecting the object to the ground plane and taking the convex hull of the resulting 2d object, a non-trivial portion of this convex hull overlaps the i,j position.

For attribute information, see the documentation of the BinnedPointCloudMapBuilder class. The only attribute present in this class that is not present in BinnedPointCloudMapBuilder is ordered_object_types which corresponds to a list of unique object types where object type ordered_object_types[i] will correspond to the ith channel of the map generated by this class.

SemanticMapBuilder.update#

 | update(depth_frame: np.ndarray, camera_xyz: np.ndarray, camera_rotation: float, camera_horizon: float) -> Dict[str, np.ndarray]

[view_source]

Updates the map with the input depth frame from the agent.

See the documentation for BinnedPointCloudMapBuilder.update, the inputs and outputs are similar except that channels are used to represent the presence/absence of objects of given types. Unlike BinnedPointCloudMapBuilder.update, this function also returns two masks with keys "egocentric_mask" and "mask" that can be used to determine what portions of the map have been observed by the agent so far in the egocentric and world-space reference frames respectively.

SemanticMapBuilder.reset#

 | reset(min_xyz: np.ndarray, object_hulls: Sequence[ObjectHull2d])

[view_source]

Reset the map.

Resets the internally stored map.

Parameters

  • min_xyz : An array of size (3,) corresponding to the minimum possible x, y, and z values that will be observed as a point in a pointcloud when calling .update(...). The (world-space) maps returned by calls to update
  • will have been normalized so the (0,0,:) entry corresponds to these minimum values.
  • object_hulls : The object hulls corresponding to objects in the scene. These will be used to construct the map.