Skip to content

dimos map tooling, stream alignment#2306

Open
leshy wants to merge 81 commits into
mainfrom
feat/ivan/stream_alignment
Open

dimos map tooling, stream alignment#2306
leshy wants to merge 81 commits into
mainfrom
feat/ivan/stream_alignment

Conversation

@leshy
Copy link
Copy Markdown
Member

@leshy leshy commented May 29, 2026

dimos map utilities for dataset collection team

you can ignore those, they are temporary,

changing pose source for observation stream, rendering global map, loop closures etc. lots of tooling (temporary and will change a lot)

dataset validation instructions document

stream alignment

a very simple stream.align tool - this needs much more work. multi stream input and output, interpolation of poses/transforms etc

leshy added 30 commits May 24, 2026 18:53
Replaces the monolithic pgo_then_voxels with four primitives over
mem2 Streams and Transforms:

  pgo_keyframes(lidar)            -> Stream[Keyframe]
  keyframes_to_corrections(kfs)   -> Stream[Transform]   (world_corrected <- world_raw)
  make_interpolator(corrections)  -> (ts) -> Transform   (SLERP + linear, endpoint-clipped)
  apply_corrections(stream, corr) -> Stream[T]           (shuffles obs.pose)

Drift correction is now a first-class, reusable Stream[Transform] that
any pose-stamped consumer can apply — same math as before (rigid
T_corr = T_global @ T_local^-1 per keyframe, SLERP + linear interpolation
between bracketing keyframes), just composable.

dimos map --pgo migrates to the new primitives; pgo_then_voxels is
deleted. The internal _SimplePGO / PGOConfig / _KeyPose machinery stays
in pgo.py and is imported by pgo2.
Renames:
  pgo.py  -> pgo_internals.py   (gtsam/ICP machinery)
  pgo2.py -> pgo.py             (public Stream-shaped API)

`dimos.mapping.relocalization.pgo` is now the canonical import path
(`pgo_keyframes`, `keyframes_to_corrections`, `make_interpolator`,
`apply_corrections`, `correction_at`, `Keyframe`). The internal
_SimplePGO / PGOConfig / _KeyPose / _icp / _voxel_downsample helpers
live in pgo_internals.py and are imported lazily inside pgo.py to keep
gtsam off the public-API import path.
Adds test_pgo.py covering pose normalization on Observation, the Transform↔Pose3
conversion helpers, the interpolator edge cases, apply_corrections behavior, and
a real-recording smoke test (skipped when data/go2_short.db is absent).
Annotates the two map.py helpers so they pass mypy. Removes leftover section
divider comments to satisfy the no-section-markers project rule.
The colors-copy block in `PointCloud2.transform` calls `self.pointcloud`
to check `has_colors()`, which forces a tensor->legacy conversion on
every invocation. With `pgo --full-pgo` rebuilding from hundreds of
lidar frames, that hidden allocation dominates the per-frame cost. The
colors path was lidar-irrelevant (lidar clouds have no colors anyway)
and the feature it was meant to support never landed; remove it so
transform() stays a clean numpy round-trip.
Comment thread dimos/memory2/stream.py Outdated
These end-to-end CLI tests forked python -m dimos.robot.cli.dimos per case, re-paying the heavy per-verb imports (rerun, open3d, voxel) 7 times. Run them in-process via Typer's CliRunner so those imports are paid once: 33.7s -> 7.5s, all 7 still pass. Trade-off: shared interpreter, no per-test process isolation.
@leshy leshy changed the title Feat/ivan/stream alignment dimos map tooling, stream alignment, autoresearch sketch Jun 1, 2026
@leshy leshy added the PlzReview label Jun 1, 2026
leshy added a commit that referenced this pull request Jun 1, 2026
Add from_time/to_time (relative to the first observation) and
from_timestamp/to_timestamp (absolute epoch seconds) for windowing a
stream by time. A trailing to_time is a duration measured from the
current start, so from_time(2).to_time(30) reads as "skip 2s, take the
following 30s"; frames mix freely (from_timestamp(ts).to_time(30)).

Shared base for the stream-alignment (#2306) and go2dds (#2314)
branches, which both need this windowing API.
Comment thread dimos/memory2/stream.py Outdated
…arch

- pgo_keyframes: widen stream param to Iterable[Observation[PointCloud2]]
  (clears the mypy gate; it never uses Stream methods)
- pose_fill_db: order_by("ts") on both align inputs — align requires
  ts-ascending iteration, but sqlite defaults to id ASC (insertion order),
  which would silently mis-pair out-of-order recordings
- add CI-running unit tests for Observation.with_pose (lazy payload kept)
  and pose_fill(mount=...) composition
- remove the unreferenced autoresearch/ research drop (near-duplicate of
  pgo_auto.py + scaffolding); pgo_auto.py kept
@leshy leshy changed the title dimos map tooling, stream alignment, autoresearch sketch dimos map tooling, stream alignment Jun 1, 2026
Comment thread dimos/robot/cli/dimos.py
leshy added a commit that referenced this pull request Jun 1, 2026
Add from_time/to_time (relative to the first observation) and
from_timestamp/to_timestamp (absolute epoch seconds) for windowing a
stream by time. A trailing to_time is a duration measured from the
current start, so from_time(2).to_time(30) reads as "skip 2s, take the
following 30s"; frames mix freely (from_timestamp(ts).to_time(30)).

Shared base for the stream-alignment (#2306) and go2dds (#2314)
branches, which both need this windowing API.
Pulls the stairs dataset LFS pointer from feat/ivan/go2stairs so the map
tooling can use it; object already on remote LFS store.
Comment thread dimos/mapping/loop_closure/pgo_auto.py
…nment

# Conflicts:
#	dimos/memory2/stream.py
#	dimos/memory2/test_stream.py
@leshy leshy mentioned this pull request Jun 1, 2026
leshy added 2 commits June 1, 2026 20:51
The self-hosted CI image's published :dev tag is stale (predates
libturbojpeg0-dev in docker/python/Dockerfile), so TurboJPEG() raises at
runtime and the four verbs that decode the color_image stream fail. Guard
them with skipif so they skip on a lib-less image and still run where the
native lib is present.
if no_gui:
print(f"open with: rerun {out}")
else:
subprocess.Popen(["rerun", str(out)])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You maintain to reference to this? How is it ever closed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait I don't get it - reference to what?

Comment thread dimos/mapping/utils/cli/rename.py Outdated
--out mid360_renamed.db \\
--rename go2_lidar=lidar \\
--rename lidar=fastlio_lidar \\
--rename odometry=fastlio_odometry
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant to be used through dimos map rename, no? Then please don't document running the file itself.

Comment thread dimos/memory2/stream.py

for p in primary_iter:
# Advance the cursor until `nxt` sits past `p`; `prev` trails just behind.
while nxt is not None and nxt.ts <= p.ts:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why <=? If nxt.ts == p.ts isn't that the best possible solution?

Comment thread dimos/memory2/type/observation.py Outdated
Comment thread dimos/mapping/utils/cli/summary.py Outdated
Comment on lines +186 to +190
# Skip placeholder poses (origin position OR zero quaternion).
if pose[0] == 0 and pose[1] == 0 and pose[2] == 0:
continue
if pose[3] == 0 and pose[4] == 0 and pose[5] == 0 and (pose[6] == 0 or pose[6] == 1):
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid values shouldn't be placeholder values. I don't see why valid values like this should be skipped? Can't you use None (an invalid pose) as a placeholder?

Copy link
Copy Markdown
Member Author

@leshy leshy Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a null/identity quaternion, famous placeholder value actually, but yeah those actually shouldn't be in the dataset, this was an ad-hoc cleanup for some datasets, will be removed later

leshy added 2 commits June 2, 2026 16:58
- dimos map global / replay: new --bottom-cutoff option, threaded to
  PointCloud2.to_rerun(bottom_cutoff=) for global/accumulated maps only
  (raw lidar frames untouched); e.g. --bottom-cutoff 0 strips the floor.
- rename/summary/replay/replay-marker: usage docs now reference the
  `dimos map <verb>` commands instead of `python -m ...cli.<mod>`.
- observation: drop redundant explicit _data_lock=Lock() in with_pose/tag/
  derive; __init__ creates a fresh per-instance lock when None.
Comment on lines 204 to 227
@@ -208,7 +208,21 @@ def tag(self, **tags: Any) -> Self:
tags={**self.tags, **tags},
_data=_UNLOADED,
_loader=lambda: self.data,
_data_lock=threading.Lock(),
)
return type(self)(**kwargs)

def with_pose(self, pose: Any) -> Self:
"""Return a new observation with ``pose`` attached, payload kept lazy.

``pose`` accepts anything :func:`_to_tuple` handles (a 3-/7-tuple,
``Pose``/``PoseStamped``/``Transform``, or ``None`` to clear).
"""
kwargs: dict[str, Any] = {f.name: getattr(self, f.name) for f in fields(self)}
kwargs.pop("pose_tuple", None)
kwargs.update(
pose=pose,
_data=_UNLOADED,
_loader=lambda: self.data,
)
return type(self)(**kwargs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Shared _data_lock deadlocks on first lazy data access

Both tag() and the new with_pose() pass the source observation's _data_lock through to the derived observation (via the fields(self) dict copy). Both also set _data=_UNLOADED with _loader=lambda: self.data. When the derived observation's .data is accessed while the source's data is still unloaded:

  1. derived.data acquires source._data_lock
  2. Calls the loader: lambda: source.data
  3. source.data checks _data (still _UNLOADED), then tries with source._data_lock: — same lock, same thread, non-reentrant → deadlock (hangs forever)

The old tag() prevented this by explicitly passing _data_lock=threading.Lock() in its kwargs.update(...). That line was removed in this PR, regressing tag(). The new with_pose() has the same omission from the start.

This is exercised directly by pose_fill._fill, which calls primary.with_pose(secondary.data) on lazily-loaded stream observations; when the result is later decoded (e.g. by DetectMarkers), it deadlocks. The new test test_attaches_pose_and_keeps_payload_lazy also hits this path and will hang at assert posed.data == "payload".

Fix: add _data_lock=threading.Lock() to the kwargs.update(...) call in both tag() and with_pose().

leshy added 2 commits June 2, 2026 18:46
The entry point is `dimos map <verb>` (self-documented via --help); don't
document running the modules directly. summary.py keeps its rewritten usage
line per reviewer suggestion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants