Skip to content

Plan: unify Search Explorer and Interactive Explorer into one page #156

@rdhyee

Description

@rdhyee

Context

iSamples currently hosts two interactive map pages that have been converging for months:

Commit bd0a094 (April 30) explicitly "rewrote the explorer on the progressive_globe foundation for speed + added results table." The two files already share assets/js/source-palette.js, Cesium 1.127, ION token, and preload hints. The globe page already links to the explorer in See Also. Convergence is in motion — this issue tracks completing it.

Direction: a single unified page at the new canonical URL /explorer.html (top-level). Implementation is built in tutorials/progressive_globe.qmd during Phases 1–4, then renamed to explorer.qmd at site root in Phase 5 with redirects from both old URLs. The DOM-first architecture is the destination model; Explorer's facet UI, cross-filter counts, table view, URL params, and SKOS labels port into it as plain-JS DOM handlers (no OJS reactive cells added).

Why now: grant ends July 2026, May website-cleanup deadline ahead of June 2 keynote and June workshop. Mandate: don't start new infrastructure — polish what exists. Two pages doing the same job is technical debt blocking the cleanup.

Plan v3 — further hardened after second Codex review. Five implementation contracts now explicit in the relevant phases: (a) facet data contract standardized on v2 parquet with URI-valued checkboxes (Phase 1); (b) portable predicate builder using EXISTS or pid IN (...) instead of alias-dependent fragments (Phase 1, reused in Phases 2 and 4); (c) all filter dimensions named in URL params with comma-list pattern (Phase 3); (d) preview-safe redirects using document-relative URLs (Phase 5); (e) asset path adjustment after moving to root (Phase 5).

Plan v2 changes (preserved): canonical URL decided up front; cluster-mode filter honesty merged into Phase 1; Phase 4 table mode scope-narrowed to Globe/Table toggle.

Phased PR Plan

Five PRs, each smoke-tested independently. Phases 1–4 modify tutorials/progressive_globe.qmd only. Phase 5 handles the rename, redirects, navbar, and tests.

Phase 1 — Specimen Type filter + SKOS prefLabels + honest cluster-mode UX + portable predicate refactor (medium-large, ~7 hr)

Facet data contract (new in v3): standardize on isamples_202601_sample_facets_v2.parquet (URI strings) — the explorer's facets file, not the globe's older short-label file. Update the globe's samples view registration accordingly. Checkbox value attributes store full URIs; SKOS / prettyLabel() is display-only. Otherwise prettyLabel() has nothing to look up.

Portable predicate refactor (new in v3): refactor facetFilterSQL() from emitting alias-dependent fragments (AND f.material IN (...)) to a portable predicate using EXISTS (SELECT 1 FROM sample_facets f WHERE f.pid = l.pid AND f.material IN (...)) or equivalently pid IN (SELECT DISTINCT pid FROM sample_facets WHERE material IN (...)). Avoids alias mismatch and duplicate rows from JOINs against multi-valued facets. Required for Phase 4 table mode but ship in Phase 1 to avoid backward refactoring. Smoke-test that the existing JOIN-based query in progressive_globe.qmd:855-878 still produces correct counts after the refactor.

Specimen Type filter + SKOS labels:

  • Preload vocab_labels.parquet (~60 KB) in YAML include-in-header.
  • Add collapsible "Specimen Type" panel (#objectTypeFilter) in the side panel, mirroring the Material/Feature pattern at progressive_globe.qmd:155-187.
  • Extend facetFilters OJS cell at progressive_globe.qmd:662-707 to also pull object_type rows from facet_summaries_url and populate #objectTypeFilterBody.
  • Port prettyLabel(uri) from isamples_explorer.qmd:486-495 (pure JS, no reactivity) — apply when rendering material/feature/object_type checkboxes so URIs display as human labels.
  • Extend getCheckedValues() and the refactored facetFilterSQL() (progressive_globe.qmd:256-275) to include object_type.

Cluster-mode honesty (ships in same PR): the H3 summary parquets only carry dominant_source, so material/feature/specimen filters cannot apply at cluster zoom. When any non-source facet filter is active and mode is cluster, show a persistent status note: "These filters apply at neighborhood zoom — zoom in or click a cluster to see filtered samples." If the camera is at res4/res6, auto-enter res8 on filter change to minimize the gap. At point mode (<120 km), the existing JOIN handles filters correctly. Document the limitation in a code comment; revisit if/when DuckDB-WASM gains H3 extension support.

Phase 2 — Cross-filtered live counts (medium, ~6 hr)

Port the explorer's cross-filter machinery to plain-JS, using the portable predicate from Phase 1:

  • Add cross_filter_url constant (already-defined cache parquet, 6 KB).
  • Copy buildCrossFilterWhere(excludeFacet) from isamples_explorer.qmd:500-548 — strip OJS reactive references (searchInput?.trim()document.getElementById('sampleSearch').value.trim()) and adapt to call the Phase 1 portable predicate builder rather than emitting alias-dependent fragments.
  • Copy crossFilteredFacets cell logic from isamples_explorer.qmd:565-652 as an async function updateCrossFilteredCounts() triggered from each filter change listener.
  • Add data-facet and data-value attributes to the count <span> elements so updates are in-place mutations (no re-render).
  • Use the globe's existing db.query() API (DuckDBClient.of, progressive_globe.qmd:438) — not the explorer's manual runQuery().

Phase 3 — URL query params + multi-term search (small-medium, ~5 hr)

Add readQueryParams() and writeQueryParams() alongside the existing readHash()/buildHash() at progressive_globe.qmd:277. Reconcile URL state model:

  • Hash (#v=1&lat=&lng=&alt=&pid=) — camera + selected sample (already working, unchanged).
  • Query params — all bookmarkable filter dimensions (updated in v3 — full list):
    • q= — search query
    • sources=A,B,C — source filter (comma-list, matching existing sources=)
    • material=A,B,C — material URIs (comma-list, URI-encoded)
    • context=A,B,C — sampled feature URIs (comma-list, URI-encoded)
    • object_type=A,B,C — specimen type URIs (comma-list, URI-encoded)
    • maxSamples=N — table row cap
    • view=globe|table
    • perf=1 — opt-in performance panel
  • Both hash and query params coexist. Share URL example: /explorer.html?q=basalt&sources=SESAR&material=https%3A%2F%2Fw3id.org%2Fisample%2Fcontrolledvocabulary%2Fmaterialtype%2Frockorsediment#v=1&lat=37.5&lng=-122&alt=200000.

On load: hydrate #sampleSearch, all filter checkboxes, view mode, and maxSamples from query params. On every filter change and search submit: call writeQueryParams() via history.replaceState.

Fold in the explorer's multi-term search + FTS relevance ranking from #95 in this phase since the search wiring already changes here. Search state becomes bookmarkable on day one of the new search behavior.

Phase 4 — Table view (medium, ~4 hr — scope-narrowed)

Add a binary view toggle (Globe / Table only — drop the explorer's three-way Globe/List/Table) above .globe-layout:

<div id="viewToggle">
  <button data-view="globe" class="active">Globe</button>
  <button data-view="table">Table</button>
</div>

When view === 'table': hide .globe-layout (display:none, do not destroy the Cesium viewer — just hide), show #tableContainer. Reuse the Phase 1 portable predicate for the WHERE clause — no separate query builder, no alias mismatch risk, no duplicate rows. Render a paginated HTML <table> (page size 100, default, configurable to 1K). No upfront row dump; pagination keeps memory bounded.

The maxSamples slider applies only to the table mode's hard cap (1K–100K, default 25K); globe stays at its 5K viewport budget. If table parity becomes too large or risky, ship Phases 1–3 and 5 first and defer table mode to a follow-up issue.

Test: /explorer.html?q=basalt&sources=SESAR&view=table should land on a pre-filtered, paginated table.

Phase 5 — Rename, redirects, navbar, tests (small-medium, ~3 hr)

Files: new explorer.qmd at site root, tutorials/progressive_globe.qmd (→ redirect stub), tutorials/isamples_explorer.qmd (→ redirect stub), _quarto.yml, how-to-use.qmd, tutorials/index.qmd, index.qmd, tests/test_explorer.pytests/test_globe.py migration.

  1. Rename: move the unified content from tutorials/progressive_globe.qmd to explorer.qmd at site root. Output is /explorer.html.

  2. Asset path fix (new in v3): after the move, change the source palette import from ../assets/js/source-palette.js to assets/js/source-palette.js. The current .. accidentally still works on production (browsers swallow .. at root) but breaks GitHub Pages PR previews whose base path is /isamplesorg.github.io/. Audit the unified file for any other ../ paths and resolve them similarly.

  3. Two preview-safe redirect stubs (updated in v3) — one per old URL — each passes location.search + location.hash through using a document-relative URL so previews work:

    location.replace(new URL(`../explorer.html${location.search}${location.hash}`, location.href).href);

    Absolute /explorer.html would break GitHub Pages PR previews at username.github.io/isamplesorg.github.io/.... Keep both files in _quarto.yml so Quarto continues to build them at their public URLs. This preserves all inbound deep links from search engines, shared URLs, and external sites.

  4. _quarto.yml: navbar Interactive Explorer href changes from tutorials/progressive_globe.qmdexplorer.qmd. Remove the Search Explorer sidebar entries at lines 21-22 and 68-69.

  5. Update internal links: how-to-use.qmd:39, tutorials/index.qmd:12, and any reference in index.qmd to point at explorer.html.

  6. Migrate Playwright tests: rename tests/test_explorer.pytests/test_globe.py (or test_explorer_v2.py) targeting /explorer.html. Unskip the cross-filter tests deferred in explorer: dynamic cross-filter facet counts #155 — native checkboxes respond to programmatic .click() unlike OJS Inputs.checkbox.

Phase 5 ships last so both pages remain live and independently functional through the migration — any regression up to this point is a single-PR revert away.

Critical Files

  • tutorials/progressive_globe.qmd — host page during Phases 1–4; renamed to explorer.qmd in Phase 5
  • tutorials/isamples_explorer.qmd — source of facet UI, cross-filter, table; reduced to redirect in Phase 5
  • _quarto.yml — navbar updates in Phase 5
  • how-to-use.qmd:39, tutorials/index.qmd:12, index.qmd — link updates in Phase 5
  • tests/test_explorer.py — migrate in Phase 5
  • tests/playwright/cesium-queries.spec.js — extend with new selectors per phase

Reused functions / patterns:

  • getCheckedValues(elementId), sourceFilterSQL(), facetFilterSQL() — already in globe at lines 234–275; refactor facetFilterSQL() to portable predicate in Phase 1 and extend to handle object_type.
  • buildCrossFilterWhere(), crossFilteredFacets, prettyLabel() — port from explorer lines 500–652, 486–495.
  • DuckDBClient.of() db.query() — globe's existing pattern at line 438.
  • Hash read/write — readHash(), buildHash() at globe lines 211–252; mirror for query params.
  • Source palette — already centralized in assets/js/source-palette.js.

Verification

Per-phase smoke test (Playwright): render with quarto render, run smoke test on the built HTML, visual check, fix-and-repeat, then commit + PR.

Phase-specific Playwright tests to add to tests/playwright/cesium-queries.spec.js:

  • Phase 1:
    • #objectTypeFilterBody input[type=checkbox] count > 0 after 10s; material labels are human-readable (not URIs).
    • Facet honesty regression test (per Codex): selecting material/context/specimen at high altitude shows the explanatory status note; selecting the same filter at point zoom (<120 km) constrains the sample query.
    • Portable predicate regression test (new in v3): a sample in the lite parquet with two material URIs appears exactly once in cluster-zoom counts and point-zoom rendering, not duplicated.
  • Phase 2: span.facet-count[data-facet='source'] exists; SESAR count > 4M; selecting SESAR drops other-source counts.
  • Phase 3: navigate to ?q=basalt&sources=SESAR&material=<uri> → search input, source checkbox, and material checkbox all hydrate correctly.
  • Phase 4: ?view=table → table visible, globe hidden; pagination visible; toggling back to globe re-renders points.
  • Phase 5:
    • tutorials/progressive_globe.html?q=basalt&sources=SESAR and tutorials/isamples_explorer.html?q=basalt&sources=SESAR both redirect to /explorer.html with params intact.
    • Preview-safe redirect test: redirect works on a GitHub Pages preview URL with non-root base path, not just on isamples.org.
    • Asset path test: source palette loads correctly from /explorer.html on both production and preview deploys.

Manual browser verification per phase: cross-filter latency under 5s; mobile 900 px breakpoint collapses cleanly; hash deep-link round-trip via incognito; ?perf=1 panel works.

Rollback: each phase is one PR. Reverting Phase N leaves earlier phases intact. Until Phase 5 ships, both pages remain live and independently functional.

Resolved Decisions

  1. Canonical URL: /explorer.html (top-level). Matches the page's hero navbar position; "progressive_globe" describes an implementation, not a user task.
  2. Single page: yes — no two-page fallback.
  3. Branch / PR strategy: each PR branched from main (easier review, smaller blast radius per merge).
  4. Facet data contract (v3): standardize on sample_facets_v2.parquet with URI-valued checkboxes from Phase 1.
  5. Predicate shape (v3): portable EXISTS / pid IN (...) predicate built in Phase 1, reused in Phases 2 and 4.

Plan prepared collaboratively with Claude Code, hardened across two rounds of Codex review. Not yet implemented — filing for visibility and future execution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions