Skip to content

HBASE-30019 Introduce CacheEngine and CacheTopology abstractions#8155

Open
VladRodionov wants to merge 1 commit intoapache:HBASE-30018from
VladRodionov:HBASE-30019-cache-engine-topology-v2
Open

HBASE-30019 Introduce CacheEngine and CacheTopology abstractions#8155
VladRodionov wants to merge 1 commit intoapache:HBASE-30018from
VladRodionov:HBASE-30019-cache-engine-topology-v2

Conversation

@VladRodionov
Copy link
Copy Markdown
Contributor

This PR targets the HBASE-30018 feature branch.

Introduces foundational internal abstractions for the pluggable block cache architecture:

  • CacheEngine
  • CacheEngineType
  • CacheTopology
  • CacheTopologyType
  • CacheTier
  • CacheEngineView
  • CacheTopologyView
  • initial topology stubs

No production wiring is added in this PR.
No behavior change intended.

JIRA: HBASE-30019

* </ul>
*/
@InterfaceAudience.Private
public interface CacheEngine {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the future substitute for the current existing BlockCache interface?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially yes, but overall CacheAccessService will be an access point to a caching system. Future ticket.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was about to ask some of the interface in BlockCache wasn't defined here, e.g. iterator and getBlockCaches, but let's see if the future task would tell us about those functions

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API will evolve for sure. We will see later if anything is missing.

@taklwu taklwu requested a review from Copilot April 27, 2026 22:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces new internal abstractions to support a future pluggable block cache architecture (engines + multi-engine topologies), along with initial topology stubs and read-only “view” wrappers intended for policy/metrics/diagnostics.

Changes:

  • Added CacheEngine (+ CacheEngineType) as the storage/backend abstraction for block cache implementations.
  • Added CacheTopology (+ CacheTopologyType, CacheTier) to describe orchestration across one or more cache engines.
  • Added initial topology implementations (SingleEngineTopology, TieredExclusiveTopology, TieredInclusiveTopology) and view wrappers (CacheEngineView, CacheTopologyView).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/CacheEngine.java New storage-layer interface for cache backends (API modeled after BlockCache).
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/CacheEngineType.java Enum for engine backend family.
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/CacheEngineView.java Read-only wrapper around CacheEngine for inspection/capability checks.
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/CacheTier.java Enum describing logical tier roles (SINGLE/L1/L2).
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/CacheTopology.java New interface describing multi-engine orchestration and promotion/demotion hooks.
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/CacheTopologyType.java Enum for topology “shape” (single, tiered exclusive/inclusive, custom).
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/CacheTopologyView.java Read-only wrapper around CacheTopology exposing engine views.
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/SingleEngineTopology.java Topology stub for a single engine.
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/TieredExclusiveTopology.java Topology stub modeling move-on-promotion semantics (exclusive tiers).
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/TieredInclusiveTopology.java Topology stub modeling copy-on-promotion semantics (inclusive tiers).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@VladRodionov VladRodionov force-pushed the HBASE-30019-cache-engine-topology-v2 branch 3 times, most recently from ae2eb87 to 101f264 Compare April 28, 2026 02:18
Copy link
Copy Markdown
Contributor

@taklwu taklwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few questions, mostly it looks good.

/**
* CarrotCache-based engine.
*/
CARROT
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: will carrot be the dependencies of hbase? if not, maybe this could be removed?

or should we use interface factory with getName instead of enum?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. CarrotCache should not need to be a hard dependency of hbase-server for this foundational API.

I think using a fixed enum for engine types is too restrictive for a pluggable architecture anyway. It works for built-ins like LRU/Bucket, but does not scale well for optional or third-party engines.

I’ll remove CacheEngineType and rely on getName() / configuration identity instead. That keeps the core API independent of optional engines and avoids baking specific external implementations into HBase.

Comment on lines +141 to +142
default Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean repeat,
boolean updateCacheMetrics, BlockType blockType) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this BlockType blockType an optimization for fast lookup?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlockType here is a hint rather than part of the lookup key.
It is used by some cache implementations to avoid re-inspecting the block payload
(e.g. distinguishing data vs index/meta blocks), enabling minor optimizations in
metrics, prioritization, or eviction behavior.

It is not required for correctness and does not affect lookup, which is based
solely on BlockCacheKey. Implementations are free to ignore it, which is why a
default method is provided. This will be moved later into Context object

* </ul>
*/
@InterfaceAudience.Private
public interface CacheEngine {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was about to ask some of the interface in BlockCache wasn't defined here, e.g. iterator and getBlockCaches, but let's see if the future task would tell us about those functions

@VladRodionov VladRodionov force-pushed the HBASE-30019-cache-engine-topology-v2 branch from 101f264 to 8b80e9d Compare April 28, 2026 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants