Skip to content

[improvement] Control HNSW build chunk memory#62869

Open
yx-keith wants to merge 1 commit intoapache:masterfrom
yx-keith:optimize-HNSW-build
Open

[improvement] Control HNSW build chunk memory#62869
yx-keith wants to merge 1 commit intoapache:masterfrom
yx-keith:optimize-HNSW-build

Conversation

@yx-keith
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Reduce ANN/HNSW index build memory spikes by switching chunk sizing to a byte-aware policy, releasing temporary vector buffers more aggressively, and adding a first-pass memory budget admission path for ANN index writers.

Release note

None

Check List (For Author)

  • Test: Partial
    • Unit Test / Manual test / No need to test (with reason)
    • Added/updated BE unit tests in ann_index_writer_test.cpp
    • Attempted to run ./run-be-ut.sh --run --filter=AnnIndexWriterTest, but the local environment is pinned to JDK 8 and Doris BE UT requires JDK 17, so the suite could not be executed in this workspace
  • Behavior changed: Yes (ANN/HNSW build now sizes chunks by byte budget and may wait/skip/fail on configured memory admission policy)
  • Does this need documentation: No

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Reduce ANN/HNSW index build memory spikes by switching chunk sizing to a byte-aware policy, releasing temporary vector buffers more aggressively, and adding a first-pass memory budget admission path for ANN index writers.

### Release note

None

### Check List (For Author)

- Test: Partial
    - Unit Test / Manual test / No need to test (with reason)
    - Added/updated BE unit tests in ann_index_writer_test.cpp
    - Attempted to run ./run-be-ut.sh --run --filter=AnnIndexWriterTest, but the local environment is pinned to JDK 8 and Doris BE UT requires JDK 17, so the suite could not be executed in this workspace
- Behavior changed: Yes (ANN/HNSW build now sizes chunks by byte budget and may wait/skip/fail on configured memory admission policy)
- Does this need documentation: No
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants