Skip to content

Toolchain/CMake: Improve dependency handling#7449

Open
Growl1234 wants to merge 6 commits into
deepmodeling:developfrom
Growl1234:toolchain
Open

Toolchain/CMake: Improve dependency handling#7449
Growl1234 wants to merge 6 commits into
deepmodeling:developfrom
Growl1234:toolchain

Conversation

@Growl1234

@Growl1234 Growl1234 commented Jun 7, 2026

Copy link
Copy Markdown

Cereal and RapidJSON have their own CMake build systems, which should be used during toolchain installation rather than relying on direct file copies. Therefore, the custom logic for finding these two dependencies should no longer be needed, as it brings avoidable issues and maintenance work.

This PR switches the installation method of the two packages to CMake, and removes the related custom logic, including the FindCereal.cmake file. The FetchContent fallback is also dropped, since installing these dependencies is the responsibility of the toolchain.

On the ABACUS CMake side, cereal and RapidJSON are now consumed through interface targets, so their include directories and compile definitions are propagated through CMake usage requirements in a more explicit and stable way.

The FindLibxc.cmake file is also removed in favor of libxc's own CMake config files.

Copilot AI review requested due to automatic review settings June 7, 2026 16:19

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR shifts RapidJSON and cereal integration to rely on proper CMake installs/packages (instead of manual header copying / FetchContent fallbacks) and updates CI/toolchain workflows to run consistently under bash and install required stage4 tools.

Changes:

  • Install RapidJSON and cereal via CMake build+install in the toolchain, and stop exporting include paths via CPATH.
  • Update project CMake to require RapidJSON and cereal via find_package(...), removing the custom FindCereal.cmake and RapidJSON FetchContent fallback.
  • CI workflows: standardize defaults.run.shell: bash, ensure xz tooling is present, and install stage4 external tools before building.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
toolchain/scripts/stage4/install_rapidjson.sh Switch RapidJSON toolchain install to CMake configure/install; remove CPATH export.
toolchain/scripts/stage4/install_cereal.sh Switch cereal toolchain install to CMake configure/install; remove CPATH export.
toolchain/root_requirements/install_requirements_ubuntu.sh Add xz-utils dependency.
toolchain/root_requirements/install_requirements_fedora.sh Add xz dependency.
toolchain/build_abacus_intel.sh Remove manual CEREAL/RAPIDJSON paths passed to CMake.
toolchain/build_abacus_gnu.sh Remove manual CEREAL/RAPIDJSON paths passed to CMake.
toolchain/build_abacus_gcc-mkl.sh Remove manual CEREAL/RAPIDJSON paths passed to CMake.
toolchain/build_abacus_gcc-aocl.sh Remove manual CEREAL/RAPIDJSON paths passed to CMake.
cmake/FindCereal.cmake Remove custom cereal find-module (move to config package).
CMakeLists.txt Require RapidJSON/cereal via find_package(...) and drop FetchContent/include-directories logic.
.github/workflows/version_check.yml Set default bash shell for run: steps.
.github/workflows/toolchain_quick.yaml Set default bash shell for run: steps.
.github/workflows/toolchain_full.yaml Set default bash shell for run: steps.
.github/workflows/test.yml Install stage4 external tools via toolchain script (broader than dftd4-only).
.github/workflows/pytest.yml Set default bash shell for run: steps.
.github/workflows/performance.yml Set default bash shell for run: steps.
.github/workflows/mirror_gitee.yml Set default bash shell for run: steps.
.github/workflows/interface.yml Set default bash shell for run: steps.
.github/workflows/dynamic.yml Set default bash shell; add toolchain external tools install; source toolchain env before build.
.github/workflows/doxygen.yml Set default bash shell for run: steps.
.github/workflows/devcontainer.yml Set default bash shell for run: steps.
.github/workflows/cuda.yml Install xz-utils; install stage4 external tools; source toolchain env before build.
.github/workflows/coverage.yml Install xz-utils; install stage4 external tools; source toolchain env before build.
.github/workflows/build_test_makefile.yml Set default bash shell for run: steps.
.github/workflows/build_test_cmake.yml Add toolchain install step, source toolchain env; set default bash shell; matrix toolchain args.
.github/workflows/ase_plugin_test.yml Install xz-utils; install stage4 external tools; source toolchain env before build.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread toolchain/scripts/stage4/install_rapidjson.sh
Comment thread toolchain/scripts/stage4/install_cereal.sh
Comment thread toolchain/scripts/stage4/install_cereal.sh
Comment thread CMakeLists.txt Outdated
Comment thread CMakeLists.txt Outdated
Comment thread .github/workflows/dynamic.yml
Comment thread .github/workflows/coverage.yml
Comment thread .github/workflows/build_test_cmake.yml
Comment thread .github/workflows/ase_plugin_test.yml
@Growl1234

Growl1234 commented Jun 7, 2026

Copy link
Copy Markdown
Author

If possible, it's better that some other libraries such as libRI and libComm also be applied a simple CMake system along with properly-generated config files.

@Growl1234 Growl1234 changed the title Toolchain: Rely on installed CMake config file for cereal and RapidJSON Toolchain: Rely on installed CMake config file for libxc, cereal, and RapidJSON Jun 7, 2026
@Growl1234 Growl1234 force-pushed the toolchain branch 2 times, most recently from b58a7b3 to 833f30e Compare June 7, 2026 18:33
@Growl1234 Growl1234 marked this pull request as draft June 7, 2026 18:33
@Growl1234

Copy link
Copy Markdown
Author

I have switched to interface targets, which provide a more modern CMake approach for ensuring stable propagation of dependency usage requirements.

@Growl1234 Growl1234 marked this pull request as ready for review June 7, 2026 19:37
@Growl1234 Growl1234 changed the title Toolchain: Rely on installed CMake config file for libxc, cereal, and RapidJSON CMake: Improve dependency handling for libxc, cereal, and RapidJSON Jun 7, 2026
@Growl1234 Growl1234 changed the title CMake: Improve dependency handling for libxc, cereal, and RapidJSON Toolchain/CMake: Improve dependency handling for libxc, cereal, and RapidJSON Jun 7, 2026
@Growl1234

Growl1234 commented Jun 7, 2026

Copy link
Copy Markdown
Author

I have switched to interface targets, which provide a more modern CMake approach for ensuring stable propagation of dependency usage requirements.

Following this change there must be some more refactors, for example, remove_definitions(...) is no longer appropriate.

@Growl1234 Growl1234 changed the title Toolchain/CMake: Improve dependency handling for libxc, cereal, and RapidJSON Toolchain/CMake: Improve dependency handling Jun 7, 2026
@mohanchen mohanchen added the Compile & CICD & Docs & Dependencies Issues related to compiling ABACUS label Jun 9, 2026
@mohanchen mohanchen requested a review from QuantumMisaka June 9, 2026 00:49
@QuantumMisaka

Copy link
Copy Markdown
Collaborator

@Growl1234

LGTM mostly, the comments below are making by GPT for reference:

Thanks for the cleanup here. The overall direction makes sense: moving cereal / RapidJSON / Libxc discovery toward the toolchain-provided standard CMake package flow, and removing some of the legacy custom lookup logic.

That said, I do not think this is ready to merge yet. I see two blocking risks in the current version:

  1. Libxc compatibility regression

The switch to find_package(Libxc REQUIRED) removes behavior that the current tree still relies on through cmake/FindLibxc.cmake:

  • fallback discovery through pkg-config
  • compatibility with the existing "prefix-style" Libxc_DIR usage, where callers pass the install prefix rather than the exact config directory

At the same time, the toolchain wrapper scripts no longer pass -DLibxc_DIR=..., so this is not just an internal cleanup. It changes the supported invocation contract for existing user environments and wrapper scripts.

Even if the default in-tree toolchain flow still works because setup injects CMAKE_PREFIX_PATH, this is still a real backward-compatibility regression for environments that currently depend on pkg-config discovery or the old prefix-style Libxc_DIR behavior.

I think we need to preserve one of those compatibility paths before merging, either by keeping a fallback FindLibxc.cmake path or by explicitly preserving the prefix-style Libxc_DIR handling in the top-level CMake logic.

  1. The new feature propagation mechanism is not aligned with existing test configuration

This PR moves feature/macro propagation to target-level target_compile_definitions() via the new recursive apply logic. However, many test directories in the current tree still use the old directory-scope remove_definitions(...) pattern to disable things like __MPI or USE_LIBXC.

Those old remove_definitions(...) calls do not prevent the new logic from re-applying the macros to the test targets later, which means the semantics of those tests have changed silently. In other words, tests that were previously configured to build without certain feature macros may now build with them enabled again.

Even if this does not break every test immediately, it is still a configuration regression and should be resolved in the same PR. I would strongly recommend migrating the existing test directories to the new disabling mechanism (for example, the new abacus_disable_feature_definitions(...) interface) before this lands.

Because of these two points, I would request changes for now.

@Growl1234

Growl1234 commented Jun 9, 2026

Copy link
Copy Markdown
Author

@QuantumMisaka I will push back on these points.

For Libxc, removing cmake/FindLibxc.cmake is exactly one of the things that this PR is intended to do. Since Libxc has its own CMake build system and can provide LibxcConfig.cmake, there is no reason to maintain a separate FindLibxc.cmake for it. Such custom discovery logic is fragile and tends to bring more problems and maintenance work than convenience.

The correct CMake-side usage should be to rely on the dependency's own config file if it's built with CMake. For example, if LibxcConfig.cmake is installed under /opt/libxc/lib/cmake/, then the installation prefix /opt/libxc should be added to CMAKE_PREFIX_PATH. This is the standard CMake package-discovery model, and it is also the model already used by the toolchain setup.

Preserving the old prefix-style Libxc_DIR behavior is even a worse idea. In CMake, <PackageName>*_DIR is expected to point to the directory containing the package config file, not to the installation prefix. Keeping a package-specific exception for Libxc would make the behavior inconsistent with standard CMake semantics and with the other dependencies handled by this PR. For providing an installation prefix, CMAKE_PREFIX_PATH is the only appropriate interface.

For the feature propagation point, I agree that remove_definitions(...) is no longer an appropriate mechanism once feature macros are handled through target-level compile definitions. However, I do not think every old remove_definitions(...) call should be mechanically preserved. Some of them were likely typos, no-ops, or inconsistent with the actual code path.

For example, the TDDFT LCAO test failure exposed by CI was caused by an old remove_definitions(-D __MPI) line. Because of the incorrect space between -D and __MPI, this probably did not actually disable __MPI before. After translating it into a real target-level opt-out, __MPI was finally disabled there, and the test failed. The failure itself shows that these tests actually require the MPI/ScaLAPACK code path: symbols and members such as numroc_, descinit_, Parallel_Orbitals::desc, and module_rt::half_Hmatrix are only available in that configuration. So disabling __MPI for those tests is not a valid behavior to preserve.

Therefore, I agree that remaining remove_definitions(...) usages should be cleaned up, but I do not agree that their old literal semantics should always be kept. Each case should be checked against the actual test requirements. Valid opt-outs should be migrated to the new disabling interface, while invalid or ineffective old opt-outs should be removed rather than preserved.

If there are still remove_definitions(...) calls left in the current tree, please point them out and I will migrate or remove them accordingly.

@QuantumMisaka QuantumMisaka left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup. I agree with the CMake direction here: consuming Libxc/cereal/RapidJSON through their installed config packages and using CMAKE_PREFIX_PATH for installation prefixes is the right long-term interface. I do not think we should preserve prefix-style Libxc_DIR as the preferred API.

However, the current PR is not yet consistent with that new contract in the Toolchain and user-facing install paths. I verified this on the SAI environment:

  • find_package(Libxc REQUIRED) works with CMAKE_PREFIX_PATH=/opt/devtools/libxc/libxc-7.0.0-avx2.
  • It also works with exact Libxc_DIR=/opt/devtools/libxc/libxc-7.0.0-avx2/lib/cmake/Libxc.
  • It fails with the old documented prefix-style Libxc_DIR=/opt/devtools/libxc/libxc-7.0.0-avx2, which is expected under standard CMake semantics.
  • More importantly, sourcing an existing ABACUS Toolchain setup generated with --with-libxc=system does not expose Libxc through CMAKE_PREFIX_PATH; configuring this PR then fails at CMakeLists.txt:859 (find_package) until the Libxc prefix is added manually.
  • I also verified the PR's default CMake-install approach for cereal and RapidJSON: installing both with the new CMake install commands produces cerealConfig.cmake / RapidJSONConfig.cmake, and find_package(... CONFIG REQUIRED) finds the expected cereal::cereal and RapidJSON targets. So the default install path is fine; the problem is the non-install/system/user discovery contract.

Requests:

  1. toolchain/scripts/stage3/install_libxc.sh: __SYSTEM__ and user-provided modes still validate headers/libs and export Makefile-style flags, but the CMake build now needs a discoverable LibxcConfig.cmake. Please make these modes export a usable Libxc prefix through CMAKE_PREFIX_PATH, or fail early with a clear message that Libxc must be CMake-installed and discoverable as a package config.

  2. toolchain/scripts/stage4/install_cereal.sh: __SYSTEM__ mode currently only checks for cereal/cereal.hpp. A header-only system install will pass Toolchain setup and then fail at ABACUS configure with find_package(cereal CONFIG REQUIRED). Please validate cerealConfig.cmake / cereal::cereal and export the prefix through CMAKE_PREFIX_PATH, or fail in the Toolchain stage.

  3. toolchain/scripts/stage4/install_rapidjson.sh: same issue as cereal. The system path only proves rapidjson/rapidjson.h exists, but ABACUS now requires RapidJSONConfig.cmake and the RapidJSON target.

  4. Docs: docs/advanced/install.md and docs/quick_start/easy_install.md still recommend prefix-style Libxc_DIR, CEREAL_INCLUDE_DIR, and say defining Libxc_DIR enables Libxc. Please update these to the new contract: -DENABLE_LIBXC=ON plus CMAKE_PREFIX_PATH=<install-prefixes>, while noting that <Package>_DIR must point to the exact config-file directory if used directly.

  5. Optional cleanup: toolchain/build_abacus_aocc-aocl.sh still passes old CEREAL_INCLUDE_DIR and RapidJSON_DIR=$RAPIDJSON_ROOT. These are not the main blocker if setup exports the right prefix, but they keep the obsolete interface alive and should be removed for consistency with the other wrapper scripts.

@Growl1234

Growl1234 commented Jun 9, 2026

Copy link
Copy Markdown
Author
  • It also works with exact Libxc_DIR=/opt/devtools/libxc/libxc-7.0.0-avx2/lib/cmake/Libxc.
  • It fails with the old documented prefix-style Libxc_DIR=/opt/devtools/libxc/libxc-7.0.0-avx2, which is expected under standard CMake semantics.

These two sentences sounds inconsistent... And as mentioned above, Libxc_DIR should not be used from user's perspective.

More importantly, sourcing an existing ABACUS Toolchain setup generated with --with-libxc=system does not expose Libxc through CMAKE_PREFIX_PATH; configuring this PR then fails at CMakeLists.txt:859 (find_package) until the Libxc prefix is added manually.

The variable CMAKE_PREFIX_PATH is actually similar to PATH, LD_LIBRARY_PATH, and PKG_CONFIG_PATH, all of which are assumed to have been properly set by users themselves when detecting with __SYSTEM__.

  • So the default install path is fine; the problem is the non-install/system/user discovery contract.

It's clear that with current clean-up the "non-install" path is no longer supported. I don't really understand why one would prefer that way...

4. Docs: docs/advanced/install.md and docs/quick_start/easy_install.md still recommend prefix-style Libxc_DIR, CEREAL_INCLUDE_DIR, and say defining Libxc_DIR enables Libxc. Please update these to the new contract: -DENABLE_LIBXC=ON plus CMAKE_PREFIX_PATH=<install-prefixes>, while noting that <Package>_DIR must point to the exact config-file directory if used directly.
5. Optional cleanup: toolchain/build_abacus_aocc-aocl.sh still passes old CEREAL_INCLUDE_DIR and RapidJSON_DIR=$RAPIDJSON_ROOT. These are not the main blocker if setup exports the right prefix, but they keep the obsolete interface alive and should be removed for consistency with the other wrapper scripts.

Thanks, I will work on this.

@Growl1234

Copy link
Copy Markdown
Author

Regarding Cereal and RapidJSON detection, I'll also try cleanups

@Growl1234

Growl1234 commented Jun 9, 2026

Copy link
Copy Markdown
Author

And currently, the system finding logic of cereal would possibly be not always valid because INCLUDE_PATHS does not seem to contain user-defined installtion path. I will not rely on CPATH; although it can be defined by users like PATH etc, it's neither general nor prefix variable.

Unfortunately, building cereal would not pull out a library (e.g. libcereal.a, so we can at first check the existence of library and then assume the pkg_install_dir accordingly), so I have no idea how to deal with it well currently.

Regarding libxc, I've checked the system installation of Ubuntu (to 22.04) and Fedora (to 43) through https://pkgs.org, and the results show that all those systems build DEB/RPM package of libXC with CMake, and find_package can actually find those CMake config files under /usr/lib64/cmake or /usr/share/cmake. Another point is, although libxc documents itself as preferring Autotools, the actual release package does not provide a present configure file, which means user has to re-generate it through autoreconf -i. Therefore, I think there's actually no reason to persist on building libxc with Autotools; we should assume CMake as the only convenience path to build libxc. I don't think it's worth effort optimizing libxc system finding logic.

@QuantumMisaka QuantumMisaka left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-tested the current head (06dca8d8768169dd09c40f14a18af0bfd0eff976) with the PR Toolchain-installed cereal/RapidJSON packages and SAI's existing CMake-package Libxc.

The Toolchain side for the default install path looks good for the packages touched here: the PR installs

  • cereal-22a1b36/lib/cmake/cereal/cerealConfig.cmake
  • rapidjson-24b5e7a/lib/cmake/RapidJSON/RapidJSONConfig.cmake

and exports their install prefixes through CMAKE_PREFIX_PATH. A minimal GNU Toolchain run also succeeds with and without --enable-cuda --gpu-ver=70.

However, the source build is still blocked when ENABLE_LCAO=ON and ENABLE_LIBXC=ON, even though CMake configuration succeeds and finds all package configs:

  • Libxc_DIR=/opt/devtools/libxc/libxc-7.0.0-avx2/lib/cmake/Libxc
  • cereal_DIR=/tmp/abacus-pr7449-verify/toolchain/install/cereal-22a1b36/lib/cmake/cereal
  • RapidJSON_DIR=/tmp/abacus-pr7449-verify/toolchain/install/rapidjson-24b5e7a/lib/cmake/RapidJSON

CPU/GNU build command failed at rdmft_tools.cpp:

source/source_hamilt/module_xc/xc_functional.h:9:10: fatal error: xc.h: No such file or directory

The same failure appears in the GNU+CUDA configuration after CMake finds CUDA 12.4/nvcc and generates successfully.

Root cause: the PR correctly propagates RapidJSON and cereal through abacus_external_deps, but Libxc is only linked to ${ABACUS_BIN_NAME}:

target_link_libraries(${ABACUS_BIN_NAME} PRIVATE Libxc::xc)
abacus_add_feature_definitions(USE_LIBXC)

At the same time, USE_LIBXC is propagated to object libraries such as rdmft; those targets include headers that reach <xc.h>, but they do not receive Libxc::xc's INTERFACE_INCLUDE_DIRECTORIES. Previously this was hidden by global include_directories(${Libxc_INCLUDE_DIRS}).

So I do not think this PR is ready aside from documentation. Please propagate Libxc usage requirements to the targets that compile Libxc-dependent sources, preferably consistently with the new design, e.g. by adding Libxc::xc to abacus_external_deps when ENABLE_LIBXC=ON rather than linking it only to the final executable.

@Growl1234

Copy link
Copy Markdown
Author

Thanks for reporting this issue. I'll have a look at why this could happen and give a further clean-up.

@Growl1234

Copy link
Copy Markdown
Author

This cleanup can hopefully fix the issue. It might take some time before submitting a change related to Docs as I'm currently busy on CMake regression fixings at another repos.

@Growl1234 Growl1234 force-pushed the toolchain branch 2 times, most recently from 4483166 to 908b32d Compare June 11, 2026 04:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Compile & CICD & Docs & Dependencies Issues related to compiling ABACUS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants