Skip to content

GH-1067: Close cached HDFS FileSystem instances#1141

Open
xborder wants to merge 2 commits intoapache:mainfrom
xborder:gh-1067
Open

GH-1067: Close cached HDFS FileSystem instances#1141
xborder wants to merge 2 commits intoapache:mainfrom
xborder:gh-1067

Conversation

@xborder
Copy link
Copy Markdown
Contributor

@xborder xborder commented May 7, 2026

What's Changed

  • This PR fixes JVM shutdown hangs after reading HDFS datasets through Arrow Java.
  • FileSystemDatasetFactory now tracks hdfs:// URIs used to create the factory. On close(), after releasing the native dataset factory, it best-effort closes the matching Hadoop FileSystem instances.
  • The Hadoop cleanup is done via reflection so Arrow Java does not add a production dependency on Hadoop. Non-HDFS URIs are ignored.

Closes #1067 .

@github-actions

This comment has been minimized.

@jbonofre jbonofre added this to the 20.0.0 milestone May 7, 2026
@jbonofre jbonofre added the bug-fix PRs that fix a big. label May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PRs that fix a big.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ARROW Java][HDFS] JVM hangs after reading HDFS files via Arrow Dataset API due to non-daemon native threads

2 participants