-
Notifications
You must be signed in to change notification settings - Fork 14
Pull requests: pytorch/ci-infra
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CRCR] Rename OOT → CRCR in Terraform config and README
#845
opened Jun 26, 2026 by
subinz1
Loading…
2 tasks
Node compactor: pending-pod awareness and peak-window taint dampening
#839
opened Jun 25, 2026 by
jeanschmidt
Contributor
Loading…
osdc/hf-cache: scripts/hf-cache-seed.py to seed models into bucket(s)
#833
opened Jun 25, 2026 by
huydhn
Contributor
Loading…
osdc/hf-cache: integration test — refresh (online download → S3 sync)
#818
opened Jun 24, 2026 by
huydhn
Contributor
Loading…
osdc/hf-cache: integration tests — mount, strict offline read, large-model read
#817
opened Jun 24, 2026 by
huydhn
Contributor
Loading…
osdc/hf-cache: wire into clusters.yaml + arc-runners (gated, not enabled)
#813
opened Jun 23, 2026 by
huydhn
Contributor
Loading…
osdc/hf-cache: rclone read-only mount DaemonSet + scheduling gate
#812
opened Jun 23, 2026 by
huydhn
Contributor
Loading…
osdc/hf-cache: terraform — per-cluster S3 bucket + read-only IRSA role
#811
opened Jun 23, 2026 by
huydhn
Contributor
Loading…
bin-pack-scheduler: enable on meta-staging-aws-ue1
#807
opened Jun 23, 2026 by
georgehong
Contributor
•
Draft
[CRCR] Add EventBridge sweeper configuration and interval variable
#806
opened Jun 23, 2026 by
can-gaa-hou
Collaborator
Loading…
arc-runners: let runner defs select a workflow scheduler
#804
opened Jun 22, 2026 by
georgehong
Contributor
•
Draft
Fix recurring dcgm-exporter OOMKill on dense-GPU nodes
#799
opened Jun 19, 2026 by
huydhn
Contributor
Loading…
[TEST-ONLY] Validate NUMA scheduling on A100 (p4d) in ue1 staging
#778
opened Jun 16, 2026 by
georgehong
Contributor
•
Draft
arc-runners: support per-def workflow schedulerName (#696)
#759
opened Jun 15, 2026 by
georgehong
Contributor
•
Draft
[TEST-ONLY] Enable NUMA modules on arc-staging with g4dn.metal (T4)
#748
opened Jun 12, 2026 by
georgehong
Contributor
•
Draft
Enable NUMA-aware scheduling for H100 4-GPU runner (#696)
#740
opened Jun 11, 2026 by
georgehong
Contributor
•
Draft
Add NFD topology-updater module with startup taint remover (#696)
#738
opened Jun 11, 2026 by
georgehong
Contributor
•
Draft
Enable NUMA-aware scheduling for H100 4-GPU runner (#696)
#737
opened Jun 11, 2026 by
georgehong
Contributor
•
Draft
Document ARC stale scale set recovery runbook
#717
opened Jun 9, 2026 by
zxiiro
Collaborator
Loading…
Add NFD topology-updater and numa-scheduler modules (#696)
#716
opened Jun 9, 2026 by
georgehong
Contributor
•
Draft
Previous Next
ProTip!
Follow long discussions with comments:>50.