fix: bounds-check ELF section and symbol-table parsing#568
Conversation
This comment has been minimized.
This comment has been minimized.
jbachorik
left a comment
There was a problem hiding this comment.
Sphinx multi-stream review (generalist + specialist + adversary).
Overall: The core hardening is correct — offset-based iteration, division-based index bounds, inImage/sectionAt/strAt layering, and NUL-termination enforcement are all sound. Two additional gaps in the same file are noted below as they are directly related to the PR's stated goal.
⚠ HIGH (out-of-diff, same function as the fix) — symbols_linux.cpp ~line 642
parseDynamicSection captures strsz from DT_STRSZ but never uses it to guard the two addImport calls in the .rela.plt/.rela.dyn relocation loops:
_cc->addImport((void**)(base + r->r_offset), strtab + sym->st_name); // unboundedEvery other name lookup in the file now routes through strAt(); these two are the only survivors — an internally inconsistent gap given the PR's own discipline.
Fix: Replace strtab + sym->st_name with strAt(strtab, strsz, sym->st_name) and skip the import when it returns NULL, mirroring loadSymbolTable.
MEDIUM (out-of-diff) — symbols_linux.cpp ~line 497
findProgramHeader and calcVirtualLoadAddress iterate the program-header table with raw e_phoff/e_phnum/e_phentsize arithmetic and no inImage() check — asymmetric with the now-hardened section-header path. The only guard in parseProgramHeaders validates the start of the first entry but not the full table.
Fix: Add a phdrAt() accessor analogous to sectionAt(): validate e_phentsize >= sizeof(ElfProgramHeader), index < e_phnum, and inImage(pheader, sizeof(ElfProgramHeader)) before each dereference.
| // strsz (DT_STRSZ) bounds string-table reads. The dynamic symbol | ||
| // region is virtual-address-relative live memory, so it is iterated | ||
| // as before rather than clamped to the file image. | ||
| loadSymbolTable(symtab, syment * nsyms, syment, strtab, strsz); |
There was a problem hiding this comment.
[MEDIUM · correctness] When DT_STRSZ is absent from the dynamic section, strsz stays 0. strAt(strings, 0, off) rejects every offset ≥ 1, so all dynamic symbols are silently dropped — a functional regression for any DSO that loads correctly without DT_STRSZ.
Suggestion: when strsz == 0, either fall back to the pre-hardening path for dynamic symbols, or emit a warning so the loss is observable rather than silent.
|
[HIGH · security] captures if (sym->st_name != 0) {
_cc->addImport((void**)(base + r->r_offset), strtab + sym->st_name); // unbounded
}Every other name lookup in the file now routes through Fix: Replace |
|
[MEDIUM · security] const char* pheaders = (const char*)_header + _header->e_phoff;
for (int i = 0; i < _header->e_phnum; i++) {
ElfProgramHeader* pheader = (ElfProgramHeader*)(pheaders + i * _header->e_phentsize);
if (pheader->p_type == type) { ... } // no inImage() guard
}This is asymmetric with the now-hardened section-header path ( Fix: Add a |
7e4c56b to
990a849
Compare
jbachorik
left a comment
There was a problem hiding this comment.
Sphinx Review — fix/elf-parser-oob — 2026-06-02
3 findings: 2 × MEDIUM, 1 × LOW. The bounds-hardening is solid overall; the issues below are incomplete coverage of the same pattern the PR itself introduced.
Streams: generalist (Opus) · specialist (Haiku lens-match · Sonnet scout · Opus review, 0 lenses applied) · adversary (Opus, 0 refutations)
Total reviewed: 7 → 4 valid, 3 structurally invalid (mutation findings outside diff)
| return NULL; | ||
| } | ||
| ElfProgramHeader* ph = (ElfProgramHeader*)( | ||
| (const char*)_header + _header->e_phoff + (size_t)index * _header->e_phentsize); |
There was a problem hiding this comment.
[CONSENSUS · MEDIUM · security] phdrAt() / contentAt() — pointer-arithmetic UB before inImage() guard
Both phdrAt() and contentAt() form the file-offset pointer before inImage() validates it:
// phdrAt (this line):
ElfProgramHeader* ph = (ElfProgramHeader*)((const char*)_header + _header->e_phoff + (size_t)index * _header->e_phentsize);
return inImage(ph, sizeof(ElfProgramHeader)) ? ph : NULL;
// contentAt (line ~411):
const char* p = (const char*)_header + s->sh_offset;
return inImage(p, want) ? p : NULL;With an attacker-controlled e_phoff / sh_offset near UINT64_MAX, the addition overflows (UB). An optimising compiler may assume the addition does not overflow and elide the >= _header branch inside inImage(), letting the wrapped pointer pass the guard.
The constructor already applies the correct integer-first pattern for _sections / e_shoff:
_sections = (image_size >= sizeof(ElfHeader) && _header->e_shoff < image_size)
? (const char*)addr + _header->e_shoff : NULL;Suggested fix — apply the same pattern in both accessors:
// contentAt — before forming the pointer:
if (s->sh_offset > (size_t)(_image_end - (const char*)_header)) return NULL;
const char* p = (const char*)_header + s->sh_offset;
// phdrAt — before forming the pointer:
if (_header->e_phoff > (size_t)(_image_end - (const char*)_header)) return NULL;
ElfProgramHeader* ph = (ElfProgramHeader*)((const char*)_header + _header->e_phoff + (size_t)index * _header->e_phentsize);| @@ -569,7 +657,10 @@ void ElfParser::parseDynamicSection() { | |||
| ElfRelocation* r = (ElfRelocation*)(jmprel + offs); | |||
| ElfSymbol* sym = (ElfSymbol*)(symtab + ELF_R_SYM(r->r_info) * syment); | |||
There was a problem hiding this comment.
[SPECIALIST · LOW · security] Dynamic-section relocation loops have no symbol-index bound (unlike addRelocationSymbols)
ElfSymbol* sym = (ElfSymbol*)(symtab + ELF_R_SYM(r->r_info) * syment);
if (sym->st_name != 0) { ... } // dereferenced with no index checkaddRelocationSymbols (file-section path) was hardened with max_sym_index in this PR; the parallel dynamic-section path here was not. A corrupt dynamic section with a large r_info symbol index or large syment yields an out-of-range ElfSymbol* that is then dereferenced.
Lower severity because this path operates on linker-validated live memory rather than an attacker-supplied file. Consider applying the same max_sym_index bound, or adding a comment that explicitly acknowledges the asymmetry.
jbachorik
left a comment
There was a problem hiding this comment.
Sphinx Review (run 2) — fix/elf-parser-oob — 2026-06-02
This run surfaced 1 × HIGH and 2 × MEDIUM not in the previous review. Posting only those here.
Streams: generalist (Opus) · specialist (Haiku + Sonnet + Opus) · adversary (Opus, HIGH confirmed)
Total: 10 reviewed → 9 valid
| if (s == NULL) { | ||
| return NULL; | ||
| } | ||
| const char* p = (const char*)_header + s->sh_offset; |
There was a problem hiding this comment.
[CONSENSUS · HIGH · correctness · ADVERSARY-TESTED] contentAt() / phdrAt() — pointer-arithmetic overflow UB fires under project's own sanitizer config
Both contentAt() (this line) and phdrAt() (lines 433-434) form _header + attacker-controlled uint64 before inImage() validates the result:
// contentAt:
const char* p = (const char*)_header + s->sh_offset; // ← UB if sh_offset near UINT64_MAX
return inImage(p, want) ? p : NULL;
// phdrAt:
(const char*)_header + _header->e_phoff + (size_t)index * _header->e_phentsizeThe project builds with -fsanitize=pointer-overflow -fno-sanitize-recover=all (ConfigurationPresets.kt:171-176) and the fuzz target with -fsanitize=fuzzer,address,undefined (FuzzTargetsPlugin.kt:203). Forming _header + huge_uint64 is a non-recoverable abort under the project's own CI config when the fuzz corpus exercises a large-offset section. A wrap-around value (e.g. UINT64_MAX - delta) can also land back inside [_header, _image_end), passing inImage() and returning a bogus in-image pointer.
The constructor already applies the safe integer-first pattern for _sections/e_shoff:
_sections = (image_size >= sizeof(ElfHeader) && _header->e_shoff < image_size)
? (const char*)addr + _header->e_shoff : NULL;extractBuildIdFromMemory uses it too. contentAt/phdrAt should match:
// contentAt — validate offset before pointer formation:
size_t img_size = (size_t)(_image_end - (const char*)_header);
if (s->sh_offset > img_size || want > img_size - s->sh_offset) return NULL;
const char* p = (const char*)_header + s->sh_offset;
// phdrAt — similarly in integer space before casting:
size_t phoff = _header->e_phoff;
size_t stride = (size_t)index * _header->e_phentsize;
if (phoff > img_size || stride > img_size - phoff) return NULL;
ElfProgramHeader* ph = (ElfProgramHeader*)((const char*)_header + phoff + stride);| void ElfParser::parseProgramHeaders(CodeCache* cc, const char* base, const char* end, bool relocate_dyn) { | ||
| ElfParser elf(cc, base, base, NULL, relocate_dyn); | ||
| ElfParser elf(cc, base, base, (size_t)(end - base), NULL, relocate_dyn); | ||
| if (elf.validHeader() && base + elf._header->e_phoff < end) { |
There was a problem hiding this comment.
[SPECIALIST · MEDIUM · correctness] parseProgramHeaders outer guard — same pointer-overflow UB on e_phoff
if (elf.validHeader() && base + elf._header->e_phoff < end) {e_phoff is Elf64_Off (uint64_t) and fully attacker-controlled. The addition base + e_phoff is pointer arithmetic on an unbounded uint64 — same UBSan abort risk as contentAt/phdrAt above. Additionally a crafted e_phoff designed to overflow back to a small value satisfies base + e_phoff < end even when e_phoff clearly exceeds the image size, allowing the body to execute when it should be skipped.
The constructor checks e_shoff < image_size (safe integer form) a few lines above. Apply the same here:
if (elf.validHeader() && elf._header->e_phoff < (size_t)(end - base)) {This avoids forming the overflowed pointer entirely.
| if (strsz == 0) { | ||
| Log::warn("DT_STRSZ absent from dynamic section in %s; string lookups will be unbounded", | ||
| _file_name != NULL ? _file_name : "unknown"); | ||
| strsz = (size_t)-1; |
There was a problem hiding this comment.
[SPECIALIST · MEDIUM · robustness] strsz = (size_t)-1 fallback — unbounded memchr over live memory
if (strsz == 0) {
Log::warn("No DT_STRSZ in dynamic section of %s", _file_name);
strsz = (size_t)-1; // ← SIZE_MAX
}Every subsequent strAt(strtab, strsz, sym->st_name) call resolves to memchr(strtab + off, '\0', SIZE_MAX - off). memchr stops at the first NUL, so for a well-formed library this is fast. However:
- Fault: if
DT_STRTABpoints to a memory region that has no NUL byte before an unmapped page (corrupt or hand-crafted library),memchrwalks off the end of mapped memory and the profiler thread faults. - DoS: a pathological-but-valid library with very long symbol names causes
memchrto scan megabytes per symbol per library load.
Cap the fallback to a sane upper bound — the fuzz target already uses 4 MB as its limit:
strsz = 4 * 1024 * 1024; // bound worst-case scan; matches fuzz target capThis still handles all real-world libraries while preventing both failure modes.
jbachorik
left a comment
There was a problem hiding this comment.
Looks ok - not an ELF format expert but nothing seems to be obviously off.
7c46187 to
9943627
Compare
- phdrAt(): validate e_phoff before forming pointer (UB when e_phoff wraps image bounds), mirroring the e_shoff pre-check in the constructor - parseDynamicSection(): cap strsz fallback to 1 MB instead of SIZE_MAX; a SIZE_MAX memchr scan crashes on libraries missing DT_STRSZ - parseDwarfInfo(): migrate program-header loop to phdrAt(), matching the fixes already applied to findProgramHeader() and calcVirtualLoadAddress() in the previous commit Tests: - elfparser_ut: programHeaderOffsetOutOfBounds, symbolNameOffsetOutOfBounds - dwarf_ut: FdeAtExactImageBoundary, FdeExceedsImageEnd
Both functions formed `_header + attacker-controlled-uint64` before inImage() could validate the result. Under the project's own CI config (-fsanitize=pointer-overflow -fno-sanitize-recover=all / fuzzer -fsanitize=fuzzer,address,undefined) this is a non-recoverable abort. contentAt(): validate sh_offset in integer space before the cast: if (sh_offset > img_size || want > img_size - sh_offset) return NULL; phdrAt(): validate both e_phoff and index*e_phentsize in integer space: if (phoff > img_size || stride > img_size - phoff) return NULL; Both now match the safe integer-first pattern already used for _sections/e_shoff in the constructor.
6b3605e
into
DataDog:main
What does this PR do?:
Add bound checks to the elf section and symbol table parsing.
This adds both unit test and minimized fuzz seeds to the repo (full reproducers are a bit large to be in Git IMO)
Motivation:
Fix overflow / out of bound reads.
Additional Notes:
How to test the change?:
For Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance.Unsure? Have a question? Request a review!