Skip to content

Corrupted output from base64-encode stream filter over slow http stream #22360

@dmjohnsson23

Description

@dmjohnsson23

Description

I've been having an issue where the 'base64-encode' stream filter produces corrupt output under certain circumstances.

This code:

$streamUrl = "http://localhost/generate.pdf.php";
$streamSessionId = internal_function_here_gets_valid_session_id();
$streamContext = stream_context_create([
    'http'=> [
        'method'=>"GET",
        'header'=>"Accept-language: en\r\n" .
        "Cookie: ".session_name()."=$streamSessionId\r\n"
    ]
]);
$stream = fopen($streamUrl, 'r', false, $streamContext);
\stream_filter_append($stream, 'convert.base64-encode');
header('Content-Type: application/pdf');
echo base64_decode(\stream_get_contents($stream));

Produces this corrupt PDF:

bad.pdf

Here's the raw base64 that was produced:

$streamUrl = "http://localhost/generate.pdf.php";
$streamSessionId = internal_function_here_gets_valid_session_id();
$streamContext = stream_context_create([
    'http'=> [
        'method'=>"GET",
        'header'=>"Accept-language: en\r\n" .
        "Cookie: ".session_name()."=$streamSessionId\r\n"
    ]
]);
$stream = fopen($streamUrl, 'r', false, $streamContext);
\stream_filter_append($stream, 'convert.base64-encode');
header('Content-Type: text/plain');
echo \stream_get_contents($stream);

bad.pdf.base64.txt

By contrast, if I use the base64_encode() function rather than the stream filter, I get different (valid) results:

$streamUrl = "http://localhost/generate.pdf.php";
$streamSessionId = internal_function_here_gets_valid_session_id();
$streamContext = stream_context_create([
    'http'=> [
        'method'=>"GET",
        'header'=>"Accept-language: en\r\n" .
        "Cookie: ".session_name()."=$streamSessionId\r\n"
    ]
]);
$stream = fopen($streamUrl, 'r', false, $streamContext);
header('Content-Type: text/plain');
echo base64_encode(\stream_get_contents($stream));

good.pdf.base64.txt

And the actual PDF:

$streamUrl = "http://localhost/generate.pdf.php";
$streamSessionId = internal_function_here_gets_valid_session_id();
$streamContext = stream_context_create([
    'http'=> [
        'method'=>"GET",
        'header'=>"Accept-language: en\r\n" .
        "Cookie: ".session_name()."=$streamSessionId\r\n"
    ]
]);
$stream = fopen($streamUrl, 'r', false, $streamContext);
header('Content-Type: text/plain');
echo \stream_get_contents($stream);

good.pdf

It should be noted that the internal URL above dynamically generates the PDF (using PHP). I cannot replicate the issue using a static PDF served directly by Apache. Nevertheless, I don't think the generating URL is to blame, as it works fine so long as the stream filter is not applied. There is something about the way PHP interacts with itself over HTTP that triggers the bug.

Reducing the dynamic PDF generation endpoint proved to be a bit of a challenge, as seemingly unrelated things would effect whether the bug occurred or not. Finally though, I was finally able to reduce it to:

// This is generate.pdf.php
$pdf = file_get_contents('good.pdf');
header('Content-Type: application/pdf');
echo $pdf;
sleep(1);

The sleep() call at the end actually proved to be what did it: if the HTTP connection doesn't close immediately, the stream filter gets corrupted.

A few observations about the corruption itself:

  • The first 73,719 bytes of both PDFs are identical, but diverge completely from that point forward
  • Likewise, the first 1,223,189 characters of both base64 strings are identical, but the output diverges completely from that point forward
  • The stream filter inserted two random padding characters in the middle of the string right at the point where things started to go wrong
  • Despite those added padding characters, both base64 strings and both PDFs are the same size (the good base64 string has two padding characters in the normal place at the end of the string)
  • Although PHP is able to decode the bad base64, other systems (such as Python) cannot

My best guess is the bug is something to do with timing and/or buffering in the stream filter implementation.

PHP Version

PHP 8.5.1 (cli) (built: Dec 16 2025 15:59:07) (NTS gcc x86_64)
Copyright (c) The PHP Group
Built by Amazon Linux
Zend Engine v4.5.1, Copyright (c) Zend Technologies
    with Zend OPcache v8.5.1, Copyright (c), by Zend Technologies

Operating System

Amazon Linux 2023

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions