From d0bfb66641685bf085eb325c511eec2d2054866b Mon Sep 17 00:00:00 2001 From: Michael Vandeberg Date: Mon, 11 May 2026 12:06:38 -0600 Subject: [PATCH] Pass buffer sequences by value in read() and write() The free-function `read` and `write` overloads previously bound the buffer sequence by const reference. With lazy coroutines this dangles: the returned `io_task` can be stored past the full-expression that created the sequence, by which point the const reference points at a destroyed temporary. Switch both to by-value so the sequence lives in the coroutine frame. Update the ReadStream/WriteStream concept docstrings to list only the by-value signature as conforming and document the `std::views::all` caller-side workaround for expensive owning sequences. Append a Resolution section to doc/buffers-passing-rationale.md recording the decision and why both `const&` and `&&` fail under lazy coroutines. Add regression tests that store the returned awaitable past a temporary buffer sequence; verified to trip ASan stack-use-after-scope with the previous signatures and to pass cleanly with by-value. Resolves cppalliance/capy#263. --- doc/buffers-passing-rationale.md | 52 +++++++++++++++++++++ include/boost/capy/concept/read_stream.hpp | 27 ++++++----- include/boost/capy/concept/write_stream.hpp | 21 ++++++--- include/boost/capy/read.hpp | 2 +- include/boost/capy/write.hpp | 2 +- test/unit/read.cpp | 33 +++++++++++++ test/unit/write.cpp | 33 +++++++++++++ 7 files changed, 149 insertions(+), 21 deletions(-) diff --git a/doc/buffers-passing-rationale.md b/doc/buffers-passing-rationale.md index 676014caf..5f172f202 100644 --- a/doc/buffers-passing-rationale.md +++ b/doc/buffers-passing-rationale.md @@ -316,3 +316,55 @@ accept by `const&`. Whether the standard should mandate that implementations keep at least one copy alive for the duration of the operation - regardless of how the parameter is passed - remains an open question. + +## Resolution + +Tracked in [cppalliance/capy#263](https://github.com/cppalliance/capy/issues/263). + +**All capy I/O entry points that accept a buffer sequence take it by +value.** This applies uniformly to member operations (`read_some`, +`write_some`) and to the free-function composed operations (`read`, +`write`). + +The "distinguish the two cases" idea floated in the section above did +not survive scrutiny. The argument that coroutine tasks can safely +accept by `const&` because "the sequence is not a temporary relative +to the suspension" assumed that the coroutine body has run by the +point of suspension. With lazy coroutines, it has not: + +```cpp +auto aw = capy::read(stream, mutable_buffer{p, n}); +// The temporary mutable_buffer dies here, at the end of the +// full-expression. The coroutine body has not begun executing. +auto [ec, k] = co_await std::move(aw); +// If read() took the sequence by const&, this co_await +// dereferences a dangling reference. +``` + +By-rvalue-reference (`Buffers&&`) fails for the same reason: the +coroutine has no opportunity to copy the rvalue into its frame before +the full-expression ends and the rvalue is destroyed. + +**By-value is therefore the only safe convention for any lazy +awaitable** - coroutine-backed or not. The same rule applies to raw +awaitables for the reasons already given (the awaitable must own its +state to support sender pipelines and detached storage). + +### Caller-side workaround for expensive sequences + +The Asio-style assumption that buffer sequences are cheap to copy +still leaks through in cases like `std::vector` with +many entries. Callers in that situation can opt into a reference-like +view at the call site: + +```cpp +std::vector bufs = /* many entries */; +auto [ec, n] = co_await capy::read(stream, std::views::all(bufs)); +``` + +`std::views::all(bufs)` produces a `std::ranges::ref_view` that +satisfies the buffer-sequence concepts and copies in O(1). The caller +takes on the lifetime obligation in exchange for the cheap copy - +the same trade-off any reference-passing convention would impose, +but now opt-in and visible at the call site rather than baked into +the API. diff --git a/include/boost/capy/concept/read_stream.hpp b/include/boost/capy/concept/read_stream.hpp index c79aaac8a..15180b872 100644 --- a/include/boost/capy/concept/read_stream.hpp +++ b/include/boost/capy/concept/read_stream.hpp @@ -68,21 +68,24 @@ namespace capy { @par Conforming Signatures @code - // Templated for any MutableBufferSequence template< MutableBufferSequence MB > - IoAwaitable auto read_some( MB const& buffers ); - - template< MutableBufferSequence MB > - IoAwaitable auto read_some( MB buffers ); // by-value also permitted + IoAwaitable auto read_some( MB buffers ); @endcode - @warning **Coroutine Buffer Lifetime**: When implementing coroutine - member functions, prefer accepting buffer sequences **by value** - rather than by reference. Buffer sequences passed by reference may - become dangling if the caller's stack frame is destroyed before the - coroutine completes. Passing by value ensures the buffer sequence - is copied into the coroutine frame and remains valid across - suspension points. + @warning **Pass buffer sequences by value.** A by-value parameter + is copied into the coroutine frame (or the awaitable's state), + so the returned awaitable is self-contained and may be stored, + moved across threads, or wrapped into a sender without lifetime + concerns. A by-const-reference parameter binds to caller storage + and is only safe when the awaitable is consumed immediately by + `co_await` in the same scope; storing such an awaitable produces + a dangling reference. + + @note Callers who want to avoid copying an expensive buffer + sequence (for example, a `std::vector` with many + entries) can pass `std::views::all(seq)` at the call site. The + resulting `ref_view` satisfies the buffer-sequence concepts and + copies in O(1). See `doc/buffers-passing-rationale.md`. @par Example @code diff --git a/include/boost/capy/concept/write_stream.hpp b/include/boost/capy/concept/write_stream.hpp index d7b020322..c4bd725d2 100644 --- a/include/boost/capy/concept/write_stream.hpp +++ b/include/boost/capy/concept/write_stream.hpp @@ -79,13 +79,20 @@ namespace capy { IoAwaitable auto write_some( Buffers buffers ); @endcode - @warning **Coroutine Buffer Lifetime**: When implementing coroutine - member functions, prefer accepting buffer sequences **by value** - rather than by reference. Buffer sequences passed by reference may - become dangling if the caller's stack frame is destroyed before the - coroutine completes. Passing by value ensures the buffer sequence - is copied into the coroutine frame and remains valid across - suspension points. + @warning **Pass buffer sequences by value.** A by-value parameter + is copied into the coroutine frame (or the awaitable's state), + so the returned awaitable is self-contained and may be stored, + moved across threads, or wrapped into a sender without lifetime + concerns. A by-const-reference parameter binds to caller storage + and is only safe when the awaitable is consumed immediately by + `co_await` in the same scope; storing such an awaitable produces + a dangling reference. + + @note Callers who want to avoid copying an expensive buffer + sequence (for example, a `std::vector` with many + entries) can pass `std::views::all(seq)` at the call site. The + resulting `ref_view` satisfies the buffer-sequence concepts and + copies in O(1). See `doc/buffers-passing-rationale.md`. @par Example diff --git a/include/boost/capy/read.hpp b/include/boost/capy/read.hpp index ea163084b..322d54f16 100644 --- a/include/boost/capy/read.hpp +++ b/include/boost/capy/read.hpp @@ -70,7 +70,7 @@ namespace capy { auto read( ReadStream auto& stream, - MutableBufferSequence auto const& buffers) -> + MutableBufferSequence auto buffers) -> io_task { consuming_buffers consuming(buffers); diff --git a/include/boost/capy/write.hpp b/include/boost/capy/write.hpp index bed7fba99..114b2d601 100644 --- a/include/boost/capy/write.hpp +++ b/include/boost/capy/write.hpp @@ -64,7 +64,7 @@ namespace capy { auto write( WriteStream auto& stream, - ConstBufferSequence auto const& buffers) -> + ConstBufferSequence auto buffers) -> io_task { consuming_buffers consuming(buffers); diff --git a/test/unit/read.cpp b/test/unit/read.cpp index 05095ab96..206b16a38 100644 --- a/test/unit/read.cpp +++ b/test/unit/read.cpp @@ -320,12 +320,45 @@ struct read_test })); } + // Regression: capy#263. Free-function read() must take its buffer + // sequence by value so that storing the returned awaitable past + // the full-expression that created the sequence does not dangle. + void + testReadStoredAwaitableTemporarySequence() + { + BOOST_TEST(test::fuse().armed([](test::fuse& f) -> task + { + test::read_stream rs(f); + rs.provide("helloworld"); + + char storage[10] = {}; + + // The std::array argument is a temporary + // that ends its lifetime at the end of this full-expression. + auto aw = read(rs, std::array{{ + mutable_buffer(storage, 5), + mutable_buffer(storage + 5, 5) + }}); + + // If read() bound the sequence by const&, the awaitable now + // holds a dangling reference and the next line trips ASan + // (or silently reads stale stack). + auto [ec, n] = co_await std::move(aw); + if(ec) + co_return; + + BOOST_TEST_EQ(n, 10u); + BOOST_TEST_EQ(std::string_view(storage, 10), "helloworld"); + })); + } + void testReadStream() { testReadSingleBuffer(); testReadBufferArray(); testReadBufferPair(); + testReadStoredAwaitableTemporarySequence(); } //---------------------------------------------------------- diff --git a/test/unit/write.cpp b/test/unit/write.cpp index a80185e8b..54b56f9c9 100644 --- a/test/unit/write.cpp +++ b/test/unit/write.cpp @@ -252,12 +252,45 @@ struct write_test })); } + // Regression: capy#263. Free-function write() must take its buffer + // sequence by value so that storing the returned awaitable past + // the full-expression that created the sequence does not dangle. + void + testWriteStoredAwaitableTemporarySequence() + { + BOOST_TEST(test::fuse().armed([](test::fuse& f) -> task + { + test::write_stream ws(f); + + char const data1[] = "hello"; + char const data2[] = "world"; + + // The std::array argument is a temporary + // that ends its lifetime at the end of this full-expression. + auto aw = write(ws, std::array{{ + const_buffer(data1, 5), + const_buffer(data2, 5) + }}); + + // If write() bound the sequence by const&, the awaitable now + // holds a dangling reference and the next line trips ASan + // (or silently reads stale stack). + auto [ec, n] = co_await std::move(aw); + if(ec) + co_return; + + BOOST_TEST_EQ(n, 10u); + BOOST_TEST_EQ(ws.data(), "helloworld"); + })); + } + void testWriteStream() { testWriteSingleBuffer(); testWriteBufferArray(); testWriteBufferPair(); + testWriteStoredAwaitableTemporarySequence(); } //----------------------------------------------------------