Skip to content

zstring_view: opt-in zstring_view_safe variant, P3655R0-aligned substr, contains() polyfill#635

Open
mmthomas wants to merge 6 commits into
microsoft:masterfrom
mmthomas:users/mmthomas/zstring-view-cxx20-23-improvements
Open

zstring_view: opt-in zstring_view_safe variant, P3655R0-aligned substr, contains() polyfill#635
mmthomas wants to merge 6 commits into
microsoft:masterfrom
mmthomas:users/mmthomas/zstring-view-cxx20-23-improvements

Conversation

@mmthomas

@mmthomas mmthomas commented May 28, 2026

Copy link
Copy Markdown

Summary

This PR adds a P3655R0-aligned safe variant of basic_zstring_view. A Traits template parameter selects between the existing trust-the-caller semantics and a new variant with runtime fail-fast on null inputs and a non-null data() invariant. Two aliases ship and both are usable in the same TU: wil::zstring_view keeps its existing trust-the-caller semantics, and wil::zstring_view_safe adds the safer behaviour. Smaller P3655R0-alignment items also land: the substr split and the contains() polyfill. A SAL annotation pass on the raw-pointer surface partially closes #278.

What ships

1. Traits-parameterised basic_zstring_view

template <class TChar, class Traits = std::char_traits<TChar>>
class basic_zstring_view : public std::basic_string_view<TChar, Traits> { ... };

// Legacy aliases
using zstring_view       = basic_zstring_view<char>;
using zwstring_view      = basic_zstring_view<wchar_t>;

// Safe aliases - non-null `data()` invariant + runtime fail-fast on nulls
using zstring_view_safe  = basic_zstring_view<char,    zstring_view_traits_safe<char>>;
using zwstring_view_safe = basic_zstring_view<wchar_t, zstring_view_traits_safe<wchar_t>>;

The dispatch is a zstring_view_traits_safe<TChar> traits type that derives from std::char_traits<TChar> and adds a nested empty_strings_are_non_null typedef; a SFINAE detector in wil::details (zsv_empty_is_nonnull_v<T>) picks up that typedef and branches the relevant ctors with if constexpr / enable_if.

The two variants are layout-compatible (both are { const TChar*, size_type } with no vtable); they differ only in the runtime checks that the safe variant interposes at construction.

2. (ptr, len) ctor: enforcement on the safe variant

The safe variant's (ptr, len) ctor routes the pointer through a private _require_non_null helper that calls WI_STL_FAIL_FAST_IF on null before the base ctor sees it. The body additionally guards the trailing-null read on a non-null pointer so a test-harness DoesCodeFailFast detour returns rather than really AVing. It's the same WI_STL_FAIL_FAST_IF primitive (and the same REQUIRE_ERROR-style testability) as the no-NULL-in-range fail-fast that both variants previously shipped.

The legacy variant's (ptr, len) ctor is unchanged: it keeps the same trust-the-caller semantics and the same latent UB on null input, so existing callers see no behaviour change.

3. Cross-variant conversion

  • Safe to default is implicit. The safe variant's invariants are strictly stronger than the default variant's, so a safe-to-default conversion can never violate the target's invariants.
  • Default to safe is explicit and runtime-checked. The default variant can hold data() == nullptr, so the conversion has to verify the source satisfies the safe invariant.
  • Safe to std::basic_string_view<TChar> is implicit, and only present on the safe variant. The legacy variant gets this for free via the base class; the safe variant has to provide it explicitly because its base is std::basic_string_view<TChar, zstring_view_traits_safe<TChar>> (a different Traits).

Cross-variant ctors take their argument by const-reference rather than by value: the standard treats a ctor whose first parameter is the enclosing class type as a copy ctor and forbids the by-value form.

4. Why does the safe variant fail-fast on (nullptr, 0)?

The safe variant's (ptr, len) constructor fail-fasts on a nullptr pointer regardless of length. The C++23 stdlib's basic_string_view(nullptr, 0) is well-defined and produces data() == nullptr, size() == 0 ([string.view.cons]/4: [nullptr, nullptr) is a trivially valid empty range, and nullptr + 0 == nullptr per C++20 [expr.add]). The legacy wil::zstring_view inherits that behaviour.

The safe variant is intentionally stricter. The C++29 draft (P3655R0) specifies a stronger class invariant for basic_zstring_view in [zstring.view.general]: "In all cases, [data(), data() + size()] is a valid range and data() + size() points at an object with value charT()." (nullptr, 0) cannot satisfy that - data() + size() is nullptr and can't be dereferenced for the null terminator. P3655R0 leaves the invariant as a caller precondition (so violating it is UB) and notes that "all [non-(charT*) ctors] could be inadvertently misused."

wil::zstring_view_safe honors the same invariant as P3655R0 and enforces it with fail-fast at the precondition boundary - same contract as the draft, with diagnostics where the draft leaves UB. The choice of fail-fast (not throw) matches the existing no-NULL-in-range path previously shipped on the legacy wil::zstring_view: both ctors are noexcept, both diagnose precondition violations via WI_STL_FAIL_FAST_IF, and both surface as REQUIRE_ERROR in the test suite.

Callers who want the stdlib's nullable-maybe-empty semantics should use the legacy wil::zstring_view. Callers who want "empty intent" should use the default constructor wil::zstring_view_safe{}, which gives a dereferenceable empty buffer at zero runtime cost.

5. Other changes

  • substr is split into two overloads with different return types. substr(pos) returns a basic_zstring_view (the tail of a null-terminated string is itself null-terminated); substr(pos, count) returns a std::basic_string_view (the slice isn't guaranteed null-terminated). This matches P3655R0.
  • A contains() polyfill is added for callers building below C++23, gated on __cpp_lib_string_contains. The gate mirrors the existing __cpp_lib_format pattern in this file.
  • A SAL annotation pass on the raw-pointer surface partially closes #278. The (ptr, length) ctor gets _In_reads_z_(stringLength + 1); the new contains() overload takes _In_z_; c_str() gets _Ret_maybenull_z_ (truthful for both variants).

The doc-block leads with the Traits split, then enumerates the safe variant's invariants, then calls out the slicing escape hatch that public inheritance leaves open (which P3655R0 closes in C++29 by rejecting inheritance outright).

Tests

Tests are parameterised so both variants exercise the same scenarios:

  • A TEMPLATE_TEST_CASE matrix exercises both variants over {wil::zstring_view, wil::zwstring_view, wil::zstring_view_safe, wil::zwstring_view_safe} for construction, element access, starts_with / ends_with, contains, substr(pos) covariance, substr(pos, count) returning string_view, remove_prefix preserving null-termination, conversion to string_view, and constexpr usage. It branches on is_safe_zsv_v<ZSV> where the safe variant's invariants diverge.
  • The hand-rolled per-char-type cases (TestZStringView, TestZWStringView, the literal block, and the std::format block) cover the legacy variant explicitly, and the templated suite gives the safe variant equivalent coverage.
  • A separate TEST_CASE pins down the safe-variant-specific invariants: nullptr_t ctor deletion, (nullptr, 0) fail-fast at runtime, the non-null buffer on default-construct, and the implicit-to-string_view round-trip.

Divergence from P3655R0

The structural difference between WIL and P3655R0 is:

WIL's basic_zstring_view publicly inherits from std::basic_string_view. P3655R0 explicitly rejects inheritance, on the grounds that string_view's operator= can assign a non-null-terminated view onto a zstring_view instance and break the invariant. The proposal's reference implementation is a stand-alone class that holds data_ / size_ members directly.

WIL has shipped public inheritance since the class was introduced; dropping it would require reimplementing every basic_string_view member on the wrapper. The doc-block's @note acknowledges the slicing escape hatch and names P3655R0 as where the design is heading.

Other P3655R0 differences are about WIL's existing shipped public surface, not anything this PR changes:

  • Type alias spelling. WIL ships zwstring_view; P3655R0 prefers wzstring_view.
  • Additional character-type aliases. P3655R0 includes u8zstring_view / u16zstring_view / u32zstring_view; WIL doesn't.

Questions for maintainers

  1. Are zstring_view_safe / zwstring_view_safe the right alias names? We considered safe_zstring_view and checked_zstring_view, and settled on the _safe suffix to keep the legacy aliases as the lexical lead.
  2. Should we override data() to add _Ret_z_ on the safe variant? That would fully close #278 but would require hiding the inherited base method. c_str() is the documented null-terminated accessor and is annotated in this PR; we left data() inherited as the lower-risk choice. We're happy to add the override if you'd rather close wil::zstring_view data and c_str are not SAL annotated to indicate _Ret_z_, triggering analyzers #278 outright.

@mmthomas mmthomas force-pushed the users/mmthomas/zstring-view-cxx20-23-improvements branch 6 times, most recently from c8560cd to ab8ab3c Compare May 28, 2026 15:52
@mmthomas mmthomas marked this pull request as ready for review May 28, 2026 16:35
@mmthomas

Copy link
Copy Markdown
Author

@jonwis — this is the WIL upstream you mentioned wanting. Putting it in your queue for a look when you have time.

Comment thread include/wil/stl.h
Comment thread tests/StlTests.cpp Outdated
REQUIRE(wil::zstring_view{}.empty());
REQUIRE(wil::zstring_view{}.data() == nullptr);
REQUIRE(wil::zstring_view{}.c_str() == nullptr);
REQUIRE(wil::zstring_view{}.data() != nullptr);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder about the compat impact of this change. (Don't get me wrong, I think it is reasonable and probably better behavior to have a valid empty string for default construction). If there are any APIs that treat nullptr different than an empty string then this change will have side effects. Passing a default-constructed zstring_view will no longer be a nullptr missing parameter.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth thinking about what changes for existing code that does this:

if (zv.data()) {
    log_message(zv.c_str());
}

Under the previous behaviour, a default-constructed zv skipped the body. Under the new behaviour, the guard is always true on a default-constructed view, so log_message("") runs. The C runtime function (or whatever the body calls) sees an empty null-terminated string instead of not being invoked at all.

The observable effect at the call site depends on what the downstream function does with "":

  • A logger emits an empty log entry instead of skipping the line.
  • printf("%s", "") prints nothing.
  • fopen("", "r") fails with ENOENT instead of the path-open being skipped.
  • strlen("") returns 0.

So existing callers that were using nullptr as a sentinel to skip downstream work now invoke that work with an empty string. Within this repo the only call sites exercising the data() == nullptr pattern are the three test-line assertions in the diff. For production code that genuinely needs to distinguish "no string" from "empty string", the idiomatic replacement is std::optional<wil::zstring_view>.

Any recommendations on how to resolve the tension here? Happy to defer to your read on how WIL has handled similar trade-offs before.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any recommendations here. This is an intentional behavior change so it is all-or-nothing. Either this change is accepted and the behavior changes, or it isn't. I can't see a more-compatible middle ground.

@jonwis @dunhor any thoughts?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to guess that there is existing code that relies on this behavior directly or indirectly. https://en.cppreference.com/cpp/string/basic_string_view/basic_string_view is documented as default-constructing to size==0 and data()==nullptr. My read of this overall change is trying to bring some better compat with the existing standards, so retaining the "it's nullptr" seems right. Nullptr as "not a value" is very common in the APIs that this type is designed to interoperate with.

@mmthomas mmthomas May 29, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I fully follow, @jonwis. The standard's proposed std::basic_zstring_view would change the behaviour you identified. It also does not derive from std::basic_string_view (perhaps for this reason, plus the slicing problem). What would you propose we do here?

Keep wil::basic_zstring_view aligned with the behaviour that std::basic_string_view brings (since it derives), or update it to match the better future?

Is there an opt-in or opt-out feature macro pattern we can use here that wil has used in the past to turn on the stricter/safer behaviour without breaking the data() == nullptr assumtion earlier code uses?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a pre-existing pattern in WIL to use to enable an opt-in approach that preserves existing caller assumptions regarding the data() value. The default behaviour is restored to = default; with data() == nullptr; callers that want the P3655R0 behaviour opt in by defining the new WIL_ZSV_DEFAULT_TO_EMPTY_BUFFER macro before including wil/stl.h, which flips the default constructor to point at a static empty buffer.

The mechanism is WI_ODR_PRAGMA, the same approach used by WIL_KERNEL_MODE and WIL_EXCEPTION_MODE; LNK2038 fires on the tag ODR_violation_WIL_ZSV_DEFAULT_TO_EMPTY_BUFFER_mismatch if two translation units disagree, and the developer can resolve the error by aligning either side with the other. The new tests/zsvsafe/ flavor mirrors tests/normal/ with the macro defined, so CI exercises both code paths.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a pre-existing assumption that this type is intended to implement/follow P3655R0 when in reality it was introduced independently to solve similar problems, and the two just naturally have different designs. In a vacuum I think I'd agree with the "always return a non-null string" design, however I agree with Jon and David's assessment that changing the behavior by default is too risky. I see three possible paths:

  1. Use an ifdef (as you have now) to opt-in to the alternative behavior. Personally, I'm not a big fan of this. If you are not careful, you can easily run into annoying ODR issues. You work around this by using the pragma, however that comes with its own headaches. In particular, all libraries that get linked together need to agree. This is not a huge deal in smaller projects; however larger projects are basically prohibited from adopting the new behavior (e.g. the Windows source code). This is made worse by the fact that this mechanism of detecting ODR issues is at the include level - it does not consider whether or not a translation unit even uses the type. Overall, this is not too bad; such projects just wouldn't adopt the new behavior, which is equivalent to the new behavior having never been added, so I don't outright object to it.
  2. Adopt the new behavior through the Traits template argument. E.g. something like a boolean Traits::empty_strings_maybe_null or Traits::c_str() that translates null pointers to empty string literals. Of course, a layer of indirection would be necessary to allow for something like char_traits which doesn't have this behavior. This would also require introducing Traits as a template argument, but that's probably fine. Defaulting to char_traits<TChar> would give the same behavior as before and likely wouldn't be a source breaking change for a majority of consumers. This would also allow multiple typedefs to co-exist next to each other that expose each behavior.
  3. Introduce functions with stricter guarantees. This way the type can internally hold a null pointer, but if the caller needs a null terminated, non-null string, they can ask for one. E.g. (and these names are just me spitballing.. they are not necessarily well named) str.c_str_or_empty() and str.data_or_empty().

Personally, I like 2 the most, although a combination of 2 and 3 would be nice (allows consumers of the current type to more easily get a guaranteed-to-not-be-null string. That may be a bit overkill though...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An additional complexity with option 2: the obvious thing would be to derive from std::basic_string_view<TChar, Traits>, however that would mean any attempts to use the traits type to enable this alternative behavior would cause the type to no longer decay to std::string_view/std::wstring_view, so there would need to be some rebind-like logic to get it to work as desired.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We went with option 2. The doc-block's "Cross-instantiation conversion" @note covers the rebind/overload-resolution caveat from the follow-up.

Comment thread tests/StlTests.cpp Outdated
… polyfill

Three changes to basic_zstring_view, aligned with the WG21 P3655R0
proposal (std::zstring_view, Feb 2025).

substr is split into two overloads with different return types. The
one-argument substr(pos) returns a basic_zstring_view -- the tail of
a null-terminated string is itself null-terminated. The two-argument
substr(pos, count) returns a std::basic_string_view -- an arbitrary
slice generally isn't null-terminated. The two-argument form is
re-declared explicitly so the inherited overload isn't hidden once
the one-argument override is added.

The default constructor now points data() at a static empty buffer
(a single TChar()) so c_str() always returns a dereferenceable
null-terminated pointer. This is a strict safety upgrade, not a
behaviour change for any operation that was previously defined:

  Previously undefined behaviour is now defined: c_str() deref,
  strlen(c_str()), printf("%s", c_str()), std::string(zv.c_str()),
  passing the result to any C API. All produce the obvious
  empty-string result.

  Previously defined operations produce the same observable result:
  empty(), size(), comparisons, iteration, implicit decay to
  string_view.

The single observable change is for callers using data() == nullptr
to distinguish "no string" from "empty string". That sentinel was
already fragile (operator[] asserts on null data) and is better
encoded via std::optional<basic_zstring_view>. The three existing
TestZ*StringView assertions that depended on the sentinel are
updated to match the new semantics.

This aligns with P3655R0's wording that [data(), data() + size()]
must always be a valid range, and matches its reference
implementation's static-empty-buffer pattern.
P3655R0: https://wg21.link/p3655r0

contains() is added as a polyfill (string_view, CharT, const CharT*
overloads) gated on __cpp_lib_string_contains, mirroring the existing
__cpp_lib_format gate at the bottom of this file. Once the STL
provides the methods natively, the polyfill compiles out and the
base-class versions resolve transparently.

The class doc block is rewritten to lead with the null-termination
invariant, the safe default state, and the substr design, with one
@note acknowledging the public-inheritance slicing escape hatch that
P3655R0 calls out and proposes to close in C++29.

Tests:

  Existing TestZStringView / TestZWStringView / literal / formatting
  cases are unchanged except for the three default-ctor assertions
  that need to reflect the new non-null data().

  Two new TEST_CASEs (char and wchar_t) make the default-ctor
  migration argument runnable: previously-UB c_str/strlen/printf
  operations now produce defined empty-string results;
  previously-defined empty/size/iteration produce the same observable
  result; defensive `if (zv.data()) { ... }` checks still compile and
  behave correctly.

  A new templated suite covers behaviour that is sensitive to the
  null-termination invariant, the new code added in this PR, and
  the post-C++17 base-class additions (starts_with, ends_with,
  contains) that the existing hand-rolled tests pre-date:

    * Mirror cases that reproduce the existing hand-rolled
      TestZStringView / TestZWStringView / TestZ*StringView literal /
      TestZStringView formatting coverage in templated form, so the
      hand-rolled cases can be retired in a follow-up commit once
      maintainers confirm the templated coverage is equivalent. A
      cross-reference comment at the top of the mirror block maps
      each existing scenario to a SECTION.
    * Element access (data()/c_str() identity, null-termination
      after access)
    * starts_with / ends_with (gated on __cpp_lib_starts_ends_with)
    * contains (exercises the polyfill under pre-C++23 STL and the
      inherited base implementation under C++23 STL; same observable
      behaviour)
    * substr(pos) covariant return and null-termination preservation
    * substr(pos, count) returns std::basic_string_view
    * remove_prefix preserves null-termination
    * Conversion to string_view (inheritance contract) and to
      std::basic_string (ctor)
    * constexpr usage
    * Type aliases (zstring_view, zwstring_view, _zv literal)

  The templated suite deliberately omits coverage that would only
  relitigate C++17 base-class behaviour the STL vendor already tests
  (find family, comparison operators, iterators, copy, capacity,
  at()/front()/back()).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mmthomas mmthomas force-pushed the users/mmthomas/zstring-view-cxx20-23-improvements branch from ab8ab3c to 36a9ce6 Compare May 28, 2026 19:01
…_EMPTY_BUFFER

The default-construct change from this PR (data() returning a non-null
empty buffer rather than nullptr) breaks source compatibility with
callers that use data() == nullptr as a "no string" sentinel. Make the
safer default opt-in via WIL_ZSV_DEFAULT_TO_EMPTY_BUFFER and restore
the legacy default (= default; data() == nullptr) for unchanged
callers.

WI_ODR_PRAGMA emits a /detect_mismatch directive so MSVC produces
LNK2038 on the mismatch tag
'ODR_violation_WIL_ZSV_DEFAULT_TO_EMPTY_BUFFER_mismatch' if two
translation units linked into the same image disagree on the macro's
value. The _empty_storage static member is gated on the macro so it
isn't present in the legacy build.

Also fix a latent (nullptr, 0) round-trip in the new one-arg
substr(pos) override: tail.data() == nullptr is now short-circuited
to a default-constructed view rather than passed to the verifying
(ptr, length) ctor, which would deref the null pointer. The
short-circuit deliberately uses data() rather than empty() so
substr(size()) on non-default source views keeps pointing data() at
the trailing nul, preserving the existing null-termination contract.

Tests: a new tests/zsvsafe build flavor mirrors normal with the macro
defined so both code paths are exercised by CI. Existing default-ctor
assertions are now mode-gated; the two safe-default TEST_CASEs run
only in opt-in mode. Two new regression tests cover the substr(0)
round-trip on a default-constructed view in both modes, with
mode-conditional assertions that prove the short-circuit fires in
legacy and the ctor fall-through happens in opt-in.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread include/wil/stl.h Outdated
Address dunhor review feedback that the "Core invariant" claim
(every constructed view satisfies data()[size()] == TChar()) is
not true when WIL_ZSV_DEFAULT_TO_EMPTY_BUFFER is undefined and the
view is default-constructed (data() == nullptr makes the dereference
UB to even evaluate).

Reorganize the class doc-block so the opt-in macro leads the
behavioural contract:

  * The macro choice is introduced as the active control on
    default-construction behaviour, not as a footnote.
  * Two parallel bulleted sections ("With WIL_ZSV_DEFAULT_TO_EMPTY_BUFFER
    defined" and "Without...") spell out each mode's invariant,
    default-ctor state, and c_str() safety contract so a reader can
    skip straight to the one relevant to their build.
  * The redundant "Default construction matches..." paragraph is
    removed; its content is now in the per-mode bullets.
  * Construction sources and substr design fold into a shared
    "Other behaviour (both modes)" section.
  * The class-purpose / C-API rationale stays as the lead paragraph,
    tightened to "enforces it at construction for views built from
    a real buffer" so the default-ctor exception is acknowledged.

No code or test changes; doc-only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread include/wil/stl.h Outdated
// we disable this constructor if the value is an array (including string literal).
template <typename TPtr, std::enable_if_t<std::is_convertible<TPtr, const TChar*>::value && !std::is_array<TPtr>::value>* = nullptr>
constexpr basic_zstring_view(TPtr&& pStr) noexcept : std::basic_string_view<TChar>(std::forward<TPtr>(pStr))
constexpr basic_zstring_view(_In_z_ TPtr&& pStr) noexcept : std::basic_string_view<TChar>(std::forward<TPtr>(pStr))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the argument is constrained on is_convertible, I don't think _In_z_ is correct here. See: https://godbolt.org/z/bdzq9WT3s

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on skipping the annotation here. Thanks for the godbolt example.

Comment thread include/wil/stl.h Outdated
}

WI_NODISCARD constexpr const TChar* c_str() const noexcept
WI_NODISCARD _Ret_z_ constexpr const TChar* c_str() const noexcept

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should (conditionally) be _Ret_maybenull_z_. Or maybe use _When_ to scope the conditions when it may be null, though I'm not sure if that's easily done here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we looked at _When_ here too and didn't find a path forward that plays well with /analyze at the template level. Splitting c_str() into per-Traits overloads would also change its mangled name, which is an ABI hazard for any caller taking &basic_zstring_view::c_str. The annotation stays as a single _Ret_maybenull_z_, loose-but-correct for the safe variant.

Comment thread include/wil/stl.h Outdated
const auto tail = std::basic_string_view<TChar>(*this).substr(pos);
// Short-circuit the (nullptr, 0) tail case (substr(0) on a default-constructed view
// when WIL_ZSV_DEFAULT_TO_EMPTY_BUFFER is not defined) so we don't round-trip a null
// pointer through the verifying (ptr, length) constructor.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to introduce a (private) constructor that doesn't do this validation (e.g. via tag dispatch). That would reduce the complexity of this, and potentially other, functions

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the current revision takes that route. See the _trusted_tag ctor at L545-551 and substr() at L523-524.

Replace the WIL_ZSV_DEFAULT_TO_EMPTY_BUFFER opt-in with a Traits-based
split. wil::zstring_view continues to ship the long-established behavior
(default ctor wraps the empty string literal). wil::zstring_view_safe
(and the wide alias) opt into the stricter behavior via
Traits::safer_default_ctor, which constructs the view from (nullptr, 0)
so a default-constructed instance carries no buffer.

The Traits split lets both variants live in the same translation unit
and the same test binary, which removes the separate tests/zsvsafe/
target added in earlier commits.

SAL annotations on the raw-pointer surface:
- (ptr, length) ctors: _In_reads_z_(stringLength + 1)
- c_str(): _Ret_maybenull_z_ (truthful for both variants)
- contains() polyfill: _In_z_ const TChar*

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mmthomas mmthomas changed the title basic_zstring_view: P3655R0-aligned substr, safer default, contains() polyfill zstring_view: opt-in safe variant, P3655R0-aligned substr, contains() polyfill Jun 4, 2026
@mmthomas mmthomas changed the title zstring_view: opt-in safe variant, P3655R0-aligned substr, contains() polyfill zstring_view: opt-in zstring_view_safe variant, P3655R0-aligned substr, contains() polyfill Jun 4, 2026
Monroe Thomas and others added 2 commits June 4, 2026 13:33
Format Check on PR microsoft#635 reported four hunks needing line-joins in
include/wil/stl.h and tests/StlTests.cpp. Pure formatting, no
behaviour change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ndent return type

The implicit operator std::basic_string_view<TChar>() on the safe variant
is SFINAE'd onto the safe-Traits instantiation only. Clang's -Wclass-conversion
fires syntactically on the default-Traits instantiation (where the inherited
base IS std::basic_string_view<TChar>) before SFINAE removes the operator
from the candidate set, so the warning is reported and -Werror flips it to a
compile error.

Move the enable_if onto the return type so the operator's conversion target
is template-dependent on T. Clang then defers its base-class shape check
until template-argument deduction, at which point SFINAE has already
eliminated the candidate for default-Traits and no warning fires. MSVC
remained silent under both forms; this only affects clang.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread include/wil/stl.h
*/
template <class TChar>
class basic_zstring_view : public std::basic_string_view<TChar>
struct zstring_view_traits_safe : std::char_traits<TChar>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"safe" is a confusing name here. Would prefer something like "non-null"

Comment thread include/wil/stl.h
buffer at zero runtime cost.
*/
template <class TChar, class Traits = std::char_traits<TChar>>
class basic_zstring_view : public std::basic_string_view<TChar, Traits>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the negative thing here is what I called out in my second comment - this will no longer decay to std::(w)string_view when using the non-null traits type. That's not going to break any existing code, so I don't object to it, but worth additional consideration

@dunhor dunhor Jun 5, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposed solution. Update the non-null traits type to something like:

template <typename TChar, typename Traits = std::char_traits<TChar>>
struct nonnull_zstring_view_traits
{
    using char_traits = Traits;
    static constexpr bool empty_strings_are_non_null = true;
};

And then create a new traits type (a separate zsv_empty_is_nonnull is no longer necessary):

namespace details
{
    template <typename TChar, typename Traits>
    struct zstring_view_traits
    {
        template <typename T = Traits>
        static std::bool_constant<T::empty_strings_are_non_null> deduce_empty_strings_are_non_null(int);
        template <typename = Traits>
        static std::false_type deduce_empty_strings_are_non_null(...);

        template <typename T = Traits>
        static typename T::char_traits deduce_char_traits(int);
        template <typename T = Traits>
        static T deduce_char_traits(...);

        static constexpr bool empty_strings_are_non_null = decltype(deduce_empty_strings_are_non_null(0))::value;
        using char_traits = decltype(deduce_char_traits(0));
    };
}

And then update to derive from basic_string_view using this deduced traits type:

template <class TChar, class Traits = std::char_traits<TChar>>
class basic_zstring_view : public std::basic_string_view<TChar, typename details::zstring_view_traits<TChar, Traits>::char_traits>
{
    // Helper typedef to avoid needing to continuously deduce the traits type... 
    using BaseType = std::basic_string_view<TChar, typename details::zstring_view_traits<TChar, Traits>::char_traits>;

    // ...

Here's an example of the traits type working: https://godbolt.org/z/f5cshPG16

Edit: small changes because I copy/pasted too much... there's no need for the non-null traits type to derive from Traits any more, and fixed a minor issue with zstring_view_traits hard-coding char_traits

Comment thread include/wil/stl.h

// Self-guard helper for the explicit `std::basic_string_view<TChar>` ctor and for the safe
// variant's `(ptr, len)` ctor.
static constexpr const TChar* _require_non_null(const TChar* p) noexcept

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static constexpr const TChar* _require_non_null(const TChar* p) noexcept
static constexpr const TChar* require_non_null(const TChar* p) noexcept

Comment thread include/wil/stl.h
constexpr basic_zstring_view(_In_reads_z_(stringLength + 1) const TChar* pStringData, size_type stringLength) noexcept :
std::basic_string_view<TChar, Traits>(_require_non_null(pStringData), stringLength)
{
if (pStringData != nullptr && pStringData[stringLength] != 0)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (pStringData != nullptr && pStringData[stringLength] != 0)
if (pStringData[stringLength] != 0)

_require_non_null will already fail fast if it's null

Comment thread include/wil/stl.h
{
WI_STL_FAIL_FAST_IF(true);
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify and merge these two constructors simply by baking the "should be non-null" check into the _require_non_null function (though I'd suggest renaming it then). Something like:

    static constexpr const TChar* check_pointer(const TChar* p) noexcept
    {
        if constexpr (details::zsv_empty_is_nonnull_v<Traits>)
        {
            if (p == nullptr)
            {
                WI_STL_FAIL_FAST_IF(true);
            }
        }
        return p;
    }

And then

    constexpr basic_zstring_view(_In_reads_z_(stringLength + 1) const TChar* pStringData, size_type stringLength) noexcept :
        std::basic_string_view<TChar, Traits>(check_pointer(pStringData), stringLength)
    {
        if (pStringData[stringLength] != 0)
        {
            WI_STL_FAIL_FAST_IF(true);
        }
    }

@dunhor dunhor Jun 5, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although... Since it reads pStringData unconditionally, the pointer can never be null under normal execution, so the conditional null check isn't even necessary...

Comment thread include/wil/stl.h
* source view has `data() == nullptr` (the only way `std::basic_string_view` can violate the
* non-null contract); otherwise delegates to the validating `(ptr, len)` ctor.
*/
constexpr explicit basic_zstring_view(std::basic_string_view<TChar> sv) noexcept :

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be more desirable if the Traits type matched as well.

Comment thread include/wil/stl.h
Comment on lines +428 to +438
template <typename T = Traits, std::enable_if_t<!details::zsv_empty_is_nonnull_v<T>, int> = 0>
constexpr basic_zstring_view(basic_zstring_view<TChar, zstring_view_traits_safe<TChar>> const& safe) noexcept :
std::basic_string_view<TChar, Traits>(safe.data(), safe.size())
{
}

template <typename T = Traits, std::enable_if_t<details::zsv_empty_is_nonnull_v<T>, int> = 0>
constexpr explicit basic_zstring_view(basic_zstring_view<TChar, std::char_traits<TChar>> const& other) noexcept :
basic_zstring_view(other.data(), other.size())
{
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it's highly unlikely anyone ever uses something other than std::char_traits (and now the new one), I'd rather we not hard-code the types here. We could just accept a templated TraitsOther type and SFINAE on some is_compatible_v<Traits, TraitsOther> (exact implementation will depend on some other factors... ideally the check would actually be something more like is_same_v<MyBaseType, OtherBaseType> in a world where both derive from the same std::basic_string_view<CharT, std::char_traits<CharT>> or other custom traits type).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you go with my proposal, you'd basically gate on std::is_same_v<BaseType, typename basic_zstring_view<...>::BaseType> with possibly a friend thrown in there if necessary

Comment thread include/wil/stl.h
Comment on lines +463 to +467
template <typename T = Traits>
constexpr operator std::enable_if_t<std::is_same_v<T, Traits> && std::is_same_v<Traits, zstring_view_traits_safe<TChar>>, std::basic_string_view<TChar>>() const noexcept
{
return {this->data(), this->size()};
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hack to work around the fact that we don't "rebind" the traits type.

This would also mean that a static_cast to a reference type wouldn't work e.g. static_cast<std::string_view&>(str). Nor would implicit casts to functions accepting non-const references work

Comment thread include/wil/stl.h
Comment on lines +588 to +589
using zstring_view_safe = basic_zstring_view<char, zstring_view_traits_safe<char>>;
using zwstring_view_safe = basic_zstring_view<wchar_t, zstring_view_traits_safe<wchar_t>>;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing, "safe" is misleading here... "non-null" is a much clearer name

@dunhor

dunhor commented Jun 5, 2026

Copy link
Copy Markdown
Member

I'd suggest factoring out the non-null changes from the other improvements to make reviewing easier and so that those changes can be checked in quicker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wil::zstring_view data and c_str are not SAL annotated to indicate _Ret_z_, triggering analyzers

4 participants