diff --git a/Doc/deprecations/index.rst b/Doc/deprecations/index.rst index bb8bfb5c227c2d..eedcd2e9c9dd42 100644 --- a/Doc/deprecations/index.rst +++ b/Doc/deprecations/index.rst @@ -15,6 +15,8 @@ Deprecations .. include:: pending-removal-in-future.rst +.. include:: soft-deprecations.rst + C API deprecations ------------------ diff --git a/Doc/deprecations/soft-deprecations.rst b/Doc/deprecations/soft-deprecations.rst new file mode 100644 index 00000000000000..a270052788ef2a --- /dev/null +++ b/Doc/deprecations/soft-deprecations.rst @@ -0,0 +1,21 @@ +Soft deprecations +----------------- + +There are no plans to remove :term:`soft deprecated` APIs. + +* :func:`re.match` and :meth:`re.Pattern.match` are now + :term:`soft deprecated` in favor of the new :func:`re.prefixmatch` and + :meth:`re.Pattern.prefixmatch` APIs, which have been added as alternate, + more explicit names. These are intended to be used to alleviate confusion + around what *match* means by following the Zen of Python's *"Explicit is + better than implicit"* mantra. Most other language regular expression + libraries use an API named *match* to mean what Python has always called + *search*. + + We **do not** plan to remove the older :func:`!match` name, as it has been + used in code for over 30 years. Code supporting older versions of Python + should continue to use :func:`!match`, while new code should prefer + :func:`!prefixmatch`. See :ref:`prefixmatch-vs-match`. + + (Contributed by Gregory P. Smith in :gh:`86519` and + Hugo van Kemenade in :gh:`148100`.) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 43fb7295876fe1..7e0a00cba2f63f 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -52,7 +52,7 @@ fine-tuning parameters. .. _re-syntax: -Regular Expression Syntax +Regular expression syntax ------------------------- A regular expression (or RE) specifies a set of strings that matches it; the @@ -205,7 +205,7 @@ The special characters are: *without* establishing any backtracking points. This is the possessive version of the quantifier above. For example, on the 6-character string ``'aaaaaa'``, ``a{3,5}+aa`` - attempt to match 5 ``'a'`` characters, then, requiring 2 more ``'a'``\ s, + attempts to match 5 ``'a'`` characters, then, requiring 2 more ``'a'``\ s, will need more characters than available and thus fail, while ``a{3,5}aa`` will match with ``a{3,5}`` capturing 5, then 4 ``'a'``\ s by backtracking and then the final 2 ``'a'``\ s are matched by the final @@ -717,7 +717,7 @@ three digits in length. .. _contents-of-module-re: -Module Contents +Module contents --------------- The module defines several functions, constants, and an exception. Some of the @@ -833,8 +833,8 @@ Flags will be conditionally ORed with other flags. Example of use as a default value:: - def myfunc(text, flag=re.NOFLAG): - return re.search(text, flag) + def myfunc(pattern, text, flag=re.NOFLAG): + return re.search(pattern, text, flag) .. versionadded:: 3.11 @@ -954,9 +954,10 @@ Functions :func:`~re.match`. Use that name when you need to retain compatibility with older Python versions. - .. versionchanged:: 3.15 - The alternate :func:`~re.prefixmatch` name of this API was added as a - more explicitly descriptive name than :func:`~re.match`. Use it to better + .. deprecated:: 3.15 + :func:`~re.match` has been :term:`soft deprecated` in favor of + the alternate :func:`~re.prefixmatch` name of this API which is + more explicitly descriptive. Use it to better express intent. The norm in other languages and regular expression implementations is to use the term *match* to refer to the behavior of what Python has always called :func:`~re.search`. @@ -1246,7 +1247,7 @@ Exceptions .. _re-objects: -Regular Expression Objects +Regular expression objects -------------------------- .. class:: Pattern @@ -1309,9 +1310,10 @@ Regular Expression Objects :meth:`~Pattern.match`. Use that name when you need to retain compatibility with older Python versions. - .. versionchanged:: 3.15 - The alternate :meth:`~Pattern.prefixmatch` name of this API was added as - a more explicitly descriptive name than :meth:`~Pattern.match`. Use it to + .. deprecated:: 3.15 + :meth:`~Pattern.match` has been :term:`soft deprecated` in favor of + the alternate :meth:`~Pattern.prefixmatch` name of this API which is + more explicitly descriptive. Use it to better express intent. The norm in other languages and regular expression implementations is to use the term *match* to refer to the behavior of what Python has always called :meth:`~Pattern.search`. @@ -1396,7 +1398,7 @@ Regular Expression Objects .. _match-objects: -Match Objects +Match objects ------------- Match objects always have a boolean value of ``True``. @@ -1615,11 +1617,11 @@ when there is no match, you can test whether there was a match with a simple .. _re-examples: -Regular Expression Examples +Regular expression examples --------------------------- -Checking for a Pair +Checking for a pair ^^^^^^^^^^^^^^^^^^^ In this example, we'll use the following helper function to display match @@ -1705,15 +1707,21 @@ expressions. | ``%x``, ``%X`` | ``[-+]?(0[xX])?[\dA-Fa-f]+`` | +--------------------------------+---------------------------------------------+ -To extract the filename and numbers from a string like :: +To extract the filename and numbers from a string like: + +.. code-block:: text /usr/sbin/sendmail - 0 errors, 4 warnings -you would use a :c:func:`!scanf` format like :: +you would use a :c:func:`!scanf` format like: + +.. code-block:: text %s - %d errors, %d warnings -The equivalent regular expression would be :: +The equivalent regular expression would be: + +.. code-block:: text (\S+) - (\d+) errors, (\d+) warnings @@ -1772,18 +1780,24 @@ not familiar with the Python API's divergence from what otherwise become the industry norm. Quoting from the Zen Of Python (``python3 -m this``): *"Explicit is better than -implicit"*. Anyone reading the name :func:`~re.prefixmatch` is likely to -understand the intended semantics. When reading :func:`~re.match` there remains +implicit"*. Anyone reading the name :func:`!prefixmatch` is likely to +understand the intended semantics. When reading :func:`!match` there remains a seed of doubt about the intended behavior to anyone not already familiar with this old Python gotcha. -We **do not** plan to deprecate and remove the older *match* name, +We **do not** plan to remove the older :func:`!match` name, as it has been used in code for over 30 years. -Code supporting older versions of Python should continue to use *match*. +It has been :term:`soft deprecated`: +code supporting older versions of Python should continue to use :func:`!match`, +while new code should prefer :func:`!prefixmatch`. .. versionadded:: 3.15 + :func:`!prefixmatch` + +.. deprecated:: 3.15 + :func:`!match` is :term:`soft deprecated` -Making a Phonebook +Making a phonebook ^^^^^^^^^^^^^^^^^^ :func:`split` splits a string into a list delimited by the passed pattern. The @@ -1844,7 +1858,7 @@ house number from the street name: ['Heather', 'Albrecht', '548.326.4584', '919', 'Park Place']] -Text Munging +Text munging ^^^^^^^^^^^^ :func:`sub` replaces every occurrence of a pattern with a string or the @@ -1864,7 +1878,7 @@ in each word of a sentence except for the first and last characters:: 'Pofsroser Aodlambelk, plasee reoprt yuor asnebces potlmrpy.' -Finding all Adverbs +Finding all adverbs ^^^^^^^^^^^^^^^^^^^ :func:`findall` matches *all* occurrences of a pattern, not just the first @@ -1877,7 +1891,7 @@ the following manner:: ['carefully', 'quickly'] -Finding all Adverbs and their Positions +Finding all adverbs and their positions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If one wants more information about all matches of a pattern than the matched @@ -1893,7 +1907,7 @@ to find all of the adverbs *and their positions* in some text, they would use 40-47: quickly -Raw String Notation +Raw string notation ^^^^^^^^^^^^^^^^^^^ Raw string notation (``r"text"``) keeps regular expressions sane. Without it, @@ -1917,7 +1931,7 @@ functionally identical:: -Writing a Tokenizer +Writing a tokenizer ^^^^^^^^^^^^^^^^^^^ A `tokenizer or scanner `_ diff --git a/Doc/using/windows.rst b/Doc/using/windows.rst index 1a913c624e99b0..eea1e2f64a468d 100644 --- a/Doc/using/windows.rst +++ b/Doc/using/windows.rst @@ -778,6 +778,14 @@ directory containing the configuration file that specified them. - True to suppress visible warnings when a shebang launches an application other than a Python runtime. + * - ``source_settings`` + - A mapping from source URL to settings specific to that index. + When multiple configuration files include this section, URL settings are + added or overwritten, but individual settings are not merged. + These settings are currently only for :ref:`index signatures + `. + + .. _install-freethreaded-windows: Installing free-threaded binaries @@ -799,6 +807,101 @@ installed, then ``python`` will launch this one. Otherwise, you will need to use ``py -V:3.14t ...`` or, if you have added the global aliases directory to your :envvar:`PATH` environment variable, the ``python3.14t.exe`` commands. + +.. _pymanager-index-signatures: + +Index signatures +---------------- + +.. versionadded:: 26.2 + +Index files may be signed to detect tampering. A signature is a catalog file +at the same URL as the index with ``.cat`` added to the filename. The catalog +file should contain the hash of its matching index file, and should be signed +with a valid Authenticode signature. This allows standard tooling (on Windows) +to generate a signature, and any certificate may be used as long as the client +operating system already trusts its certification authority (root CA). + +Index signatures are only downloaded and checked when the local configuration's +``source_settings`` section includes the index URL and ``requires_signature`` is +true, or the index JSON contains ``requires_signature`` set to true. When the +setting exists in local configuration, even when false, settings in the index +are ignored. + +As well as requiring a valid signature, the ``required_root_subject`` and +``required_publisher_subject`` settings can further restrict acceptable +signatures based on the certificate Subject fields. Any attribute specified in +the configuration must match the attribute in the certificate (additional +attributes in the certificate are ignored). Typical attributes are ``CN=`` for +the common name, ``O=`` for the organizational unit, and ``C=`` for the +publisher's country. + +Finally, the ``required_publisher_eku`` setting allows requiring that a specific +Enhanced Key Usage (EKU) has been assigned to the publisher certificate. For +example, the EKU ``1.3.6.1.5.5.7.3.3`` indicates that the certificate was +intended for code signing (as opposed to server or client authentication). +In combination with a specific root CA, this provides another mechanism to +verify a legitimate signature. + +This is an example ``source_settings`` section from a configuration file. In +this case, the publisher of the feed is uniquely identified by the combination +of the Microsoft Identity Verification root and the EKU assigned by that root. +The signature for this case would be found at +``https://www.python.org/ftp/python/index-windows.json.cat``. + +.. code:: json5 + + { + "source_settings": { + "https://www.python.org/ftp/python/index-windows.json": { + "requires_signature": true, + "required_root_subject": "CN=Microsoft Identity Verification Root Certificate Authority 2020", + "required_publisher_subject": "CN=Python Software Foundation", + "required_publisher_eku": "1.3.6.1.4.1.311.97.608394634.79987812.305991749.578777327" + } + } + } + +The same settings could be specified in the ``index.json`` file instead. In this +case, the root and EKU are omitted, meaning that the signature must be valid and +have a specific common name in the publisher's certificate, but no other checks +are used. + +.. code:: json5 + + { + "requires_signature": true, + "required_publisher_subject": "CN=Python Software Foundation", + "versions": [ + // ... + ] + } + +When settings from inside a feed are used, the user is notified and the settings +are shown in the log file or verbose output. It is recommended to copy these +settings into a local configuration file for feeds that will be used frequently, +so that unauthorised modifications to the feed cannot disable verification. + +It is not possible to override the location of the signature file in the feed or +through a configuration file. Administrators can provide their own +``source_settings`` in a mandatory configuration file (see +:ref:`pymanager-admin-config`). + +If signature validation fails, you will be notified and prompted to continue. +When interactive confirmation is not allowed (for example, because ``--yes`` was +specified), it will always abort. To use a feed with invalid configuration in +this scenario, you must provide a configuration file that disables signature +checking for that feed. + +.. code:: json5 + + "source_settings": { + "https://www.example.com/feed-with-invalid-signature.json": { + "requires_signature": false + } + } + + .. _pymanager-troubleshoot: Troubleshooting diff --git a/Doc/whatsnew/3.15.rst b/Doc/whatsnew/3.15.rst index 485480802feb97..ed07afd2277cc0 100644 --- a/Doc/whatsnew/3.15.rst +++ b/Doc/whatsnew/3.15.rst @@ -945,9 +945,10 @@ pickle re -- -* :func:`re.prefixmatch` and a corresponding :meth:`~re.Pattern.prefixmatch` - have been added as alternate more explicit names for the existing - :func:`re.match` and :meth:`~re.Pattern.match` APIs. These are intended +* :func:`re.prefixmatch` and a corresponding :meth:`re.Pattern.prefixmatch` + have been added as alternate, more explicit names for the existing + and now :term:`soft deprecated` + :func:`re.match` and :meth:`re.Pattern.match` APIs. These are intended to be used to alleviate confusion around what *match* means by following the Zen of Python's *"Explicit is better than implicit"* mantra. Most other language regular expression libraries use an API named *match* to mean what @@ -1685,6 +1686,27 @@ New deprecations (Contributed by Bénédikt Tran in :gh:`134978`.) + +* :mod:`re`: + + * :func:`re.match` and :meth:`re.Pattern.match` are now + :term:`soft deprecated` in favor of the new :func:`re.prefixmatch` and + :meth:`re.Pattern.prefixmatch` APIs, which have been added as alternate, + more explicit names. These are intended to be used to alleviate confusion + around what *match* means by following the Zen of Python's *"Explicit is + better than implicit"* mantra. Most other language regular expression + libraries use an API named *match* to mean what Python has always called + *search*. + + We **do not** plan to remove the older :func:`!match` name, as it has been + used in code for over 30 years. Code supporting older versions of Python + should continue to use :func:`!match`, while new code should prefer + :func:`!prefixmatch`. See :ref:`prefixmatch-vs-match`. + + (Contributed by Gregory P. Smith in :gh:`86519` and + Hugo van Kemenade in :gh:`148100`.) + + * :mod:`struct`: * Calling the ``Struct.__new__()`` without required argument now is @@ -1757,6 +1779,8 @@ New deprecations .. include:: ../deprecations/pending-removal-in-future.rst +.. include:: ../deprecations/soft-deprecations.rst + C API changes ============= diff --git a/Misc/NEWS.d/next/Library/2026-04-04-20-22-02.gh-issue-148100.lSmGQi.rst b/Misc/NEWS.d/next/Library/2026-04-04-20-22-02.gh-issue-148100.lSmGQi.rst new file mode 100644 index 00000000000000..dd5bc6022e0e5c --- /dev/null +++ b/Misc/NEWS.d/next/Library/2026-04-04-20-22-02.gh-issue-148100.lSmGQi.rst @@ -0,0 +1,3 @@ +:term:`Soft deprecate ` :func:`re.match` and +:meth:`re.Pattern.match` in favour of :func:`re.prefixmatch` and +:meth:`re.Pattern.prefixmatch`. Patch by Hugo van Kemenade.