Skip to content

[pull] master from git:master#211

Merged
pull[bot] merged 17 commits into
turkdevops:masterfrom
git:master
May 25, 2026
Merged

[pull] master from git:master#211
pull[bot] merged 17 commits into
turkdevops:masterfrom
git:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 25, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

spkrka and others added 17 commits May 12, 2026 09:33
Replace the boolean ignore_missing_commits parameter in
paint_down_to_common() with an enum merge_base_flags, and thread
the flags through merge_bases_many(), get_merge_bases_many_0(),
and the public repo_get_merge_bases_many_dirty() API.

This makes callsites with boolean parameters easier to read and
prepares the function for additional flags in a subsequent commit.

No functional change: the single caller that used
ignore_missing_commits (repo_in_merge_bases_many) now sets
MERGE_BASE_IGNORE_MISSING_COMMITS in the flags word, and all
other callers pass 0.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits not in the commit-graph get GENERATION_NUMBER_INFINITY and
sort to the top of the priority queue.  After those, commits with
finite generation numbers are popped in non-increasing order.
When MERGE_BASE_FIND_ALL is not set the first doubly-painted commit
with a finite generation is therefore a best merge-base: no commit
still in the queue can be a descendant of it.  Skip the expensive
STALE drain in this case.

Add MERGE_BASE_FIND_ALL to the merge_base_flags enum.  Callers that
need every merge-base (repo_get_merge_bases_many, repo_get_merge_bases,
repo_in_merge_bases_many, remove_redundant_no_gen) pass the flag to
preserve existing behavior.  git merge-base (without --all) passes 0,
triggering the early exit.

On a 2.2M-commit merge-heavy monorepo with commit-graph:

  HEAD vs ~500:   5,229ms -> 24ms
  HEAD vs ~1000:  4,214ms -> 39ms
  HEAD vs ~5000:  3,799ms -> 46ms
  HEAD vs ~10000: 3,827ms -> 61ms

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'commit' field in 'struct todo_add_branch_context' is unused.
It's written to, but never read from.
add_decorations_to_list() gets the commit passed to it explicitly
as an argument.

Signed-off-by: Abhinav Gupta <mail@abhinavg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the top of add_rfc2047() we do this:

  strbuf_grow(sb, len * 3 + strlen(encoding) + 100);

where "len" is the size of the header (like an author name) we are about
to encode into the buffer. This pre-sizing is purely an optimization; we
use strbuf_addf() and friends to actually write into the buffer, and
they will grow the buffer as necessary.

But there's a problem with the code above: the input can be arbitrarily
large, so we might overflow a size_t while doing that computation,
ending up with a too-small allocation request. Overflowing requires an
impractically large input on a 64-bit system, but is easy to demonstrate
on a 32-bit system with a commit whose author name is ~1.4GB.

Because this pre-sizing is just an optimization, there's no real harm.
We'll start with a smaller buffer and grow it as necessary. But it
_looks_ like a vulnerability, since some other code may pre-size a
strbuf and then write directly into its buffer. So it's worth avoiding
the overflow in the first place.

The obvious way to do that is via checked operations like st_add() and
friends. But taking a step back, is this pre-sizing actually helping
anything?

The computation goes all the way back to 4234a76 (Extend
--pretty=oneline to cover the first paragraph,, 2007-06-11), but back
then we really were sizing the array to write into directly! In
674d172 (Rework pretty_print_commit to use strbufs instead of custom
buffers., 2007-09-10) that switched to a strbuf, and at that point it
was a pure optimization.

Is the optimization helping? I don't think so. Even for a gigantic case
like the 1.4GB author name, I couldn't measure any slowdown when
removing it. And most input will be much smaller, and added to a running
strbuf containing the rest of the email-header output. We can just rely
on strbuf's usual amortized-linear growth.

So deleting the line seems like the best way to go. It eliminates the
integer overflow and makes the code a tiny bit simpler.

Reported-by: Luke Martin <lmartin@paramenoeng.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a dumb http server reports alternates with an absolute path, we try
to paste that onto the root of the URL we're trying to fetch from. So if
we go to "http://example.com/path/to/child.git" and it tells us about an
alternate at "/parent.git", we'll hit "http://example.com/parent.git".

But there's a bug in computing the base when the URL does not have any
path component at all, like "http://example.com". When looking for the
first slash after the host, strchr() returns NULL, and we compute a
nonsense value for the length of the host portion. And then when we use
that length to copy the base of the URL into a strbuf, we're likely to
fail.

The security implications are minimal here. We store the nonsense length
("serverlen") as an int, so on a 64-bit system it may effectively be
anything (it is zero minus a 64-bit heap pointer, then truncated to
32-bits and stuffed into a signed value). When we feed that length to
strbuf_add(), it is cast into a size_t and one of four things will
happen:

  1. If serverlen was negative, it will turn into a very large positive
     value and strbuf_add() will fail to allocate, ending the program.
     Ditto if serverlen was positive but just very large.

     This doesn't really get an attacker anything; the victim will just
     fail to clone their evil repo.

  2. If serverlen was small enough, we'll successfully extend the target
     strbuf, and then copy an arbitrary set of bytes from "base". And
     then one of these is true:

       a. That set of bytes is much larger than the length of the "base"
          string. This is an out-of-bounds read, but there's no
          out-of-bounds write, since the strbuf code both allocates and
          copies using the same size_t. This is likely to cause a
          segfault as we try to read unmapped pages of memory.

       b. Like (2a), but if the set of bytes is small enough we might
          not segfault. We might read random memory from the process and
          copy it into the "target" strbuf.

          What happens then? We know that "base" ends with a NUL
          terminator, which will be copied into "target" as well. So
          even though target.len might be 1000 bytes (or whatever), when
          interpreted as a NUL-terminated string, target.buf is still
          the exact same string as "base".

          And that's all we ever do with target: pass it around as a C
          string, and then eventually strbuf_detach() it to become a C
          string. So even though there was arbitrary memory copied into
          the strbuf, we never access it.

       c. The other interesting case is when serverlen is actually
          _shorter_ than the length of base. And there we truncate the
          string. Probably in a way that makes it totally invalid, but
          if you were very unlucky you could turn something like:

             http://victim.com.evil.domain:8000

          into:

            http://victim.com

	  Which looks like the start of a redirect attack, except that
	  the attacker could just have written "http://victim.com" in
	  the first place! Either way we feed it to
	  is_alternate_allowed(), which is where we check redirect and
	  protocol rules.

I think we can just treat this like a regular bug.

And it's quite a weird setup in the first place, as it implies that the
root of the web server is serving a repository (i.e., that you can get
something useful from "http://example.com/info/refs"). The bug has been
there since b366156 ([PATCH] Add support for alternates in HTTP,
2005-09-14) without anybody noticing.

I kind of doubt anybody really cares about making this work, but it's
easy enough to do so: the host-portion of the URL ends at either the
first slash or the end-of-string. So we can just replace strchr() with
strchrnul().

The test setup is a little gross, as we take over the httpd document
root by shoving our bare-repo components into it. But it demonstrates
the problem and shows that our solution actually allows the alternate to
function, if the server is configured to allow it.

Reported-by: slonkazoid <slonkazoid@slonk.ing>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Negative values for --inter-hunk-context produce structurally invalid
diff output with overlapping hunks:

    $ git log -1 -p -U3 --inter-hunk-context=-100 791aedd \
        -- git-compat-util.h | grep '^@@'
    @@ -110,6 +110,9 @@
    @@ -115,6 +118,9 @@
    @@ -116,6 +122,7 @@

Hunk 1 covers lines 110-115, hunk 2 starts at 115 (overlap), hunk 3
starts at 116 (overlaps both). The resulting patch cannot be applied.

The config variable diff.interHunkContext already rejects negative
values, but the command line option does not.

Change the type of diff_options.interhunkcontext and its static
default from int to unsigned int, and switch the option parser from
OPT_INTEGER_F to OPT_UNSIGNED. This rejects negative values at parse
time via git_parse_unsigned() and enforces the correct type at compile
time via BARF_UNLESS_UNSIGNED.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Passing a negative value to -U is silently accepted and produces
corrupt unified diff output with malformed hunk headers:

    $ git log -1 -p -U-500 -- GIT-VERSION-GEN | grep '^@@'
    @@ -503,999- +503,999- @@

Line 503 of a 106-line file, count "999-" is not a valid integer.

The config variable diff.context already rejects negative values, but
the command line callback diff_opt_unified() uses strtol() with no
range check.

Change the type of diff_options.context and its static default from
int to unsigned int, matching the change to interhunkcontext in the
previous commit. The type change requires reworking the callback and
config parsing to validate in a local variable before assigning to
the now-unsigned field.

Unlike --inter-hunk-context which could be converted to OPT_UNSIGNED,
-U needs OPT_CALLBACK_F for PARSE_OPT_OPTARG (bare -U with no value
enables patch output). Add a range check in the callback instead.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The xdemitconf_t fields ctxlen and interhunkctxlen are typed as long
(signed), but negative values are not meaningful for context line
counts. Unlike the diff_options fields changed in the previous two
commits, these cannot be converted to unsigned because the xdiff
arithmetic relies on signed subtraction:

    s1 = XDL_MAX(xch->i1 - xecfg->ctxlen, 0);

If ctxlen were unsigned long, the signed operand would be implicitly
converted to unsigned, and the subtraction would wrap to a large
positive value when i1 < ctxlen, defeating the XDL_MAX clamp. The
signed type is required for correct context-window calculations.

The previous two commits reject negative values at the parse layer
for --inter-hunk-context and -U/--unified, so negative values should
no longer reach xdiff in normal use. Add BUG() guards at the top of
xdl_get_hunk() as defense in depth to catch programming errors in
current or future callers that bypass option parsing.

xdl_get_hunk() is called by both xdl_emit_diff() and
xdl_call_hunk_func(), so a single guard covers all xdiff consumers.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation says the flag prevents an option from being
"negated" without specifying what that means. Add a parenthetical
to clarify that it rejects the "--no-<option>" form.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitignore(5) says that the per-repository ignore file is
$GIT_DIR/info/exclude, but in a worktree that is not the case:

    git rev-parse --git-path info/exclude
    /path/to/main/worktree/.git/info/exclude
    git rev-parse --git-common-dir
    /path/to/main/worktree/.git

We actually use $GIT_COMMON_DIR/info/exclude. Adjust the documentation
and some code comments to say so.

Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git merge-base" optimization.

* kk/paint-down-to-common-optim:
  commit-reach: early exit paint_down_to_common for single merge-base
  commit-reach: introduce merge_base_flags enum
Document the fact that .git/info/exclude is shared across worktrees
linked to the same repository.

* dk/doc-exclude-is-shared-per-repo:
  ignore: note info/exclude lives in GIT_COMMON_DIR, not GIT_DIR
The command line parser for "git diff" learned a few options take
only non-negative integers.

* mm/diff-U-takes-no-negative-values:
  parse-options: clarify what "negated" means for PARSE_OPT_NONEG
  xdiff: guard against negative context lengths
  diff: reject negative values for -U/--unified
  diff: reject negative values for --inter-hunk-context
Code clean-up.

* ag/sequencer-remove-unused-struct-member:
  sequencer: remove todo_add_branch_context.commit
Remove ineffective strbuf presizing that would have computed an
allocation that would not have fit in the available memory anyway,
or too small due to integer wraparound to cause immediate automatic
growing.

* jk/pretty-no-strbuf-presizing:
  pretty: drop strbuf pre-sizing from add_rfc2047()
The HTTP walker misinterpreted the alternates file that gives an
absolute path when the server URL does not have the final slash
(i.e., "https://example.com" not "https://example.com/").

* jk/dumb-http-alternate-fix:
  http: handle absolute-path alternates from server root
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@pull pull Bot locked and limited conversation to collaborators May 25, 2026
@pull pull Bot added the ⤵️ pull label May 25, 2026
@pull pull Bot merged commit 56a4f3c into turkdevops:master May 25, 2026
2 of 3 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants