<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/git-filter-branch.sh, branch jch</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/git/git.git/
</subtitle>
<id>https://git.shady.money/git/atom?h=jch</id>
<link rel='self' href='https://git.shady.money/git/atom?h=jch'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/'/>
<updated>2025-04-16T14:30:29Z</updated>
<entry>
<title>filter-branch: stop depending on Perl</title>
<updated>2025-04-16T14:30:29Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-04-16T12:16:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=f6d855091e73fdab8a39185a8392b9d0df7ed46f'/>
<id>urn:sha1:f6d855091e73fdab8a39185a8392b9d0df7ed46f</id>
<content type='text'>
While git-filter-branch(1) is written as a shell script, the
`--state-branch` feature depends on Perl to save and extract the object
ID mappings. This can lead to subtle breakage though:

  - We execute `perl` directly without respecting the `PERL_PATH`
    configured by the distribution. As such, it may happen that we use
    the wrong version of Perl.

  - We install the script unchanged even if Perl isn't available at all
    on the system, so using `--state-branch` would lead to failure
    altogether in that case.

Fix this by dropping Perl and instead implementing the feature with
shell scripting exclusively.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>git-sh-setup: remove "sane_grep", it's not needed anymore</title>
<updated>2021-10-21T23:17:57Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-10-21T19:58:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=ebeb39faad6e3a67c31884c3dc6b76ce58b3f15b'/>
<id>urn:sha1:ebeb39faad6e3a67c31884c3dc6b76ce58b3f15b</id>
<content type='text'>
Remove the sane_grep() shell function in git-sh-setup. The two reasons
for why it existed don't apply anymore:

1. It was added due to GNU grep supporting GREP_OPTIONS. See
   e1622bfcbad (Protect scripted Porcelains from GREP_OPTIONS insanity,
   2009-11-23).

   Newer versions of GNU grep ignore that, but even on older versions
   its existence won't matter, none of these sane_grep() uses care
   about grep's output, they're merely using it to check if a string
   exists in a file or stream. We also don't care about the "LC_ALL=C"
   that "sane_grep" was using, these greps for fixed or ASCII strings
   will behave the same under any locale.

2. The SANE_TEXT_GREP added in 71b401032b9 (sane_grep: pass "-a" if
   grep accepts it, 2016-03-08) isn't needed either, none of these grep
   uses deal with binary data.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>filter-branch: drop $_x40 glob</title>
<updated>2021-03-10T22:16:58Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-03-10T17:07:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=42efa1231aee85932058cc6d1571ab4ceb3e7eff'/>
<id>urn:sha1:42efa1231aee85932058cc6d1571ab4ceb3e7eff</id>
<content type='text'>
When checking whether a commit was rewritten to a single object id, we
use a glob that insists on a 40-hex result. This works for sha1, but
fails t7003 when run with GIT_TEST_DEFAULT_HASH=sha256.

Since the previous commit simplified the case statement here, we only
have two arms: an empty string or a single object id. We can just loosen
our glob to match anything, and still distinguish those cases (we lose
the ability to notice bogus input, but that's not a problem; we are the
one who wrote the map in the first place, and anyway update-ref will
complain loudly if the input isn't a valid hash).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>filter-branch: drop multiple-ancestor warning</title>
<updated>2021-03-10T22:14:52Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-03-10T17:07:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=98fe9e666fe4a595cf5396fd8d5c57b380c782b2'/>
<id>urn:sha1:98fe9e666fe4a595cf5396fd8d5c57b380c782b2</id>
<content type='text'>
When a ref maps to a commit that is neither rewritten nor kept by
filter-branch (e.g., because it was eliminated by rev-list's pathspec
selection), we rewrite it to its nearest ancestor.

Since the initial commit in 6f6826c52b (Add git-filter-branch,
2007-06-03), we have warned when there are multiple such ancestors in
the map file. However, the warning code is impossible to trigger these
days. Since a0e46390d3 (filter-branch: fix ref rewriting with
--subdirectory-filter, 2008-08-12), we find the ancestor using "rev-list
-1", so it can only ever have a single value.

This code is made doubly confusing by the fact that we append to the map
file when mapping ancestors. However, this can never yield multiple
values because:

  - we explicitly check whether the map already exists, and if so, do
    nothing (so our "append" will always be to a file that does not
    exist)

  - even if we were to try mapping twice, the process to do so is
    deterministic. I.e., we'd always end up with the same ancestor for a
    given sha1. So warning about it would be pointless; there is no
    ambiguity.

So swap out the warning code for a BUG (which we'll simplify further in
the next commit). And let's stop using the append operator to make the
ancestor-mapping code less confusing.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Recommend git-filter-repo instead of git-filter-branch</title>
<updated>2019-09-05T20:01:48Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2019-09-04T22:32:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9df53c5de6e687df9cd7b36e633360178b65a0ef'/>
<id>urn:sha1:9df53c5de6e687df9cd7b36e633360178b65a0ef</id>
<content type='text'>
filter-branch suffers from a deluge of disguised dangers that disfigure
history rewrites (i.e. deviate from the deliberate changes).  Many of
these problems are unobtrusive and can easily go undiscovered until the
new repository is in use.  This can result in problems ranging from an
even messier history than what led folks to filter-branch in the first
place, to data loss or corruption.  These issues cannot be backward
compatibly fixed, so add a warning to both filter-branch and its manpage
recommending that another tool (such as filter-repo) be used instead.

Also, update other manpages that referenced filter-branch.  Several of
these needed updates even if we could continue recommending
filter-branch, either due to implying that something was unique to
filter-branch when it applied more generally to all history rewriting
tools (e.g. BFG, reposurgeon, fast-import, filter-repo), or because
something about filter-branch was used as an example despite other more
commonly known examples now existing.  Reword these sections to fix
these issues and to avoid recommending filter-branch.

Finally, remove the section explaining BFG Repo Cleaner as an
alternative to filter-branch.  I feel somewhat bad about this,
especially since I feel like I learned so much from BFG that I put to
good use in filter-repo (which is much more than I can say for
filter-branch), but keeping that section presented a few problems:
  * In order to recommend that people quit using filter-branch, we need
    to provide them a recomendation for something else to use that
    can handle all the same types of rewrites.  To my knowledge,
    filter-repo is the only such tool.  So it needs to be mentioned.
  * I don't want to give conflicting recommendations to users
  * If we recommend two tools, we shouldn't expect users to learn both
    and pick which one to use; we should explain which problems one
    can solve that the other can't or when one is much faster than
    the other.
  * BFG and filter-repo have similar performance
  * All filtering types that BFG can do, filter-repo can also do.  In
    fact, filter-repo comes with a reimplementation of BFG named
    bfg-ish which provides the same user-interface as BFG but with
    several bugfixes and new features that are hard to implement in
    BFG due to its technical underpinnings.
While I could still mention both tools, it seems like I would need to
provide some kind of comparison and I would ultimately just say that
filter-repo can do everything BFG can, so ultimately it seems that it
is just better to remove that section altogether.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'mb/filter-branch-optim'</title>
<updated>2018-07-18T19:20:32Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-07-18T19:20:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=676c7e50b10f1fbb2ce06e46260f15f3a20679f7'/>
<id>urn:sha1:676c7e50b10f1fbb2ce06e46260f15f3a20679f7</id>
<content type='text'>
"git filter-branch" when used with the "--state-branch" option
still attempted to rewrite the commits whose filtered result is
known from the previous attempt (which is recorded on the state
branch); the command has been corrected not to waste cycles doing
so.

* mb/filter-branch-optim:
  filter-branch: skip commits present on --state-branch
</content>
</entry>
<entry>
<title>filter-branch: skip commits present on --state-branch</title>
<updated>2018-06-26T22:44:53Z</updated>
<author>
<name>Michael Barabanov</name>
<email>michael.barabanov@gmail.com</email>
</author>
<published>2018-06-26T04:07:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=709cfe848ad2312f80e6f4f7a27aa5d23992a0e3'/>
<id>urn:sha1:709cfe848ad2312f80e6f4f7a27aa5d23992a0e3</id>
<content type='text'>
The commits in state:filter.map have already been processed, so don't
filter them again. This makes incremental git filter-branch much faster.

Also add tests for --state-branch option.

Signed-off-by: Michael Barabanov &lt;michael.barabanov@gmail.com&gt;
Acked-by: Ian Campbell &lt;ijc@hellion.org.uk&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Update shell scripts to compute empty tree object ID</title>
<updated>2018-05-02T04:59:53Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2018-05-02T00:26:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=03a7f388dafaee0aa084144efe7a8f9c151e5221'/>
<id>urn:sha1:03a7f388dafaee0aa084144efe7a8f9c151e5221</id>
<content type='text'>
Several of our shell scripts hard-code the object ID of the empty tree.
To avoid any problems when changing hashes, compute this value on
startup of the script.  For performance, store the value in a variable
and reuse it throughout the life of the script.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'yk/filter-branch-non-committish-refs'</title>
<updated>2018-04-10T07:28:23Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-04-10T07:28:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=9aa3a4c406e1db08269cc8f6e8757555bd771120'/>
<id>urn:sha1:9aa3a4c406e1db08269cc8f6e8757555bd771120</id>
<content type='text'>
when refs that do not point at committish are given, "git
filter-branch" gave a misleading error messages.  This has been
corrected.

* yk/filter-branch-non-committish-refs:
  filter-branch: fix errors caused by refs that point at non-committish
</content>
</entry>
<entry>
<title>Merge branch 'ml/filter-branch-no-op-error'</title>
<updated>2018-04-09T23:25:44Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-04-09T23:25:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/git/commit/?id=cb3e97dae806d66ea1859c48b9936d5bfbac9c09'/>
<id>urn:sha1:cb3e97dae806d66ea1859c48b9936d5bfbac9c09</id>
<content type='text'>
"git filter-branch" learned to use a different exit code to allow
the callers to tell the case where there was no new commits to
rewrite from other error cases.

* ml/filter-branch-no-op-error:
  filter-branch: return 2 when nothing to rewrite
</content>
</entry>
</feed>
