<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/core/dev.c, branch v6.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2023-12-21T09:15:10Z</updated>
<entry>
<title>net: check dev-&gt;gso_max_size in gso_features_check()</title>
<updated>2023-12-21T09:15:10Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-12-19T12:53:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=24ab059d2ebd62fdccc43794796f6ffbabe49ebc'/>
<id>urn:sha1:24ab059d2ebd62fdccc43794796f6ffbabe49ebc</id>
<content type='text'>
Some drivers might misbehave if TSO packets get too big.

GVE for instance uses a 16bit field in its TX descriptor,
and will do bad things if a packet is bigger than 2^16 bytes.

Linux TCP stack honors dev-&gt;gso_max_size, but there are
other ways for too big packets to reach an ndo_start_xmit()
handler : virtio_net, af_packet, GRO...

Add a generic check in gso_features_check() and fallback
to GSO when needed.

gso_max_size was added in the blamed commit.

Fixes: 82cc1a7a5687 ("[NET]: Add per-connection option to set max TSO frame size")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://lore.kernel.org/r/20231219125331.4127498-1-edumazet@google.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>bpf: Fix dev's rx stats for bpf_redirect_peer traffic</title>
<updated>2023-11-20T18:15:16Z</updated>
<author>
<name>Peilin Ye</name>
<email>peilin.ye@bytedance.com</email>
</author>
<published>2023-11-14T00:42:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=024ee930cb3c9ae49e4266aee89cfde0ebb407e1'/>
<id>urn:sha1:024ee930cb3c9ae49e4266aee89cfde0ebb407e1</id>
<content type='text'>
Traffic redirected by bpf_redirect_peer() (used by recent CNIs like Cilium)
is not accounted for in the RX stats of supported devices (that is, veth
and netkit), confusing user space metrics collectors such as cAdvisor [0],
as reported by Youlun.

Fix it by calling dev_sw_netstats_rx_add() in skb_do_redirect(), to update
RX traffic counters. Devices that support ndo_get_peer_dev _must_ use the
@tstats per-CPU counters (instead of @lstats, or @dstats).

To make this more fool-proof, error out when ndo_get_peer_dev is set but
@tstats are not selected.

  [0] Specifically, the "container_network_receive_{byte,packet}s_total"
      counters are affected.

Fixes: 9aa1206e8f48 ("bpf: Add redirect_peer helper")
Reported-by: Youlun Zhang &lt;zhangyoulun@bytedance.com&gt;
Signed-off-by: Peilin Ye &lt;peilin.ye@bytedance.com&gt;
Co-developed-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://lore.kernel.org/r/20231114004220.6495-6-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: Move {l,t,d}stats allocation to core and convert veth &amp; vrf</title>
<updated>2023-11-20T18:15:16Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2023-11-14T00:42:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=34d21de99cea9cb17967874313e5b0262527833c'/>
<id>urn:sha1:34d21de99cea9cb17967874313e5b0262527833c</id>
<content type='text'>
Move {l,t,d}stats allocation to the core and let netdevs pick the stats
type they need. That way the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc) - all happening in the core.

Co-developed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Cc: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://lore.kernel.org/r/20231114004220.6495-3-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: Fix undefined behavior in netdev name allocation</title>
<updated>2023-11-15T11:05:08Z</updated>
<author>
<name>Gal Pressman</name>
<email>gal@nvidia.com</email>
</author>
<published>2023-11-14T07:56:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=674e318089468ece99aef4796eaef7add57f36b2'/>
<id>urn:sha1:674e318089468ece99aef4796eaef7add57f36b2</id>
<content type='text'>
Cited commit removed the strscpy() call and kept the snprintf() only.

It is common to use 'dev-&gt;name' as the format string before a netdev is
registered, this results in 'res' and 'name' pointers being equal.
According to POSIX, if copying takes place between objects that overlap
as a result of a call to sprintf() or snprintf(), the results are
undefined.

Add back the strscpy() and use 'buf' as an intermediate buffer.

Fixes: 7ad17b04dc7b ("net: trust the bitmap in __dev_alloc_name()")
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
Reviewed-by: Vlad Buslov &lt;vladbu@nvidia.com&gt;
Signed-off-by: Gal Pressman &lt;gal@nvidia.com&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2023-10-27T03:02:41Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-10-27T03:02:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c6f9b7138bf5c6b826175c9e0ad5f5dbfff4fa36'/>
<id>urn:sha1:c6f9b7138bf5c6b826175c9e0ad5f5dbfff4fa36</id>
<content type='text'>
Daniel Borkmann says:

====================
pull-request: bpf-next 2023-10-26

We've added 51 non-merge commits during the last 10 day(s) which contain
a total of 75 files changed, 5037 insertions(+), 200 deletions(-).

The main changes are:

1) Add open-coded task, css_task and css iterator support.
   One of the use cases is customizable OOM victim selection via BPF,
   from Chuyi Zhou.

2) Fix BPF verifier's iterator convergence logic to use exact states
   comparison for convergence checks, from Eduard Zingerman,
   Andrii Nakryiko and Alexei Starovoitov.

3) Add BPF programmable net device where bpf_mprog defines the logic
   of its xmit routine. It can operate in L3 and L2 mode,
   from Daniel Borkmann and Nikolay Aleksandrov.

4) Batch of fixes for BPF per-CPU kptr and re-enable unit_size checking
   for global per-CPU allocator, from Hou Tao.

5) Fix libbpf which eagerly assumed that SHT_GNU_verdef ELF section
   was going to be present whenever a binary has SHT_GNU_versym section,
   from Andrii Nakryiko.

6) Fix BPF ringbuf correctness to fold smp_mb__before_atomic() into
   atomic_set_release(), from Paul E. McKenney.

7) Add a warning if NAPI callback missed xdp_do_flush() under
   CONFIG_DEBUG_NET which helps checking if drivers were missing
   the former, from Sebastian Andrzej Siewior.

8) Fix missed RCU read-lock in bpf_task_under_cgroup() which was throwing
   a warning under sleepable programs, from Yafang Shao.

9) Avoid unnecessary -EBUSY from htab_lock_bucket by disabling IRQ before
   checking map_locked, from Song Liu.

10) Make BPF CI linked_list failure test more robust,
    from Kumar Kartikeya Dwivedi.

11) Enable samples/bpf to be built as PIE in Fedora, from Viktor Malik.

12) Fix xsk starving when multiple xsk sockets were associated with
    a single xsk_buff_pool, from Albert Huang.

13) Clarify the signed modulo implementation for the BPF ISA standardization
    document that it uses truncated division, from Dave Thaler.

14) Improve BPF verifier's JEQ/JNE branch taken logic to also consider
    signed bounds knowledge, from Andrii Nakryiko.

15) Add an option to XDP selftests to use multi-buffer AF_XDP
    xdp_hw_metadata and mark used XDP programs as capable to use frags,
    from Larysa Zaremba.

16) Fix bpftool's BTF dumper wrt printing a pointer value and another
    one to fix struct_ops dump in an array, from Manu Bretelle.

* tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (51 commits)
  netkit: Remove explicit active/peer ptr initialization
  selftests/bpf: Fix selftests broken by mitigations=off
  samples/bpf: Allow building with custom bpftool
  samples/bpf: Fix passing LDFLAGS to libbpf
  samples/bpf: Allow building with custom CFLAGS/LDFLAGS
  bpf: Add more WARN_ON_ONCE checks for mismatched alloc and free
  selftests/bpf: Add selftests for netkit
  selftests/bpf: Add netlink helper library
  bpftool: Extend net dump with netkit progs
  bpftool: Implement link show support for netkit
  libbpf: Add link-based API for netkit
  tools: Sync if_link uapi header
  netkit, bpf: Add bpf programmable net device
  bpf: Improve JEQ/JNE branch taken logic
  bpf: Fold smp_mb__before_atomic() into atomic_set_release()
  bpf: Fix unnecessary -EBUSY from htab_lock_bucket
  xsk: Avoid starving the xsk further down the list
  bpf: print full verifier states on infinite loop detection
  selftests/bpf: test if state loops are detected in a tricky case
  bpf: correct loop detection for iterators convergence
  ...
====================

Link: https://lore.kernel.org/r/20231026150509.2824-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: remove else after return in dev_prep_valid_name()</title>
<updated>2023-10-24T20:02:59Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-10-23T15:23:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ce4cfa2318afcd74cc41992e306a28fa04e5d484'/>
<id>urn:sha1:ce4cfa2318afcd74cc41992e306a28fa04e5d484</id>
<content type='text'>
Remove unnecessary else clauses after return.
I copied this if / else construct from somewhere,
it makes the code harder to read.

Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Link: https://lore.kernel.org/r/20231023152346.3639749-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: remove dev_valid_name() check from __dev_alloc_name()</title>
<updated>2023-10-24T20:02:58Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-10-23T15:23:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=70e1b14c1bcbbb0854311ff8bed6cf4db75d5f05'/>
<id>urn:sha1:70e1b14c1bcbbb0854311ff8bed6cf4db75d5f05</id>
<content type='text'>
__dev_alloc_name() is only called by dev_prep_valid_name(),
which already checks that name is valid.

Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Link: https://lore.kernel.org/r/20231023152346.3639749-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: trust the bitmap in __dev_alloc_name()</title>
<updated>2023-10-24T20:02:58Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-10-23T15:23:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7ad17b04dc7bdcdd1f85e460c38da55b0afa2422'/>
<id>urn:sha1:7ad17b04dc7bdcdd1f85e460c38da55b0afa2422</id>
<content type='text'>
Prior to restructuring __dev_alloc_name() handled both printf
and non-printf names. In a clever attempt at code reuse it
always prints the name into a buffer and checks if it's
a duplicate.

Trust the bitmap, and return an error if its full.

This shrinks the possible ID space by one from 32K to 32K - 1,
as previously the max value would have been tried as a valid ID.
It seems very unlikely that anyone would care as we heard
no requests to increase the max beyond 32k.

Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Link: https://lore.kernel.org/r/20231023152346.3639749-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: reduce indentation of __dev_alloc_name()</title>
<updated>2023-10-24T20:02:58Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-10-23T15:23:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a810468126c846299d867f73dd7053064c29be1'/>
<id>urn:sha1:9a810468126c846299d867f73dd7053064c29be1</id>
<content type='text'>
All callers of __dev_valid_name() go thru dev_prep_valid_name()
which handles the non-printf case. Focus __dev_alloc_name() on
the sprintf case, remove the indentation level.

Minor functional change of returning -EINVAL if % is not found,
which should now never happen.

Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Link: https://lore.kernel.org/r/20231023152346.3639749-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: make dev_alloc_name() call dev_prep_valid_name()</title>
<updated>2023-10-24T20:02:58Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-10-23T15:23:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=556c755a4d8143007c745b6ad894611a04173b53'/>
<id>urn:sha1:556c755a4d8143007c745b6ad894611a04173b53</id>
<content type='text'>
__dev_alloc_name() handles both the sprintf and non-sprintf
target names. This complicates the code.

dev_prep_valid_name() already handles the non-sprintf case,
before calling __dev_alloc_name(), make the only other caller
also go thru dev_prep_valid_name(). This way we can drop
the non-sprintf handling in __dev_alloc_name() in one of
the next changes.

commit 55a5ec9b7710 ("Revert "net: core: dev_get_valid_name is now the same as dev_alloc_name_ns"") and
commit 029b6d140550 ("Revert "net: core: maybe return -EEXIST in __dev_alloc_name"")
tell us that we can't start returning -EEXIST from dev_alloc_name()
on name duplicates. Bite the bullet and pass the expected errno to
dev_prep_valid_name().

dev_prep_valid_name() must now propagate out the allocated id
for printf names.

Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Link: https://lore.kernel.org/r/20231023152346.3639749-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
