<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/core/skmsg.c, branch v6.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2023-01-23T11:26:50Z</updated>
<entry>
<title>net/sock: Introduce trace_sk_data_ready()</title>
<updated>2023-01-23T11:26:50Z</updated>
<author>
<name>Peilin Ye</name>
<email>peilin.ye@bytedance.com</email>
</author>
<published>2023-01-20T00:45:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=40e0b09081420853542571c38875b48b60404ebb'/>
<id>urn:sha1:40e0b09081420853542571c38875b48b60404ebb</id>
<content type='text'>
As suggested by Cong, introduce a tracepoint for all -&gt;sk_data_ready()
callback implementations.  For example:

&lt;...&gt;
  iperf-609  [002] .....  70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable
  iperf-609  [002] .....  70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable
&lt;...&gt;

Suggested-by: Cong Wang &lt;cong.wang@bytedance.com&gt;
Signed-off-by: Peilin Ye &lt;peilin.ye@bytedance.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf, sockmap: Fix missing BPF_F_INGRESS flag when using apply_bytes</title>
<updated>2022-12-01T00:07:32Z</updated>
<author>
<name>Pengcheng Yang</name>
<email>yangpc@wangsu.com</email>
</author>
<published>2022-11-29T10:40:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a351d6087bf7d3d8440d58d3bf244ec64b89394a'/>
<id>urn:sha1:a351d6087bf7d3d8440d58d3bf244ec64b89394a</id>
<content type='text'>
When redirecting, we use sk_msg_to_ingress() to get the BPF_F_INGRESS
flag from the msg-&gt;flags. If apply_bytes is used and it is larger than
the current data being processed, sk_psock_msg_verdict() will not be
called when sendmsg() is called again. At this time, the msg-&gt;flags is 0,
and we lost the BPF_F_INGRESS flag.

So we need to save the BPF_F_INGRESS flag in sk_psock and use it when
redirection.

Fixes: 8934ce2fd081 ("bpf: sockmap redirect ingress support")
Signed-off-by: Pengcheng Yang &lt;yangpc@wangsu.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Link: https://lore.kernel.org/bpf/1669718441-2654-3-git-send-email-yangpc@wangsu.com
</content>
</entry>
<entry>
<title>bpf, sock_map: Move cancel_work_sync() out of sock lock</title>
<updated>2022-11-03T12:51:06Z</updated>
<author>
<name>Cong Wang</name>
<email>cong.wang@bytedance.com</email>
</author>
<published>2022-11-02T04:34:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8bbabb3fddcd0f858be69ed5abc9b470a239d6f2'/>
<id>urn:sha1:8bbabb3fddcd0f858be69ed5abc9b470a239d6f2</id>
<content type='text'>
Stanislav reported a lockdep warning, which is caused by the
cancel_work_sync() called inside sock_map_close(), as analyzed
below by Jakub:

psock-&gt;work.func = sk_psock_backlog()
  ACQUIRE psock-&gt;work_mutex
    sk_psock_handle_skb()
      skb_send_sock()
        __skb_send_sock()
          sendpage_unlocked()
            kernel_sendpage()
              sock-&gt;ops-&gt;sendpage = inet_sendpage()
                sk-&gt;sk_prot-&gt;sendpage = tcp_sendpage()
                  ACQUIRE sk-&gt;sk_lock
                    tcp_sendpage_locked()
                  RELEASE sk-&gt;sk_lock
  RELEASE psock-&gt;work_mutex

sock_map_close()
  ACQUIRE sk-&gt;sk_lock
  sk_psock_stop()
    sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED)
    cancel_work_sync()
      __cancel_work_timer()
        __flush_work()
          // wait for psock-&gt;work to finish
  RELEASE sk-&gt;sk_lock

We can move the cancel_work_sync() out of the sock lock protection,
but still before saved_close() was called.

Fixes: 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
Reported-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Signed-off-by: Cong Wang &lt;cong.wang@bytedance.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Tested-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Acked-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Link: https://lore.kernel.org/bpf/20221102043417.279409-1-xiyou.wangcong@gmail.com
</content>
</entry>
<entry>
<title>skmsg: pass gfp argument to alloc_sk_msg()</title>
<updated>2022-10-16T19:57:17Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2022-10-15T21:24:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2d1f274b95c6e4ba6a813b3b8e7a1a38d54a0a08'/>
<id>urn:sha1:2d1f274b95c6e4ba6a813b3b8e7a1a38d54a0a08</id>
<content type='text'>
syzbot found that alloc_sk_msg() could be called from a
non sleepable context. sk_psock_verdict_recv() uses
rcu_read_lock() protection.

We need the callers to pass a gfp_t argument to avoid issues.

syzbot report was:

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 3613, name: syz-executor414
preempt_count: 0, expected: 0
RCU nest depth: 1, expected: 0
INFO: lockdep is turned off.
CPU: 0 PID: 3613 Comm: syz-executor414 Not tainted 6.0.0-syzkaller-09589-g55be6084c8e0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
Call Trace:
&lt;TASK&gt;
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
__might_resched+0x538/0x6a0 kernel/sched/core.c:9877
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc_node mm/slub.c:3162 [inline]
slab_alloc mm/slub.c:3256 [inline]
kmem_cache_alloc_trace+0x59/0x310 mm/slub.c:3287
kmalloc include/linux/slab.h:600 [inline]
kzalloc include/linux/slab.h:733 [inline]
alloc_sk_msg net/core/skmsg.c:507 [inline]
sk_psock_skb_ingress_self+0x5c/0x330 net/core/skmsg.c:600
sk_psock_verdict_apply+0x395/0x440 net/core/skmsg.c:1014
sk_psock_verdict_recv+0x34d/0x560 net/core/skmsg.c:1201
tcp_read_skb+0x4a1/0x790 net/ipv4/tcp.c:1770
tcp_rcv_established+0x129d/0x1a10 net/ipv4/tcp_input.c:5971
tcp_v4_do_rcv+0x479/0xac0 net/ipv4/tcp_ipv4.c:1681
sk_backlog_rcv include/net/sock.h:1109 [inline]
__release_sock+0x1d8/0x4c0 net/core/sock.c:2906
release_sock+0x5d/0x1c0 net/core/sock.c:3462
tcp_sendmsg+0x36/0x40 net/ipv4/tcp.c:1483
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg net/socket.c:734 [inline]
__sys_sendto+0x46d/0x5f0 net/socket.c:2117
__do_sys_sendto net/socket.c:2129 [inline]
__se_sys_sendto net/socket.c:2125 [inline]
__x64_sys_sendto+0xda/0xf0 net/socket.c:2125
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 43312915b5ba ("skmsg: Get rid of unncessary memset()")
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Cong Wang &lt;cong.wang@bytedance.com&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>skmsg: Schedule psock work if the cached skb exists on the psock</title>
<updated>2022-09-26T15:48:05Z</updated>
<author>
<name>Liu Jian</name>
<email>liujian56@huawei.com</email>
</author>
<published>2022-09-07T07:13:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bec217197b412d74168c6a42fc0f76d0cc9cad00'/>
<id>urn:sha1:bec217197b412d74168c6a42fc0f76d0cc9cad00</id>
<content type='text'>
In sk_psock_backlog function, for ingress direction skb, if no new data
packet arrives after the skb is cached, the cached skb does not have a
chance to be added to the receive queue of psock. As a result, the cached
skb cannot be received by the upper-layer application. Fix this by reschedule
the psock work to dispose the cached skb in sk_msg_recvmsg function.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian &lt;liujian56@huawei.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20220907071311.60534-1-liujian56@huawei.com
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf</title>
<updated>2022-08-26T11:19:09Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2022-08-26T11:19:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2e085ec0e2d7afa14bcfbcd4c41240f5d3372bfe'/>
<id>urn:sha1:2e085ec0e2d7afa14bcfbcd4c41240f5d3372bfe</id>
<content type='text'>
Daniel borkmann says:

====================
The following pull-request contains BPF updates for your *net* tree.

We've added 11 non-merge commits during the last 14 day(s) which contain
a total of 13 files changed, 61 insertions(+), 24 deletions(-).

The main changes are:

1) Fix BPF verifier's precision tracking around BPF ring buffer, from Kumar Kartikeya Dwivedi.

2) Fix regression in tunnel key infra when passing FLOWI_FLAG_ANYSRC, from Eyal Birger.

3) Fix insufficient permissions for bpf_sys_bpf() helper, from YiFei Zhu.

4) Fix splat from hitting BUG when purging effective cgroup programs, from Pu Lehui.

5) Fix range tracking for array poke descriptors, from Daniel Borkmann.

6) Fix corrupted packets for XDP_SHARED_UMEM in aligned mode, from Magnus Karlsson.

7) Fix NULL pointer splat in BPF sockmap sk_msg_recvmsg(), from Liu Jian.

8) Add READ_ONCE() to bpf_jit_limit when reading from sysctl, from Kuniyuki Iwashima.

9) Add BPF selftest lru_bug check to s390x deny list, from Daniel Müller.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: handle pure FIN case correctly</title>
<updated>2022-08-18T18:04:56Z</updated>
<author>
<name>Cong Wang</name>
<email>cong.wang@bytedance.com</email>
</author>
<published>2022-08-17T19:54:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2e23acd99efacfd2a63cb9725afbc65e4e964fb7'/>
<id>urn:sha1:2e23acd99efacfd2a63cb9725afbc65e4e964fb7</id>
<content type='text'>
When skb-&gt;len==0, the recv_actor() returns 0 too, but we also use 0
for error conditions. This patch amends this by propagating the errors
to tcp_read_skb() so that we can distinguish skb-&gt;len==0 case from
error cases.

Fixes: 04919bed948d ("tcp: Introduce tcp_read_skb()")
Reported-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Cc: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Signed-off-by: Cong Wang &lt;cong.wang@bytedance.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>skmsg: Fix wrong last sg check in sk_msg_recvmsg()</title>
<updated>2022-08-17T20:33:20Z</updated>
<author>
<name>Liu Jian</name>
<email>liujian56@huawei.com</email>
</author>
<published>2022-08-09T09:49:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=583585e48d965338e73e1eb383768d16e0922d73'/>
<id>urn:sha1:583585e48d965338e73e1eb383768d16e0922d73</id>
<content type='text'>
Fix one kernel NULL pointer dereference as below:

[  224.462334] Call Trace:
[  224.462394]  __tcp_bpf_recvmsg+0xd3/0x380
[  224.462441]  ? sock_has_perm+0x78/0xa0
[  224.462463]  tcp_bpf_recvmsg+0x12e/0x220
[  224.462494]  inet_recvmsg+0x5b/0xd0
[  224.462534]  __sys_recvfrom+0xc8/0x130
[  224.462574]  ? syscall_trace_enter+0x1df/0x2e0
[  224.462606]  ? __do_page_fault+0x2de/0x500
[  224.462635]  __x64_sys_recvfrom+0x24/0x30
[  224.462660]  do_syscall_64+0x5d/0x1d0
[  224.462709]  entry_SYSCALL_64_after_hwframe+0x65/0xca

In commit 9974d37ea75f ("skmsg: Fix invalid last sg check in
sk_msg_recvmsg()"), we change last sg check to sg_is_last(),
but in sockmap redirection case (without stream_parser/stream_verdict/
skb_verdict), we did not mark the end of the scatterlist. Check the
sk_msg_alloc, sk_msg_page_add, and bpf_msg_push_data functions, they all
do not mark the end of sg. They are expected to use sg.end for end
judgment. So the judgment of '(i != msg_rx-&gt;sg.end)' is added back here.

Fixes: 9974d37ea75f ("skmsg: Fix invalid last sg check in sk_msg_recvmsg()")
Signed-off-by: Liu Jian &lt;liujian56@huawei.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Acked-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Link: https://lore.kernel.org/bpf/20220809094915.150391-1-liujian56@huawei.com
</content>
</entry>
<entry>
<title>Merge tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2022-08-11T20:45:37Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-11T20:45:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7ebfc85e2cd7b08f518b526173e9a33b56b3913b'/>
<id>urn:sha1:7ebfc85e2cd7b08f518b526173e9a33b56b3913b</id>
<content type='text'>
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, bpf, can and netfilter.

  A little larger than usual but it's all fixes, no late features. It's
  large partially because of timing, and partially because of follow ups
  to stuff that got merged a week or so before the merge window and
  wasn't as widely tested. Maybe the Bluetooth fixes are a little
  alarming so we'll address that, but the rest seems okay and not scary.

  Notably we're including a fix for the netfilter Kconfig [1], your WiFi
  warning [2] and a bluetooth fix which should unblock syzbot [3].

  Current release - regressions:

   - Bluetooth:
      - don't try to cancel uninitialized works [3]
      - L2CAP: fix use-after-free caused by l2cap_chan_put

   - tls: rx: fix device offload after recent rework

   - devlink: fix UAF on failed reload and leftover locks in mlxsw

  Current release - new code bugs:

   - netfilter:
      - flowtable: fix incorrect Kconfig dependencies [1]
      - nf_tables: fix crash when nf_trace is enabled

   - bpf:
      - use proper target btf when exporting attach_btf_obj_id
      - arm64: fixes for bpf trampoline support

   - Bluetooth:
      - ISO: unlock on error path in iso_sock_setsockopt()
      - ISO: fix info leak in iso_sock_getsockopt()
      - ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
      - ISO: fix memory corruption on iso_pinfo.base
      - ISO: fix not using the correct QoS
      - hci_conn: fix updating ISO QoS PHY

   - phy: dp83867: fix get nvmem cell fail

  Previous releases - regressions:

   - wifi: cfg80211: fix validating BSS pointers in
     __cfg80211_connect_result [2]

   - atm: bring back zatm uAPI after ATM had been removed

   - properly fix old bug making bonding ARP monitor mode not being able
     to work with software devices with lockless Tx

   - tap: fix null-deref on skb-&gt;dev in dev_parse_header_protocol

   - revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some
     devices and breaks others

   - netfilter:
      - nf_tables: many fixes rejecting cross-object linking which may
        lead to UAFs
      - nf_tables: fix null deref due to zeroed list head
      - nf_tables: validate variable length element extension

   - bgmac: fix a BUG triggered by wrong bytes_compl

   - bcmgenet: indicate MAC is in charge of PHY PM

  Previous releases - always broken:

   - bpf:
      - fix bad pointer deref in bpf_sys_bpf() injected via test infra
      - disallow non-builtin bpf programs calling the prog_run command
      - don't reinit map value in prealloc_lru_pop
      - fix UAFs during the read of map iterator fd
      - fix invalidity check for values in sk local storage map
      - reject sleepable program for non-resched map iterator

   - mptcp:
      - move subflow cleanup in mptcp_destroy_common()
      - do not queue data on closed subflows

   - virtio_net: fix memory leak inside XDP_TX with mergeable

   - vsock: fix memory leak when multiple threads try to connect()

   - rework sk_user_data sharing to prevent psock leaks

   - geneve: fix TOS inheriting for ipv4

   - tunnels &amp; drivers: do not use RT_TOS for IPv6 flowlabel

   - phy: c45 baset1: do not skip aneg configuration if clock role is
     not specified

   - rose: avoid overflow when /proc displays timer information

   - x25: fix call timeouts in blocking connects

   - can: mcp251x: fix race condition on receive interrupt

   - can: j1939:
      - replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
      - fix memory leak of skbs in j1939_session_destroy()

  Misc:

   - docs: bpf: clarify that many things are not uAPI

   - seg6: initialize induction variable to first valid array index (to
     silence clang vs objtool warning)

   - can: ems_usb: fix clang 14's -Wunaligned-access warning"

* tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits)
  net: atm: bring back zatm uAPI
  dpaa2-eth: trace the allocated address instead of page struct
  net: add missing kdoc for struct genl_multicast_group::flags
  nfp: fix use-after-free in area_cache_get()
  MAINTAINERS: use my korg address for mt7601u
  mlxsw: minimal: Fix deadlock in ports creation
  bonding: fix reference count leak in balance-alb mode
  net: usb: qmi_wwan: Add support for Cinterion MV32
  bpf: Shut up kern_sys_bpf warning.
  net/tls: Use RCU API to access tls_ctx-&gt;netdev
  tls: rx: device: don't try to copy too much on detach
  tls: rx: device: bound the frag walk
  net_sched: cls_route: remove from list when handle is 0
  selftests: forwarding: Fix failing tests with old libnet
  net: refactor bpf_sk_reuseport_detach()
  net: fix refcount bug in sk_psock_get (2)
  selftests/bpf: Ensure sleepable program is rejected by hash map iter
  selftests/bpf: Add write tests for sk local storage map iterator
  selftests/bpf: Add tests for reading a dangling map iter fd
  bpf: Only allow sleepable program for resched-able iterator
  ...
</content>
</entry>
<entry>
<title>net: fix refcount bug in sk_psock_get (2)</title>
<updated>2022-08-11T04:47:58Z</updated>
<author>
<name>Hawkins Jiawei</name>
<email>yin31149@gmail.com</email>
</author>
<published>2022-08-05T07:48:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2a0133723f9ebeb751cfce19f74ec07e108bef1f'/>
<id>urn:sha1:2a0133723f9ebeb751cfce19f74ec07e108bef1f</id>
<content type='text'>
Syzkaller reports refcount bug as follows:
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Modules linked in:
CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda7d90 #0
 &lt;TASK&gt;
 __refcount_add_not_zero include/linux/refcount.h:163 [inline]
 __refcount_inc_not_zero include/linux/refcount.h:227 [inline]
 refcount_inc_not_zero include/linux/refcount.h:245 [inline]
 sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439
 tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091
 tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983
 tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057
 tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659
 tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682
 sk_backlog_rcv include/net/sock.h:1061 [inline]
 __release_sock+0x134/0x3b0 net/core/sock.c:2849
 release_sock+0x54/0x1b0 net/core/sock.c:3404
 inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909
 __sys_shutdown_sock net/socket.c:2331 [inline]
 __sys_shutdown_sock net/socket.c:2325 [inline]
 __sys_shutdown+0xf1/0x1b0 net/socket.c:2343
 __do_sys_shutdown net/socket.c:2351 [inline]
 __se_sys_shutdown net/socket.c:2349 [inline]
 __x64_sys_shutdown+0x50/0x70 net/socket.c:2349
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
 &lt;/TASK&gt;

During SMC fallback process in connect syscall, kernel will
replaces TCP with SMC. In order to forward wakeup
smc socket waitqueue after fallback, kernel will sets
clcsk-&gt;sk_user_data to origin smc socket in
smc_fback_replace_callbacks().

Later, in shutdown syscall, kernel will calls
sk_psock_get(), which treats the clcsk-&gt;sk_user_data
as psock type, triggering the refcnt warning.

So, the root cause is that smc and psock, both will use
sk_user_data field. So they will mismatch this field
easily.

This patch solves it by using another bit(defined as
SK_USER_DATA_PSOCK) in PTRMASK, to mark whether
sk_user_data points to a psock object or not.
This patch depends on a PTRMASK introduced in commit f1ff5ce2cd5e
("net, sk_msg: Clear sk_user_data pointer on clone if tagged").

For there will possibly be more flags in the sk_user_data field,
this patch also refactor sk_user_data flags code to be more generic
to improve its maintainability.

Reported-and-tested-by: syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com
Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Wen Gu &lt;guwen@linux.alibaba.com&gt;
Signed-off-by: Hawkins Jiawei &lt;yin31149@gmail.com&gt;
Reviewed-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
