<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/net/netfilter, branch v6.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2023-12-20T09:43:21Z</updated>
<entry>
<title>netfilter: nf_tables: set transport offset from mac header for netdev/egress</title>
<updated>2023-12-20T09:43:21Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2023-12-14T10:50:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0ae8e4cca78781401b17721bfb72718fdf7b4912'/>
<id>urn:sha1:0ae8e4cca78781401b17721bfb72718fdf7b4912</id>
<content type='text'>
Before this patch, transport offset (pkt-&gt;thoff) provides an offset
relative to the network header. This is fine for the inet families
because skb-&gt;data points to the network header in such case. However,
from netdev/egress, skb-&gt;data points to the mac header (if available),
thus, pkt-&gt;thoff is missing the mac header length.

Add skb_network_offset() to the transport offset (pkt-&gt;thoff) for
netdev, so transport header mangling works as expected. Adjust payload
fast eval function to use skb-&gt;data now that pkt-&gt;thoff provides an
absolute offset. This explains why users report that matching on
egress/netdev works but payload mangling does not.

This patch implicitly fixes payload mangling for IPv4 packets in
netdev/egress given skb_store_bits() requires an offset from skb-&gt;data
to reach the transport header.

I suspect that nft_exthdr and the trace infra were also broken from
netdev/egress because they also take skb-&gt;data as start, and pkt-&gt;thoff
was not correct.

Note that IPv6 is fine because ipv6_find_hdr() already provides a
transport offset starting from skb-&gt;data, which includes
skb_network_offset().

The bridge family also uses nft_set_pktinfo_ipv4_validate(), but there
skb_network_offset() is zero, so the update in this patch does not alter
the existing behaviour.

Fixes: 42df6e1d221d ("netfilter: Introduce egress hook")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table</title>
<updated>2023-12-11T09:59:58Z</updated>
<author>
<name>Vlad Buslov</name>
<email>vladbu@nvidia.com</email>
</author>
<published>2023-12-05T17:25:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=125f1c7f26ffcdbf96177abe75b70c1a6ceb17bc'/>
<id>urn:sha1:125f1c7f26ffcdbf96177abe75b70c1a6ceb17bc</id>
<content type='text'>
The referenced change added custom cleanup code to act_ct to delete any
callbacks registered on the parent block when deleting the
tcf_ct_flow_table instance. However, the underlying issue is that the
drivers don't obtain the reference to the tcf_ct_flow_table instance when
registering callbacks which means that not only driver callbacks may still
be on the table when deleting it but also that the driver can still have
pointers to its internal nf_flowtable and can use it concurrently which
results either warning in netfilter[0] or use-after-free.

Fix the issue by taking a reference to the underlying struct
tcf_ct_flow_table instance when registering the callback and release the
reference when unregistering. Expose new API required for such reference
counting by adding two new callbacks to nf_flowtable_type and implementing
them for act_ct flowtable_ct type. This fixes the issue by extending the
lifetime of nf_flowtable until all users have unregistered.

[0]:
[106170.938634] ------------[ cut here ]------------
[106170.939111] WARNING: CPU: 21 PID: 3688 at include/net/netfilter/nf_flow_table.h:262 mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
[106170.940108] Modules linked in: act_ct nf_flow_table act_mirred act_skbedit act_tunnel_key vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa bonding openvswitch nsh rpcrdma rdma_ucm
ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_regis
try overlay mlx5_core
[106170.943496] CPU: 21 PID: 3688 Comm: kworker/u48:0 Not tainted 6.6.0-rc7_for_upstream_min_debug_2023_11_01_13_02 #1
[106170.944361] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[106170.945292] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
[106170.945846] RIP: 0010:mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
[106170.946413] Code: 89 ef 48 83 05 71 a4 14 00 01 e8 f4 06 04 e1 48 83 05 6c a4 14 00 01 48 83 c4 28 5b 5d 41 5c 41 5d c3 48 83 05 d1 8b 14 00 01 &lt;0f&gt; 0b 48 83 05 d7 8b 14 00 01 e9 96 fe ff ff 48 83 05 a2 90 14 00
[106170.947924] RSP: 0018:ffff88813ff0fcb8 EFLAGS: 00010202
[106170.948397] RAX: 0000000000000000 RBX: ffff88811eabac40 RCX: ffff88811eabad48
[106170.949040] RDX: ffff88811eab8000 RSI: ffffffffa02cd560 RDI: 0000000000000000
[106170.949679] RBP: ffff88811eab8000 R08: 0000000000000001 R09: ffffffffa0229700
[106170.950317] R10: ffff888103538fc0 R11: 0000000000000001 R12: ffff88811eabad58
[106170.950969] R13: ffff888110c01c00 R14: ffff888106b40000 R15: 0000000000000000
[106170.951616] FS:  0000000000000000(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000
[106170.952329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[106170.952834] CR2: 00007f1cefd28cb0 CR3: 000000012181b006 CR4: 0000000000370ea0
[106170.953482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[106170.954121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[106170.954766] Call Trace:
[106170.955057]  &lt;TASK&gt;
[106170.955315]  ? __warn+0x79/0x120
[106170.955648]  ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
[106170.956172]  ? report_bug+0x17c/0x190
[106170.956537]  ? handle_bug+0x3c/0x60
[106170.956891]  ? exc_invalid_op+0x14/0x70
[106170.957264]  ? asm_exc_invalid_op+0x16/0x20
[106170.957666]  ? mlx5_del_flow_rules+0x10/0x310 [mlx5_core]
[106170.958172]  ? mlx5_tc_ct_block_flow_offload_add+0x1240/0x1240 [mlx5_core]
[106170.958788]  ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
[106170.959339]  ? mlx5_tc_ct_del_ft_cb+0xc6/0x2b0 [mlx5_core]
[106170.959854]  ? mapping_remove+0x154/0x1d0 [mlx5_core]
[106170.960342]  ? mlx5e_tc_action_miss_mapping_put+0x4f/0x80 [mlx5_core]
[106170.960927]  mlx5_tc_ct_delete_flow+0x76/0xc0 [mlx5_core]
[106170.961441]  mlx5_free_flow_attr_actions+0x13b/0x220 [mlx5_core]
[106170.962001]  mlx5e_tc_del_fdb_flow+0x22c/0x3b0 [mlx5_core]
[106170.962524]  mlx5e_tc_del_flow+0x95/0x3c0 [mlx5_core]
[106170.963034]  mlx5e_flow_put+0x73/0xe0 [mlx5_core]
[106170.963506]  mlx5e_put_flow_list+0x38/0x70 [mlx5_core]
[106170.964002]  mlx5e_rep_update_flows+0xec/0x290 [mlx5_core]
[106170.964525]  mlx5e_rep_neigh_update+0x1da/0x310 [mlx5_core]
[106170.965056]  process_one_work+0x13a/0x2c0
[106170.965443]  worker_thread+0x2e5/0x3f0
[106170.965808]  ? rescuer_thread+0x410/0x410
[106170.966192]  kthread+0xc6/0xf0
[106170.966515]  ? kthread_complete_and_exit+0x20/0x20
[106170.966970]  ret_from_fork+0x2d/0x50
[106170.967332]  ? kthread_complete_and_exit+0x20/0x20
[106170.967774]  ret_from_fork_asm+0x11/0x20
[106170.970466]  &lt;/TASK&gt;
[106170.970726] ---[ end trace 0000000000000000 ]---

Fixes: 77ac5e40c44e ("net/sched: act_ct: remove and free nf_table callbacks")
Signed-off-by: Vlad Buslov &lt;vladbu@nvidia.com&gt;
Reviewed-by: Paul Blakey &lt;paulb@nvidia.com&gt;
Acked-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()</title>
<updated>2023-11-14T15:16:21Z</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@linaro.org</email>
</author>
<published>2023-11-03T06:42:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c301f0981fdd3fd1ffac6836b423c4d7a8e0eb63'/>
<id>urn:sha1:c301f0981fdd3fd1ffac6836b423c4d7a8e0eb63</id>
<content type='text'>
The problem is in nft_byteorder_eval() where we are iterating through a
loop and writing to dst[0], dst[1], dst[2] and so on...  On each
iteration we are writing 8 bytes.  But dst[] is an array of u32 so each
element only has space for 4 bytes.  That means that every iteration
overwrites part of the previous element.

I spotted this bug while reviewing commit caf3ef7468f7 ("netfilter:
nf_tables: prevent OOB access in nft_byteorder_eval") which is a related
issue.  I think that the reason we have not detected this bug in testing
is that most of time we only write one element.

Fixes: ce1e7989d989 ("netfilter: nft_byteorder: provide 64bit le/be conversion")
Signed-off-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>net/sched: act_ct: Always fill offloading tuple iifidx</title>
<updated>2023-11-09T01:47:08Z</updated>
<author>
<name>Vlad Buslov</name>
<email>vladbu@nvidia.com</email>
</author>
<published>2023-11-03T15:14:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba'/>
<id>urn:sha1:9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba</id>
<content type='text'>
Referenced commit doesn't always set iifidx when offloading the flow to
hardware. Fix the following cases:

- nf_conn_act_ct_ext_fill() is called before extension is created with
nf_conn_act_ct_ext_add() in tcf_ct_act(). This can cause rule offload with
unspecified iifidx when connection is offloaded after only single
original-direction packet has been processed by tc data path. Always fill
the new nf_conn_act_ct_ext instance after creating it in
nf_conn_act_ct_ext_add().

- Offloading of unidirectional UDP NEW connections is now supported, but ct
flow iifidx field is not updated when connection is promoted to
bidirectional which can result reply-direction iifidx to be zero when
refreshing the connection. Fill in the extension and update flow iifidx
before calling flow_offload_refresh().

Fixes: 9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx")
Reviewed-by: Paul Blakey &lt;paulb@nvidia.com&gt;
Signed-off-by: Vlad Buslov &lt;vladbu@nvidia.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections")
Link: https://lore.kernel.org/r/20231103151410.764271-1-vladbu@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2023-10-26T20:46:28Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-10-26T20:42:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ec4c20ca09831ddba8fac10a7d82a9902e96e717'/>
<id>urn:sha1:ec4c20ca09831ddba8fac10a7d82a9902e96e717</id>
<content type='text'>
Cross-merge networking fixes after downstream PR.

Conflicts:

net/mac80211/rx.c
  91535613b609 ("wifi: mac80211: don't drop all unprotected public action frames")
  6c02fab72429 ("wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return value")

Adjacent changes:

drivers/net/ethernet/apm/xgene/xgene_enet_main.c
  61471264c018 ("net: ethernet: apm: Convert to platform remove callback returning void")
  d2ca43f30611 ("net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF")

net/vmw_vsock/virtio_transport.c
  64c99d2d6ada ("vsock/virtio: support to send non-linear skb")
  53b08c498515 ("vsock/virtio: initialize the_virtio_vsock before using VQs")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfilter: flowtable: GC pushes back packets to classic path</title>
<updated>2023-10-25T09:35:46Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2023-10-24T19:09:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=735795f68b37e9bb49f642407a0d49b1631ea1c7'/>
<id>urn:sha1:735795f68b37e9bb49f642407a0d49b1631ea1c7</id>
<content type='text'>
Since 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded
unreplied tuple"), flowtable GC pushes back flows with IPS_SEEN_REPLY
back to classic path in every run, ie. every second. This is because of
a new check for NF_FLOW_HW_ESTABLISHED which is specific of sched/act_ct.

In Netfilter's flowtable case, NF_FLOW_HW_ESTABLISHED never gets set on
and IPS_SEEN_REPLY is unreliable since users decide when to offload the
flow before, such bit might be set on at a later stage.

Fix it by adding a custom .gc handler that sched/act_ct can use to
deal with its NF_FLOW_HW_ESTABLISHED bit.

Fixes: 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
Reported-by: Vladimir Smelhaus &lt;vl.sm@email.cz&gt;
Reviewed-by: Paul Blakey &lt;paulb@nvidia.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: set-&gt;ops-&gt;insert returns opaque set element in case of EEXIST</title>
<updated>2023-10-24T11:37:46Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2023-10-18T20:23:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=078996fcd657e6e0c3a72b7d5806d04c32e74250'/>
<id>urn:sha1:078996fcd657e6e0c3a72b7d5806d04c32e74250</id>
<content type='text'>
Return struct nft_elem_priv instead of struct nft_set_ext for
consistency with ("netfilter: nf_tables: expose opaque set element as
struct nft_elem_priv") and to prepare the introduction of element
timeout updates from control path.

Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: shrink memory consumption of set elements</title>
<updated>2023-10-24T11:37:42Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2023-10-16T12:29:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0e1ea651c9717ddcd8e0648d8468477a31867b0a'/>
<id>urn:sha1:0e1ea651c9717ddcd8e0648d8468477a31867b0a</id>
<content type='text'>
Instead of copying struct nft_set_elem into struct nft_trans_elem, store
the pointer to the opaque set element object in the transaction. Adapt
set backend API (and set backend implementations) to take the pointer to
opaque set element representation whenever required.

This patch deconstifies .remove() and .activate() set backend API since
these modify the set element opaque object. And it also constify
nft_set_elem_ext() this provides access to the nft_set_ext struct
without updating the object.

According to pahole on x86_64, this patch shrinks struct nft_trans_elem
size from 216 to 24 bytes.

This patch also reduces stack memory consumption by removing the
template struct nft_set_elem object, using the opaque set element object
instead such as from the set iterator API, catchall elements and the get
element command.

Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: expose opaque set element as struct nft_elem_priv</title>
<updated>2023-10-24T11:16:30Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2023-10-18T20:23:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9dad402b89e81a0516bad5e0ac009b7a0a80898f'/>
<id>urn:sha1:9dad402b89e81a0516bad5e0ac009b7a0a80898f</id>
<content type='text'>
Add placeholder structure and place it at the beginning of each struct
nft_*_elem for each existing set backend, instead of exposing elements
as void type to the frontend which defeats compiler type checks. Use
this pointer to this new type to replace void *.

This patch updates the following set backend API to use this new struct
nft_elem_priv placeholder structure:

- update
- deactivate
- flush
- get

as well as the following helper functions:

- nft_set_elem_ext()
- nft_set_elem_init()
- nft_set_elem_destroy()
- nf_tables_set_elem_destroy()

This patch adds nft_elem_priv_cast() to cast struct nft_elem_priv to
native element representation from the corresponding set backend.
BUILD_BUG_ON() makes sure this .priv placeholder is always at the top
of the opaque set element representation.

Suggested-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: set backend .flush always succeeds</title>
<updated>2023-10-24T11:16:30Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2023-10-18T20:20:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6509a2e410c3cb36c78a0a85c6102debe171337e'/>
<id>urn:sha1:6509a2e410c3cb36c78a0a85c6102debe171337e</id>
<content type='text'>
.flush is always successful since this results from iterating over the
set elements to toggle mark the element as inactive in the next
generation.

Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
</feed>
