linux/drivers/virtio, branch v2.6.28

virtio_balloon: fix towards_target when deflating balloon

2008-08-25T14:19:25Z

Both v and vb->num_pages are u32 and unsigned int respectively. If v is less than vb->num_pages (and it is, when deflating the balloon), the result is a very large 32-bit number. Since we're returning a s64, instead of getting the same negative number we desire, we get a very large positive number. This handles the case where v < vb->num_pages and ensures we get a small, negative, s64 as the result. Rusty: please push this for 2.6.27-rc4. It's probably appropriate for the stable tree too as it will cause an unexpected OOM when ballooning. Signed-off-by: Anthony Liguori Signed-off-by: Rusty Russell (simplified)

virtio: Add transport feature handling stub for virtio_ring.

2008-07-25T02:06:14Z

To prepare for virtio_ring transport feature bits, hook in a call in all the users to manipulate them. This currently just clears all the bits, since it doesn't understand any features. Signed-off-by: Rusty Russell

virtio: Rename set_features to finalize_features

2008-07-25T02:06:12Z

Rather than explicitly handing the features to the lower-level, we just hand the virtio_device and have it set the features. This make it clear that it has the chance to manipulate the features of the device at this point (and that all feature negotiation is already done). Signed-off-by: Rusty Russell

virtio: Formally reserve bits 28-31 to be 'transport' features.

2008-07-25T02:06:07Z

We assign feature bits as required, but it makes sense to reserve some for the particular transport, rather than the particular device. Signed-off-by: Rusty Russell

virtio: Use bus_type probe and remove methods

2008-07-25T02:06:05Z

Hook up to the probe() and remove() methods in bus_type rather than device_driver. The latter has been preferred since 2.6.16. Signed-off-by: Mark McLoughlin Signed-off-by: Rusty Russell

virtio: don't always force a notification when ring is full

2008-07-25T02:06:04Z

We force notification when the ring is full, even if the host has indicated it doesn't want to know. This seemed like a good idea at the time: if we fill the transmit ring, we should tell the host immediately. Unfortunately this logic also applies to the receiving ring, which is refilled constantly. We should introduce real notification thesholds to replace this logic. Meanwhile, removing the logic altogether breaks the heuristics which KVM uses, so we use a hack: only notify if there are outgoing parts of the new buffer. Here are the number of exits with lguest's crappy network implementation: Before: network xmit 7859051 recv 236420 After: network xmit 7858610 recv 118136 Signed-off-by: Rusty Russell

virtio: Complete feature negotation before updating status

2008-06-15T20:46:16Z

lguest (in rusty's use-tun-ringfd patch) assumes that the guest has updated its feature bits before setting its status to VIRTIO_CONFIG_S_DRIVER_OK. That's pretty reasonable, so let's make it so. Signed-off-by: Mark McLoughlin Signed-off-by: Rusty Russell Signed-off-by: Linus Torvalds

virtio: force callback on empty.

2008-05-30T05:09:46Z

virtio allows drivers to suppress callbacks (ie. interrupts) for efficiency (no locking, it's just an optimization). There's a similar mechanism for the host to suppress notifications coming from the guest: in that case, we ignore the suppression if the ring is completely full. It turns out that life is simpler if the host similarly ignores callback suppression when the ring is completely empty: the network driver wants to free up old packets in a timely manner, and otherwise has to use a timer to poll. We have to remove the code which ignores interrupts when the driver has disabled them (again, it had no locking and hence was unreliable anyway). Signed-off-by: Rusty Russell

virtio_net: another race with virtio_net and enable_cb

2008-05-30T05:09:45Z

Hello Rusty, seems that we still have a problem with virtio_net and the enable_cb callback. During a long running network stress tests with virtio and got the following oops: ------------[ cut here ]------------ kernel BUG at drivers/virtio/virtio_ring.c:230! illegal operation: 0001 [#1] SMP Modules linked in: CPU: 0 Not tainted 2.6.26-rc2-kvm-00436-gc94c08b-dirty #34 Process netserver (pid: 2582, task: 000000000fbc4c68, ksp: 000000000f42b990) Krnl PSW : 0704c00180000000 00000000002d0ec8 (vring_enable_cb+0x1c/0x60) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3 Krnl GPRS: 0000000000000000 0000000000000000 000000000ef3d000 0000000010009800 0000000000000000 0000000000419ce0 0000000000000080 000000000000007b 000000000adb5538 000000000ef40900 000000000ef40000 000000000ef40920 0000000000000000 0000000000000005 000000000029c1b0 000000000fea7d18 Krnl Code: 00000000002d0ebc: a7110001 tmll %r1,1 00000000002d0ec0: a7740004 brc 7,2d0ec8 00000000002d0ec4: a7f40001 brc 15,2d0ec6 >00000000002d0ec8: a517fffe nill %r1,65534 00000000002d0ecc: 40103000 sth %r1,0(%r3) 00000000002d0ed0: 07f0 bcr 15,%r0 00000000002d0ed2: e31020380004 lg %r1,56(%r2) 00000000002d0ed8: a7480000 lhi %r4,0 Call Trace: ([<000000000029c0fc>] virtnet_poll+0x290/0x3b8) [<0000000000333fb8>] net_rx_action+0x9c/0x1b8 [<00000000001394bc>] __do_softirq+0x74/0x108 [<000000000010d16a>] do_softirq+0x92/0xac [<0000000000139826>] irq_exit+0x72/0xc8 [<000000000010a7b6>] do_extint+0xe2/0x104 [<0000000000110508>] ext_no_vtime+0x16/0x1a Last Breaking-Event-Address: [<00000000002d0ec4>] vring_enable_cb+0x18/0x60 I looked into the virtio_net code for some time and I think the following scenario happened. Please look at virtnet_poll: [...] /* Out of packets? */ if (received < budget) { netif_rx_complete(vi->dev, napi); if (unlikely(!vi->rvq->vq_ops->enable_cb(vi->rvq)) && napi_schedule_prep(napi)) { vi->rvq->vq_ops->disable_cb(vi->rvq); __netif_rx_schedule(vi->dev, napi); goto again; } } If an interrupt arrives after netif_rx_complete, a second poll routine can run on a different cpu. The second check for napi_schedule_prep would prevent any harm in the network stack, but we have called enable_cb possibly after the disable_cb in skb_recv_done. static void skb_recv_done(struct virtqueue *rvq) { struct virtnet_info *vi = rvq->vdev->priv; /* Schedule NAPI, Suppress further interrupts if successful. */ if (netif_rx_schedule_prep(vi->dev, &vi->napi)) { rvq->vq_ops->disable_cb(rvq); __netif_rx_schedule(vi->dev, &vi->napi); } } That means that the second poll routine runs with interrupts enabled, which is ok, since we can handle additional interrupts. The problem is now that the second poll routine might also call enable_cb, triggering the BUG. The only solution I can come up with, is to remove the BUG statement in enable_cb - similar to disable_cb. Opinions or better ideas where the oops could come from? Signed-off-by: Christian Borntraeger Signed-off-by: Rusty Russell

virtio: set device index in common code.

2008-05-30T05:09:42Z

Anthony Liguori points out that three different transports use the virtio code, but each one keeps its own counter to set the virtio_device's index field. In theory (though not in current practice) this means that names could be duplicated, and that risk grows as more transports are created. So we move the selection of the unique virtio_device.index into the common code in virtio.c, which has the side-benefit of removing duplicate code. The only complexity is that lguest and S/390 use the index to uniquely identify the device in case of catastrophic failure before register_virtio_device() is called: now we use the offset within the descriptor page as a unique identifier for the printks. Signed-off-by: Rusty Russell Cc: Christian Borntraeger Cc: Martin Schwidefsky Cc: Carsten Otte Cc: Heiko Carstens Cc: Chris Lalancette Cc: Anthony Liguori