<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/virt, branch v5.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-11-15T09:14:04Z</updated>
<entry>
<title>KVM: Add a comment describing the /dev/kvm no_compat handling</title>
<updated>2019-11-15T09:14:04Z</updated>
<author>
<name>Marc Zyngier</name>
<email>maz@kernel.org</email>
</author>
<published>2019-11-14T13:17:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9cb09e7c1c9af2968d5186ef9085f05641ab65d9'/>
<id>urn:sha1:9cb09e7c1c9af2968d5186ef9085f05641ab65d9</id>
<content type='text'>
Add a comment explaining the rational behind having both
no_compat open and ioctl callbacks to fend off compat tasks.

Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: Forbid /dev/kvm being opened by a compat task when CONFIG_KVM_COMPAT=n</title>
<updated>2019-11-13T16:13:12Z</updated>
<author>
<name>Marc Zyngier</name>
<email>maz@kernel.org</email>
</author>
<published>2019-11-13T16:05:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b9876e6de123adb52ac693bac08c493e989bd93e'/>
<id>urn:sha1:b9876e6de123adb52ac693bac08c493e989bd93e</id>
<content type='text'>
On a system without KVM_COMPAT, we prevent IOCTLs from being issued
by a compat task. Although this prevents most silly things from
happening, it can still confuse a 32bit userspace that is able
to open the kvm device (the qemu test suite seems to be pretty
mad with this behaviour).

Take a more radical approach and return a -ENODEV to the compat
task.

Reported-by: Peter Maydell &lt;peter.maydell@linaro.org&gt;
Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm</title>
<updated>2019-11-12T21:19:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-11-12T21:19:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8c5bd25bf42effd194d4b0b43895c42b374e620b'/>
<id>urn:sha1:8c5bd25bf42effd194d4b0b43895c42b374e620b</id>
<content type='text'>
Pull kvm fixes from Paolo Bonzini:
 "Fix unwinding of KVM_CREATE_VM failure, VT-d posted interrupts,
  DAX/ZONE_DEVICE, and module unload/reload"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved
  KVM: VMX: Introduce pi_is_pir_empty() helper
  KVM: VMX: Do not change PID.NDST when loading a blocked vCPU
  KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts
  KVM: VMX: Fix comment to specify PID.ON instead of PIR.ON
  KVM: X86: Fix initialization of MSR lists
  KVM: fix placement of refcount initialization
  KVM: Fix NULL-ptr deref after kvm_create_vm fails
</content>
</entry>
<entry>
<title>KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved</title>
<updated>2019-11-12T09:17:42Z</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-11-11T22:12:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a78986aae9b2988f8493f9f65a587ee433e83bc3'/>
<id>urn:sha1:a78986aae9b2988f8493f9f65a587ee433e83bc3</id>
<content type='text'>
Explicitly exempt ZONE_DEVICE pages from kvm_is_reserved_pfn() and
instead manually handle ZONE_DEVICE on a case-by-case basis.  For things
like page refcounts, KVM needs to treat ZONE_DEVICE pages like normal
pages, e.g. put pages grabbed via gup().  But for flows such as setting
A/D bits or shifting refcounts for transparent huge pages, KVM needs to
to avoid processing ZONE_DEVICE pages as the flows in question lack the
underlying machinery for proper handling of ZONE_DEVICE pages.

This fixes a hang reported by Adam Borowski[*] in dev_pagemap_cleanup()
when running a KVM guest backed with /dev/dax memory, as KVM straight up
doesn't put any references to ZONE_DEVICE pages acquired by gup().

Note, Dan Williams proposed an alternative solution of doing put_page()
on ZONE_DEVICE pages immediately after gup() in order to simplify the
auditing needed to ensure is_zone_device_page() is called if and only if
the backing device is pinned (via gup()).  But that approach would break
kvm_vcpu_{un}map() as KVM requires the page to be pinned from map() 'til
unmap() when accessing guest memory, unlike KVM's secondary MMU, which
coordinates with mmu_notifier invalidations to avoid creating stale
page references, i.e. doesn't rely on pages being pinned.

[*] http://lkml.kernel.org/r/20190919115547.GA17963@angband.pl

Reported-by: Adam Borowski &lt;kilobyte@angband.pl&gt;
Analyzed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: stable@vger.kernel.org
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: fix placement of refcount initialization</title>
<updated>2019-11-11T14:48:03Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2019-11-04T12:23:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e2d3fcaf939dded3da604a25ebbea9fb954c2280'/>
<id>urn:sha1:e2d3fcaf939dded3da604a25ebbea9fb954c2280</id>
<content type='text'>
Reported by syzkaller:

   =============================
   WARNING: suspicious RCU usage
   -----------------------------
   ./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!

   other info that might help us debug this:

   rcu_scheduler_active = 2, debug_locks = 1
   no locks held by repro_11/12688.

   stack backtrace:
   Call Trace:
    dump_stack+0x7d/0xc5
    lockdep_rcu_suspicious+0x123/0x170
    kvm_dev_ioctl+0x9a9/0x1260 [kvm]
    do_vfs_ioctl+0x1a1/0xfb0
    ksys_ioctl+0x6d/0x80
    __x64_sys_ioctl+0x73/0xb0
    do_syscall_64+0x108/0xaa0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

Commit a97b0e773e4 (kvm: call kvm_arch_destroy_vm if vm creation fails)
sets users_count to 1 before kvm_arch_init_vm(), however, if kvm_arch_init_vm()
fails, we need to decrease this count.  By moving it earlier, we can push
the decrease to out_err_no_arch_destroy_vm without introducing yet another
error label.

syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=15209b84e00000

Reported-by: syzbot+75475908cd0910f141ee@syzkaller.appspotmail.com
Fixes: a97b0e773e49 ("kvm: call kvm_arch_destroy_vm if vm creation fails")
Cc: Jim Mattson &lt;jmattson@google.com&gt;
Analyzed-by: Wanpeng Li &lt;wanpengli@tencent.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: Fix NULL-ptr deref after kvm_create_vm fails</title>
<updated>2019-11-11T14:48:02Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2019-11-04T11:16:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8a44119a98bee4381d28f3ed1e41dfacf5c3aa6d'/>
<id>urn:sha1:8a44119a98bee4381d28f3ed1e41dfacf5c3aa6d</id>
<content type='text'>
Reported by syzkaller:

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 0 PID: 14727 Comm: syz-executor.3 Not tainted 5.4.0-rc4+ #0
    RIP: 0010:kvm_coalesced_mmio_init+0x5d/0x110 arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:121
    Call Trace:
     kvm_dev_ioctl_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:3446 [inline]
     kvm_dev_ioctl+0x781/0x1490 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3494
     vfs_ioctl fs/ioctl.c:46 [inline]
     file_ioctl fs/ioctl.c:509 [inline]
     do_vfs_ioctl+0x196/0x1150 fs/ioctl.c:696
     ksys_ioctl+0x62/0x90 fs/ioctl.c:713
     __do_sys_ioctl fs/ioctl.c:720 [inline]
     __se_sys_ioctl fs/ioctl.c:718 [inline]
     __x64_sys_ioctl+0x6e/0xb0 fs/ioctl.c:718
     do_syscall_64+0xca/0x5d0 arch/x86/entry/common.c:290
     entry_SYSCALL_64_after_hwframe+0x49/0xbe

Commit 9121923c457d ("kvm: Allocate memslots and buses before calling kvm_arch_init_vm")
moves memslots and buses allocations around, however, if kvm-&gt;srcu/irq_srcu fails
initialization, NULL will be returned instead of error code, NULL will not be intercepted
in kvm_dev_ioctl_create_vm() and be dereferenced by kvm_coalesced_mmio_init(), this patch
fixes it.

Moving the initialization is required anyway to avoid an incorrect synchronize_srcu that
was also reported by syzkaller:

 wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
 __synchronize_srcu+0x197/0x250 kernel/rcu/srcutree.c:921
 synchronize_srcu_expedited kernel/rcu/srcutree.c:946 [inline]
 synchronize_srcu+0x239/0x3e8 kernel/rcu/srcutree.c:997
 kvm_page_track_unregister_notifier+0xe7/0x130 arch/x86/kvm/page_track.c:212
 kvm_mmu_uninit_vm+0x1e/0x30 arch/x86/kvm/mmu.c:5828
 kvm_arch_destroy_vm+0x4a2/0x5f0 arch/x86/kvm/x86.c:9579
 kvm_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:702 [inline]

so do it.

Reported-by: syzbot+89a8060879fa0bd2db4f@syzkaller.appspotmail.com
Reported-by: syzbot+e27e7027eb2b80e44225@syzkaller.appspotmail.com
Fixes: 9121923c457d ("kvm: Allocate memslots and buses before calling kvm_arch_init_vm")
Cc: Jim Mattson &lt;jmattson@google.com&gt;
Cc: Wanpeng Li &lt;wanpengli@tencent.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>kvm: x86: mmu: Recovery of shattered NX large pages</title>
<updated>2019-11-04T19:26:00Z</updated>
<author>
<name>Junaid Shahid</name>
<email>junaids@google.com</email>
</author>
<published>2019-11-04T19:26:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1aa9b9572b10529c2e64e2b8f44025d86e124308'/>
<id>urn:sha1:1aa9b9572b10529c2e64e2b8f44025d86e124308</id>
<content type='text'>
The page table pages corresponding to broken down large pages are zapped in
FIFO order, so that the large page can potentially be recovered, if it is
not longer being used for execution.  This removes the performance penalty
for walking deeper EPT page tables.

By default, one large page will last about one hour once the guest
reaches a steady state.

Signed-off-by: Junaid Shahid &lt;junaids@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>kvm: Add helper function for creating VM worker threads</title>
<updated>2019-11-04T11:22:02Z</updated>
<author>
<name>Junaid Shahid</name>
<email>junaids@google.com</email>
</author>
<published>2019-11-04T11:22:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c57c80467f90e5504c8df9ad3555d2c78800bf94'/>
<id>urn:sha1:c57c80467f90e5504c8df9ad3555d2c78800bf94</id>
<content type='text'>
Add a function to create a kernel thread associated with a given VM. In
particular, it ensures that the worker thread inherits the priority and
cgroups of the calling thread.

Signed-off-by: Junaid Shahid &lt;junaids@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>kvm: call kvm_arch_destroy_vm if vm creation fails</title>
<updated>2019-10-31T11:13:16Z</updated>
<author>
<name>Jim Mattson</name>
<email>jmattson@google.com</email>
</author>
<published>2019-10-25T11:34:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a97b0e773e492ae319a7e981e98962a1060215f9'/>
<id>urn:sha1:a97b0e773e492ae319a7e981e98962a1060215f9</id>
<content type='text'>
In kvm_create_vm(), if we've successfully called kvm_arch_init_vm(), but
then fail later in the function, we need to call kvm_arch_destroy_vm()
so that it can do any necessary cleanup (like freeing memory).

Fixes: 44a95dae1d229a ("KVM: x86: Detect and Initialize AVIC support")

Signed-off-by: John Sperbeck &lt;jsperbeck@google.com&gt;
Signed-off-by: Jim Mattson &lt;jmattson@google.com&gt;
Reviewed-by: Junaid Shahid &lt;junaids@google.com&gt;
[Remove dependency on "kvm: Don't clear reference count on
 kvm_create_vm() error path" which was not committed. - Paolo]
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>kvm: Allocate memslots and buses before calling kvm_arch_init_vm</title>
<updated>2019-10-25T11:32:33Z</updated>
<author>
<name>Jim Mattson</name>
<email>jmattson@google.com</email>
</author>
<published>2019-10-24T23:03:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9121923c457d1d8667a6e3a67302c29e5c5add6b'/>
<id>urn:sha1:9121923c457d1d8667a6e3a67302c29e5c5add6b</id>
<content type='text'>
This reorganization will allow us to call kvm_arch_destroy_vm in the
event that kvm_create_vm fails after calling kvm_arch_init_vm.

Suggested-by: Junaid Shahid &lt;junaids@google.com&gt;
Signed-off-by: Jim Mattson &lt;jmattson@google.com&gt;
Reviewed-by: Junaid Shahid &lt;junaids@google.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
</feed>
