git.osdn.net Git - qmiga/qemu.git/log

or32: Remove ELF_MACHINE from cpu.h

The only generic code relying on this is linux-user, but linux users'
default behaviour of defaulting ELF_MACHINE to ELF_ARCH will handle
this.

The bootloader can just pass EM_OPENRISC directly, as that is
architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Jia Liu <proljc@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

lm32: Remove ELF_MACHINE from cpu.h

The bootloaders can just pass EM_LATTICEMICO32 directly, as that is
architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Michael Walle <michael@walle.cc>
Acked-By: Michael Walle <michael@walle.cc>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

unicore: Remove ELF_MACHINE from cpu.h

The only generic code relying on this is linux-user, but linux users'
default behaviour of defaulting ELF_MACHINE to ELF_ARCH will handle
this.

This removes another architecture specific definition from the global
namespace.

Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

moxie: Remove ELF_MACHINE from cpu.h

The bootloader can just pass EM_MOXIE directly, as that is architecture
specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Anthony Green <green@moxielogic.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

cris: Remove ELF_MACHINE from cpu.h

The only generic code relying on this is linux-user, but linux users'
default behaviour of defaulting ELF_MACHINE to ELF_ARCH will handle
this.

The bootloader can just pass EM_CRIS directly, as that is architecture
specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

m68k: Remove ELF_MACHINE from cpu.h

The only generic code relying on this is linux-user, but linux users'
default behaviour of defaulting ELF_MACHINE to ELF_ARCH will handle
this.

The machine model bootloaders can just pass EM_68K directly, as that
is architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Laurent Vivier <laurent@vivier.eu>
Cc: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

mb: Remove ELF_MACHINE from cpu.h

The only generic code relying on this is linux-user, but linux-users'
default behaviour or setting ELF_MACHINE to ELF_ARCH will handle this.

The microblaze bootloader can just pass EM_MICROBLAZE directly, as that
is architecture specific code.

This removes another architecture specific definition from the global
namespace.

Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

arm: Remove ELF_MACHINE from cpu.h

The only generic code relying on this is linux-user. Linux user
already has a lot of #ifdef TARGET_ customisation so instead, define
ELF_ARCH as either EM_ARM or EM_AARCH64 appropriately.

The armv7m bootloader can just pass EM_ARM directly, as that
is architecture specific code. Note that arm_boot already has its own
logic selecting an arm specific elf machine so this makes V7M more
consistent with arm_boot.

This removes another architecture specific definition from the global
namespace.

Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

elf: Update EM_MOXIE definition

EM_MOXIE now has a proper assigned elf code. Use it. Register the old
interim value as EM_MOXIE_OLD and accept either in elf loading.

Cc: Anthony Green <green@moxielogic.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

elf_ops: Fix coding style for EM alias case statement

Fix the coding style for these cases as per CODING_STYLE. Reverse the
Yoda conditions and add missing if braces.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

linux-user: elfload: Provide default for elf_check_arch

For many arch's this macro is defined as the predicatable behaviour
of checking the argument for eqaulity against ELF_ARCH. Provide a
default define as such, so only archs with special handling (usually
allowing multiple EM values) need to provide a def.

Arches that do any of:

1: provide this def exactly the same way as the new default
        (alpha, x86_64)
2: check against ELF_MACHINE while defining ELF_ARCH == ELF_MACHINE
        (arm, aarch64)
3: check against EM_FOO directly while defining ELF_ARCH == EM_FOO
        (unicore32, sparc32, ppc32, mips, openrisc, sh4, cris, m86k)

have their elf_check_arch removed as the default will provide the
correct behaviour.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

linux_user: elfload: Default ELF_MACHINE to ELF_ARCH

In most (but not all) cases, ELF_MACHINE and ELF_ARCH are safely the
same. Default ELF_MACHINE to ELF_ARCH. This makes defining ELF_MACHINE
optional for target-*/cpu.h when they are known to match.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hmp: implemented io apic dump state for TCG

Added support emulator for the hmp command "info ioapic"

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-10-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hmp: added io apic dump state

Added the hmp command to query io apic state, may be usefull after guest
crashes to understand IRQ routing in guest.

Implementation is only for kvm here. The dump will look like
(qemu) info ioapic
ioapic id=0x00 sel=0x26 (redir[11])
pin 0  0x0000000000010000 dest=0 vec=0   active-hi edge  masked fixed  physical
pin 1  0x0000000000000031 dest=0 vec=49  active-hi edge         fixed  physical
...
pin 23 0x0000000000010000 dest=0 vec=0   active-hi edge  masked fixed  physical
IRR        (none)
Remote IRR (none)

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-9-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ioapic_internal.h: added more constants

Added the masks for easy access to fields of the redirection table entry

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-8-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hmp: added local apic dump state

Added the hmp command to query local apic registers state, may be
usefull after guest crashes to understand IRQ routing in guest.

(qemu) info lapic
dumping local APIC state for CPU 0

LVT0    0x00010700 active-hi edge  masked                      ExtINT (vec 0)
LVT1    0x00000400 active-hi edge                              NMI
LVTPC   0x00010000 active-hi edge  masked                      Fixed  (vec 0)
LVTERR  0x000000fe active-hi edge                              Fixed  (vec 254)
LVTTHMR 0x00010000 active-hi edge  masked                      Fixed  (vec 0)
LVTT    0x000000ef active-hi edge                 one-shot     Fixed  (vec 239)
Timer   DCR=0x3 (divide by 16) initial_count = 61360
SPIV    0x000001ff APIC enabled, focus=off, spurious vec 255
ICR     0x000000fd physical edge de-assert no-shorthand
ICR2    0x00000001 cpu 1 (X2APIC ID)
ESR     0x00000000
ISR     (none)
IRR     239

APR 0x00 TPR 0x00 DFR 0x0f LDR 0x00 PPR 0x00

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-7-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

monitor: make monitor_fprintf and mon_get_cpu externally visible

monitor_fprintf and mon_get_cpu will be used in the target-specific monitor,
so it is advisable to make it external.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-6-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

apic_internal.h: fix formatting and drop unused consts

Fix formatting of local apic definitions and drop unused constant
APIC_INPUT_POLARITY, APIC_SEND_PENDING. Magic numbers in shifts are
replaced with constants defined just above.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-5-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

apic_internal.h: added more constants

These constants are needed for optimal access to
bit fields local apic registers without magic numbers.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-4-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

apic_internal.h: rename ESR_ILLEGAL_ADDRESS to APIC_ESR_ILLEGAL_ADDRESS

Added prefix APIC_ for determining the constant of a particular subsystem,
improve the overall readability and match other constant names.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-3-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

apic_internal.h: make some apic_get_* functions externally visible

Move apic_get_bit(), apic_set_bit() to apic_internal.h, make the apic_get_ppr
symbol external. It's necessary to work with isr, tmr, irr and ppr outside
hw/intc/apic.c

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Andreas FÃ¤rber <afaerber@suse.de>
Message-Id: <1442927901-1084-2-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ioapic: fix contents of arbitration register

The arbitration register should read to the same value as the
IOAPIC id register. Fixes kvm-unit-tests ioapic.flat.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ioapic: coalesce level interrupts

If a level-triggered interrupt goes down and back up before the
corresponding EOI, it should be coalesced. This fixes one testcase
in kvm-unit-tests' ioapic.flat.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MAINTAINERS: add maintainer for network device front-ends

Only "Odd Fixes" status, but let's add a point of contact.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MAINTAINERS: add maintainer for character device front-ends

Only "Odd Fixes" status, but let's add a point of contact.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MAINTAINERS: add IPack section

Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MAINTAINERS: Add more s390 files

Cc: Alexander Graf <agraf@suse.de>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MAINTAINERS: Add disassemblers to the various backends

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MAINTAINERS: there is no PPC64 TCG backend anymore

PPC32 and PPC64 were unified.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

get_maintainer.pl: \C is deprecated

"Match a single C-language char (octet) even if that is part of a larger
UTF-8 character. Thus it breaks up characters into their UTF-8 bytes,
so you may end up with malformed pieces of UTF-8."

Just use a period instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

vhost-scsi: include linux/vhost.h

Replace ad-hoc declarations with the linux header.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1442585920-28373-1-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Makefile: fix build when VPATH is outside GIT tree

Steve Ellcey / Leon Alrae reported that QEMU fails to build when
the VPATH directory is outside of the GIT tree, and the system
emulators & tools build is disabled. eg

   cd ..
   mkdir build
   cd build
   ../qemu/configure --disable-system --disable-tools
   make
   (...)
   make[1]: *** No rule to make target `../qom/object.o', needed by `qemu-aarch64'. Stop.
   make: *** [subdir-aarch64-linux-user] Error 2

The problem is due to the fact that some sub directory deps
were listed against SOFTMMU_SUBDIR_RULES instead of SUBDIR_RULES,
so were only processed for system emulators, not user emalutors.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442570495-22029-1-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi-generic: let guests recognize readonly=on on passthrough devices

Passed-through SCSI devices can be opened with the readonly=on option.
When this happens, Linux filters away write commands so that the guest
cannot overwrite the contents of the device.

However, the guest does not know that the device is read-only, and
accepts writes. The writes only fail later when the page cache is
flushed.

This patch modifies scsi-generic to modify the MODE SENSE data and
set the read-only bit in the device-specific parameters, so that
the guest OS treats the disk as write protected.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

checkpatch: do not recommend qemu_strtok over strtok

If anything it should recommend strtok_r!

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tests: add some qemu_strtosz() tests

While reading the function I decided to write some tests.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1442419377-9309-2-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

utils: rename strtosz to use qemu prefix

Not only it makes sense, but it gets rid of checkpatch warning:
WARNING: consider using qemu_strtosz in preference to strtosz

Also remove get rid of tabs to please checkpatch.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1442419377-9309-1-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

qemu-nbd: convert to use the QAPI SocketAddress object

The qemu-nbd program currently uses a QemuOpts objects
when setting up sockets. Switch it over to use the
QAPI SocketAddress objects instead.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442411543-28513-3-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

nbd: convert to use the QAPI SocketAddress object

The nbd block driver currently uses a QemuOpts object
when setting up sockets. Switch it over to use the
QAPI SocketAddress object instead.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1442411543-28513-2-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge remote-tracking branch 'remotes/weil/tags/pull-wxx-20150924' into staging

wxx patch queue

# gpg: Signature made Thu 24 Sep 2015 20:24:50 BST using RSA key ID 677450AD
# gpg: Good signature from "Stefan Weil <sw@weilnetz.de>"
# gpg:                 aka "Stefan Weil <stefan.weil@weilnetz.de>"
# gpg:                 aka "Stefan Weil <stefan.weil@bib.uni-mannheim.de>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 4923 6FEA 75C9 5D69 8EC2  B78A E08C 21D5 6774 50AD

* remotes/weil/tags/pull-wxx-20150924:
  oslib-win32: only provide localtime_r/gmtime_r if missing
  gtk: avoid redefining _WIN32_WINNT macro
  qemu-thread: add a fast path to the Win32 QemuEvent
  slirp: Fix non blocking connect for w32
  nsis: Add QEMU version information to Windows registry

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

oslib-win32: only provide localtime_r/gmtime_r if missing

The oslib-win32 file currently provides a localtime_r and
gmtime_r replacement unconditionally. Some versions of
Mingw-w64 would provide crude macros for localtime_r/gmtime_r
which QEMU takes care to disable. Latest versions of Mingw-w64
now provide actual functions for localtime_r/gmtime_r, but
with a twist that you have to include unistd.h or pthread.h
before including time.h.  By luck some files in QEMU have
such an include order, resulting in compile errors:

  CC    util/osdep.o
In file included from include/qemu-common.h:48:0,
                 from util/osdep.c:48:
include/sysemu/os-win32.h:77:12: error: redundant redeclaration of 'gmtime_r' [-Werror=redundant-decls]
struct tm *gmtime_r(const time_t *timep, struct tm *result);
            ^
In file included from include/qemu-common.h:35:0,
                 from util/osdep.c:48:
/usr/i686-w64-mingw32/sys-root/mingw/include/time.h:272:107: note: previous definition of 'gmtime_r' was here
In file included from include/qemu-common.h:48:0,
                 from util/osdep.c:48:
include/sysemu/os-win32.h:79:12: error: redundant redeclaration of 'localtime_r' [-Werror=redundant-decls]
struct tm *localtime_r(const time_t *timep, struct tm *result);
            ^
In file included from include/qemu-common.h:35:0,
                 from util/osdep.c:48:
/usr/i686-w64-mingw32/sys-root/mingw/include/time.h:269:107: note: previous definition of 'localtime_r' was here

This change adds a configure test to see if localtime_r
exits, and only enables the QEMU impl if missing. We also
re-arrange qemu-common.h try attempt to guarantee that all
source files get unistd.h before time.h and thus see the
localtime_r/gmtime_r defs.

[sw: Use "official" spellings for Mingw-w64, MinGW in comments.]
[sw: Terminate sentences with a dot in comments.]

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>

gtk: avoid redefining _WIN32_WINNT macro

When building for Mingw64 target on Fedora 22 a warning
is issued about _WIN32_WINNT being redefined.

In file included from ui/gtk.c:40:0:
include/ui/gtk.h:5:0: warning: "_WIN32_WINNT" redefined
# define _WIN32_WINNT 0x0601 /* needed to get definition of MAPVK_VK_TO_VSC */
  ^
In file included from /usr/i686-w64-mingw32/sys-root/mingw/include/crtdefs.h:10:0,
                 from /usr/i686-w64-mingw32/sys-root/mingw/include/stdio.h:9,
                 from /home/berrange/src/virt/qemu/include/qemu/fprintf-fn.h:12,
                 from /home/berrange/src/virt/qemu/include/qemu-common.h:18,
                 from ui/gtk.c:37:
/usr/i686-w64-mingw32/sys-root/mingw/include/_mingw.h:225:0: note: this is the location of the previous definition
#define _WIN32_WINNT 0x502
^

Rather than try to get MAPVK_VK_TO_VSC defined indirectly
by defining _WIN32_WINNT, instead just define it explicitly
if missing.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

qemu-thread: add a fast path to the Win32 QemuEvent

QemuEvents are used heavily by call_rcu.  We do not want them to be slow,
but the current implementation does a kernel call on every invocation
of qemu_event_* and won't cut it.

So, wrap a Win32 manual-reset event with a fast userspace path.  The
states and transitions are the same as for the futex and mutex/condvar
implementations, but the slow path is different of course.  The idea
is to reset the Win32 event lazily, as part of a test-reset-test-wait
sequence.  Such a sequence is, indeed, how QemuEvents are used by
RCU and other subsystems!

The patch includes a formal model of the algorithm.

Tested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>

slirp: Fix non blocking connect for w32

Signed-off-by: Stefan Weil <sw@weilnetz.de>

nsis: Add QEMU version information to Windows registry

The uninstall keys include an option key "DisplayVersion" which we set
now. By default the version value is read from file VERSION, but it is
also possible to pass VERSION=#.#.# to make.

Signed-off-by: Stefan Weil <sw@weilnetz.de>

Merge remote-tracking branch 'remotes/elmarco/tags/rm-libcacard' into staging

Remove libcacard

# gpg: Signature made Wed 23 Sep 2015 22:37:11 BST using RSA key ID 75969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/rm-libcacard:
  libcacard: use the standalone project

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20150924' into staging

target-arm queue:
* support VGICv3 in KVM
* fix bug in ACPI table entries for flash devices in virt board
* update Allwinner entry in MAINTAINERS

# gpg: Signature made Thu 24 Sep 2015 01:29:55 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20150924:
  MAINTAINERS: update Allwinner A10 maintainer
  hw/arm/virt-acpi-build: Fix wrong size of flash in ACPI table
  hw/arm/virt: Add gic-version option to virt machine
  hw/intc: Initial implementation of vGICv3
  arm_kvm: Do not assume particular GIC type in kvm_arch_irqchip_create()
  intc/gic: Extract some reusable vGIC code
  hw/intc: Implement GIC-500 base class

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

MAINTAINERS: update Allwinner A10 maintainer

Change the maintainer for Allwinner A10 to myself as Li Guang's mail
address bounces. While at it, extend the file pattern for the entry to
include allwinner_emac.[ch].

Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1442865156-5598-1-git-send-email-b.galvani@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/virt-acpi-build: Fix wrong size of flash in ACPI table

While virt machine creates two flash devices with total size 0x08000000,
the ACPI table generation code was wrongly using this total size as the
size of each flash device, so it would overlap other MMIO spaces.
Make each device entry in the table half the total; this brings the
ACPI table into line with the code which generates the device tree
and which creates the flash devices themselves.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Wei Huang <wei@redhat.com>
Tested-by: Graeme Gregory <graeme.gregory@linaro.org>
Message-id: 1442455041-6596-1-git-send-email-shannon.zhao@linaro.org
[PMM: edited commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/virt: Add gic-version option to virt machine

Add gic_version to VirtMachineState, set it to value of the option
and pass it around where necessary. Instantiate devices and fdt
nodes according to the choice.

max_cpus for virt machine increased to 123 (calculated from redistributor
space available in the memory map). GICv2 compatibility check happens
inside arm_gic_common_realize().

ITS region is added to the memory map too, however currently it not used,
just reserved.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Tested-by: Ashok kumar <ashoks@broadcom.com>
[PMM: Added missing cpu_to_le* calls, thanks to Shannon Zhao]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/intc: Initial implementation of vGICv3

This is the initial version of KVM-accelerated GICv3 support.
State load and save are not yet supported, live migration is
not possible.

In order to get correct class name in a simpler way, gicv3_class_name()
function is implemented, similar to gic_class_name().

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Ashok kumar <ashoks@broadcom.com>
Message-id: 69d8f01d14994d7a1a140e96aef59fd332d02293.1441784344.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

arm_kvm: Do not assume particular GIC type in kvm_arch_irqchip_create()

This allows us to use different GIC types from v2. There are no kernels
which could advertise KVM_CAP_DEVICE_CTRL without the actual ability to
create GIC with it.

GIC version probe code moved to kvm_arm_vgic_probe() which will be used
later.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Ashok kumar <ashoks@broadcom.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 015f4d9e4a8a50dfbdd734c4730558e24a69c6dc.1441784344.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

intc/gic: Extract some reusable vGIC code

Some functions previously used only by vGICv2 are useful also for vGICv3
implementation. Untie them from GICState and make accessible from within
other modules:
- kvm_arm_gic_set_irq()
- kvm_gic_supports_attr() - moved to common code and renamed to
  kvm_device_check_attr()
- kvm_gic_access() - turned into GIC-independent kvm_device_access().
  Data pointer changed to void * because some GICv3 registers are
  64-bit wide

Some of these changes are not used right now, but they will be helpful for
implementing live migration.

Actually kvm_dist_get() and kvm_dist_put() could also be made reusable, but
they would require two extra parameters (s->dev_fd and s->num_cpu) as well as
lots of typecasts of 's' to DeviceState * and back to GICState *. This makes
the code very ugly so i decided to stop at this point. I tried also an
approach with making a base class for all possible GICs, but it would contain
only three variables (dev_fd, cpu_num and irq_num), and accessing them through
the rest of the code would be again tedious (either ugly casts or qemu-style
separate object pointer). So i disliked it too.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Tested-by: Ashok kumar <ashoks@broadcom.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 2ef56d1dd64ffb75ed02a10dcdaf605e5b8ff4f8.1441784344.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/intc: Implement GIC-500 base class

This class is to be used by both software and KVM implementations of GICv3

Currently it is mostly a placeholder, but in future it is supposed to hold
qemu's representation of GICv3 state, which is necessary for migration.

The interface of this class is fully compatible with GICv2 one. This is
done in order to simplify integration with existing code.

Signed-off-by: Shlomo Pongratz <shlomo.pongratz@huawei.com>
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Ashok kumar <ashoks@broadcom.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: aff8baaee493cdcab0694b4a1d4dd5ff27c37ed2.1441784344.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

libcacard: use the standalone project

libcacard is now a standalone project hosted with the Spice project (see
the 2.5.0 release announcement), remove it from qemu tree.

Use the library if found during configure or if --enable-smartcard.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150923.0' into staging

VFIO updates 2015-09-23

- Tracing improvements to use common prefixes for functional areas
- Quirks overhaul:
   - Split PCI quirks to separate file
   - Make them understandable and more extensible
   - Improve use of MemoryRegions and eliminate use of target pagesize
- Eliminate build-time debugging, everything migrated to runtime opts

# gpg: Signature made Wed 23 Sep 2015 21:09:05 BST using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20150923.0:
  vfio/pci: Add emulated PCI IDs
  vfio/pci: Cache vendor and device ID
  vfio/pci: Move AMD device specific reset to quirks
  vfio/pci: Remove old config window and mirror quirks
  vfio/pci: Config mirror quirk
  vfio/pci: Config window quirks
  vfio/pci: Rework RTL8168 quirk
  vfio/pci: Cleanup Nvidia 0x3d0 quirk
  vfio/pci: Cleanup ATI 0x3c3 quirk
  vfio/pci: Foundation for new quirk structure
  vfio/pci: Cleanup ROM blacklist quirk
  vfio/pci: Split quirks to a separate file
  vfio/pci: Extract PCI structures to a separate header
  vfio: Change polarity of our no-mmap option
  vfio/pci: Make interrupt bypass runtime configurable
  vfio/pci: Rename MSI/X functions for easier tracing
  vfio/pci: Rename INTx functions for easier tracing
  vfio/pci: Cleanup vfio_early_setup_msix() error path
  vfio/pci: Cleanup RTL8168 quirk and tracing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

vfio/pci: Add emulated PCI IDs

Specifying an emulated PCI vendor/device ID can be useful for testing
various quirk paths, even though the behavior and functionality of
the device with bogus IDs is fully unsupportable.  We need to use a
uint32_t for the vendor/device IDs, even though the registers
themselves are only 16-bit in order to be able to determine whether
the value is valid and user set.

The same support is added for subsystem vendor/device ID, though these
have the possibility of being useful and supported for more than a
testing tool.  An emulated platform might want to impose their own
subsystem IDs or at least hide the physical subsystem ID.  Windows
guests will often reinstall drivers due to a change in subsystem IDs,
something that VM users may want to avoid.  Of course careful
attention would be required to ensure that guest drivers do not rely
on the subsystem ID as a basis for device driver quirks.

All of these options are added using the standard experimental option
prefix and should not be considered stable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Cache vendor and device ID

Simplify access to commonly referenced PCI vendor and device ID by
caching it on the VFIOPCIDevice struct.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Move AMD device specific reset to quirks

This is just another quirk, for reset rather than affecting memory
regions. Move it to our new quirks file.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Remove old config window and mirror quirks

These are now unused.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Config mirror quirk

Re-implement our mirror quirk using the new infrastructure.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Config window quirks

Config windows make use of an address register and a data register.
In VGA cards, these are often used to provide real mode code in the
BIOS an easy way to access MMIO registers since the window often
resides in an I/O port register.  When the MMIO register has a mirror
of PCI config space, we need to trap those accesses and redirect them
to emulated config space.

The previous version of this functionality made use of a single
MemoryRegion and single match address.  This version uses separate
MemoryRegions for each of the address and data registers and allows
for multiple match addresses.  This is useful for Nvidia cards which
have two ranges which index into PCI config space.

The previous implementation is left for the follow-on patch for a more
reviewable diff.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Rework RTL8168 quirk

Another rework of this quirk, this time to update to the new quirk
structure. We can handle the address and data registers with
separate MemoryRegions and a quirk specific data structure, making the
code much more understandable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Cleanup Nvidia 0x3d0 quirk

The Nvidia 0x3d0 quirk makes use of a two separate registers and gives
us our first chance to make use of separate memory regions for each to
simplify the code a bit.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Cleanup ATI 0x3c3 quirk

This is an easy quirk that really doesn't need a data structure if
its own. We can pass vdev as the opaque data and access to the
MemoryRegion isn't required.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Foundation for new quirk structure

VFIOQuirk hosts a single memory region and a fixed set of data fields
that try to handle all the quirk cases, but end up making those that
don't exactly match really confusing.  This patch introduces a struct
intended to provide more flexibility and simpler code.  VFIOQuirk is
stripped to its basics, an opaque data pointer for quirk specific
data and a pointer to an array of MemoryRegions with a counter.  This
still allows us to have common teardown routines, but adds much
greater flexibility to support multiple memory regions and quirk
specific data structures that are easier to maintain.  The existing
VFIOQuirk is transformed into VFIOLegacyQuirk, which further patches
will eliminate entirely.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Cleanup ROM blacklist quirk

Create a vendor:device ID helper that we'll also use as we rework the
rest of the quirks.  Re-reading the config entries, even if we get
more blacklist entries, is trivial overhead and only incurred during
device setup.  There's no need to typedef the blacklist structure,
it's a static private data type used once.  The elements get bumped
up to uint32_t to avoid future maintenance issues if PCI_ANY_ID gets
used for a blacklist entry (avoiding an actual hardware match).  Our
test loop is also crying out to be simplified as a for loop.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Split quirks to a separate file

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Extract PCI structures to a separate header

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio: Change polarity of our no-mmap option

The default should be to allow mmap and new drivers shouldn't need to
expose an option or set it to other than the allocation default in
their initfn. Take advantage of the experimental flag to change this
option to the correct polarity.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Make interrupt bypass runtime configurable

Tracing is more effective when we can completely disable all KVM
bypass paths. Make these runtime rather than build-time configurable.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Rename MSI/X functions for easier tracing

This allows vfio_msi* tracing. The MSI/X interrupt tracing is also
pulled out of #ifdef DEBUG_VFIO to avoid a recompile for tracing this
path. A few cycles to read the message is hardly anything if we're
already in QEMU.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Rename INTx functions for easier tracing

Rename functions and tracing callbacks so that we can trace vfio_intx*
to see all the INTx related activities.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

vfio/pci: Cleanup vfio_early_setup_msix() error path

With the addition of the Chelsio quirk we have an error path out of
vfio_early_setup_msix() that doesn't free the allocated VFIOMSIXInfo
struct. This doesn't introduce a leak as it still gets freed in the
vfio_put_device() path, but it's complicated and sloppy to rely on
that. Restructure to free the allocated data on error and only link
it into the vdev on success.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>

vfio/pci: Cleanup RTL8168 quirk and tracing

There's quite a bit of cleanup that can be done to the RTL8168 quirk,
as well as the tracing to prevent a spew of uninteresting accesses
for anything else the driver might choose to use the window registers
for besides the MSI-X table. There should be no functional change,
but it's now possible to get compact and useful traces by enabling
vfio_rtl8168_quirk*, ex:

vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f000
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f000
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0xfee0100c
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f004
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f004
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f008
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f008
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x49b1
vfio_rtl8168_quirk_write 0000:04:00.0 [address]: 0x1f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [address]: 0x8001f00c
vfio_rtl8168_quirk_read 0000:04:00.0 [data]: 0x0

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

Merge remote-tracking branch 'remotes/dgibson/tags/spapr-next-20150923' into staging

sPAPR Patch Queue: 2015-09-23

Highlights:
    * pseries-2.5 machine type
    * Memory hotplug for "pseries" guests
    * Fixes to the PAPR Dynamic Reconfiguration hotplug code
    * Several PAPR compliance fixes
    * New SLOF with:
        * GPT support
        * Much faster VGA handling

# gpg: Signature made Wed 23 Sep 2015 02:50:10 BST using DSA key ID FDDA6FC6
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: F730 2185 38B4 D13E FD80  34F2 6882 CAC6 FDDA 6FC6

* remotes/dgibson/tags/spapr-next-20150923: (36 commits)
  sPAPR: Enable EEH on VFIO PCI device only
  sPAPR: Revert don't enable EEH on emulated PCI devices
  ppc/spapr: Implement H_RANDOM hypercall in QEMU
  ppc/spapr: Fix buffer overflow in spapr_populate_drconf_memory()
  spapr: Fix default NUMA node allocation for threads
  spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type
  spapr: Support hotplug by specifying DRC count
  spapr: Revert to memory@XXXX representation for non-hotplugged memory
  spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA
  spapr: Provide better error message when slots exceed max allowed
  spapr: Don't allow memory hotplug to memory less nodes
  spapr: Memory hotplug support
  spapr: Make hash table size a factor of maxram_size
  spapr: Support ibm,dynamic-reconfiguration-memory
  spapr: Add LMB DR connectors
  spapr: Use QEMU limit for maximum CPUs number
  spapr: Don't use QOM [*] syntax for DR connectors.
  spapr_drc: use RTAS return codes for methods called by RTAS
  spapr: Initialize hotplug memory address space
  spapr_drc: don't allow 'empty' DRCs to be unisolated or allocated
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

sPAPR: Enable EEH on VFIO PCI device only

This checks if the PCI device retrieved from the PCI device address
is VFIO PCI device when enabling EEH functionality. If it's not
VFIO PCI device, the EEH functonality isn't enabled.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

sPAPR: Revert don't enable EEH on emulated PCI devices

This reverts commit 7cb18007 ("sPAPR: Don't enable EEH on emulated
PCI devices") as rtas_ibm_set_eeh_option() isn't the right place
to check if there has the corresponding PCI device for the input
address, which can be PE address, not PCI device address.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/spapr: Implement H_RANDOM hypercall in QEMU

The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.

This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:

qemu-system-ppc64 -device spapr-rng,use-kvm=true

For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:

qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
-device spapr-rng,rng=gid0 ...

See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

ppc/spapr: Fix buffer overflow in spapr_populate_drconf_memory()

The buffer that is allocated in spapr_populate_drconf_memory()
is used for setting both, the "ibm,dynamic-memory" and the
"ibm,associativity-lookup-arrays" property. However, only the
size of the first one is taken into account when allocating the
memory. So if the length of the second property is larger than
the length of the first one, we run into a buffer overflow here!
Fix it by taking the length of the second property into account,
too.

Fixes: "spapr: Support ibm,dynamic-reconfiguration-memory" patch
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Fix default NUMA node allocation for threads

At present, if guest numa nodes are requested, but the cpus in each node
are not specified, spapr just uses the default behaviour or assigning each
vcpu round-robin to nodes.

If smp_threads != 1, that will assign adjacent threads in a core to
different NUMA nodes. As well as being just weird, that's a configuration
that can't be represented in the device tree we give to the guest, which
means the guest and qemu end up with different ideas of the NUMA topology.

This patch implements mc->cpu_index_to_socket_id in the spapr code to
make sure vcpus get assigned to nodes only at the socket granularity.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type

Till now memory hotplug used RTAS_LOG_V6_HP_ID_DRC_INDEX hotplug type
which meant that we generated one hotplug type of EPOW event for every
256MB (SPAPR_MEMORY_BLOCK_SIZE). This quickly overruns the kernel
rtas log buffer thus resulting in loss of memory hotplug events. Switch
to RTAS_LOG_V6_HP_ID_DRC_COUNT hotplug type for memory so that we
generate only one event per hotplug request.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Support hotplug by specifying DRC count

Support hotplug identifier type RTAS_LOG_V6_HP_ID_DRC_COUNT that allows
hotplugging of DRCs by specifying the DRC count.

While we are here, rename

spapr_hotplug_req_add_event() to spapr_hotplug_req_add_by_index()
spapr_hotplug_req_remove_event() to spapr_hotplug_req_remove_by_index()

so that they match with spapr_hotplug_req_add_by_count().

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Revert to memory@XXXX representation for non-hotplugged memory

Don't represent non-hotluggable memory under drconf node. With this
we don't have to create DRC objects for them.

The effect of this patch is that we revert back to memory@XXXX representation
for all the memory specified with -m option and represent the cold
plugged memory and hot-pluggable memory under
ibm,dynamic-reconfiguration-memory.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA

When NUMA isn't configured explicitly, assume node 0 is present for
the purpose of creating ibm,associativity-lookup-arrays property
under ibm,dynamic-reconfiguration-memory DT node. This ensures that
the associativity index property is correctly updated in ibm,dynamic-memory
for the LMB that is hotplugged.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Provide better error message when slots exceed max allowed

Currently when user specifies more slots than allowed max of
SPAPR_MAX_RAM_SLOTS (32), we error out like this:

qemu-system-ppc64: unsupported amount of memory slots: 64

Let the user know about the max allowed slots like this:

qemu-system-ppc64: Specified number of memory slots 64 exceeds max supported 32

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Don't allow memory hotplug to memory less nodes

Currently PowerPC kernel doesn't allow hot-adding memory to memory-less
node, but instead will silently add the memory to the first node that has
some memory. This causes two unexpected behaviours for the user.

- Memory gets hotplugged to a different node than what the user specified.
- Since pc-dimm subsystem in QEMU still thinks that memory belongs to
  memory-less node, a reboot will set things accordingly and the previously
  hotplugged memory now ends in the right node. This appears as if some
  memory moved from one node to another.

So until kernel starts supporting memory hotplug to memory-less
nodes, just prevent such attempts upfront in QEMU.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Memory hotplug support

Make use of pc-dimm infrastructure to support memory hotplug
for PowerPC.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Make hash table size a factor of maxram_size

The hash table size is dependent on ram_size, but since with hotplug
the memory can grow till maxram_size. Hence make hash table size dependent
on maxram_size.

This allows to hotplug huge amounts of memory to the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Support ibm,dynamic-reconfiguration-memory

Parse ibm,architecture.vec table obtained from the guest and enable
memory node configuration via ibm,dynamic-reconfiguration-memory if guest
supports it. This is in preparation to support memory hotplug for
sPAPR guests.

This changes the way memory node configuration is done. Currently all
memory nodes are built upfront. But after this patch, only memory@0 node
for RMA is built upfront. Guest kernel boots with just that and rest of
the memory nodes (via memory@XXX or ibm,dynamic-reconfiguration-memory)
are built when guest does ibm,client-architecture-support call.

Note: This patch needs a SLOF enhancement which is already part of
SLOF binary in QEMU.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Add LMB DR connectors

Enable memory hotplug for pseries 2.4 and add LMB DR connectors.
With memory hotplug, enforce RAM size, NUMA node memory size and maxmem
to be a multiple of SPAPR_MEMORY_BLOCK_SIZE (256M) since that's the
granularity in which LMBs are represented and hot-added.

LMB DR connectors will be used by the memory hotplug code.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[spapr_drc_reset implementation]
[since this missed the 2.4 cutoff, changing to only enable for 2.5]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Use QEMU limit for maximum CPUs number

sPAPR uses hard coded limit of maximum 255 supported CPUs which is
exactly the same as QEMU-wide limit which is MAX_CPUMASK_BITS and also
defined as 255.

This makes use of a global CPU number limit for the "pseries" machine.

In order to anticipate future increase of the MAX_CPUMASK_BITS
(or to help debugging large systems), this also bumps the FDT_MAX_SIZE
limit from 256K to 1M assuming that 1 CPU core needs roughly 512 bytes
in the device tree so the new limit can cover up to 2048 CPU cores.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Don't use QOM [*] syntax for DR connectors.

The dynamic reconfiguration (hotplug) code for the pseries machine type
uses a "DR connector" QOM object for each resource it will be possible
to hotplug.  Each of these is added to its owner using
    object_property_add_child(owner, "dr-connector[*], ...);

That works ok, mostly, but it means that the property indices are
arbitrary, depending on the order in which the connectors are constructed.
That might line up to something useful, but it doesn't have to.

It will get worse once we add hotplug RAM support.  That will add a DR
connector object for every 256MB of potential memory.  So if maxmem=2T,
for example, there are 8192 objects under the same parent.

The QOM interfaces aren't really designed for this.  In particular
object_property_add() with [*] has O(n^2) time complexity (in the number of
existing children): first it has a linear search through array indices to
find a free slot, each of which is attempted to a recursive call to
object_property_add() with a specific [N].  Those calls are O(n) because
there's a linear search through all properties to check for duplicates.

By using a meaningful index value, which we already know is unique we can
avoid the [*] special behaviour.  That lets us reduce the total time for
creating the DR objects from O(n^3) to O(n^2).

O(n^2) is still kind of crappy, but it's enough to reduce the startup time
of qemu (with in-progress memory hotplug support) with maxmem=2T from ~20
minutes to ~4 seconds.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

spapr_drc: use RTAS return codes for methods called by RTAS

Certain methods in sPAPRDRConnector objects are only ever called by
RTAS and in many cases are responsible for the logic that determines
the RTAS return codes.

Rather than having a level of indirection requiring RTAS code to
re-interpret return values from such methods to determine the
appropriate return code, just pass them through directly.

This requires changing method return types to uint32_t to match the
type of values currently passed to RTAS helpers.

In the case of read accesses like drc->entity_sense() where we weren't
previously reporting any errors, just the read value, we modify the
function to return RTAS return code, and pass the read value back via
reference.

Suggested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Initialize hotplug memory address space

Initialize a hotplug memory region under which all the hotplugged
memory is accommodated. Also enable memory hotplug by setting
CONFIG_MEM_HOTPLUG.

Modelled on i386 memory hotplug.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr_drc: don't allow 'empty' DRCs to be unisolated or allocated

Logical resources start with allocation-state:UNUSABLE /
isolation-state:ISOLATED. During hotplug, guests will transition
them to allocation-state:USABLE, and then to
isolation-state:UNISOLATED.

For cases where we cannot transition to allocation-state:USABLE,
in this case due to no device/resource being association with
the logical DRC, we should return an error -3.

For physical DRCs, we default to allocation-state:USABLE and stay
there, so in this case we should report an error -3 when the guest
attempts to make the isolation-state:ISOLATED transition for a DRC
with no device associated.

These are as documented in PAPR 2.7, 13.5.3.4.

We also ensure allocation-state:USABLE when the guest attempts
transition to isolation-state:UNISOLATED to deal with misbehaving
guests attempting to bring online an unallocated logical resource.

This is as documented in PAPR 2.7, 13.7.

Currently we implement no such error logic. Fix this by handling
these error cases as PAPR defines.

Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr_pci: fix device tree props for MSI/MSI-X

PAPR requires ibm,req#msi and ibm,req#msi-x to be present in the
device node to define the number of msi/msi-x interrupts the device
supports, respectively.

Currently we have ibm,req#msi-x hardcoded to a non-sensical constant
that happens to be 2, and are missing ibm,req#msi entirely. The result
of that is that msi-x capable devices get limited to 2 msi-x
interrupts (which can impact performance), and msi-only devices likely
wouldn't work at all. Additionally, if devices expect a minimum that
exceeds 2, the guest driver may fail to load entirely.

SLOF still owns the generation of these properties at boot-time
(although other device properties have since been offloaded to QEMU),
but for hotplugged devices we rely on the values generated by QEMU
and thus hit the limitations above.

Fix this by generating these properties in QEMU as expected by guests.

In the future it may make sense to modify SLOF to pass through these
values directly as we do with other props since we're duplicating SLOF
code.

Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

spapr: Enable in-kernel H_SET_MODE handling

For setting debug watchpoints, sPAPR guests use H_SET_MODE hypercall.
The existing QEMU H_SET_MODE handler does not support this but
the KVM handler in HV KVM does. However it is not enabled.

This enables the in-kernel H_SET_MODE handler which handles:
- Completed Instruction Address Breakpoint Register
- Watch point 0 registers.

The rest is still handled in QEMU.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

pseries: Fix incorrect calculation of threads per socket for chip-id

The device tree presented to pseries machine type guests includes an
ibm,chip-id property which gives essentially the socket number of each
vcpu core (individual vcpu threads don't get a node in the device
tree).

To calculate this, it uses a vcpus_per_socket variable computed as
(smp_cpus / #sockets). This is correct for the usual case where
smp_cpus == smp_threads * smp_cores * #sockets.

However, you can start QEMU with the number of cores and threads
mismatching the total number of vcpus (whether that _should_ be
permitted is a topic for another day). It's a bit hard to say what
the "real" number of vcpus per socket here is, but for most purposes
(smp_threads * smp_cores) will more meaningfully match how QEMU
behaves with respect to socket boundaries.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

pseries: Update SLOF firmware image to qemu-slof-20150813

The changes are:
1. GPT support;
2. Much faster VGA support.

The full changelog is:
  > Add missing half word access case to _FASTRMOVE and _FASTMOVE
  > Remove unused RMOVE64 stub
  > fbuffer: Implement RFILL as an accelerated primitive
  > fbuffer: Implement MRMOVE as an accelerated primitive
  > fbuffer: Precalculate line length in bytes
  > terminal: Disable the terminal-write trace by default
  > boot: remove trailing ":" in the bootpath
  > ci: implement boot client interface
  > boot: bootpath should be complete device path
  > fbuffer: Use a smaller cursor
  > fbuffer: Improve invert-region helper
  > usb-hid: Caps is not always shift
  > cas: Increase FDT buffer size to accomodate larger ibm, cas node properties
  > README: Update with patch submittion note
  > disk-label: add support for booting from GPT FAT partition
  > disk-label: introduce helper to check fat filesystem
  > introduce 8-byte LE helpers
  > disk-label: simplify gpt-prep-partition? routine
  > fbuffer: introduce the invert-region-x helper
  > fbuffer: introduce the invert-region helper
  > fbuffer: simplify address computations in fb8-toggle-cursor

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

pseries: define coldplugged devices as "configured"

When a device is hotplugged, attach() sets "configured" to
false, waiting an action from the OS to configure it and then
to call ibm,configure-connector. On ibm,configure-connector,
the hypervisor sets "configured" to true.

In case of coldplugged device, attach() sets "configured" to
false, but firmware and OS never call the ibm,configure-connector
in this case, so it remains set to false.

It could be harmless, but when we unplug a device, hypervisor
waits the device becomes configured because for it, a not configured
device is a device being configured, so it waits the end of configuration
to unplug it... and it never happens, so it is never unplugged.

This patch set by default coldplugged device to "configured=true",
hotplugged device to "configured=false".

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>