git.osdn.net Git - qmiga/qemu.git/log

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-2.12-pull-request' into staging

# gpg: Signature made Tue 20 Mar 2018 20:43:37 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-2.12-pull-request:
  linux-user: init_guest_space: Try to make ARM space+commpage continuous

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Update version for v2.12.0-rc0 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20180320' into staging

HMP fixes for 2.12

# gpg: Signature made Tue 20 Mar 2018 12:39:24 GMT
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-hmp-20180320:
  hmp: free sev info
  HMP: Initialize err before using

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

linux-user: init_guest_space: Try to make ARM space+commpage continuous

At a fixed distance after the usable memory that init_guest_space maps, for
32-bit ARM targets we also need to map a commpage.  The normal
init_guest_space logic doesn't keep this in mind when searching for an
address range.

If !host_start, then try to find a big continuous segment where we can put
both the usable memory and the commpage; we then munmap that segment and
set current_start to that address; and let the normal code mmap the usable
memory and the commpage separately.  That is: if we don't have hint of
where to start looking for memory, come up with one that is better than
NULL.  Depending on host_size and guest_start, there may or may not be a
gap between the usable memory and the commpage, so this is slightly more
restrictive than it needs to be; but it's only a hint, so that's OK.

We only do that for !host start, because if host_start, then either:
- we got an address passed in with -B, in which case we don't want to
   interfere with what the user said;
- or host_start is based off of the ELF image's loaddr.  The check "if
   (host_start && real_start != current_start)" suggests that we really
   want lowest available address that is >= loaddr.  I don't know why that
   is, but I'm trusting that Paul Brook knew what he was doing when he
   wrote the original version of that check in
   c581deda322080e8beb88b2e468d4af54454e4b3 way back in 2010.

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-11-lukeshu@lukeshu.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,vhost,pci,pc: features, cleanups

SRAT tables for DIMM devices
new virtio net flags for speed/duplex
post-copy migration support in vhost
cleanups in pci

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (51 commits)
  postcopy shared docs
  libvhost-user: Claim support for postcopy
  postcopy: Allow shared memory
  vhost: Huge page align and merge
  vhost+postcopy: Wire up POSTCOPY_END notify
  vhost-user: Add VHOST_USER_POSTCOPY_END message
  libvhost-user: mprotect & madvises for postcopy
  vhost+postcopy: Call wakeups
  vhost+postcopy: Add vhost waker
  postcopy: postcopy_notify_shared_wake
  postcopy: helper for waking shared
  vhost+postcopy: Resolve client address
  postcopy-ram: add a stub for postcopy_request_shared_page
  vhost+postcopy: Helper to send requests to source for shared pages
  vhost+postcopy: Stash RAMBlock and offset
  vhost+postcopy: Send address back to qemu
  libvhost-user+postcopy: Register new regions with the ufd
  migration/ram: ramblock_recv_bitmap_test_byte_offset
  postcopy+vhost-user: Split set_mem_table for postcopy
  vhost+postcopy: Transmit 'listen' to slave
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# scripts/update-linux-headers.sh

postcopy shared docs

Add some notes to the migration documentation for shared memory
postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

libvhost-user: Claim support for postcopy

Tell QEMU we understand the protocol features needed for postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy: Allow shared memory

Now that we have the mechanisms in here, allow shared memory in a
postcopy.

Note that QEMU can't tell who all the users of shared regions are
and thus can't tell whether all the users of the shared regions
have appropriate support for postcopy. Those devices that explicitly
support shared memory (e.g. vhost-user) must check, but it doesn't
stop weirder configurations causing problems.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost: Huge page align and merge

Align RAMBlocks to page size alignment, and adjust the merging code
to deal with partial overlap due to that alignment.

This is needed for postcopy so that we can place/fetch whole hugepages
when under userfault.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost+postcopy: Wire up POSTCOPY_END notify

Wire up a call to VHOST_USER_POSTCOPY_END message to the vhost clients
right before we ask the listener thread to shutdown.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost-user: Add VHOST_USER_POSTCOPY_END message

This message is sent just before the end of postcopy to get the
client to stop using userfault since we wont respond to any more
requests. It should close userfaultfd so that any other pages
get mapped to the backing file automatically by the kernel, since
at this point we know we've received everything.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

libvhost-user: mprotect & madvises for postcopy

Clear the area and turn off THP.
PROT_NONE the area until after we've userfault advised it
to catch any unexpected changes.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost+postcopy: Call wakeups

Cause the vhost-user client to be woken up whenever:
a) We place a page in postcopy mode
b) We get a fault and the page has already been received

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost+postcopy: Add vhost waker

Register a waker function in vhost-user code to be notified when
pages arrive or requests to previously mapped pages get requested.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy: postcopy_notify_shared_wake

Add a hook to allow a client userfaultfd to be 'woken'
when a page arrives, and a walker that calls that
hook for relevant clients given a RAMBlock and offset.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy: helper for waking shared

Provide a helper to send a 'wake' request on a userfaultfd for
a shared process.
The address in the clients address space is specified together
with the RAMBlock it was resolved to.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost+postcopy: Resolve client address

Resolve fault addresses read off the clients UFD into RAMBlock
and offset, and call back to the postcopy code to ask for the page.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy-ram: add a stub for postcopy_request_shared_page

This fixes the build on systems without userfaultfd.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.12-pull-request' into staging

# gpg: Signature made Tue 20 Mar 2018 09:07:55 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier/tags/m68k-for-2.12-pull-request:
  target/m68k: add a mechanism to automatically free TCGv
  target/m68k: add DisasContext parameter to gen_extend()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging

Machine and x86 queue, 2018-03-19

* cpu_model/cpu_type cleanups
* x86: Fix on Intel Processor Trace CPUID checks

# gpg: Signature made Mon 19 Mar 2018 20:07:14 GMT
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-next-pull-request:
  i386: Disable Intel PT if packets IP payloads have LIP values
  cpu: drop unnecessary NULL check and cpu_common_class_by_name()
  cpu: get rid of unused cpu_init() defines
  Use cpu_create(type) instead of cpu_init(cpu_model)
  cpu: add CPU_RESOLVING_TYPE macro
  tests: add machine 'none' with -cpu test
  nios2: 10m50_devboard: replace cpu_model with cpu_type

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hmp: free sev info

Found thanks to ASAN:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7efe20417a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7efe1f7b2f75 in g_malloc0 ../glib/gmem.c:124
    #2 0x7efe1f7b3249 in g_malloc0_n ../glib/gmem.c:355
    #3 0x558272879162 in sev_get_info /home/elmarco/src/qemu/target/i386/sev.c:414
    #4 0x55827285113b in hmp_info_sev /home/elmarco/src/qemu/target/i386/monitor.c:684
    #5 0x5582724043b8 in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3333

Fixes: 63036314
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180319175823.22111-1-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

HMP: Initialize err before using

When bdrv_snapshot_delete return fail, the errp will not be
assigned a valid value in error_propagate as errp didn't be
initialized in hmp_delvm, then error_reportf_err will use an
uninitialized value(call by hmp_delvm), and qemu crash.

Signed-off-by: zhangjixiang <jixiang_zhang@h3c.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

RISC-V: Fix riscv_isa_string memory size bug

This version uses a constant size memory buffer sized for
the maximum possible ISA string length. It also uses g_new
instead of g_new0, uses more efficient logic to append
extensions and adds manual zero termination of the string.

Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMM: Use qemu_tolower() rather than tolower()]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/ericb/tags/pull-qapi-2018-03-12-v4' into staging

qapi patches for 2018-03-12, 2.12 softfreeze

- Marc-André Lureau: 0/4 qapi: generate a literal qobject for introspection
- Max Reitz: 0/7 block: Handle null backing link
- Daniel P. Berrange: chardev: tcp: postpone TLS work until machine done
- Peter Xu: 00/23 QMP: out-of-band (OOB) execution support
- Vladimir Sementsov-Ogievskiy: 0/2 block latency histogram
- Eric Blake: qapi: Pass '-u' when doing non-silent diff

# gpg: Signature made Mon 19 Mar 2018 19:59:04 GMT
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-qapi-2018-03-12-v4: (38 commits)
  qapi: Pass '-u' when doing non-silent diff
  qapi: add block latency histogram interface
  block/accounting: introduce latency histogram
  tests: qmp-test: add oob test
  tests: qmp-test: verify command batching
  qmp: add command "x-oob-test"
  monitor: enable IO thread for (qmp & !mux) typed
  qmp: isolate responses into io thread
  qmp: support out-of-band (oob) execution
  qapi: introduce new cmd option "allow-oob"
  monitor: send event when command queue full
  qmp: add new event "command-dropped"
  monitor: separate QMP parser and dispatcher
  monitor: let suspend/resume work even with QMPs
  monitor: let suspend_cnt be thread safe
  monitor: introduce monitor_qmp_respond()
  qmp: introduce QMPCapability
  monitor: allow using IO thread for parsing
  monitor: let mon_list be tail queue
  monitor: unify global init
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/m68k: add a mechanism to automatically free TCGv

SRC_EA() and gen_extend() can return either a temporary
TCGv or a memory allocated one. Mark them when they are
allocated, and free them automatically at end of the
instruction translation.

We want to free locally allocated TCGv to avoid
overflow in sequence like:

  0xc00ae406:  movel %fp@(-132),%fp@(-268)
  0xc00ae40c:  movel %fp@(-128),%fp@(-264)
  0xc00ae412:  movel %fp@(-20),%fp@(-212)
  0xc00ae418:  movel %fp@(-16),%fp@(-208)
  0xc00ae41e:  movel %fp@(-60),%fp@(-220)
  0xc00ae424:  movel %fp@(-56),%fp@(-216)
  0xc00ae42a:  movel %fp@(-124),%fp@(-252)
  0xc00ae430:  movel %fp@(-120),%fp@(-248)
  0xc00ae436:  movel %fp@(-12),%fp@(-260)
  0xc00ae43c:  movel %fp@(-8),%fp@(-256)
  0xc00ae442:  movel %fp@(-52),%fp@(-276)
  0xc00ae448:  movel %fp@(-48),%fp@(-272)
  ...

That can fill a lot of TCGv entries in a sequence,
especially since 15fa08f845 ("tcg: Dynamically allocate TCGOps")
we have no limit to fill the TCGOps cache and we can fill
the entire TCG variables array and overflow it.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180319113544.704-3-laurent@vivier.eu>

target/m68k: add DisasContext parameter to gen_extend()

This parameter will be needed to manage automatic release
of temporary allocated TCG variables.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180319113544.704-2-laurent@vivier.eu>

vhost+postcopy: Helper to send requests to source for shared pages

Provide a helper to be used by shared waker functions to request
shared pages from the source.
The last_rb pointer is moved into the incoming state since this
helper can update it as well as the main fault thread function.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost+postcopy: Stash RAMBlock and offset

Stash the RAMBlock and offset for later use looking up
addresses.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost+postcopy: Send address back to qemu

We need a better way, but at the moment we need the address of the
mappings sent back to qemu so it can interpret the messages on the
userfaultfd it reads.

This is done as a 3 stage set:
   QEMU -> client
      set_mem_table

   mmap stuff, get addresses

   client -> qemu
       here are the addresses

   qemu -> client
       OK - now you can use them

That ensures that qemu has registered the new addresses in it's
userfault code before the client starts accessing them.

Note: We don't ask for the default 'ack' reply since we've got our own.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

libvhost-user+postcopy: Register new regions with the ufd

When new regions are sent to the client using SET_MEM_TABLE, register
them with the userfaultfd.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

migration/ram: ramblock_recv_bitmap_test_byte_offset

Utility for testing the map when you already know the offset
in the RAMBlock.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy+vhost-user: Split set_mem_table for postcopy

Split the set_mem_table routines in both qemu and libvhost-user
because the postcopy versions are going to be quite different
once changes in the later patches are added. However, this patch
doesn't produce any functional change, just the split.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost+postcopy: Transmit 'listen' to slave

Notify the vhost-user slave on reception of the 'postcopy-listen'
event from the source.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost+postcopy: Register shared ufd with postcopy

Register the UFD that comes in as the response to the 'advise' method
with the postcopy code.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy: Allow registering of fd handler

Allow other userfaultfd's to be registered into the fault thread
so that handlers for shared memory can get responses.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

libvhost-user: Open userfaultfd

Open a userfaultfd (on a postcopy_advise) and send it back in
the reply to the qemu for it to monitor.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

libvhost-user: Support sending fds back to qemu

Allow replies with fds (for postcopy)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message

Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE
message on an incoming advise.

Later patches will fill in the behaviour/contents of the
message.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy: Add vhost-user flag for postcopy and check it

Add a vhost feature flag for postcopy support, and
use the postcopy notifier to check it before allowing postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy: Add notifier chain

Add a notifier chain for postcopy with a 'reason' flag
and an opportunity for a notifier member to return an error.

Call it when enabling postcopy.

This will initially used to enable devices to declare they're unable
to postcopy and later to notify of devices of stages within postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

postcopy: use UFFDIO_ZEROPAGE only when available

Use a flag on the RAMBlock to state whether it has the
UFFDIO_ZEROPAGE capability, use it when it's available.

This allows the use of postcopy on tmpfs as well as hugepage
backed files.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

qemu_ram_block_host_offset

Utility to give the offset of a host pointer within a RAMBlock
(assuming we already know it's in that RAMBlock)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

migrate: Update ram_block_discard_range for shared

The choice of call to discard a block is getting more complicated
for other cases. We use fallocate PUNCH_HOLE in any file cases;
it works for both hugepage and for tmpfs.
We use the DONTNEED for non-hugepage cases either where they're
anonymous or where they're private.

Care should be taken when trying other backing files.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Makefile: add target to print generated files

This is helpful for automatic code analysis.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

test/acpi-test-data: add ACPI tables for dimmpxm test

Reviewers can use ACPI tables in this patch to run
test_acpi_{piix4,q35}_tcg_dimm_pxm cases.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

tests/bios-tables-test: add test cases for DIMM proximity

QEMU now builds one SRAT memory affinity structure for each PC-DIMM
and NVDIMM device presented at boot time with the proximity domain
specified in the device option 'node', rather than only one SRAT
memory affinity structure covering the entire hotpluggable address
space with the proximity domain of the last node.

Add test cases on PC and Q35 machines with 4 proximity domains, and
one PC-DIMM and one NVDIMM attached to the 2nd and 3rd proximity
domains respectively. Check whether the QEMU-built SRAT tables match
with the expected ones.

The following ACPI tables need to be added for this test:
  tests/acpi-test-data/pc/APIC.dimmpxm
  tests/acpi-test-data/pc/DSDT.dimmpxm
  tests/acpi-test-data/pc/NFIT.dimmpxm
  tests/acpi-test-data/pc/SRAT.dimmpxm
  tests/acpi-test-data/pc/SSDT.dimmpxm
  tests/acpi-test-data/q35/APIC.dimmpxm
  tests/acpi-test-data/q35/DSDT.dimmpxm
  tests/acpi-test-data/q35/NFIT.dimmpxm
  tests/acpi-test-data/q35/SRAT.dimmpxm
  tests/acpi-test-data/q35/SSDT.dimmpxm
New APIC and DSDT are needed because of the multiple processors
configuration. New NFIT and SSDT are needed because of NVDIMM.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/acpi-build: build SRAT memory affinity structures for DIMM devices

ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
domain of a NVDIMM SPA range must match with corresponding entry in
SRAT table.

The address ranges of vNVDIMM in QEMU are allocated from the
hot-pluggable address space, which is entirely covered by one SRAT
memory affinity structure. However, users can set the vNVDIMM
proximity domain in NFIT SPA range structure by the 'node' property of
'-device nvdimm' to a value different than the one in the above SRAT
memory affinity structure.

In order to solve such proximity domain mismatch, this patch builds
one SRAT memory affinity structure for each DIMM device present at
boot time, including both PC-DIMM and NVDIMM, with the proximity
domain specified in '-device pc-dimm' or '-device nvdimm'.

The remaining hot-pluggable address space is covered by one or multiple
SRAT memory affinity structures with the proximity domain of the last
node as before.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

qmp: distinguish PC-DIMM and NVDIMM in MemoryDeviceInfoList

It may need to treat PC-DIMM and NVDIMM differently, e.g., when
deciding the necessity of non-volatile flag bit in SRAT memory
affinity structures.

A new field 'nvdimm' is added to the union type MemoryDeviceInfo for
such purpose. Its type is currently PCDIMMDeviceInfo and will be
updated when necessary in the future.

It also fixes "info memory-devices"/query-memory-devices which
currently show nvdimm devices as dimm devices since
object_dynamic_cast(obj, TYPE_PC_DIMM) happily cast nvdimm to
TYPE_PC_DIMM which it's been inherited from.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

pc-dimm: make qmp_pc_dimm_device_list() sort devices by address

Make qmp_pc_dimm_device_list() return sorted by start address
list of devices so that it could be reused in places that
would need sorted list*. Reuse existing pc_dimm_built_list()
to get sorted list.

While at it hide recursive callbacks from callers, so that:

qmp_pc_dimm_device_list(qdev_get_machine(), &list);

could be replaced with simpler:

list = qmp_pc_dimm_device_list();

* follow up patch will use it in build_srat()

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> for ppc part
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

hw/pci: remove obsolete PCIDevice->init()

All PCI devices are now QOM'ified.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

standard-headers: update virtio_net.h

include speed/duplex fields

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

i386: Disable Intel PT if packets IP payloads have LIP values

Intel processor trace should be disabled when
CPUID.(EAX=14H,ECX=0H).ECX.[bit31] is set.
Generated packets which contain IP payloads will have LIP
values when this bit is set, or IP payloads will have RIP
values.
Currently, The information of CPUID 14H is constant to make
live migration safty and this bit is always 0 in guest even
if host support LIP values.
Guest sees the bit is 0 will expect IP payloads with RIP
values, but the host CPU will generate IP payloads with
LIP values if this bit is set in HW.
To make sure the value of IP payloads correctly, Intel PT
should be disabled when bit[31] is set.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1520969191-18162-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

qapi: Pass '-u' when doing non-silent diff

Ed-script diffs are awful compared to context diffs. Fix another
'diff -q' while in the area (if the files are different, being
noisy makes it easier to diagnose why).

While at it, diff .err before .out, because if a test fails, .err
is more likely to contain the most important information for
fixing the failure.

Fixes: 46ec4fce
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180315125116.804342-1-eblake@redhat.com>

qapi: add block latency histogram interface

Set (and clear) histograms through new command
block-latency-histogram-set and show new statistics in
query-blockstats results.

For now, the command is marked experimental with prefix 'x-',
to gain experience with the interface without being stuck
with design decisions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180309165212.97144-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[eblake: fix typos, mention x- prefix in commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>

block/accounting: introduce latency histogram

Introduce latency histogram statics for block devices.
For each accounted operation type, the latency region [0, +inf) is
divided into subregions by several points. Then, calculate
hits for each subregion.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180309165212.97144-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

tests: qmp-test: add oob test

Test the new OOB capability. Here we used the new "x-oob-test" command.
First, we send a lock=true and oob=false command to hang the main
thread. Then send another lock=false and oob=true command (which will
be run inside parser this time) to free that hanged command.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-24-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>

tests: qmp-test: verify command batching

OOB introduced DROP event for flow control. This should not affect old
QMP clients. Add a command batching check to make sure of it.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-23-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

qmp: add command "x-oob-test"

This command is only used to test OOB functionality. It should not be
used for any other purposes.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-22-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: enable IO thread for (qmp & !mux) typed

Start to use dedicate IO thread for QMP monitors that are not using
MUXed chardev.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-21-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

qmp: isolate responses into io thread

For those monitors who have enabled IO thread, we'll offload the
responding procedure into IO thread.  The main reason is that chardev is
not thread safe, and we need to do all the read/write IOs in the same
thread.  For use_io_thr=true monitors, that thread is the IO thread.

We do this isolation in similar pattern as what we have done to the
request queue: we first create one response queue for each monitor, then
instead of replying directly in the main thread, we queue the responses
and kick the IO thread to do the rest of the job for us.

A funny thing after doing this is that, when the QMP clients send "quit"
to QEMU, it's possible that we close the IOThread even earlier than
replying to that "quit".  So another thing we need to do before cleaning
up the monitors is that we need to flush the response queue (we don't
need to do that for command queue; after all we are quitting) to make
sure replies for handled commands are always flushed back to clients.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-20-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

qmp: support out-of-band (oob) execution

Having "allow-oob":true for a command does not mean that this command
will always be run in out-of-band mode.  The out-of-band quick path will
only be executed if we specify the extra "run-oob" flag when sending the
QMP request:

    { "execute":   "command-that-allows-oob",
      "arguments": { ... },
      "control":   { "run-oob": true } }

The "control" key is introduced to store this extra flag.  "control"
field is used to store arguments that are shared by all the commands,
rather than command specific arguments.  Let "run-oob" be the first.

Note that in the patch I exported qmp_dispatch_check_obj() to be used to
check the request earlier, and at the same time allowed "id" field to be
there since actually we always allow that.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-19-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to(), spelling fix]
Signed-off-by: Eric Blake <eblake@redhat.com>

qapi: introduce new cmd option "allow-oob"

Here "oob" stands for "Out-Of-Band".  When "allow-oob" is set, it means
the command allows out-of-band execution.

The "oob" idea is proposed by Markus Armbruster in following thread:

  https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg02057.html

This new "allow-oob" boolean will be exposed by "query-qmp-schema" as
well for command entries, so that QMP clients can know which commands
can be used in out-of-band calls. For example the command "migrate"
originally looks like:

  {"name": "migrate", "ret-type": "17", "meta-type": "command",
   "arg-type": "86"}

And it'll be changed into:

  {"name": "migrate", "ret-type": "17", "allow-oob": false,
   "meta-type": "command", "arg-type": "86"}

This patch only provides the QMP interface level changes.  It does not
contain the real out-of-band execution implementation yet.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-18-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase on introspection done by qlit]
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: send event when command queue full

Set maximum QMP command queue length to 8. If the queue is full,
instead of queuing the command, we directly return a "command-dropped"
event, telling the client that a specific command is dropped.

Note that this flow control mechanism is only valid if OOB is enabled.
If it's not, the effective queue length will always be 1, which strictly
follows original behavior of QMP command handling (which never drops
messages).

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-17-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: commit message grammar, abort on failure to send event]
Signed-off-by: Eric Blake <eblake@redhat.com>

qmp: add new event "command-dropped"

This event will be emitted if one QMP command is dropped. Also,
declare an enum for the reasons.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-16-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: separate QMP parser and dispatcher

Originally QMP goes through these steps:

  JSON Parser --> QMP Dispatcher --> Respond
      /|\    (2)                (3)     |
   (1) |                               \|/ (4)
       +---------  main thread  --------+

This patch does this:

  JSON Parser     QMP Dispatcher --> Respond
      /|\ |           /|\       (4)     |
       |  | (2)        | (3)            |  (5)
   (1) |  +----->      |               \|/
       +---------  main thread  <-------+

So the parsing job and the dispatching job is isolated now.  It gives us
a chance in follow up patches to totally move the parser outside.

The isolation is done using one QEMUBH. Only one dispatcher QEMUBH is
used for all the monitors.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-15-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks, rebase to qobject_to()]
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: let suspend/resume work even with QMPs

This patches allows QMP monitors to be suspended/resumed.

One thing to mention is that for QMPs that are using IOThreads, we need
an explicit kick for the IOThread in case it is sleeping.

Meanwhile, we need to take special care on non-interactive HMPs.
Currently only gdbserver is using that. For these monitors, we still
don't allow suspend/resume operations.

Since at it, add traces for the operations.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-14-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: let suspend_cnt be thread safe

Monitor code now can be run in more than one thread. Let it be thread
safe when accessing suspend_cnt counter.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-13-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: introduce monitor_qmp_respond()

A tiny refactoring, preparing to split the QMP dispatcher away.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-12-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to() usage]
Signed-off-by: Eric Blake <eblake@redhat.com>

qmp: introduce QMPCapability

There were no QMP capabilities defined.  Define the first capability,
"oob", to allow out-of-band messages.

After this patch, we will allow QMP clients to enable QMP capabilities
when sending the first "qmp_capabilities" command.  Originally we are
starting QMP session with no arguments like:

  { "execute": "qmp_capabilities" }

Now we can enable some QMP capabilities using (take OOB as example,
which is the only capability that we support):

  { "execute": "qmp_capabilities",
    "arguments": { "enable": [ "oob" ] } }

When the "arguments" key is not provided, no capability is enabled.

For capability "oob", the monitor needs to be run on a dedicated IO
thread, otherwise the command will fail.  For example, trying to enable
OOB on a MUXed typed QMP monitor will fail.

One thing to mention is that QMP capabilities are per-monitor, and also
when the connection is closed due to some reason, the capabilities will
be reset.

Also, touch up qmp-test.c to test the new bits.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-11-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: touch up commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: allow using IO thread for parsing

For each Monitor, add one field "use_io_thr" to show whether it will be
using the dedicated monitor IO thread to handle input/output.  When set,
monitor IO parsing work will be offloaded to the dedicated monitor IO
thread, rather than the original main loop thread.

This only works for QMP.  HMP will always be run on the main loop
thread.

Currently we're still keeping use_io_thr off always.  Will turn it on
later at some point.

One thing to mention is that we cannot set use_io_thr for every QMP
monitor.  The problem is that MUXed typed chardevs may not work well
with it now. When MUX is used, frontend of chardev can be the monitor
plus something else.  The only thing we know would be safe to be run
outside main thread so far is the monitor frontend. All the rest of the
frontends should still be run in main thread only.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-10-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: squash in Peter's followup patch to avoid test failures]
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: let mon_list be tail queue

It was QLIST. I want to use this list to do monitor priority job later,
which need tail insertion ability. So switching to a tail queue.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-9-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: unify global init

There are many places where the monitor initializes its globals:

- monitor_init_qmp_commands() at the very beginning
- single function to init monitor_lock
- in the first entry of monitor_init() using "is_first_init"

Unify them a bit.

monitor_lock is not used before monitor_init() (as confirmed by code
analysis and gdb watchpoints); so we are safe delaying what was a
constructor-time initialization of the mutex into the later first call
to monitor_init().

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-8-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: move the cur_mon hack deeper for QMP

In monitor_qmp_read(), we have the hack to temporarily replace the
cur_mon pointer. Now we move this hack deeper inside the QMP dispatcher
routine since the Monitor pointer can be actually obtained using
container_of() upon the parser object, just like most of the other JSON
parser users do.

This does not make much sense as a single patch. However, this will be
a big step for the next patch, when the QMP dispatcher routine will be
split from the QMP parser.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-7-peterx@redhat.com>
[eblake: rebase context of qobject_to() macro]
Signed-off-by: Eric Blake <eblake@redhat.com>

monitor: move skip_flush into monitor_data_init

It's part of the data init. Collect it.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-6-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

qobject: let object_property_get_str() use new API

We can simplify object_property_get_str() using the new
qobject_get_try_str().

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-5-peterx@redhat.com>
[eblake: rebase context of qobject_to() macro]
Signed-off-by: Eric Blake <eblake@redhat.com>

qobject: introduce qobject_get_try_str()

A quick way to fetch string from qobject when it's a QString.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-4-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to() macro]
Signed-off-by: Eric Blake <eblake@redhat.com>

qobject: introduce qstring_get_try_str()

The only difference from qstring_get_str() is that it allows the qstring
to be NULL. If so, NULL is returned.

CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-3-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

docs: update QMP documents for OOB commands

Update both the developer and spec for the new QMP OOB (Out-Of-Band)
command.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-2-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>

chardev: tcp: postpone TLS work until machine done

TLS handshake may create background GSource tasks, while we won't know
the correct GMainContext until the whole chardev (including frontend)
inited. Let's postpone the initial TLS handshake until machine done.

For dynamically created tcp chardev, we don't postpone that by checking
the init_machine_done variable.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[peterx: add missing include line, do unit test]
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180308140714.28906-1-peterx@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

block: Deprecate "backing": ""

We have a clear replacement, so let's deprecate it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-8-mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

block: Handle null backing link

Instead of converting all "backing": null instances into "backing": "",
handle a null value directly in bdrv_open_inherit().

This enables explicitly null backing links for json:{} filenames.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-7-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to() parameter order and qapi headers split]
Signed-off-by: Eric Blake <eblake@redhat.com>

qapi: Make more of qobject_to()

This patch reworks some places which use either qobject_type() checks
plus qobject_to(), where the latter alone is sufficient, or NULL checks
plus qobject_type() checks where we can simply do a qobject_to() != NULL
check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-6-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to() parameter ordering]
Signed-off-by: Eric Blake <eblake@redhat.com>

qapi: Remove qobject_to_X() functions

They are no longer needed now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-5-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

qapi: Replace qobject_to_X(o) by qobject_to(X, o)

This patch was generated using the following Coccinelle script:

@@
expression Obj;
@@
(
- qobject_to_qnum(Obj)
+ qobject_to(QNum, Obj)
|
- qobject_to_qstring(Obj)
+ qobject_to(QString, Obj)
|
- qobject_to_qdict(Obj)
+ qobject_to(QDict, Obj)
|
- qobject_to_qlist(Obj)
+ qobject_to(QList, Obj)
|
- qobject_to_qbool(Obj)
+ qobject_to(QBool, Obj)
)

and a bit of manual fix-up for overly long lines and three places in
tests/check-qjson.c that Coccinelle did not find.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-4-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: swap order from qobject_to(o, X), rebase to master, also a fix
to latent false-positive compiler complaint about hw/i386/acpi-build.c]
Signed-off-by: Eric Blake <eblake@redhat.com>

qapi: Add qobject_to()

This is a dynamic casting macro that, given a QObject type, returns an
object as that type or NULL if the object is of a different type (or
NULL itself).

The macro uses lower-case letters because:
1. There does not seem to be a hard rule on whether qemu macros have to
   be upper-cased,
2. The current situation in qapi/qmp is inconsistent (compare e.g.
   QINCREF() vs. qdict_put()),
3. qobject_to() will evaluate its @obj parameter only once, thus it is
   generally not important to the caller whether it is a macro or not,
4. I prefer it aesthetically.

The macro parameter order is chosen with typename first for
consistency with other QAPI macros like QAPI_CLONE(), as well as
for legibility (read it as "qobject to" type "applied to" obj).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20180224154033.29559-3-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
[eblake: swap parameter order to list type first, avoid clang ubsan
warning on QOBJECT(NULL) and container_of(NULL,type,base)]
Signed-off-by: Eric Blake <eblake@redhat.com>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180319' into staging

target-arm queue:
* fsl-imx6: Fix incorrect Ethernet interrupt defines
* dump: Update correct kdump phys_base field for AArch64
* char: i.MX: Add support for "TX complete" interrupt
* bcm2836/raspi: Fix various bugs resulting in panics trying
   to boot a Debian Linux kernel on raspi3

# gpg: Signature made Mon 19 Mar 2018 18:30:33 GMT
# gpg:                using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20180319:
  hw/arm/raspi: Provide spin-loop code for AArch64 CPUs
  hw/arm/bcm2836: Hardcode correct CPU type
  hw/arm/bcm2836: Use correct affinity values for BCM2837
  hw/arm/bcm2836: Create proper bcm2837 device
  hw/arm/bcm2836: Rename bcm2836 type/struct to bcm283x
  hw/arm/bcm2386: Fix parent type of bcm2386
  hw/arm/boot: If booting a kernel in EL2, set SCR_EL3.HCE
  hw/arm/boot: assert that secure_boot and secure_board_setup are false for AArch64
  hw/arm/raspi: Don't do board-setup or secure-boot for raspi3
  char: i.MX: Add support for "TX complete" interrupt
  char: i.MX: Simplify imx_update()
  dump: Update correct kdump phys_base field for AArch64
  fsl-imx6: Swap Ethernet interrupt defines

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/raspi: Provide spin-loop code for AArch64 CPUs

The raspi3 has AArch64 CPUs, which means that our smpboot
code for keeping the secondary CPUs in a pen needs to have
a version for A64 as well as A32. Without this, the
secondary CPUs go into an infinite loop of taking undefined
instruction exceptions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-10-peter.maydell@linaro.org

hw/arm/bcm2836: Hardcode correct CPU type

Now we have separate types for BCM2386 and BCM2387, we might as well
just hard-code the CPU type they use rather than having it passed
through as an object property. This then lets us put the initialization
of the CPU object in init rather than realize.

Note that this change means that it's no longer possible on
the command line to use -cpu to ask for a different kind of
CPU than the SoC supports. This was never a supported thing to
do anyway; we were just not sanity-checking the command line.

This does require us to only build the bcm2837 object on
TARGET_AARCH64 configs, since otherwise it won't instantiate
due to the missing cortex-a53 device and "make check" will fail.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-9-peter.maydell@linaro.org

hw/arm/bcm2836: Use correct affinity values for BCM2837

The BCM2837 sets the Aff1 field of the MPIDR affinity values for the
CPUs to 0, whereas the BCM2836 uses 0xf. Set this correctly, as it
is required for Linux to boot.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-8-peter.maydell@linaro.org

hw/arm/bcm2836: Create proper bcm2837 device

The bcm2837 is pretty similar to the bcm2836, but it does have
some differences. Notably, the MPIDR affinity aff1 values it
sets for the CPUs are 0x0, rather than the 0xf that the bcm2836
uses, and if this is wrong Linux will not boot.

Rather than trying to have one device with properties that
configure it differently for the two cases, create two
separate QOM devices for the two SoCs. We use the same approach
as hw/arm/aspeed_soc.c and share code and have a data table
that might differ per-SoC. For the moment the two types don't
actually have different behaviour.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-7-peter.maydell@linaro.org

hw/arm/bcm2836: Rename bcm2836 type/struct to bcm283x

Our BCM2836 type is really a generic one that can be any of
the bcm283x family. Rename it accordingly. We change only
the names which are visible via the header file to the
rest of the QEMU code, leaving private function names
in bcm2836.c as they are.

This is a preliminary to making bcm283x be an abstract
parent class to specific types for the bcm2836 and bcm2837.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-6-peter.maydell@linaro.org

hw/arm/bcm2386: Fix parent type of bcm2386

The TypeInfo and state struct for bcm2386 disagree about what the
parent class is -- the TypeInfo says it's TYPE_SYS_BUS_DEVICE,
but the BCM2386State struct only defines the parent_obj field
as DeviceState. This would have caused problems if anything
actually tried to treat the object as a TYPE_SYS_BUS_DEVICE.
Fix the TypeInfo to use TYPE_DEVICE as the parent, since we don't
need any of the additional functionality TYPE_SYS_BUS_DEVICE
provides.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-5-peter.maydell@linaro.org

hw/arm/boot: If booting a kernel in EL2, set SCR_EL3.HCE

If we're directly booting a Linux kernel and the CPU supports both
EL3 and EL2, we start the kernel in EL2, as it expects. We must also
set the SCR_EL3.HCE bit in this situation, so that the HVC
instruction is enabled rather than UNDEFing. Otherwise at least some
kernels will panic when trying to initialize KVM in the guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180313153458.26822-4-peter.maydell@linaro.org

hw/arm/boot: assert that secure_boot and secure_board_setup are false for AArch64

Add some assertions that if we're about to boot an AArch64 kernel,
the board code has not mistakenly set either secure_boot or
secure_board_setup. It doesn't make sense to set secure_boot,
because all AArch64 kernels must be booted in non-secure mode.

It might in theory make sense to set secure_board_setup, but
we don't currently support that, because only the AArch32
bootloader[] code calls this hook; bootloader_aarch64[] does not.
Since we don't have a current need for this functionality, just
assert that we don't try to use it. If it's needed we'll add
it later.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-3-peter.maydell@linaro.org

hw/arm/raspi: Don't do board-setup or secure-boot for raspi3

For the rpi1 and 2 we want to boot the Linux kernel via some
custom setup code that makes sure that the SMC instruction
acts as a no-op, because it's used for cache maintenance.
The rpi3 boots AArch64 kernels, which don't need SMC for
cache maintenance and always expect to be booted non-secure.
Don't fill in the aarch32-specific parts of the binfo struct.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-2-peter.maydell@linaro.org

char: i.MX: Add support for "TX complete" interrupt

Add support for "TX complete"/TXDC interrupt generate by real HW since
it is needed to support guests other than Linux.

Based on the patch by Bill Paul as found here:
https://bugs.launchpad.net/qemu/+bug/1753314

Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: Bill Paul <wpaul@windriver.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bill Paul <wpaul@windriver.com>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Message-id: 20180315191141.6789-2-andrew.smirnov@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

char: i.MX: Simplify imx_update()

Code of imx_update() is slightly confusing since the "flags" variable
doesn't really corespond to anything in real hardware and server as a
kitchensink accumulating events normally reported via USR1 and USR2
registers.

Change the code to explicitly evaluate state of interrupts reported
via USR1 and USR2 against corresponding masking bits and use the to
detemine if IRQ line should be asserted or not.

NOTE: Check for UTS1_TXEMPTY being set has been dropped for two
reasons:

    1. Emulation code implements a single character FIFO, so this flag
       will always be set since characters are trasmitted as a part of
       the code emulating "push" into the FIFO

    2. imx_update() is really just a function doing ORing and maksing
       of reported events, so checking for UTS1_TXEMPTY should happen,
       if it's ever really needed should probably happen outside of
       it.

Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: Bill Paul <wpaul@windriver.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Message-id: 20180315191141.6789-1-andrew.smirnov@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

dump: Update correct kdump phys_base field for AArch64

For guest kernel that supports KASLR, the load address can change every
time when guest VM runs. To find the physical base address correctly,
current QEMU dump searches VMCOREINFO for the string "NUMBER(phys_base)=".
However this string pattern is only available on x86_64. AArch64 uses a
different field, called "NUMBER(PHYS_OFFSET)=". This patch makes sure
QEMU dump uses the correct string on AArch64.

Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1520615003-20869-1-git-send-email-wei@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

fsl-imx6: Swap Ethernet interrupt defines

The sabrelite machine model used by qemu-system-arm is based on the
Freescale/NXP i.MX6Q processor. This SoC has an on-board ethernet
controller which is supported in QEMU using the imx_fec.c module
(actually called imx.enet for this model.)

The include/hw/arm/fsm-imx6.h file defines the interrupt vectors for the
imx.enet device like this:

#define FSL_IMX6_ENET_MAC_1588_IRQ 118
#define FSL_IMX6_ENET_MAC_IRQ 119

According to https://www.nxp.com/docs/en/reference-manual/IMX6DQRM.pdf,
page 225, in Table 3-1. ARM Cortex A9 domain interrupt summary,
interrupts are as follows.

150 ENET MAC 0 IRQ
151 ENET MAC 0 1588 Timer interrupt

where

150 - 32 == 118
151 - 32 == 119

In other words, the vector definitions in the fsl-imx6.h file are reversed.

Fixing the interrupts alone causes problems with older Linux kernels:
The Ethernet interface will fail to probe with Linux v4.9 and earlier.
Linux v4.1 and earlier will crash due to a bug in Ethernet driver probe
error handling. This is a Linux kernel problem, not a qemu problem:
the Linux kernel only worked by accident since it requested both interrupts.

For backward compatibility, generate the Ethernet interrupt on both interrupt
lines. This was shown to work from all Linux kernel releases starting with
v3.16.

Link: https://bugs.launchpad.net/qemu/+bug/1753309
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 1520723090-22130-1-git-send-email-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

cpu: drop unnecessary NULL check and cpu_common_class_by_name()

both do nothing as for the first all callers
parse_cpu_model() and qmp_query_cpu_model_()
should provide non NULL value, so just abort if it's not so.

While at it drop cpu_common_class_by_name() which is not need
any more as every target has CPUClass::class_by_name callback
by now, though abort in case a new arch will forget to define one.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1518013857-4372-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>