git.osdn.net Git - qmiga/qemu.git/log

vl.c: fix memory leak spotted by valgrind

valgrind complains about:
==42062== 16 bytes in 1 blocks are definitely lost in loss record 387 of 1,048
==42062==    at 0x402DCB2: malloc (vg_replace_malloc.c:299)
==42062==    by 0x40C1BE3: g_malloc (in /usr/lib64/libglib-2.0.so.0.3800.2)
==42062==    by 0x40DA133: g_slice_alloc (in /usr/lib64/libglib-2.0.so.0.3800.2)
==42062==    by 0x40DB2E5: g_slist_prepend (in /usr/lib64/libglib-2.0.so.0.3800.2)
==42062==    by 0x801637FF: object_class_get_list_tramp (object.c:690)
==42062==    by 0x40A96C9: g_hash_table_foreach (in /usr/lib64/libglib-2.0.so.0.3800.2)
==42062==    by 0x80164885: object_class_foreach (object.c:665)
==42062==    by 0x80164975: object_class_get_list (object.c:698)
==42062==    by 0x800100A5: machine_parse (vl.c:2447)
==42062==    by 0x800100A5: main (vl.c:3756)

Lets free machines in case of mc.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

aes: remove a dead return statement

bits is checked to be 128, 192 or 256 at the beginning of the function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

qemu-sockets: improve error reporting in unix_listen_opts

Coverity complains about not checking the returned value of mkstemp. While
at it, also improve error checking for snprintf, and refine error messages
in general.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

cpu-exec: simplify icount code

Use MIN instead of an "if" statement. Move "tb" assignment where
the value is actually used.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

cpu-exec: drop dead assignment

All uses of TB inside cpu_exec are dominated by "tb = tb_find_fast(env)",
and there are no uses after the switch statement. So the assignment
is dead, as reported by Coverity.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

qemu-log: Correct help text of 'log cpu_reset'

The logging of the CPU state during reset is done for all architectures
nowadays (see cpu_common_reset() in qom/cpu.c), so the "x86 only" text
does not apply here anymore.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

linux-user/syscall.c: do_ioctl_dm: Need to call unlock_user() before going to failure return in default case

In abi_long do_ioctl_dm(), after lock_user() call, the code does
not call unlock_user() before going to failure return in default case.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

linux-user/main.c: Use TARGET_SIG* instead of SIG*

In main.c, all SIG* should be TARGET_SIG*, since the relevant functions
(queue_signal() and gdb_handlesig()) expect TARGET_SIG*.

The corresponding vi command is "1,$ s/\<SIG/TARGET_SIG/g".

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

linux-user/syscall.c: Fix typo issue for using target_vec[i].iov_len instead of target_vec[i].iov_base

It is only a typo issue, need use tswapal(target_vec[i].iov_len) for the
len.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

linux-user/syscall.c: lock_iovec: unlock vec[i] in failure processing code block

When failure occurs during locking of vec[i], we also need to unlock all
already locked vec[i] in failure processing code block before return.

Code in unlock_user() checks vec[i].iov_base for NULL, so there's no
need not check it .

If error is EFAULT when "i == 0", vec[i].iov_base is NULL, we can just
skip it, so can still use "while (--i >= 0)" loop condition.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

virtfs-proxy-helper: Fix possible socket leak.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

vl: Fix bogus error message for implied mon ID clashing

monitor_parse() desugars --monitor, --qmp and -qmp-pretty to --mon.
The ID it picks can clash with a user-specified ID.  When it happens,
the error message is misleading.

Reproducer:

    $ qemu --mon id=compat_monitor0 --monitor stdio

Message before the patch:

    duplicate chardev: compat_monitor0

There's no "duplicate chardev" here.  The problem is a duplicate
monitor ID.  Moreover, the message provides no clue which option
caused the problem.  The patch changes the message to:

    qemu: --monitor stdio: Duplicate ID 'compat_monitor0' for mon

monitor_parse() is also used for creating a default monitor, but
that's not done when the user specifies a monitor, so an ID clash is
impossible then.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

Convert some debugging printfs to trace calls in pcnet.c.

Signed-off-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

Add/convert trace calls in pcnet-pci.c.

Add trace calls. Convert some #ifdef DEBUG printfs to trace.

Signed-off-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

Add trace to ps2.c.

Signed-off-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

Add tracing to xenfb.

Signed-off-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

fw_cfg: fix typos in comments: patch -> path

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

target-mips: Clean up switch fall through after commit fecd264

Commit fecd264 added a number of fall-throughs, but neglected to
properly document them as intentional. Commit d922445 cleaned that up
for many, but not all cases. Take care of the remaining ones.

Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

qmp: unbreak build for non-vnc configuration

Signed-off-by: Leon Yu <chianglungyu@gmail.com>
Message-id: 1422853731-5282-1-git-send-email-chianglungyu@gmail.com
Fixes: df887684603a ("monitor: add query-vnc-servers command")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block patches for 2.3

# gpg: Signature made Fri 06 Feb 2015 17:14:10 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (47 commits)
  block/raw-posix.c: Fix raw_getlength() on Mac OS X block devices
  block: Eliminate silly QERR_ macros used for encryption keys
  block: New bdrv_add_key(), convert monitor to use it
  blockdev: Eliminate silly QERR_BLOCK_JOB_NOT_ACTIVE macro
  blockdev: Give find_block_job() an Error ** parameter
  qcow2: Rewrite qcow2_alloc_bytes()
  block: Give always priority to unused entries in the qcow2 L2 cache
  nbd: fix max_discard/max_transfer_length
  block: introduce BDRV_REQUEST_MAX_SECTORS
  nbd: Improve error messages
  iotests: Fix 104 for NBD
  iotests: Fix 100 for nbd
  iotests: Fix 083
  block: fix off-by-one error in qcow and qcow2
  qemu-iotests: add 116 invalid QED input file tests
  qed: check for header size overflow
  block/dmg: improve zeroes handling
  block/dmg: support bzip2 block entry types
  block/dmg: factor out block type check
  block/dmg: use SectorNumber from BLKX header
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

block/raw-posix.c: Fix raw_getlength() on Mac OS X block devices

This patch replaces the dummy code in raw_getlength() for block devices
on OS X, which always returned LLONG_MAX, with a real implementation
that returns the actual block device size.

Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Merge remote-tracking branch 'mreitz/block' into queue-block

* mreitz/block:
  block: Eliminate silly QERR_ macros used for encryption keys
  block: New bdrv_add_key(), convert monitor to use it
  blockdev: Eliminate silly QERR_BLOCK_JOB_NOT_ACTIVE macro
  blockdev: Give find_block_job() an Error ** parameter

block: Eliminate silly QERR_ macros used for encryption keys

The QERR_ macros are leftovers from the days of "rich" error objects.
They're used with error_set() and qerror_report(), and expand into the
first *two* arguments. This trickiness has become pointless. Clean
up QERR_DEVICE_ENCRYPTED and QERR_DEVICE_NOT_ENCRYPTED.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1422524221-8566-5-git-send-email-armbru@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: New bdrv_add_key(), convert monitor to use it

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1422524221-8566-4-git-send-email-armbru@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>

blockdev: Eliminate silly QERR_BLOCK_JOB_NOT_ACTIVE macro

The QERR_ macros are leftovers from the days of "rich" error objects.
They're used with error_set() and qerror_report(), and expand into the
first *two* arguments. This trickiness has become pointless. Clean
this one up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1422524221-8566-3-git-send-email-armbru@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>

blockdev: Give find_block_job() an Error ** parameter

When find_block_job() fails, all its callers build the same Error
object. Build it in find_block_job() instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1422524221-8566-2-git-send-email-armbru@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>

qcow2: Rewrite qcow2_alloc_bytes()

qcow2_alloc_bytes() is a function with insufficient error handling and
an unnecessary goto. This patch rewrites it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Give always priority to unused entries in the qcow2 L2 cache

The current algorithm to replace entries from the L2 cache gives
priority to newer hits by dividing the hit count of all existing
entries by two everytime there is a cache miss.

However, if there are several cache misses the hit count of the
existing entries can easily go down to 0. This will result in those
entries being replaced even when there are others that have never been
used.

This problem is more noticeable with larger disk images and cache
sizes, since the chances of having several misses before the cache is
full are higher.

If we make sure that the hit count can never go down to 0 again,
unused entries will always have priority.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

nbd: fix max_discard/max_transfer_length

nbd_co_discard calls nbd_client_session_co_discard which uses uint32_t
as the length in bytes of the data to discard due to the following
definition:

struct nbd_request {
    uint32_t magic;
    uint32_t type;
    uint64_t handle;
    uint64_t from;
    uint32_t len; <-- the length of data to be discarded, in bytes
} QEMU_PACKED;

Thus we should limit bl_max_discard to UINT32_MAX >> BDRV_SECTOR_BITS to
avoid overflow.

NBD read/write code uses the same structure for transfers. Fix
max_transfer_length accordingly.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Peter Lieven <pl@kamp.de>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: introduce BDRV_REQUEST_MAX_SECTORS

we check and adjust request sizes at several places with
sometimes inconsistent checks or default values:
INT_MAX
INT_MAX >> BDRV_SECTOR_BITS
UINT_MAX >> BDRV_SECTOR_BITS
SIZE_MAX >> BDRV_SECTOR_BITS

This patches introdocues a macro for the maximal allowed sectors
per request and uses it at several places.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

nbd: Improve error messages

This patch makes use of the Error object for nbd_receive_negotiate() so
that errors during negotiation look nicer.

Furthermore, this patch adds an additional error message if the received
magic was wrong, but would be correct for the other protocol version,
respectively: So if an export name was specified, but the NBD server
magic corresponds to an old handshake, this condition is explicitly
signaled to the user, and vice versa.

As these messages are now part of the "Could not open image" error
message, additional filtering has to be employed in iotest 083, which
this patch does as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Fix 104 for NBD

_make_test_img sets up an NBD server, _cleanup_test_img shuts it down;
thus, _cleanup_test_img has to be called before _make_test_img is
invoked another time.

Furthermore, the pipe through _filter_test_img was unnecessary;
_make_test_img already takes care of that.

And finally, a filter is added to _filter_img_info to replace
"nbd://127.0.0.1:10810" by "TEST_DIR/t.IMGFMT", since the former is the
way to express the full image path (normally the latter) for NBD tests.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Fix 100 for nbd

In case of NBD, _make_test_img starts a new NBD server. Therefore,
_cleanup_test_img (which shuts that server down) has to be invoked
before the next _make_test_img call in order to make 100 work for NBD.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Fix 083

As of 8f9e835fd2e687d2bfe936819c3494af4343614d, probing should be
disabled in the qemu-iotests (at least when using qemu-io). This broke
083's reference output (which consisted mostly of "Could not read image
for determining its format").

This patch fixes it.

Note that one case which failed before is now successful: Disconnect
after data. This is due to qemu having read twice before (once for
probing, once for the qemu-io read command), but only once now (the
qemu-io read command). Therefore, reading is successful (which is
correct).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: fix off-by-one error in qcow and qcow2

This fixes an off-by-one error introduced in 9a29e18. Both qcow and
qcow2 need to make sure to leave room for string terminator '\0' for
the backing file, so the max length of the non-terminated string is
either 1023 or PATH_MAX - 1.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: add 116 invalid QED input file tests

These tests exercise error code paths in the QED image format. The
tests are very simple, they just prove that the error path exits
cleanly.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1421065893-18875-3-git-send-email-stefanha@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qed: check for header size overflow

Header size is denoted in clusters. The maximum cluster size is 64 MB
but there is no limit on header size. Check for uint32_t overflow in
case the header size field has a whacky value.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1421065893-18875-2-git-send-email-stefanha@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: improve zeroes handling

Disk images may contain large all-zeroes gaps (1.66k sectors or 812 MiB
is seen in the real world). These blocks (type 2) do not need to be
extracted into a temporary buffer, there is no need to allocate memory
for these blocks nor to check its length.

(For the test image, the maximum uncompressed size is 1054371 bytes,
probably for a bzip2-compressed block.)

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-13-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: support bzip2 block entry types

This patch adds support for bzip2-compressed block entries as introduced
with OS X 10.4 (source: https://en.wikipedia.org/wiki/Apple_Disk_Image).

It was tested against a 5.2G "OS X Yosemite" installation image which
stores the BLXX block in the XML property list (instead of resource
forks) and has over 5k chunks.

New configure entries are added (--enable-bzip2 / --disable-bzip2) to
control inclusion of bzip2 functionality (which requires linking against
libbz2). The help message suggests that this option is needed for DMG
files, but the tests are generic enough that other parts of QEMU can use
bzip2 if needed.

The identifiers are based on http://newosxbook.com/DMG.html.

The decompression routines are based on the zlib case, but as there is
no way to reset the decompression state (unlike zlib), memory is
allocated and deallocated for every decompression. This should not be
problematic as the decompression takes most of the time and as blocks
are typically about/over 1 MiB in size, only one allocation is done
every 2000 sectors.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1420566495-13284-12-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: factor out block type check

In preparation for adding bzip2 support, split the type check into a
separate function. Make all offsets relative to the begin of a chunk
such that it is easier to recognize the position without having to
add up all offsets. Some comments are added to describe the fields.

There is no functional change.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-11-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: use SectorNumber from BLKX header

Previously the sector table parsing relied on the previous offset of
the DMG file. Now it uses the sector number from the BLKX header
(see http://newosxbook.com/DMG.html).

The implementation of dmg2img (from vu1tur) does not base the output
sector on the location of the terminator (0xffffffff) either so it
should be safe to drop this dependency on the previous state.

(It makes somehow makes sense, a terminator should halt further
processing of a block and is perhaps used to preallocate some space.)

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-10-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: fix sector data offset calculation

This patch addresses two issues:

- The data fork offset was not taken into account, resulting in failure
   to read an InstallESD.dmg file (5164763151 bytes) which had a
   non-zero DataForkOffset field.
- The offset of the previous block ("partition") was unconditionally
   added to the current block because older files would start the input
   offset of a new block at zero. Newer files (including vlc-2.1.5.dmg,
   tuxpaint-0.9.15-macosx.dmg and OS X Yosemite [MAS].dmg) failed in
   reads because these files have chunk offsets, relative to the begin
   of a data fork.

Now the data offset of the mish is taken into account. While we could
check that the data_offset is within the data fork, let's not do that
here as it would only result in parse failures on invalid files (rather
than gracefully handling such bad files). dmg_read will error out if
the offset is incorrect.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-9-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: set virtual size to a non-zero value

Right now the virtual size is always reported as zero which makes it
impossible to convert between formats.

After this patch, the number of sectors will be read from the trailer
("koly" block).

To verify the behavior, the output of `dmg2img foo.dmg foo.img` was
compared against `qemu-img convert -f dmg -O raw foo.dmg foo.raw`. The
tests showed that the file contents are exactly the same, except that
QEMU creates a slightly larger file (it matches the total sectors
count).

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-8-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: process XML plists

The format is simple enough to avoid using a full-blown XML parser. It
assumes that all BLKX items begin with the "mish" magic word, therefore
it is not a problem if other values get matched which are not a BLKX
block.

The offsets are based on the description at
http://newosxbook.com/DMG.html

For compatibility with glib 2.12, use g_base64_decode (which
additionally requires an extra buffer allocation) instead of
g_base64_decode_inplace (which is only available since glib 2.20).

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-7-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: validate chunk size to avoid overflow

Previously the chunk size was not checked, allowing for a large memory
allocation. This patch checks whether the chunks size is within the
resource fork length, and whether the resource fork is below the
trailer of the dmg file.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-6-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: process a buffer instead of reading ints

As the decoded plist XML is not a pointer in the file,
dmg_read_mish_block must be able to process a buffer instead of a file
pointer. Since the full buffer must be processed, let's change the
return value again to just a success flag.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-5-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: extract processing of resource forks

Besides the offset, also read the resource length. This length is now
used in the extracted function to verify the end of the resource fork
against "count" from the resource fork.

Instead of relying on the value of offset to conclude whether the
resource fork is available or not (info_begin==0), check the
rsrc_fork_length instead. This would allow a dmg file to begin with a
resource fork. This seemingly unnecessary restriction was found while
trying to craft a DMG file by hand.

Other changes:

- Do not require resource data offset to be 0x100 (but check that it
   is within bounds though).
- Further improve boundary checking (resource data must be within
   the resource fork).
- Use correct value for resource data length (spotted by John Snow)
- Consider the resource data offset when determining info_end.
   This fixes an EINVAL on the tuxpaint dmg example.

The resource fork format is documented at
https://developer.apple.com/legacy/library/documentation/mac/pdf/MoreMacintoshToolbox.pdf#page=151

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1420566495-13284-4-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: extract mish block decoding functionality

Extract the mish block decoder such that this can be used for other
formats in the future. A new DmgHeaderState struct is introduced to
share state while decoding.

The code is kept unchanged as much as possible, a "fail" label is added
for example where a simple return would probably do. In dmg_open, the
variable "tmp" is renamed to "rsrc_data_offset" for clarity and comments
have been added explaining various data.

Note that this patch has one subtle difference with the previous
version which should not affect functionality. In the previous code,
the end of a resource was inferred from the mish block (the offsets
would be increased by the fields). In this patch, the resource length
is used instead to avoid the need to rely on the previous offsets.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1420566495-13284-3-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/dmg: properly detect the UDIF trailer

DMG files have a variable length with a UDIF trailer at the end of a
file. This UDIF trailer is essential as it describes the contents of
the image. At the moment however, the start of this trailer is almost
always incorrect as bdrv_getlength() returns a multiple of the block
size (rounded up). This results in a failure to recognize DMG files,
resulting in Invalid argument (EINVAL) errors.

As there is no API to retrieve the real file size, look for the magic
header in the last two sectors to find the start of this 512-byte UDIF
trailer (the "koly" block).

The resource fork offset ("info_begin") has its offset adjusted as the
initial value of offset does not mean "end of file" anymore, but "begin
of UDIF trailer".

[Replaced error_set(errp, ERROR_CLASS_GENERIC_ERROR, ...) with
error_setg(errp, ...) as discussed with Peter.
--Stefan]

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1420566495-13284-2-git-send-email-peter@lekensteyn.nl
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: add event when disk usage exceeds threshold

Managing applications, like oVirt (http://www.ovirt.org), make extensive
use of thin-provisioned disk images.
To let the guest run smoothly and be not unnecessarily paused, oVirt sets
a disk usage threshold (so called 'high water mark') based on the occupation
of the device,  and automatically extends the image once the threshold
is reached or exceeded.

In order to detect the crossing of the threshold, oVirt has no choice but
aggressively polling the QEMU monitor using the query-blockstats command.
This lead to unnecessary system load, and is made even worse under scale:
deployments with hundreds of VMs are no longer rare.

To fix this, this patch adds:
* A new monitor command `block-set-write-threshold', to set a mark for
  a given block device.
* A new event `BLOCK_WRITE_THRESHOLD', to report if a block device
  usage exceeds the threshold.
* A new `write_threshold' field into the `BlockDeviceInfo' structure,
  to report the configured threshold.

This will allow the managing application to use smarter and more
efficient monitoring, greatly reducing the need of polling.

[Updated qemu-iotests 067 output to add the new 'write_threshold'
property. --Stefan]
[Changed g_assert_false() to !g_assert() to fix the build on older glib
versions. --Kevin]

Signed-off-by: Francesco Romani <fromani@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1421068273-692-1-git-send-email-fromani@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Specify format for qemu-nbd

This patch is necessary to suppress the "probed raw" warning when
running raw over nbd tests.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-iotests: Fix supported_oses check

There is a bug in the recently added sys.platform test, and we no longer
run python tests, because "linux2" is the value to compare here. So do a
prefix match. According to python doc [1], the way to use sys.platform
is "unless you want to test for a specific system version, it is
therefore recommended to use the following idiom":

if sys.platform.startswith('freebsd'):
# FreeBSD-specific code here...
elif sys.platform.startswith('linux'):
# Linux-specific code here...

[1]: https://docs.python.org/2.7/library/sys.html#sys.platform

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

virtio-blk: add a knob to disable request merging

this adds a knob to disable request merging for debugging or benchmarks if dedired.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

virtio-blk: introduce multiread

this patch finally introduces multiread support to virtio-blk. While
multiwrite support was there for a long time, read support was missing.

The complete merge logic is moved into virtio-blk.c which has
been the only user of request merging ever since. This is required
to be able to merge chunks of requests and immediately invoke callbacks
for those requests. Secondly, this is required to switch to
direct invocation of coroutines which is planned at a later stage.

The following benchmarks show the performance of running fio with
4 worker threads on a local ram disk. The numbers show the average
of 10 test runs after 1 run as warmup phase.

              |        4k        |       64k        |        4k
MB/s          | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand
--------------+--------+---------+--------+---------+--------+--------
master        | 1221   | 1187    | 4178   | 4114    | 1745   | 1213
multiread     | 1829   | 1189    | 4639   | 4110    | 1894   | 1216

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block-backend: expose bs->bl.max_transfer_length

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

hw/virtio-blk: add a constant for max number of merged requests

As it was not obvious (at least for me) where the 32 comes from;
add a constant for it.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: add accounting for merged requests

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qed: Really remove unused field QEDAIOCB.finished

The commit 533ffb17a that removed qed_aiocb_info.cancel said to remove
this but didn't do it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-img: Add QEMU_PKGVERSION to QEMU_IMG_VERSION

This is the same way vl.c handles this.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: change default for discard and write zeroes to INT_MAX

do not trim requests if the driver does not supply a limit
through BlockLimits. For write zeroes we still keep a limit
for the unsupported path to avoid allocating a big bounce buffer.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes

This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported.
Unfortunately, FALLOC_FL_ZERO_RANGE is supported on really modern systems
and only for a couple of filesystems. FALLOC_FL_PUNCH_HOLE is much more
mature.

The sequence of 2 operations FALLOC_FL_PUNCH_HOLE and 0 is necessary due
to the following reasons:
- FALLOC_FL_PUNCH_HOLE creates a hole in the file, the file becomes
  sparse. In order to retain original functionality we must allocate
  disk space afterwards. This is done using fallocate(0) call
- fallocate(0) without preceeding FALLOC_FL_PUNCH_HOLE will do nothing
  if called above already allocated areas of the file, i.e. the content
  will not be zeroed

This should increase the performance a bit for not-so-modern kernels.

CC: Max Reitz <mreitz@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Fam Zheng <famz@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/raw-posix: call plain fallocate in handle_aiocb_write_zeroes

There is a possibility that we are extending our image and thus writing
zeroes beyond the end of the file. In this case we do not need to care
about the hole to make sure that there is no data in the file under
this offset (pre-condition to fallocate(0) to work). We could simply call
fallocate(0).

This improves the performance of writing zeroes even on really old
platforms which do not have even FALLOC_FL_PUNCH_HOLE.

Before the patch do_fallocate was used when either
CONFIG_FALLOCATE_PUNCH_HOLE or CONFIG_FALLOCATE_ZERO_RANGE are defined.
Now the story is different. CONFIG_FALLOCATE is defined when Linux
fallocate is defined, posix_fallocate is completely different story
(CONFIG_POSIX_FALLOCATE). CONFIG_FALLOCATE is mandatory prerequite
for both CONFIG_FALLOCATE_PUNCH_HOLE and CONFIG_FALLOCATE_ZERO_RANGE
thus we are on the safe side.

CC: Max Reitz <mreitz@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Fam Zheng <famz@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes

This efficiently writes zeroes on Linux if the kernel is capable enough.
FALLOC_FL_ZERO_RANGE correctly handles all cases, including and not
including file expansion.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Fam Zheng <famz@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/raw-posix: refactor handle_aiocb_write_zeroes a bit

move code dealing with a block device to a separate function. This will
allow to implement additional processing for ordinary files.

Please note, that xfs_code has been moved before checking for
s->has_write_zeroes as xfs_write_zeroes does not touch this flag inside.
This makes code a bit more consistent.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Fam Zheng <famz@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/raw-posix: create do_fallocate helper

The pattern
    do {
        if (fallocate(s->fd, mode, offset, len) == 0) {
            return 0;
        }
    } while (errno == EINTR);
    ret = translate_err(-errno);
will be commonly useful in next patches. Create helper for it.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Fam Zheng <famz@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/raw-posix: create translate_err helper to merge errno values

actually the code
    if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP ||
        ret == -ENOTTY) {
        ret = -ENOTSUP;
    }
is present twice and will be added a couple more times. Create helper
for this.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Fam Zheng <famz@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

atapi migration: Throw recoverable error to avoid recovery

(With the previous atapi_dma flag recovery)
If migration happens between the ATAPI command being written and the
bmdma being started, the DMA is dropped. Eventually the guest times
out and recovers, but that can take many seconds.
(This is rare, on a pingpong reading the CD continuously I hit
this about ~1/30-1/50 migrates)

I don't think we've got enough state to be able to recover safely
at this point, so I throw a 'medium error, no seek complete'
that I'm assuming guests will try and recover from an apparently
dirty CD.

OK, it's a hack, the real solution is probably to push a lot of
ATAPI state into the migration stream, but this is a fix that
works with no stream changes. Tested only on Linux (both RHEL5
(pre-libata) and RHEL7).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Restore atapi_dma flag across migration

If a migration happens just after the guest has kicked
off an ATAPI command and kicked off DMA, we lose the atapi_dma
flag, and the destination tries to complete the command as PIO
rather than DMA. This upsets Linux; modern libata based kernels
stumble and recover OK, older kernels end up passing bad data
to userspace.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

softfloat: expand out STATUS macro

Expand out and remove the STATUS macro.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>

softfloat: expand out STATUS_VAR

Expand out and remove the STATUS_VAR macro.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>

softfloat: Expand out the STATUS_PARAM macro

Expand out STATUS_PARAM wherever it is used and delete the definition.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>

Merge remote-tracking branch 'remotes/stefanha/tags/net-pull-request' into staging

# gpg: Signature made Fri 06 Feb 2015 14:10:40 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/net-pull-request:
  monitor: more accurate completion for host_net_remove()
  net: del hub port when peer is deleted
  net: remove the wrong comment in net_init_hubport()
  monitor: print hub port name during info network
  rtl8139: simplify timer logic
  MAINTAINERS: add Jason Wang as net subsystem maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

monitor: more accurate completion for host_net_remove()

Current completion for host_net_remove will show hub ports and clients
that were not peered with hub ports. Fix this.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-id: 1422860798-17495-4-git-send-email-jasowang@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

net: del hub port when peer is deleted

We should del hub port when peer is deleted since it will not be reused
and will only be freed during exit.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-id: 1422860798-17495-3-git-send-email-jasowang@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

net: remove the wrong comment in net_init_hubport()

Not only nic could be the one to peer.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-id: 1422860798-17495-2-git-send-email-jasowang@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

monitor: print hub port name during info network

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-id: 1422860798-17495-1-git-send-email-jasowang@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

rtl8139: simplify timer logic

Pavel Dovgalyuk reports that TimerExpire and the timer are not restored
correctly on the receiving end of migration.

It is not clear to me whether this is really the case, but we can take
the occasion to get rid of the complicated code that computes PCSTimeout
on the fly upon changes to IntrStatus/IntrMask. Just always keep a
timer running, it will fire every ~130 seconds at most if the interrupt
is masked with TimerInt != 0.

This makes rtl8139_set_next_tctr_time idempotent (when the virtual clock
is stopped between two calls, as is the case during migration).

Tested with Frediano's qtest.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1421765099-26190-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

# gpg: Signature made Fri 06 Feb 2015 13:45:06 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
trace: Print PID and time in stderr traces

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

trace: Print PID and time in stderr traces

When debugging migration it's useful to know the PID of
each trace message so you can figure out if it came from the source
or the destination.

Printing the time makes it easy to do latency measurements or timings
between trace points.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1421746875-9962-1-git-send-email-dgilbert@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20150205' into staging

migration/next for 20150205

# gpg: Signature made Thu 05 Feb 2015 16:17:08 GMT using RSA key ID 5872D723
# gpg: Can't check signature: public key not found

* remotes/juanquintela/tags/migration/20150205:
  fix mc146818rtc wrong subsection name to avoid vmstate_subsection_load() fail
  Tracify migration/rdma.c
  Add migration stream analyzation script
  migration: Append JSON description of migration stream
  qemu-file: Add fast ftell code path
  QJSON: Add JSON writer
  Print errors in some of the early migration failure cases.
  Migration: Add lots of trace events
  savevm: Convert fprintf to error_report
  vmstate-static-checker: update whitelist

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/armbru/tags/pull-cov-model-2015-02-05' into staging

coverity: Improve and extend model

# gpg: Signature made Thu 05 Feb 2015 16:20:49 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-cov-model-2015-02-05:
  MAINTAINERS: Add myself as Coverity model maintainer
  coverity: Model g_free() isn't necessarily free()
  coverity: Model GLib string allocation partially
  coverity: Improve model for GLib memory allocation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

fix mc146818rtc wrong subsection name to avoid vmstate_subsection_load() fail

fix mc146818rtc wrong subsection name to avoid vmstate_subsection_load() fail
during incoming migration or loadvm.

Signed-off-by: Zhang Haoyu <zhanghy@sangfor.com.cn>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

MAINTAINERS: Add myself as Coverity model maintainer

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

Tracify migration/rdma.c

Turn all the D/DD/DDDPRINTFs into trace events
Turn most of the fprintf(stderr, into error_report

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Add migration stream analyzation script

This patch adds a python tool to the scripts directory that can read
a dumped migration stream if it contains the JSON description of the
device states. I constructs a human readable JSON stream out of it.

It's very simple to use:

  $ qemu-system-x86_64
    (qemu) migrate "exec:cat > mig"
  $ ./scripts/analyze_migration.py -f mig

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

migration: Append JSON description of migration stream

One of the annoyances of the current migration format is the fact that
it's not self-describing. In fact, it's not properly describing at all.
Some code randomly scattered throughout QEMU elaborates roughly how to
read and write a stream of bytes.

We discussed an idea during KVM Forum 2013 to add a JSON description of
the migration protocol itself to the migration stream. This patch
adds a section after the VM_END migration end marker that contains
description data on what the device sections of the stream are composed of.

This approach is backwards compatible with any QEMU version reading the
stream, because QEMU just stops reading after the VM_END marker and ignores
any data following it.

With an additional external program this allows us to decipher the
contents of any migration stream and hopefully make migration bugs easier
to track down.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

qemu-file: Add fast ftell code path

For ftell we flush the output buffer to ensure that we don't have anything
lingering in our internal buffers. This is a very safe thing to do.

However, with the dynamic size measurement that the dynamic vmstate
description will bring this would turn out quite slow.

Instead, we can fast path this specific measurement and just take the
internal buffers into account when telling the kernel our position.

I'm sure I overlooked some corner cases where this doesn't work, so
instead of tuning the safe, existing version, this patch adds a fast
variant of ftell that gets used by the dynamic vmstate description code
which isn't critical when it fails.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

QJSON: Add JSON writer

To support programmatic JSON assembly while keeping the code that generates it
readable, this patch introduces a simple JSON writer. It emits JSON serially
into a buffer in memory.

The nice thing about this writer is its simplicity and low memory overhead.
Unlike the QMP JSON writer, this one does not need to spawn QObjects for every
element it wants to represent.

This is a prerequisite for the migration stream format description generator.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Print errors in some of the early migration failure cases.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Migration: Add lots of trace events

Mostly on the load side, so that when we get a complaint about
a migration failure we can figure out what it didn't like.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

savevm: Convert fprintf to error_report

Convert a bunch of fprintfs to error_reports

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

vmstate-static-checker: update whitelist

Commit 22382bb96c8bd88370c1ff0cb28c3ee6bee79ed3 renamed the
'hw_cursor_x' and 'hw_cursor_y' fields in cirrus_vga. Update the static
checker's whitelist to allow matching against the old and new names.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

coverity: Model g_free() isn't necessarily free()

Memory allocated with GLib needs to be freed with GLib.  Freeing it
with free() instead of g_free() is a common error.  Harmless when
g_free() is a trivial wrapper around free(), which is commonly the
case.  But model the difference anyway.

In a local scan, this flags four ALLOC_FREE_MISMATCH.  Requires
--enable ALLOC_FREE_MISMATCH, because the checker is still preview.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

coverity: Model GLib string allocation partially

Without a model, Coverity can't know that the result of g_strdup()
needs to be fed to g_free().

One way to get such a model is to scan GLib, build a derived model
file with cov-collect-models, and use that when scanning QEMU.
Unfortunately, the Coverity Scan service we use doesn't support that.

Thus, we're stuck with the other way: write a user model.  Doing that
for all of GLib is hardly practical.  I'm doing it for the "String
Utility Functions" we actually use that return dynamically allocated
strings.

In a local scan, this flags 20 additional RESOURCE_LEAKs.  The ones I
checked look genuine.

It also loses a NULL_RETURNS about ppce500_init() using
qemu_find_file() without error checking.  I don't understand why.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

coverity: Improve model for GLib memory allocation

In current versions of GLib, g_new() may expand into g_malloc_n().
When it does, Coverity can't see the memory allocation, because we
don't model g_malloc_n().  Similarly for g_new0(), g_renew(),
g_try_new(), g_try_new0(), g_try_renew().

Model g_malloc_n(), g_malloc0_n(), g_realloc_n().  Model
g_try_malloc_n(), g_try_malloc0_n(), g_try_realloc_n() by adding
indeterminate out of memory conditions on top.

To avoid undue duplication, replace the existing models for g_malloc()
& friends by trivial wrappers around g_malloc_n() & friends.

In a local scan, this flags four additional RESOURCE_LEAKs and one
NULL_RETURNS.

The NULL_RETURNS is a false positive: Coverity can now see that
g_try_malloc(l1_sz * sizeof(uint64_t)) in
qcow2_check_metadata_overlap() may return NULL, but is too stupid to
recognize that a loop executing l1_sz times won't be entered then.

Three out of the four RESOURCE_LEAKs appear genuine.  The false
positive is in ppce500_prep_device_tree(): the pointer dies, but a
pointer to a struct member escapes, and we get the pointer back for
freeing with container_of().  Too funky for Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20150205' into staging

target-arm queue:
* refactor/clean up armv7m_init()
* some initial cleanup in the direction of supporting 64-bit EL3
* fix broken synchronization of registers between QEMU and KVM
   for 32-bit ARM hosts (which among other things broke memory
   access via gdbstub)
* fix flush-to-zero handling in FMULX, FRECPS, FRSQRTS and FRECPE
* don't crash QEMU for UNPREDICTABLE BFI insns in A32 encoding
* explain why virt board's device-to-transport mapping code is
   the way it is
* implement mmu_idx values which match the architectural
   distinctions, and introduce the concept of a translation
   regime to get_phys_addr() rather than incorrectly looking
   at the current CPU state
* update to upstream VIXL 1.7 (gives us correct code addresses
   when dissassembling pc-relative references)
* sync system register state between KVM and QEMU for 64-bit ARM
* support virtio on big-endian guests by implementing the
   "which endian is the guest now?" CPU method

# gpg: Signature made Thu 05 Feb 2015 14:02:16 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20150205: (28 commits)
  target-arm: fix for exponent comparison in recpe_f64
  target-arm: Guest cpu endianness determination for virtio KVM ARM/ARM64
  target-arm: KVM64: Get and Sync up guest register state like kvm32.
  disas/arm-a64.cc: Tell libvixl correct code addresses
  disas/libvixl: Update to upstream VIXL 1.7
  target-arm: Fix brace style in reindented code
  target-arm: Reindent ancient page-table-walk code
  target-arm: Use mmu_idx in get_phys_addr()
  target-arm: Pass mmu_idx to get_phys_addr()
  target-arm: Split AArch64 cases out of ats_write()
  target-arm: Don't define any MMU_MODE*_SUFFIXes
  target-arm: Use correct mmu_idx for unprivileged loads and stores
  target-arm: Define correct mmu_idx values and pass them in TB flags
  target-arm/translate-a64: Fix wrong mmu_idx usage for LDT/STT
  target-arm: Make arm_current_el() return sensible values for M profile
  cpu_ldst.h: Allow NB_MMU_MODES to be 7
  hw/arm/virt: explain device-to-transport mapping in create_virtio_devices()
  target-arm: check that LSB <= MSB in BFI instruction
  target-arm: Squash input denormals in FRECPS and FRSQRTS
  Fix FMULX not squashing denormalized inputs when FZ is set.
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target-arm: fix for exponent comparison in recpe_f64

f64 exponent in HELPER(recpe_f64) should be compared to 2045 rather than 1023
(FPRecipEstimate in ARMV8 spec). This fixes incorrect underflow handling when
flushing denormals to zero in the FRECPE instructions operating on 64-bit
values.

Signed-off-by: Ildar Isaev <ild@inbox.ru>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target-arm: Guest cpu endianness determination for virtio KVM ARM/ARM64

This patch implements a fucntion pointer "virtio_is_big_endian"
from "CPUClass" structure for arm/arm64.
Function arm_cpu_is_big_endian() is added to determine and
return the guest cpu endianness to virtio.
This is required for running cross endian guests with virtio on ARM/ARM64.

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Message-id: 1423130382-18640-3-git-send-email-pranavkumar@linaro.org
[PMM: check CPSR_E in env->cpsr_uncached, not env->pstate.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target-arm: KVM64: Get and Sync up guest register state like kvm32.

This patch adds:
1. Call write_kvmstate_to_list() and write_list_to_cpustate()
in kvm_arch_get_registers() to sync guest register state.
2. Call write_list_to_kvmstate() in kvm_arch_put_registers()
to sync guest register state.

These changes are already there for kvm32 in target-arm/kvm32.c.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Message-id: 1423130382-18640-2-git-send-email-pranavkumar@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

disas/arm-a64.cc: Tell libvixl correct code addresses

disassembling relative branches in code which doesn't reside at
what the guest CPU would think its execution address is. Use
the new MapCodeAddress() API to tell libvixl where the code is
from the guest CPU's point of view so it can get the target
addresses right.

Previous disassembly:

0x0000000040000000:  580000c0      ldr x0, pc+24 (addr 0x7f6cb7020434)
0x0000000040000004:  aa1f03e1      mov x1, xzr
0x0000000040000008:  aa1f03e2      mov x2, xzr
0x000000004000000c:  aa1f03e3      mov x3, xzr
0x0000000040000010:  58000084      ldr x4, pc+16 (addr 0x7f6cb702042c)
0x0000000040000014:  d61f0080      br x4

Fixed disassembly:
0x0000000040000000:  580000c0      ldr x0, pc+24 (addr 0x40000018)
0x0000000040000004:  aa1f03e1      mov x1, xzr
0x0000000040000008:  aa1f03e2      mov x2, xzr
0x000000004000000c:  aa1f03e3      mov x3, xzr
0x0000000040000010:  58000084      ldr x4, pc+16 (addr 0x40000020)
0x0000000040000014:  d61f0080      br x4

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1422274779-13359-3-git-send-email-peter.maydell@linaro.org