OSDN Git Service

sagit-ice-cold/kernel_xiaomi_msm8998.git
10 years agoMerge branch 'xfs-misc-fixes-3-for-3.16' into for-next
Dave Chinner [Mon, 9 Jun 2014 21:32:56 +0000 (07:32 +1000)]
Merge branch 'xfs-misc-fixes-3-for-3.16' into for-next

10 years agoMerge branch 'xfs-da-geom' into for-next
Dave Chinner [Mon, 9 Jun 2014 21:32:41 +0000 (07:32 +1000)]
Merge branch 'xfs-da-geom' into for-next

10 years agoxfs: fix xfs_da_args sparse warning in xfs_readdir
Dave Chinner [Mon, 9 Jun 2014 21:30:36 +0000 (07:30 +1000)]
xfs: fix xfs_da_args sparse warning in xfs_readdir

The kbuild test robot reported:

>> fs/xfs/xfs_dir2_readdir.c:672:41: sparse: Using plain integer as NULL pointer

Fix it.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: Fix rounding in xfs_alloc_fix_len()
Jan Kara [Fri, 6 Jun 2014 06:06:37 +0000 (16:06 +1000)]
xfs: Fix rounding in xfs_alloc_fix_len()

Rounding in xfs_alloc_fix_len() is wrong. As the comment states, the
result should be a number of a form (k*prod+mod) however due to sign
mistake the result is different. As a result allocations on raid arrays
could be misaligned in some cases.

This also seems to fix occasional assertion failure:
XFS_WANT_CORRUPTED_GOTO(rlen <= flen, error0)
in xfs_alloc_ag_vextent_size().

Also add an assertion that the result of xfs_alloc_fix_len() is of
expected form.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: tone down writepage/releasepage WARN_ONs
Christoph Hellwig [Fri, 6 Jun 2014 06:05:15 +0000 (16:05 +1000)]
xfs: tone down writepage/releasepage WARN_ONs

I recently ran into the issue fixed by

  "xfs: kill buffers over failed write ranges properly"

which spams the log with lots of backtraces.  Make debugging any
issues like that easier by using WARN_ON_ONCE in the writeback code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: small cleanup in xfs_lowbit64()
Dan Carpenter [Fri, 6 Jun 2014 06:04:42 +0000 (16:04 +1000)]
xfs: small cleanup in xfs_lowbit64()

There are two checkpatch.pl complaints here because of the bad
indenting and because of the assignment inside the condition.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: kill xfs_buf_geterror()
Dave Chinner [Fri, 6 Jun 2014 06:02:12 +0000 (16:02 +1000)]
xfs: kill xfs_buf_geterror()

Most of the callers are just calling ASSERT(!xfs_buf_geterror())
which means they are checking for bp->b_error == 0. If bp is null in
this case, we will assert fail, and hence it's no different in
result to oopsing because of a null bp. In some cases, errors have
already been checked for or the function returning the buffer can't
return a buffer with an error, so it's just a redundant assert.
Either way, the assert can either be removed.

The other two non-assert callers can just test for a buffer and
error properly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: xfs_readsb needs to check for magic numbers
Dave Chinner [Fri, 6 Jun 2014 06:00:43 +0000 (16:00 +1000)]
xfs: xfs_readsb needs to check for magic numbers

Commit daba542 ("xfs: skip verification on initial "guess"
superblock read") dropped the use of a verifier for the initial
superblock read so we can probe the sector size of the filesystem
stored in the superblock. It, however, now fails to validate that
what was read initially is actually an XFS superblock and hence will
fail the sector size check and return ENOSYS.

This causes probe-based mounts to fail because it expects XFS to
return EINVAL when it doesn't recognise the superblock format.

cc: <stable@vger.kernel.org>
Reported-by: Plamen Petrov <plamen.sisi@gmail.com>
Tested-by: Plamen Petrov <plamen.sisi@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: block allocation work needs to be kswapd aware
Dave Chinner [Fri, 6 Jun 2014 05:59:59 +0000 (15:59 +1000)]
xfs: block allocation work needs to be kswapd aware

Upon memory pressure, kswapd calls xfs_vm_writepage() from
shrink_page_list(). This can result in delayed allocation occurring
and that gets deferred to the the allocation workqueue.

The allocation then runs outside kswapd context, which means if it
needs memory (and it does to demand page metadata from disk) it can
block in shrink_inactive_list() waiting for IO congestion. These
blocking waits are normally avoiding in kswapd context, so under
memory pressure writeback from kswapd can be arbitrarily delayed by
memory reclaim.

To avoid this, pass the kswapd context to the allocation being done
by the workqueue, so that memory reclaim understands correctly that
the work is being done for kswapd and therefore it is not blocked
and does not delay memory reclaim.

To avoid issues with int->char conversion of flag fields (as noticed
in v1 of this patch) convert the flag fields in the struct
xfs_bmalloca to bool types. pahole indicates these variables are
still single byte variables, so no extra space is consumed by this
change.

cc: <stable@vger.kernel.org>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: remove redundant geometry information from xfs_da_state
Dave Chinner [Fri, 6 Jun 2014 05:22:04 +0000 (15:22 +1000)]
xfs: remove redundant geometry information from xfs_da_state

It's carried in state->args->geo, so there's no need to duplicate it
and use more stack space than necessary.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: replace attr LBSIZE with xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:21:45 +0000 (15:21 +1000)]
xfs: replace attr LBSIZE with xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: pass xfs_da_args to xfs_attr_leaf_newentsize
Dave Chinner [Fri, 6 Jun 2014 05:21:27 +0000 (15:21 +1000)]
xfs: pass xfs_da_args to xfs_attr_leaf_newentsize

As it's only ever called from contexts where the xfs_da_args is
present and contains all the information needed inside the args
structure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: use xfs_da_geometry for block size in attr code
Dave Chinner [Fri, 6 Jun 2014 05:21:10 +0000 (15:21 +1000)]
xfs: use xfs_da_geometry for block size in attr code

Rather than using the superblock value obtained through the
xfs_mount.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: remove mp->m_dir_geo from directory logging
Dave Chinner [Fri, 6 Jun 2014 05:20:54 +0000 (15:20 +1000)]
xfs: remove mp->m_dir_geo from directory logging

We don't pass the xfs_da_args or the geometry all the way down to
the directory buffer logging code, hence we have to use
mp->m_dir_geo here. Fix this to use the geometry passed via the
xfs_da_args, and convert all the directory logging functions for
consistency.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: reduce direct usage of mp->m_dir_geo
Dave Chinner [Fri, 6 Jun 2014 05:20:32 +0000 (15:20 +1000)]
xfs: reduce direct usage of mp->m_dir_geo

There are many places in the directory code were we don't pass the
args into and so have to extract the geometry direct from the mount
structure. Push the args or the geometry into these leaf functions
so that we don't need to grab it from the struct xfs_mount.

This, in turn, brings use to the point where directory geometry is
no longer a property of the struct xfs_mount; it is not a global
property anymore, and hence we can start to consider per-directory
configuration of physical geometries.

Start by converting the xfs_dir_isblock/leaf code - pass in the
xfs_da_args and convert the readdir code to use xfs_da_args like
the rest of the directory code to pass information around.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: move node entry counts to xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:20:02 +0000 (15:20 +1000)]
xfs: move node entry counts to xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: convert dir/attr btree threshold to xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:18:10 +0000 (15:18 +1000)]
xfs: convert dir/attr btree threshold to xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: convert m_dirblksize to xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:15:59 +0000 (15:15 +1000)]
xfs: convert m_dirblksize to xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: convert m_dirblkfsbs to xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:14:11 +0000 (15:14 +1000)]
xfs: convert m_dirblkfsbs to xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: convert directory segment limits to xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:11:18 +0000 (15:11 +1000)]
xfs: convert directory segment limits to xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: convert directory db conversion to xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:08:18 +0000 (15:08 +1000)]
xfs: convert directory db conversion to xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: convert directory dablk conversion to xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:07:53 +0000 (15:07 +1000)]
xfs: convert directory dablk conversion to xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: convert dir byte/off conversion to xfs_da_geometry
Dave Chinner [Fri, 6 Jun 2014 05:06:53 +0000 (15:06 +1000)]
xfs: convert dir byte/off conversion to xfs_da_geometry

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: kill XFS_DIR2...FIRSTDB macros
Dave Chinner [Fri, 6 Jun 2014 05:04:41 +0000 (15:04 +1000)]
xfs: kill XFS_DIR2...FIRSTDB macros

They are just simple wrappers around xfs_dir2_byte_to_db(), and
we've already removed one usage earlier in the patch set. Kill
the rest before we start removing the xfs_mount from conversion
functions.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: move directory block translatiosn to xfs_dir2_priv.h
Dave Chinner [Fri, 6 Jun 2014 05:04:05 +0000 (15:04 +1000)]
xfs: move directory block translatiosn to xfs_dir2_priv.h

Because they aren't actually part of the on-disk format, and so
shouldn't be in xfs_da_format.h.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: introduce directory geometry structure
Dave Chinner [Fri, 6 Jun 2014 05:01:58 +0000 (15:01 +1000)]
xfs: introduce directory geometry structure

The directory code has a dependency on the struct xfs_mount to
supply the directory block geometry. Block size, block log size,
and other parameters are pre-caclulated in the struct xfs_mount or
access directly from the superblock embedded in the struct
xfs_mount.

Extract all of this geometry information out of the struct xfs_mount
and superblock and place it into a new struct xfs_da_geometry
defined by the directory code. Allocate and initialise it at mount
time, and attach it to the struct xfs_mount so it canbe passed back
into the directory code appropriately rather than using the struct
xfs_mount.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoMerge branch 'xfs-feature-bit-cleanup' into for-next
Dave Chinner [Mon, 19 May 2014 22:57:02 +0000 (08:57 +1000)]
Merge branch 'xfs-feature-bit-cleanup' into for-next

Conflicts:
fs/xfs/xfs_inode.c

10 years agoMerge branch 'xfs-misc-fixes-2-for-3.16' into for-next
Dave Chinner [Mon, 19 May 2014 22:56:00 +0000 (08:56 +1000)]
Merge branch 'xfs-misc-fixes-2-for-3.16' into for-next

Conflicts:
fs/xfs/xfs_ialloc.c

10 years agoxfs: fix compile error when libxfs header used in C++ code
Roger Willcocks [Mon, 19 May 2014 22:52:21 +0000 (08:52 +1000)]
xfs: fix compile error when libxfs header used in C++ code

xfs_ialloc.h:102: error: expected ',' or '...' before 'delete'

Simple parameter rename, no changes to behaviour.

Signed-off-by: Roger Willcocks <roger@filmlight.ltd.uk>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: fix infinite loop at xfs_vm_writepage on 32bit system
Jie Liu [Mon, 19 May 2014 22:24:26 +0000 (08:24 +1000)]
xfs: fix infinite loop at xfs_vm_writepage on 32bit system

Write to a file with an offset greater than 16TB on 32-bit system and
then trigger page write-back via sync(1) will cause task hang.

# block_size=4096
# offset=$(((2**32 - 1) * $block_size))
# xfs_io -f -c "pwrite $offset $block_size" /storage/test_file
# sync

INFO: task sync:2590 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
sync            D c1064a28     0  2590   2097 0x00000000
.....
Call Trace:
[<c1064a28>] ? ttwu_do_wakeup+0x18/0x130
[<c1066d0e>] ? try_to_wake_up+0x1ce/0x220
[<c1066dbf>] ? wake_up_process+0x1f/0x40
[<c104fc2e>] ? wake_up_worker+0x1e/0x30
[<c15b6083>] schedule+0x23/0x60
[<c15b3c2d>] schedule_timeout+0x18d/0x1f0
[<c12a143e>] ? do_raw_spin_unlock+0x4e/0x90
[<c10515f1>] ? __queue_delayed_work+0x91/0x150
[<c12a12ef>] ? do_raw_spin_lock+0x3f/0x100
[<c12a143e>] ? do_raw_spin_unlock+0x4e/0x90
[<c15b5b5d>] wait_for_completion+0x7d/0xc0
[<c1066d60>] ? try_to_wake_up+0x220/0x220
[<c116a4d2>] sync_inodes_sb+0x92/0x180
[<c116fb05>] sync_inodes_one_sb+0x15/0x20
[<c114a8f8>] iterate_supers+0xb8/0xc0
[<c116faf0>] ? fdatawrite_one_bdev+0x20/0x20
[<c116fc21>] sys_sync+0x31/0x80
[<c15be18d>] sysenter_do_call+0x12/0x28

This issue can be triggered via xfstests/generic/308.

The reason is that the end_index is unsigned long with maximum value
'2^32-1=4294967295' on 32-bit platform, and the given offset cause it
wrapped to 0, so that the following codes will repeat again and again
until the task schedule time out:

end_index = offset >> PAGE_CACHE_SHIFT;
last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
if (page->index >= end_index) {
unsigned offset_into_page = offset & (PAGE_CACHE_SIZE - 1);
        /*
         * Just skip the page if it is fully outside i_size, e.g. due
         * to a truncate operation that is in progress.
         */
        if (page->index >= end_index + 1 || offset_into_page == 0) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
unlock_page(page);
return 0;
}

In order to check if a page is fully outsids i_size or not, we can fix
the code logic as below:
if (page->index > end_index ||
    (page->index == end_index && offset_into_page == 0))

Secondly, there still has another similar issue when calculating the
end offset for mapping the filesystem blocks to the file blocks for
delalloc.  With the same tests to above, run unmount(8) will cause
kernel panic if CONFIG_XFS_DEBUG is enabled:

XFS: Assertion failed: XFS_FORCED_SHUTDOWN(ip->i_mount) || \
ip->i_delayed_blks == 0, file: fs/xfs/xfs_super.c, line: 964

kernel BUG at fs/xfs/xfs_message.c:108!
invalid opcode: 0000 [#1] SMP
task: edddc100 ti: ec6ee000 task.ti: ec6ee000
EIP: 0060:[<f83d87cb>] EFLAGS: 00010296 CPU: 1
EIP is at assfail+0x2b/0x30 [xfs]
..............
Call Trace:
[<f83d9cd4>] xfs_fs_destroy_inode+0x74/0x120 [xfs]
[<c115ddf1>] destroy_inode+0x31/0x50
[<c115deff>] evict+0xef/0x170
[<c115dfb2>] dispose_list+0x32/0x40
[<c115ea3a>] evict_inodes+0xca/0xe0
[<c1149706>] generic_shutdown_super+0x46/0xd0
[<c11497b9>] kill_block_super+0x29/0x70
[<c1149a14>] deactivate_locked_super+0x44/0x70
[<c114a427>] deactivate_super+0x47/0x60
[<c1161c3d>] mntput_no_expire+0xcd/0x120
[<c1162ae8>] SyS_umount+0xa8/0x370
[<c1162dce>] SyS_oldumount+0x1e/0x20
[<c15be18d>] sysenter_do_call+0x12/0x28

That because the end_offset is evaluated to 0 which is the same reason
to above, hence the mapping and covertion for dealloc file blocks to
file system blocks did not happened.

This patch just fixed both issues.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: remove redundant checks from xfs_da_read_buf
Dave Chinner [Mon, 19 May 2014 22:23:06 +0000 (08:23 +1000)]
xfs: remove redundant checks from xfs_da_read_buf

All of the verification checks of magic numbers are now done by
verifiers, so ther eis no need to check them again once the buffer
has been successfully read. If the magic number is bad, it won't
even get to that code to verify it so it really serves no purpose at
all anymore. Remove it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: log vector rounding leaks log space
Dave Chinner [Mon, 19 May 2014 22:18:09 +0000 (08:18 +1000)]
xfs: log vector rounding leaks log space

The addition of direct formatting of log items into the CIL
linear buffer added alignment restrictions that the start of each
vector needed to be 64 bit aligned. Hence padding was added in
xlog_finish_iovec() to round up the vector length to ensure the next
vector started with the correct alignment.

This adds a small number of bytes to the size of
the linear buffer that is otherwise unused. The issue is that we
then use the linear buffer size to determine the log space used by
the log item, and this includes the unused space. Hence when we
account for space used by the log item, it's more than is actually
written into the iclogs, and hence we slowly leak this space.

This results on log hangs when reserving space, with threads getting
stuck with these stack traces:

Call Trace:
[<ffffffff81d15989>] schedule+0x29/0x70
[<ffffffff8150d3a2>] xlog_grant_head_wait+0xa2/0x1a0
[<ffffffff8150d55d>] xlog_grant_head_check+0xbd/0x140
[<ffffffff8150ee33>] xfs_log_reserve+0x103/0x220
[<ffffffff814b7f05>] xfs_trans_reserve+0x2f5/0x310
.....

The 4 bytes is significant. Brain Foster did all the hard work in
tracking down a reproducable leak to inode chunk allocation (it went
away with the ikeep mount option). His rough numbers were that
creating 50,000 inodes leaked 11 log blocks. This turns out to be
roughly 800 inode chunks or 1600 inode cluster buffers. That
works out at roughly 4 bytes per cluster buffer logged, and at that
I started looking for a 4 byte leak in the buffer logging code.

What I found was that a struct xfs_buf_log_format structure for an
inode cluster buffer is 28 bytes in length. This gets rounded up to
32 bytes, but the vector length remains 28 bytes. Hence the CIL
ticket reservation is decremented by 32 bytes (via lv->lv_buf_len)
for that vector rather than 28 bytes which are written into the log.

The fix for this problem is to separately track the bytes used by
the log vectors in the item and use that instead of the buffer
length when accounting for the log space that will be used by the
formatted log item.

Again, thanks to Brian Foster for doing all the hard work and long
hours to isolate this leak and make finding the bug relatively
simple.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: remove XFS_TRANS_RESERVE in collapse range
Namjae Jeon [Mon, 19 May 2014 22:15:57 +0000 (08:15 +1000)]
xfs: remove XFS_TRANS_RESERVE in collapse range

There is no need to dip into reserve pool. Reserve pool is used for much
more important things. And xfs_trans_reserve will never return ENOSPC
because punch hole is already done. If we get ENOSPC, collapse range
will be simply failed.

Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: remove shared supberlock feature checking
Dave Chinner [Mon, 19 May 2014 21:47:05 +0000 (07:47 +1000)]
xfs: remove shared supberlock feature checking

We reject any filesystem that is mounted with this feature bit set,
so we don't need to check for it anywhere else. Remove the function
for checking if the feature bit is set and any code that uses it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: don't need dirv2 checks anymore
Dave Chinner [Mon, 19 May 2014 21:46:55 +0000 (07:46 +1000)]
xfs: don't need dirv2 checks anymore

If the the V2 directory feature bit is not set in the superblock
feature mask the filesystem will fail the good version check.
Hence we don't need any other version checking on the dir2 feature
bit in the code as the filesystem will not mount without it set.
Remove the checking code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: turn NLINK feature on by default
Dave Chinner [Mon, 19 May 2014 21:46:40 +0000 (07:46 +1000)]
xfs: turn NLINK feature on by default

mkfs has turned on the XFS_SB_VERSION_NLINKBIT feature bit by
default since November 2007. It's about time we simply made the
kernel code turn it on by default and so always convert v1 inodes to
v2 inodes when reading them in from disk or allocating them. This
This removes needless version checks and modification when bumping
link counts on inodes, and will take code out of a few common code
paths.

   text    data     bss     dec     hex filename
 783251  100867     616  884734   d7ffe fs/xfs/xfs.o.orig
 782664  100867     616  884147   d7db3 fs/xfs/xfs.o.patched

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: keep sb_bad_features2 the same a sb_features2
Dave Chinner [Mon, 19 May 2014 21:41:43 +0000 (07:41 +1000)]
xfs: keep sb_bad_features2 the same a sb_features2

Whenever we update sb_features2, we need to update sb_bad_features2
so that they remain identical on disk. This prevents future mounts
or userspace utilities from getting confused over which features the
filesystem supports.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: make superblock version checks reflect reality
Dave Chinner [Mon, 19 May 2014 21:41:16 +0000 (07:41 +1000)]
xfs: make superblock version checks reflect reality

We only support filesystems that have v2 directory support, and than
means all the checking and handling of superblock versions prior to
this support being added is completely unnecessary overhead.

Strip out all the version 1-3 support, sanitise the good version
checking to reflect the supported versions, update all the feature
supported functions and clean up all the support bit definitions to
reflect the fact that we no longer care about Irix bootloader flag
regions for v4 feature bits. Also, convert the return values to
boolean types and remove typedefs from function declarations to
clean up calling conventions, too.

Because the feature bit checking is all inline code, this relatively
small cleanup has a noticable impact on code size:

   text    data     bss     dec     hex filename
 785195  100867     616  886678   d8796 fs/xfs/xfs.o.orig
 783595  100867     616  885078   d8156 fs/xfs/xfs.o.patched

i.e. it reduces it by 1600 bytes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoMerge branch 'xfs-attr-cleanup' into for-next
Dave Chinner [Wed, 14 May 2014 23:39:28 +0000 (09:39 +1000)]
Merge branch 'xfs-attr-cleanup' into for-next

Conflicts:
fs/xfs/xfs_attr.c

10 years agoMerge branch 'xfs-misc-fixes-1-for-3.16' into for-next
Dave Chinner [Wed, 14 May 2014 23:38:15 +0000 (09:38 +1000)]
Merge branch 'xfs-misc-fixes-1-for-3.16' into for-next

10 years agoMerge branch 'xfs-free-inode-btree' into for-next
Dave Chinner [Wed, 14 May 2014 23:37:44 +0000 (09:37 +1000)]
Merge branch 'xfs-free-inode-btree' into for-next

10 years agoMerge branch 'xfs-filestreams-lookup' into for-next
Dave Chinner [Wed, 14 May 2014 23:36:59 +0000 (09:36 +1000)]
Merge branch 'xfs-filestreams-lookup' into for-next

10 years agoMerge branch 'xfs-unused-args-cleanup' into for-next
Dave Chinner [Wed, 14 May 2014 23:36:35 +0000 (09:36 +1000)]
Merge branch 'xfs-unused-args-cleanup' into for-next

10 years agoxfs: list_lru_init returns a negative error
Dave Chinner [Wed, 14 May 2014 23:23:24 +0000 (09:23 +1000)]
xfs: list_lru_init returns a negative error

And we don't invert it properly when initialising the dquot lru
list.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: negate xfs_icsb_init_counters error value
Dave Chinner [Wed, 14 May 2014 23:23:07 +0000 (09:23 +1000)]
xfs: negate xfs_icsb_init_counters error value

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: negate mount workqueue init error value
Dave Chinner [Wed, 14 May 2014 23:22:53 +0000 (09:22 +1000)]
xfs: negate mount workqueue init error value

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: fix wrong err sign on xfs_set_acl()
Dave Chinner [Wed, 14 May 2014 23:22:37 +0000 (09:22 +1000)]
xfs: fix wrong err sign on xfs_set_acl()

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: fix wrong errno from xfs_initxattrs
Dave Chinner [Wed, 14 May 2014 23:22:21 +0000 (09:22 +1000)]
xfs: fix wrong errno from xfs_initxattrs

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: correct error sign on COLLAPSE_RANGE errors
Dave Chinner [Wed, 14 May 2014 23:22:07 +0000 (09:22 +1000)]
xfs: correct error sign on COLLAPSE_RANGE errors

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: xfs_commit_metadata returns wrong errno
Dave Chinner [Wed, 14 May 2014 23:21:52 +0000 (09:21 +1000)]
xfs: xfs_commit_metadata returns wrong errno

Invert it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: fix incorrect error sign in xfs_file_aio_read
Dave Chinner [Wed, 14 May 2014 23:21:37 +0000 (09:21 +1000)]
xfs: fix incorrect error sign in xfs_file_aio_read

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: xfs_dir_fsync() returns positive errno
Dave Chinner [Wed, 14 May 2014 23:21:11 +0000 (09:21 +1000)]
xfs: xfs_dir_fsync() returns positive errno

And it should be negative.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: pass struct da_args to xfs_attr_calc_size
Christoph Hellwig [Tue, 13 May 2014 06:40:19 +0000 (16:40 +1000)]
xfs: pass struct da_args to xfs_attr_calc_size

And remove a very confused comment.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: simplify attr name setup
Christoph Hellwig [Tue, 13 May 2014 06:34:43 +0000 (16:34 +1000)]
xfs: simplify attr name setup

Replace xfs_attr_name_to_xname with a new xfs_attr_args_init helper that
sets up the basic da_args structure without using a temporary xfs_name
structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: fold xfs_attr_remove_int into xfs_attr_remove
Christoph Hellwig [Tue, 13 May 2014 06:34:33 +0000 (16:34 +1000)]
xfs: fold xfs_attr_remove_int into xfs_attr_remove

Also remove a useless ilock roundtrip for the first attr fork check, it's
racy anyway and we redo it later under the ilock before we start the removal.

Plus various minor style fixes to the new xfs_attr_remove.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: fold xfs_attr_get_int into xfs_attr_get
Christoph Hellwig [Tue, 13 May 2014 06:34:24 +0000 (16:34 +1000)]
xfs: fold xfs_attr_get_int into xfs_attr_get

This allows doing an unlocked check if an attr for is present at all and
slightly reduce the lock hold time if we actually do an attr get.

Plus various minor style fixes to the new xfs_attr_get.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: fold xfs_attr_set_int into xfs_attr_set
Christoph Hellwig [Tue, 13 May 2014 06:34:14 +0000 (16:34 +1000)]
xfs: fold xfs_attr_set_int into xfs_attr_set

Plus various minor style fixes to the new xfs_attr_set.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoLinux 3.15-rc5
Linus Torvalds [Fri, 9 May 2014 20:10:52 +0000 (13:10 -0700)]
Linux 3.15-rc5

10 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 9 May 2014 19:24:20 +0000 (12:24 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Peter Anvin:
 "A somewhat unpleasantly large collection of small fixes.  The big ones
  are the __visible tree sweep and a fix for 'earlyprintk=efi,keep'.  It
  was using __init functions with predictably suboptimal results.

  Another key fix is a build fix which would produce output that simply
  would not decompress correctly in some configuration, due to the
  existing Makefiles picking up an unfortunate local label and mistaking
  it for the global symbol _end.

  Additional fixes include the handling of 64-bit numbers when setting
  the vdso data page (a latent bug which became manifest when i386
  started exporting a vdso with time functions), a fix to the new MSR
  manipulation accessors which would cause features to not get properly
  unblocked, a build fix for 32-bit userland, and a few new platform
  quirks"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall()
  x86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro
  x86: Fix typo preventing msr_set/clear_bit from having an effect
  x86/intel: Add quirk to disable HPET for the Baytrail platform
  x86/hpet: Make boot_hpet_disable extern
  x86-64, build: Fix stack protector Makefile breakage with 32-bit userland
  x86/reboot: Add reboot quirk for Certec BPC600
  asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/*
  asmlinkage, x86: Add explicit __visible to arch/x86/*
  asmlinkage: Revert "lto: Make asmlinkage __visible"
  x86, build: Don't get confused by local symbols
  x86/efi: earlyprintk=efi,keep fix

10 years agox86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall()
Boris Ostrovsky [Fri, 9 May 2014 15:11:27 +0000 (11:11 -0400)]
x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall()

With tk->wall_to_monotonic.tv_nsec being a 32-bit value on 32-bit
systems, (tk->wall_to_monotonic.tv_nsec << tk->shift) in update_vsyscall()
may lose upper bits or, worse, add them since compiler will do this:
(u64)(tk->wall_to_monotonic.tv_nsec << tk->shift)
instead of
((u64)tk->wall_to_monotonic.tv_nsec << tk->shift)

So if, for example, tv_nsec is 0x800000 and shift is 8 we will end up
with 0xffffffff80000000 instead of 0x80000000. And then we are stuck in
the subsequent 'while' loop.

We need an explicit cast.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1399648287-15178-1-git-send-email-boris.ostrovsky@oracle.com
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org> # v3.14
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
10 years agox86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro
Andres Freund [Fri, 9 May 2014 01:29:17 +0000 (03:29 +0200)]
x86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro

The spuriously added semicolon didn't have any effect because the
macro isn't currently in use.

c0a639ad0bc6b178b46996bd1f821a04643e2bde

Signed-off-by: Andres Freund <andres@anarazel.de>
Link: http://lkml.kernel.org/r/1399598957-7011-3-git-send-email-andres@anarazel.de
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
10 years agox86: Fix typo preventing msr_set/clear_bit from having an effect
Andres Freund [Fri, 9 May 2014 01:29:16 +0000 (03:29 +0200)]
x86: Fix typo preventing msr_set/clear_bit from having an effect

Due to a typo the msr accessor function introduced in
22085a66c2fab6cf9b9393c056a3600a6b4735de didn't have any lasting
effects because they accidentally wrote the old value back.

After c0a639ad0bc6b178b46996bd1f821a04643e2bde this at the very least
this causes cpuid limits not to be lifted on some cpus leading to
missing capabilities for those.

Signed-off-by: Andres Freund <andres@anarazel.de>
Link: http://lkml.kernel.org/r/1399598957-7011-2-git-send-email-andres@anarazel.de
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
10 years agoMerge tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfs
Linus Torvalds [Fri, 9 May 2014 02:20:45 +0000 (19:20 -0700)]
Merge tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfs

Pull xfs fixes from Dave Chinner:
 "The main fix is adding support for default ACLs on O_TMPFILE opened
  inodes to bring XFS into line with other filesystems.  Metadata CRCs
  are now also considered well enough tested to be fully supported, so
  we're removing the shouty warnings issued at mount time for
  filesystems with that format.  And there's transaction block
  reservation overrun fix.

  Summary:
   - fix a remote attribute size calculation bug that leads to a
     transaction overrun
   - add default ACLs to O_TMPFILE files
   - Remove the EXPERIMENTAL tag from filesystems with metadata CRC
     support"

* tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfs:
  xfs: remote attribute overwrite causes transaction overrun
  xfs: initialize default acls for ->tmpfile()
  xfs: fully support v5 format filesystems

10 years agoMerge tag 'trace-fixes-v3.15-rc4-v2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 8 May 2014 21:17:13 +0000 (14:17 -0700)]
Merge tag 'trace-fixes-v3.15-rc4-v2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "This contains two fixes.

  The first is a long standing bug that causes bogus data to show up in
  the refcnt field of the module_refcnt tracepoint.  It was introduced
  by a merge conflict resolution back in 2.6.35-rc days.

  The result should be 'refcnt = incs - decs', but instead it did
  'refcnt = incs + decs'.

  The second fix is to a bug that was introduced in this merge window
  that allowed for a tracepoint funcs pointer to be used after it was
  freed.  Moving the location of where the probes are released solved
  the problem"

* tag 'trace-fixes-v3.15-rc4-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracepoint: Fix use of tracepoint funcs after rcu free
  trace: module: Maintain a valid user count

10 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Thu, 8 May 2014 21:06:45 +0000 (14:06 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input subsystem fixes from Dmitry Torokhov:
 "Just a few fixups to various drivers"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elantech - fix touchpad initialization on Gigabyte U2442
  Input: tca8418 - fix loading this driver as a module from a device tree
  Input: bma150 - extend chip detection for bma180
  Input: atkbd - fix keyboard not working on some LG laptops
  Input: synaptics - add min/max quirk for ThinkPad Edge E431

10 years agoMerge tag 'sound-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 8 May 2014 20:51:53 +0000 (13:51 -0700)]
Merge tag 'sound-3.15-rc5' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A bunch of small fixes for USB-audio and HD-audio, where most of them
  are for regressions: USB-audio PM fixes, ratelimit annoyance fix, HDMI
  offline state fix, and a couple of device-specific quirks"

* tag 'sound-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - hdmi: Set converter channel count even without sink
  ALSA: usb-audio: work around corrupted TEAC UD-H01 feedback data
  ALSA: usb-audio: Fix deadlocks at resuming
  ALSA: usb-audio: Save mixer status only once at suspend
  ALSA: usb-audio: Prevent printk ratelimiting from spamming kernel log while DEBUG not defined
  ALSA: hda - add headset mic detect quirk for a Dell laptop

10 years agoMerge tag 'mfd-mmc-fixes-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 8 May 2014 19:41:14 +0000 (12:41 -0700)]
Merge tag 'mfd-mmc-fixes-3.15-rc4' of git://git./linux/kernel/git/lee/mfd

Pull mmc/rtsx revert from Lee Jones.

* tag 'mfd-mmc-fixes-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  mmc: rtsx: Revert "mmc: rtsx: add support for pre_req and post_req"

10 years agotracepoint: Fix use of tracepoint funcs after rcu free
Mathieu Desnoyers [Thu, 8 May 2014 11:47:49 +0000 (07:47 -0400)]
tracepoint: Fix use of tracepoint funcs after rcu free

Commit de7b2973903c "tracepoint: Use struct pointer instead of name hash
for reg/unreg tracepoints" introduces a use after free by calling
release_probes on the old struct tracepoint array before the newly
allocated array is published with rcu_assign_pointer. There is a race
window where tracepoints (RCU readers) can perform a
"use-after-grace-period-after-free", which shows up as a GPF in
stress-tests.

Link: http://lkml.kernel.org/r/53698021.5020108@oracle.com
Link: http://lkml.kernel.org/p/1399549669-25465-1-git-send-email-mathieu.desnoyers@efficios.com
Reported-by: Sasha Levin <sasha.levin@oracle.com>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Dave Jones <davej@redhat.com>
Fixes: de7b2973903c "tracepoint: Use struct pointer instead of name hash for reg/unreg tracepoints"
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
10 years agotrace: module: Maintain a valid user count
Romain Izard [Tue, 4 Mar 2014 09:09:39 +0000 (10:09 +0100)]
trace: module: Maintain a valid user count

The replacement of the 'count' variable by two variables 'incs' and
'decs' to resolve some race conditions during module unloading was done
in parallel with some cleanup in the trace subsystem, and was integrated
as a merge.

Unfortunately, the formula for this replacement was wrong in the tracing
code, and the refcount in the traces was not usable as a result.

Use 'count = incs - decs' to compute the user count.

Link: http://lkml.kernel.org/p/1393924179-9147-1-git-send-email-romain.izard.pro@gmail.com
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: stable@vger.kernel.org # 2.6.35
Fixes: c1ab9cab7509 "merge conflict resolution"
Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
10 years agommc: rtsx: Revert "mmc: rtsx: add support for pre_req and post_req"
Micky Ching [Tue, 29 Apr 2014 01:54:54 +0000 (09:54 +0800)]
mmc: rtsx: Revert "mmc: rtsx: add support for pre_req and post_req"

This reverts commit c42deffd5b53c9e583d83c7964854ede2f12410d.

commit <mmc: rtsx: add support for pre_req and post_req> did use
mutex_unlock() in tasklet, but mutex_unlock() can't be used in
tasklet(atomic context). The driver needs to use mutex to avoid
concurrency, so we can't use tasklet here, the patch need to be
removed.

The spinlock host->lock and pcr->lock may deadlock, one way to solve
the deadlock is remove host->lock in sd_isr_done_transfer(), but if
using workqueue the we can avoid using the spinlock and also avoid
the problem.

Signed-off-by: Micky Ching <micky_ching@realsil.com.cn>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
10 years agox86/intel: Add quirk to disable HPET for the Baytrail platform
Feng Tang [Thu, 24 Apr 2014 08:18:18 +0000 (16:18 +0800)]
x86/intel: Add quirk to disable HPET for the Baytrail platform

HPET on current Baytrail platform has accuracy problem to be
used as reliable clocksource/clockevent, so add a early quirk to
disable it.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1398327498-13163-2-git-send-email-feng.tang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agox86/hpet: Make boot_hpet_disable extern
Feng Tang [Thu, 24 Apr 2014 08:18:17 +0000 (16:18 +0800)]
x86/hpet: Make boot_hpet_disable extern

HPET on some platform has accuracy problem. Making
"boot_hpet_disable" extern so that we can runtime disable
the HPET timer by using quirk to check the platform.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1398327498-13163-1-git-send-email-feng.tang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoMerge tag 'for-linus-20140507' of git://git.infradead.org/linux-mtd
Linus Torvalds [Wed, 7 May 2014 23:28:52 +0000 (16:28 -0700)]
Merge tag 'for-linus-20140507' of git://git.infradead.org/linux-mtd

Pull MTD fix from Brian Norris:
 "A single update for Keystone SoC's, whose NAND controller does not
  support subpage programming"

* tag 'for-linus-20140507' of git://git.infradead.org/linux-mtd:
  mtd: davinci-nand: disable subpage write for keystone-nand

10 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Linus Torvalds [Wed, 7 May 2014 23:07:58 +0000 (16:07 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid

Pull HID fixes from Jiri Kosina:

 - fix a small bug in computation of report size, which might cause some
   devices (Atmel touchpad found on the Samsung Ativ 9) to reject
   reports with otherwise valid contents

 - a few device-ID specific quirks/additions piggy-backing on top of it

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: sensor-hub: Add in quirk for sensor hub in Lenovo Ideapad Yogas
  HID: add NO_INIT_REPORTS quirk for Synaptics Touch Pad V 103S
  HID: core: fix computation of the report size
  HID: multitouch: add support of EliteGroup 05D8 panels

10 years agoMerge branch 'drm-radeon-mullins' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Wed, 7 May 2014 22:47:47 +0000 (15:47 -0700)]
Merge branch 'drm-radeon-mullins' of git://people.freedesktop.org/~airlied/linux

Pull radeon mullins support from Dave Airlie:
 "This is support for the new AMD mullins APU, it pretty much just adds
  support to the driver in the all the right places, and is pretty low
  risk wrt other GPUs"

Oh well.  I guess it ends up fitting under "support new hardware" for
merging late.

* 'drm-radeon-mullins' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: add pci ids for Mullins
  drm/radeon: add Mullins VCE support
  drm/radeon: modesetting updates for Mullins.
  drm/radeon: dpm updates for KV/KB
  drm/radeon: add Mullins dpm support.
  drm/radeon: add Mullins UVD support.
  drm/radeon: update cik init for Mullins.
  drm/radeon: add Mullins chip family

10 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Wed, 7 May 2014 22:45:13 +0000 (15:45 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "radeon, i915 and nouveau fixes, all fixes for regressions or black
  screens, or possible oopses"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: lower the ref * post PLL maximum
  drm/radeon: check that we have a clock before PLL setup
  drm/radeon: drm/radeon: add missing radeon_semaphore_free to error path
  drm/radeon: Fix num_banks calculation for SI
  agp: info leak in agpioc_info_wrap()
  drm/gm107/gr: bump attrib cb size quite a bit
  drm/nouveau: fix another lock unbalance in nouveau_crtc_page_flip
  drm/nouveau/bios: fix shadowing from PROM on big-endian systems
  drm/nouveau/acpi: allow non-optimus setups to load vbios from acpi
  drm/radeon/dp: check for errors in dpcd reads
  drm/radeon: avoid high jitter with small frac divs
  drm/radeon: check buffer relocation offset
  drm/radeon: use pflip irq on R600+ v2
  drm/radeon/uvd: use lower clocks on old UVD to boot v2
  drm/i915: don't try DP_LINK_BW_5_4 on HSW ULX
  drm/i915: Sanitize the enable_ppgtt module option once
  drm/i915: Break encoder->crtc link separately in intel_sanitize_crtc()

10 years agox86-64, build: Fix stack protector Makefile breakage with 32-bit userland
George Spelvin [Wed, 7 May 2014 21:05:52 +0000 (17:05 -0400)]
x86-64, build: Fix stack protector Makefile breakage with 32-bit userland

If you are using a 64-bit kernel with 32-bit userland, then
scripts/gcc-x86_64-has-stack-protector.sh invokes 32-bit gcc
with -mcmodel=kernel, which produces:

<stdin>:1:0: error: code model 'kernel' not supported in the 32 bit mode

and trips the "broken compiler" test at arch/x86/Makefile:120.

There are several places a fix is possible, but the following seems
cleanest.  (But it's minimal; it would also be possible to factor
out a bunch of stuff from the two branches of the if.)

Signed-off-by: George Spelvin <linux@horizon.com>
Link: http://lkml.kernel.org/r/20140507210552.7581.qmail@ns.horizon.com
Cc: <stable@vger.kernel.org> # v3.14
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
10 years agox86/reboot: Add reboot quirk for Certec BPC600
Christian Gmeiner [Wed, 7 May 2014 07:01:54 +0000 (09:01 +0200)]
x86/reboot: Add reboot quirk for Certec BPC600

Certec BPC600 needs reboot=pci to actually reboot.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Li Aubrey <aubrey.li@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1399446114-2147-1-git-send-email-christian.gmeiner@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
10 years agoMerge branch 'mullins' of git://people.freedesktop.org/~deathsimple/linux into drm...
Dave Airlie [Tue, 6 May 2014 23:10:28 +0000 (09:10 +1000)]
Merge branch 'mullins' of git://people.freedesktop.org/~deathsimple/linux into drm-fixes

Add Mullins chips support.

* 'mullins' of git://people.freedesktop.org/~deathsimple/linux:
  drm/radeon: add pci ids for Mullins
  drm/radeon: add Mullins VCE support
  drm/radeon: modesetting updates for Mullins.
  drm/radeon: dpm updates for KV/KB
  drm/radeon: add Mullins dpm support.
  drm/radeon: add Mullins UVD support.
  drm/radeon: update cik init for Mullins.
  drm/radeon: add Mullins chip family

10 years agoMerge branch 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux...
Dave Airlie [Tue, 6 May 2014 23:06:21 +0000 (09:06 +1000)]
Merge branch 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

nouveau fixes.

* 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/gm107/gr: bump attrib cb size quite a bit
  drm/nouveau: fix another lock unbalance in nouveau_crtc_page_flip
  drm/nouveau/bios: fix shadowing from PROM on big-endian systems
  drm/nouveau/acpi: allow non-optimus setups to load vbios from acpi

10 years agoMerge tag 'topc/core-stuff-2014-05-05' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Tue, 6 May 2014 22:56:03 +0000 (08:56 +1000)]
Merge tag 'topc/core-stuff-2014-05-05' of git://anongit.freedesktop.org/drm-intel into drm-fixes

Some more i915 fixes. There's still some DP issues we are looking into,
but wanted to get these moving.

* tag 'topc/core-stuff-2014-05-05' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: don't try DP_LINK_BW_5_4 on HSW ULX
  drm/i915: Sanitize the enable_ppgtt module option once
  drm/i915: Break encoder->crtc link separately in intel_sanitize_crtc()

10 years agoMerge branch 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux...
Dave Airlie [Tue, 6 May 2014 22:55:27 +0000 (08:55 +1000)]
Merge branch 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux into drm-fixes

this is the next pull quested for stashed up radeon fixes for 3.15. As discussed support for Mullins was separated out and will get it's own pull request. Remaining highlights are:
1. Some more patches to better handle PLL limits.
2. Making use of the PFLIP additional to the VBLANK interrupt, otherwise we sometimes miss page flip events.
3. Fix for the UVD command stream parser.
4. Fix for bootup UVD clocks on RV7xx systems.
5. Adding missing error check on dpcd reads.
6. Fixes number of banks calculation on SI.

* 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux:
  drm/radeon: lower the ref * post PLL maximum
  drm/radeon: check that we have a clock before PLL setup
  drm/radeon: drm/radeon: add missing radeon_semaphore_free to error path
  drm/radeon: Fix num_banks calculation for SI
  drm/radeon/dp: check for errors in dpcd reads
  drm/radeon: avoid high jitter with small frac divs
  drm/radeon: check buffer relocation offset
  drm/radeon: use pflip irq on R600+ v2
  drm/radeon/uvd: use lower clocks on old UVD to boot v2

10 years agoxfs: fix directory readahead offset off-by-one
Dave Chinner [Tue, 6 May 2014 22:05:52 +0000 (08:05 +1000)]
xfs: fix directory readahead offset off-by-one

Directory readahead can throw loud scary but harmless warnings
when multiblock directories are in use a specific pattern of
discontiguous blocks are found in the directory. That is, if a hole
follows a discontiguous block, it will throw a warning like:

XFS (dm-1): xfs_da_do_buf: bno 637 dir: inode 34363923462
XFS (dm-1): [00] br_startoff 637 br_startblock 1917954575 br_blockcount 1 br_state 0
XFS (dm-1): [01] br_startoff 638 br_startblock -2 br_blockcount 1 br_state 0

And dump a stack trace.

This is because the readahead offset increment loop does a double
increment of the block index - it does an increment for the loop
iteration as well as increase the loop counter by the number of
blocks in the extent. As a result, the readahead offset does not get
incremented correctly for discontiguous blocks and hence can ask for
readahead of a directory block from an offset part way through a
directory block.  If that directory block is followed by a hole, it
will trigger a mapping warning like the above.

The bad readahead will be ignored, though, because the main
directory block read loop uses the correct mapping offsets rather
than the readahead offset and so will ignore the bad readahead
altogether.

Fix the warning by ensuring that the readahead offset is correctly
incremented.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: don't sleep in xlog_cil_force_lsn on shutdown
Dave Chinner [Tue, 6 May 2014 22:05:50 +0000 (08:05 +1000)]
xfs: don't sleep in xlog_cil_force_lsn on shutdown

Reports of a shutdown hang when fsyncing a directory have surfaced,
such as this:

[ 3663.394472] Call Trace:
[ 3663.397199]  [<ffffffff815f1889>] schedule+0x29/0x70
[ 3663.402743]  [<ffffffffa01feda5>] xlog_cil_force_lsn+0x185/0x1a0 [xfs]
[ 3663.416249]  [<ffffffffa01fd3af>] _xfs_log_force_lsn+0x6f/0x2f0 [xfs]
[ 3663.429271]  [<ffffffffa01a339d>] xfs_dir_fsync+0x7d/0xe0 [xfs]
[ 3663.435873]  [<ffffffff811df8c5>] do_fsync+0x65/0xa0
[ 3663.441408]  [<ffffffff811dfbc0>] SyS_fsync+0x10/0x20
[ 3663.447043]  [<ffffffff815fc7d9>] system_call_fastpath+0x16/0x1b

If we trigger a shutdown in xlog_cil_push() from xlog_write(), we
will never wake waiters on the current push sequence number, so
anything waiting in xlog_cil_force_lsn() for that push sequence
number to come up will not get woken and hence stall the shutdown.

Fix this by ensuring we call wake_up_all(&cil->xc_commit_wait) in
the push abort handling, in the log shutdown code when waking all
waiters, and adding a shutdown check in the sequence completion wait
loops to ensure they abort when a wakeup due to a shutdown occurs.

Reported-by: Boris Ranto <branto@redhat.com>
Reported-by: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoxfs: truncate_setsize should be outside transactions
Dave Chinner [Tue, 6 May 2014 22:05:45 +0000 (08:05 +1000)]
xfs: truncate_setsize should be outside transactions

truncate_setsize() removes pages from the page cache, and hence
requires page locks to be held. It is not valid to lock a page cache
page inside a transaction context as we can hold page locks when we
we reserve space for a transaction. If we do, then we expose an ABBA
deadlock between log space reservation and page locks.

That is, both the write path and writeback lock a page, then start a
transaction for block allocation, which means they can block waiting
for a log reservation with the page lock held. If we hold a log
reservation and then do something that locks a page (e.g.
truncate_setsize in xfs_setattr_size) then that page lock can block
on the page locked and waiting for a log reservation. If the
transaction that is waiting for the page lock is the only active
transaction in the system that can free log space via a commit,
then writeback will never make progress and so log space will never
free up.

This issue with xfs_setattr_size() was introduced back in 2010 by
commit fa9b227 ("xfs: new truncate sequence") which moved the page
cache truncate from outside the transaction context (what was
xfs_itruncate_data()) to inside the transaction context as a call to
truncate_setsize().

The reason truncate_setsize() was located where in this place was
that we can't shouldn't change the file size until after we are in
the transaction context and the operation will either succeed or
shut down the filesystem on failure. However, block_truncate_page()
already modifies the file contents before we enter the transaction
context, so we can't really fulfill this guarantee in any way. Hence
we may as well ensure that on success or failure, the in-memory
inode and data is truncated away and that the application cleans up
the mess appropriately.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
10 years agoMerge branch 'akpm' (incoming from Andrew)
Linus Torvalds [Tue, 6 May 2014 20:07:41 +0000 (13:07 -0700)]
Merge branch 'akpm' (incoming from Andrew)

Merge misc fixes from Andrew Morton:
 "13 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  agp: info leak in agpioc_info_wrap()
  fs/affs/super.c: bugfix / double free
  fanotify: fix -EOVERFLOW with large files on 64-bit
  slub: use sysfs'es release mechanism for kmem_cache
  revert "mm: vmscan: do not swap anon pages just because free+file is low"
  autofs: fix lockref lookup
  mm: filemap: update find_get_pages_tag() to deal with shadow entries
  mm/compaction: make isolate_freepages start at pageblock boundary
  MAINTAINERS: zswap/zbud: change maintainer email address
  mm/page-writeback.c: fix divide by zero in pos_ratio_polynom
  hugetlb: ensure hugepage access is denied if hugepages are not supported
  slub: fix memcg_propagate_slab_attrs
  drivers/rtc/rtc-pcf8523.c: fix month definition

10 years agoagp: info leak in agpioc_info_wrap()
Dan Carpenter [Tue, 6 May 2014 19:50:12 +0000 (12:50 -0700)]
agp: info leak in agpioc_info_wrap()

On 64 bit systems the agp_info struct has a 4 byte hole between
->agp_mode and ->aper_base.  We need to clear it to avoid disclosing
stack information to userspace.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agofs/affs/super.c: bugfix / double free
Fabian Frederick [Tue, 6 May 2014 19:50:11 +0000 (12:50 -0700)]
fs/affs/super.c: bugfix / double free

Commit 842a859db26b ("affs: use ->kill_sb() to simplify ->put_super()
and failure exits of ->mount()") adds .kill_sb which frees sbi but
doesn't remove sbi free in case of parse_options error causing double
free+random crash.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org> [3.14.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agofanotify: fix -EOVERFLOW with large files on 64-bit
Will Woods [Tue, 6 May 2014 19:50:10 +0000 (12:50 -0700)]
fanotify: fix -EOVERFLOW with large files on 64-bit

On 64-bit systems, O_LARGEFILE is automatically added to flags inside
the open() syscall (also openat(), blkdev_open(), etc).  Userspace
therefore defines O_LARGEFILE to be 0 - you can use it, but it's a
no-op.  Everything should be O_LARGEFILE by default.

But: when fanotify does create_fd() it uses dentry_open(), which skips
all that.  And userspace can't set O_LARGEFILE in fanotify_init()
because it's defined to 0.  So if fanotify gets an event regarding a
large file, the read() will just fail with -EOVERFLOW.

This patch adds O_LARGEFILE to fanotify_init()'s event_f_flags on 64-bit
systems, using the same test as open()/openat()/etc.

Addresses https://bugzilla.redhat.com/show_bug.cgi?id=696821

Signed-off-by: Will Woods <wwoods@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoslub: use sysfs'es release mechanism for kmem_cache
Christoph Lameter [Tue, 6 May 2014 19:50:08 +0000 (12:50 -0700)]
slub: use sysfs'es release mechanism for kmem_cache

debugobjects warning during netfilter exit:

    ------------[ cut here ]------------
    WARNING: CPU: 6 PID: 4178 at lib/debugobjects.c:260 debug_print_object+0x8d/0xb0()
    ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
    Modules linked in:
    CPU: 6 PID: 4178 Comm: kworker/u16:2 Tainted: G        W 3.11.0-next-20130906-sasha #3984
    Workqueue: netns cleanup_net
    Call Trace:
      dump_stack+0x52/0x87
      warn_slowpath_common+0x8c/0xc0
      warn_slowpath_fmt+0x46/0x50
      debug_print_object+0x8d/0xb0
      __debug_check_no_obj_freed+0xa5/0x220
      debug_check_no_obj_freed+0x15/0x20
      kmem_cache_free+0x197/0x340
      kmem_cache_destroy+0x86/0xe0
      nf_conntrack_cleanup_net_list+0x131/0x170
      nf_conntrack_pernet_exit+0x5d/0x70
      ops_exit_list+0x5e/0x70
      cleanup_net+0xfb/0x1c0
      process_one_work+0x338/0x550
      worker_thread+0x215/0x350
      kthread+0xe7/0xf0
      ret_from_fork+0x7c/0xb0

Also during dcookie cleanup:

    WARNING: CPU: 12 PID: 9725 at lib/debugobjects.c:260 debug_print_object+0x8c/0xb0()
    ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
    Modules linked in:
    CPU: 12 PID: 9725 Comm: trinity-c141 Not tainted 3.15.0-rc2-next-20140423-sasha-00018-gc4ff6c4 #408
    Call Trace:
      dump_stack (lib/dump_stack.c:52)
      warn_slowpath_common (kernel/panic.c:430)
      warn_slowpath_fmt (kernel/panic.c:445)
      debug_print_object (lib/debugobjects.c:262)
      __debug_check_no_obj_freed (lib/debugobjects.c:697)
      debug_check_no_obj_freed (lib/debugobjects.c:726)
      kmem_cache_free (mm/slub.c:2689 mm/slub.c:2717)
      kmem_cache_destroy (mm/slab_common.c:363)
      dcookie_unregister (fs/dcookies.c:302 fs/dcookies.c:343)
      event_buffer_release (arch/x86/oprofile/../../../drivers/oprofile/event_buffer.c:153)
      __fput (fs/file_table.c:217)
      ____fput (fs/file_table.c:253)
      task_work_run (kernel/task_work.c:125 (discriminator 1))
      do_notify_resume (include/linux/tracehook.h:196 arch/x86/kernel/signal.c:751)
      int_signal (arch/x86/kernel/entry_64.S:807)

Sysfs has a release mechanism.  Use that to release the kmem_cache
structure if CONFIG_SYSFS is enabled.

Only slub is changed - slab currently only supports /proc/slabinfo and
not /sys/kernel/slab/*.  We talked about adding that and someone was
working on it.

[akpm@linux-foundation.org: fix CONFIG_SYSFS=n build]
[akpm@linux-foundation.org: fix CONFIG_SYSFS=n build even more]
Signed-off-by: Christoph Lameter <cl@linux.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Greg KH <greg@kroah.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agorevert "mm: vmscan: do not swap anon pages just because free+file is low"
Johannes Weiner [Tue, 6 May 2014 19:50:07 +0000 (12:50 -0700)]
revert "mm: vmscan: do not swap anon pages just because free+file is low"

This reverts commit 0bf1457f0cfc ("mm: vmscan: do not swap anon pages
just because free+file is low") because it introduced a regression in
mostly-anonymous workloads, where reclaim would become ineffective and
trap every allocating task in direct reclaim.

The problem is that there is a runaway feedback loop in the scan balance
between file and anon, where the balance tips heavily towards a tiny
thrashing file LRU and anonymous pages are no longer being looked at.
The commit in question removed the safe guard that would detect such
situations and respond with forced anonymous reclaim.

This commit was part of a series to fix premature swapping in loads with
relatively little cache, and while it made a small difference, the cure
is obviously worse than the disease.  Revert it.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoautofs: fix lockref lookup
Ian Kent [Tue, 6 May 2014 19:50:06 +0000 (12:50 -0700)]
autofs: fix lockref lookup

autofs needs to be able to see private data dentry flags for its dentrys
that are being created but not yet hashed and for its dentrys that have
been rmdir()ed but not yet freed.  It needs to do this so it can block
processes in these states until a status has been returned to indicate
the given operation is complete.

It does this by keeping two lists, active and expring, of dentrys in
this state and uses ->d_release() to keep them stable while it checks
the reference count to determine if they should be used.

But with the recent lockref changes dentrys being freed sometimes don't
transition to a reference count of 0 before being freed so autofs can
occassionally use a dentry that is invalid which can lead to a panic.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agomm: filemap: update find_get_pages_tag() to deal with shadow entries
Johannes Weiner [Tue, 6 May 2014 19:50:05 +0000 (12:50 -0700)]
mm: filemap: update find_get_pages_tag() to deal with shadow entries

Dave Jones reports the following crash when find_get_pages_tag() runs
into an exceptional entry:

  kernel BUG at mm/filemap.c:1347!
  RIP: find_get_pages_tag+0x1cb/0x220
  Call Trace:
    find_get_pages_tag+0x36/0x220
    pagevec_lookup_tag+0x21/0x30
    filemap_fdatawait_range+0xbe/0x1e0
    filemap_fdatawait+0x27/0x30
    sync_inodes_sb+0x204/0x2a0
    sync_inodes_one_sb+0x19/0x20
    iterate_supers+0xb2/0x110
    sys_sync+0x44/0xb0
    ia32_do_call+0x13/0x13

  1343                         /*
  1344                          * This function is never used on a shmem/tmpfs
  1345                          * mapping, so a swap entry won't be found here.
  1346                          */
  1347                         BUG();

After commit 0cd6144aadd2 ("mm + fs: prepare for non-page entries in
page cache radix trees") this comment and BUG() are out of date because
exceptional entries can now appear in all mappings - as shadows of
recently evicted pages.

However, as Hugh Dickins notes,

  "it is truly surprising for a PAGECACHE_TAG_WRITEBACK (and probably
   any other PAGECACHE_TAG_*) to appear on an exceptional entry.

   I expect it comes down to an occasional race in RCU lookup of the
   radix_tree: lacking absolute synchronization, we might sometimes
   catch an exceptional entry, with the tag which really belongs with
   the unexceptional entry which was there an instant before."

And indeed, not only is the tree walk lockless, the tags are also read
in chunks, one radix tree node at a time.  There is plenty of time for
page reclaim to swoop in and replace a page that was already looked up
as tagged with a shadow entry.

Remove the BUG() and update the comment.  While reviewing all other
lookup sites for whether they properly deal with shadow entries of
evicted pages, update all the comments and fix memcg file charge moving
to not miss shmem/tmpfs swapcache pages.

Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache radix trees")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agomm/compaction: make isolate_freepages start at pageblock boundary
Vlastimil Babka [Tue, 6 May 2014 19:50:03 +0000 (12:50 -0700)]
mm/compaction: make isolate_freepages start at pageblock boundary

The compaction freepage scanner implementation in isolate_freepages()
starts by taking the current cc->free_pfn value as the first pfn.  In a
for loop, it scans from this first pfn to the end of the pageblock, and
then subtracts pageblock_nr_pages from the first pfn to obtain the first
pfn for the next for loop iteration.

This means that when cc->free_pfn starts at offset X rather than being
aligned on pageblock boundary, the scanner will start at offset X in all
scanned pageblock, ignoring potentially many free pages.  Currently this
can happen when

 a) zone's end pfn is not pageblock aligned, or

 b) through zone->compact_cached_free_pfn with CONFIG_HOLES_IN_ZONE
    enabled and a hole spanning the beginning of a pageblock

This patch fixes the problem by aligning the initial pfn in
isolate_freepages() to pageblock boundary.  This also permits replacing
the end-of-pageblock alignment within the for loop with a simple
pageblock_nr_pages increment.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Heesub Shin <heesub.shin@samsung.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Christoph Lameter <cl@linux.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Dongjun Shin <d.j.shin@samsung.com>
Cc: Sunghwan Yun <sunghwan.yun@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMAINTAINERS: zswap/zbud: change maintainer email address
Seth Jennings [Tue, 6 May 2014 19:50:02 +0000 (12:50 -0700)]
MAINTAINERS: zswap/zbud: change maintainer email address

sjenning@linux.vnet.ibm.com is no longer a viable entity.

Signed-off-by: Seth Jennings <sjennings@variantweb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agomm/page-writeback.c: fix divide by zero in pos_ratio_polynom
Rik van Riel [Tue, 6 May 2014 19:50:01 +0000 (12:50 -0700)]
mm/page-writeback.c: fix divide by zero in pos_ratio_polynom

It is possible for "limit - setpoint + 1" to equal zero, after getting
truncated to a 32 bit variable, and resulting in a divide by zero error.

Using the fully 64 bit divide functions avoids this problem.  It also
will cause pos_ratio_polynom() to return the correct value when
(setpoint - limit) exceeds 2^32.

Also uninline pos_ratio_polynom, at Andrew's request.

Signed-off-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agohugetlb: ensure hugepage access is denied if hugepages are not supported
Nishanth Aravamudan [Tue, 6 May 2014 19:50:00 +0000 (12:50 -0700)]
hugetlb: ensure hugepage access is denied if hugepages are not supported

Currently, I am seeing the following when I `mount -t hugetlbfs /none
/dev/hugetlbfs`, and then simply do a `ls /dev/hugetlbfs`.  I think it's
related to the fact that hugetlbfs is properly not correctly setting
itself up in this state?:

  Unable to handle kernel paging request for data at address 0x00000031
  Faulting instruction address: 0xc000000000245710
  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  ....

In KVM guests on Power, in a guest not backed by hugepages, we see the
following:

  AnonHugePages:         0 kB
  HugePages_Total:       0
  HugePages_Free:        0
  HugePages_Rsvd:        0
  HugePages_Surp:        0
  Hugepagesize:         64 kB

HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
are not supported at boot-time, but this is only checked in
hugetlb_init().  Extract the check to a helper function, and use it in a
few relevant places.

This does make hugetlbfs not supported (not registered at all) in this
environment.  I believe this is fine, as there are no valid hugepages
and that won't change at runtime.

[akpm@linux-foundation.org: use pr_info(), per Mel]
[akpm@linux-foundation.org: fix build when HPAGE_SHIFT is undefined]
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoslub: fix memcg_propagate_slab_attrs
Vladimir Davydov [Tue, 6 May 2014 19:49:59 +0000 (12:49 -0700)]
slub: fix memcg_propagate_slab_attrs

After creating a cache for a memcg we should initialize its sysfs attrs
with the values from its parent.  That's what memcg_propagate_slab_attrs
is for.  Currently it's broken - we clearly muddled root-vs-memcg caches
there.  Let's fix it up.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agodrivers/rtc/rtc-pcf8523.c: fix month definition
Chris Cui [Tue, 6 May 2014 19:49:58 +0000 (12:49 -0700)]
drivers/rtc/rtc-pcf8523.c: fix month definition

PCF8523 uses 1-12 to represent month according to datasheet.
link: www.nxp.com/documents/data_sheet/PCF8523.pdf.

Signed-off-by: Chris Cui <chris.wei.cui@gmail.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Tue, 6 May 2014 19:22:20 +0000 (12:22 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
 "dcache fixes + kvfree() (uninlined, exported by mm/util.c) + posix_acl
  bugfix from hch"

The dcache fixes are for a subtle LRU list corruption bug reported by
Miklos Szeredi, where people inside IBM saw list corruptions with the
LTP/host01 test.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  nick kvfree() from apparmor
  posix_acl: handle NULL ACL in posix_acl_equiv_mode
  dcache: don't need rcu in shrink_dentry_list()
  more graceful recovery in umount_collect()
  don't remove from shrink list in select_collect()
  dentry_kill(): don't try to remove from shrink list
  expand the call of dentry_lru_del() in dentry_kill()
  new helper: dentry_free()
  fold try_prune_one_dentry()
  fold d_kill() and d_free()
  fix races between __d_instantiate() and checks of dentry flags