OSDN Git Service

uclinux-h8/linux.git
8 years agofuse: move list_del_init() from request_end() into callers
Miklos Szeredi [Wed, 1 Jul 2015 14:26:04 +0000 (16:26 +0200)]
fuse: move list_del_init() from request_end() into callers

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
8 years agofuse: duplicate ->connected in pqueue
Miklos Szeredi [Wed, 1 Jul 2015 14:26:04 +0000 (16:26 +0200)]
fuse: duplicate ->connected in pqueue

This will allow checking ->connected just with the processing queue lock.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: separate out processing queue
Miklos Szeredi [Wed, 1 Jul 2015 14:26:04 +0000 (16:26 +0200)]
fuse: separate out processing queue

This is just two fields: fc->io and fc->processing.

This patch just rearranges the fields, no functional change.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: simplify request_wait()
Miklos Szeredi [Wed, 1 Jul 2015 14:26:03 +0000 (16:26 +0200)]
fuse: simplify request_wait()

wait_event_interruptible_exclusive_locked() will do everything
request_wait() does, so replace it.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: no fc->lock for iqueue parts
Miklos Szeredi [Wed, 1 Jul 2015 14:26:03 +0000 (16:26 +0200)]
fuse: no fc->lock for iqueue parts

Remove fc->lock protection from input queue members, now protected by
fiq->waitq.lock.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: allow interrupt queuing without fc->lock
Miklos Szeredi [Wed, 1 Jul 2015 14:26:03 +0000 (16:26 +0200)]
fuse: allow interrupt queuing without fc->lock

Interrupt is only queued after the request has been sent to userspace.
This is either done in request_wait_answer() or fuse_dev_do_read()
depending on which state the request is in at the time of the interrupt.
If it's not yet sent, then queuing the interrupt is postponed until the
request is read.  Otherwise (the request has already been read and is
waiting for an answer) the interrupt is queued immedidately.

We want to call queue_interrupt() without fc->lock protection, in which
case there can be a race between the two functions:

 - neither of them queue the interrupt (thinking the other one has already
   done it).

 - both of them queue the interrupt

The first one is prevented by adding memory barriers, the second is
prevented by checking (under fiq->waitq.lock) if the interrupt has already
been queued.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
8 years agofuse: iqueue locking
Miklos Szeredi [Wed, 1 Jul 2015 14:26:02 +0000 (16:26 +0200)]
fuse: iqueue locking

Use fiq->waitq.lock for protecting members of struct fuse_iqueue and
FR_PENDING request flag, previously protected by fc->lock.

Following patches will remove fc->lock protection from these members.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: dev read: split list_move
Miklos Szeredi [Wed, 1 Jul 2015 14:26:02 +0000 (16:26 +0200)]
fuse: dev read: split list_move

Different lists will need different locks.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: abort: group iqueue accesses
Miklos Szeredi [Wed, 1 Jul 2015 14:26:02 +0000 (16:26 +0200)]
fuse: abort: group iqueue accesses

Rearrange fuse_abort_conn() so that input queue accesses are grouped
together.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: duplicate ->connected in iqueue
Miklos Szeredi [Wed, 1 Jul 2015 14:26:01 +0000 (16:26 +0200)]
fuse: duplicate ->connected in iqueue

This will allow checking ->connected just with the input queue lock.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: separate out input queue
Miklos Szeredi [Wed, 1 Jul 2015 14:26:01 +0000 (16:26 +0200)]
fuse: separate out input queue

The input queue contains normal requests (fc->pending), forgets
(fc->forget_*) and interrupts (fc->interrupts).  There's also fc->waitq and
fc->fasync for waking up the readers of the fuse device when a request is
available.

The fc->reqctr is also moved to the input queue (assigned to the request
when the request is added to the input queue.

This patch just rearranges the fields, no functional change.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: req state use flags
Miklos Szeredi [Wed, 1 Jul 2015 14:26:01 +0000 (16:26 +0200)]
fuse: req state use flags

Use flags for representing the state in fuse_req.  This is needed since
req->list will be protected by different locks in different states, hence
we'll want the state itself to be split into distinct bits, each protected
with the relevant lock in that state.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
8 years agofuse: simplify req states
Miklos Szeredi [Wed, 1 Jul 2015 14:26:00 +0000 (16:26 +0200)]
fuse: simplify req states

FUSE_REQ_INIT is actually the same state as FUSE_REQ_PENDING and
FUSE_REQ_READING and FUSE_REQ_WRITING can be merged into a common
FUSE_REQ_IO state.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: don't hold lock over request_wait_answer()
Miklos Szeredi [Wed, 1 Jul 2015 14:26:00 +0000 (16:26 +0200)]
fuse: don't hold lock over request_wait_answer()

Only hold fc->lock over sections of request_wait_answer() that actually
need it.  If wait_event_interruptible() returns zero, it means that the
request finished.  Need to add memory barriers, though, to make sure that
all relevant data in the request is synchronized.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
8 years agofuse: simplify unique ctr
Miklos Szeredi [Wed, 1 Jul 2015 14:26:00 +0000 (16:26 +0200)]
fuse: simplify unique ctr

Since it's a 64bit counter, it's never gonna wrap around.  Remove code
dealing with that possibility.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: rework abort
Miklos Szeredi [Wed, 1 Jul 2015 14:25:59 +0000 (16:25 +0200)]
fuse: rework abort

Splice fc->pending and fc->processing lists into a common kill list while
holding fc->lock.

By the time we release fc->lock, pending and processing lists are empty and
the io list contains only locked requests.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: fold helpers into abort
Miklos Szeredi [Wed, 1 Jul 2015 14:25:59 +0000 (16:25 +0200)]
fuse: fold helpers into abort

Fold end_io_requests() and end_queued_requests() into fuse_abort_conn().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: use per req lock for lock/unlock_request()
Miklos Szeredi [Wed, 1 Jul 2015 14:25:58 +0000 (16:25 +0200)]
fuse: use per req lock for lock/unlock_request()

Reuse req->waitq.lock for protecting FR_ABORTED and FR_LOCKED flags.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: req use bitops
Miklos Szeredi [Wed, 1 Jul 2015 14:25:58 +0000 (16:25 +0200)]
fuse: req use bitops

Finer grained locking will mean there's no single lock to protect
modification of bitfileds in fuse_req.

So move to using bitops.  Can use the non-atomic variants for those which
happen while the request definitely has only one reference.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: simplify request abort
Miklos Szeredi [Wed, 1 Jul 2015 14:25:58 +0000 (16:25 +0200)]
fuse: simplify request abort

 - don't end the request while req->locked is true

 - make unlock_request() return an error if the connection was aborted

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: call fuse_abort_conn() in dev release
Miklos Szeredi [Wed, 1 Jul 2015 14:25:57 +0000 (16:25 +0200)]
fuse: call fuse_abort_conn() in dev release

fuse_abort_conn() does all the work done by fuse_dev_release() and more.
"More" consists of:

end_io_requests(fc);
wake_up_all(&fc->waitq);
kill_fasync(&fc->fasync, SIGIO, POLL_IN);

All of which should be no-op (WARN_ON's added).

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: fold fuse_request_send_nowait() into single caller
Miklos Szeredi [Wed, 1 Jul 2015 14:25:57 +0000 (16:25 +0200)]
fuse: fold fuse_request_send_nowait() into single caller

And the same with fuse_request_send_nowait_locked().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: check conn_error earlier
Miklos Szeredi [Wed, 1 Jul 2015 14:25:57 +0000 (16:25 +0200)]
fuse: check conn_error earlier

fc->conn_error is set once in FUSE_INIT reply and never cleared.  Check it
in request allocation, there's no sense in doing all the preparation if
sending will surely fail.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: account as waiting before queuing for background
Miklos Szeredi [Wed, 1 Jul 2015 14:25:56 +0000 (16:25 +0200)]
fuse: account as waiting before queuing for background

Move accounting of fc->num_waiting to the point where the request actually
starts waiting.  This is earlier than the current queue_request() for
background requests, since they might be waiting on the fc->bg_queue before
being queued on fc->pending.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: reset waiting
Miklos Szeredi [Wed, 1 Jul 2015 14:25:56 +0000 (16:25 +0200)]
fuse: reset waiting

Reset req->waiting in fuse_put_request().  This is needed for correct
accounting in fc->num_waiting for reserved requests.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
8 years agofuse: fix background request if not connected
Miklos Szeredi [Wed, 1 Jul 2015 14:25:56 +0000 (16:25 +0200)]
fuse: fix background request if not connected

request_end() expects fc->num_background and fc->active_background to have
been incremented, which is not the case in fuse_request_send_nowait()
failure path.  So instead just call the ->end() callback (which is actually
set by all callers).

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
8 years agofuse: initialize fc->release before calling it
Miklos Szeredi [Wed, 1 Jul 2015 14:25:55 +0000 (16:25 +0200)]
fuse: initialize fc->release before calling it

fc->release is called from fuse_conn_put() which was used in the error
cleanup before fc->release was initialized.

[Jeremiah Mahler <jmmahler@gmail.com>: assign fc->release after calling
fuse_conn_init(fc) instead of before.]

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Fixes: a325f9b92273 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()")
Cc: <stable@vger.kernel.org> #v2.6.31+
8 years agoarm64: Don't report clear pmds and puds as huge
Christoffer Dall [Wed, 1 Jul 2015 12:08:31 +0000 (14:08 +0200)]
arm64: Don't report clear pmds and puds as huge

The current pmd_huge() and pud_huge() functions simply check if the table
bit is not set and reports the entries as huge in that case.  This is
counter-intuitive as a clear pmd/pud cannot also be a huge pmd/pud, and
it is inconsistent with at least arm and x86.

To prevent others from making the same mistake as me in looking at code
that calls these functions and to fix an issue with KVM on arm64 that
causes memory corruption due to incorrect page reference counting
resulting from this mistake, let's change the behavior.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Fixes: 084bd29810a5 ("ARM64: mm: HugeTLB support.")
Cc: <stable@vger.kernel.org> # 3.11+
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
8 years agohwspinlock: qcom: Correct msb in regmap_field
Bjorn Andersson [Fri, 26 Jun 2015 21:47:21 +0000 (14:47 -0700)]
hwspinlock: qcom: Correct msb in regmap_field

msb of the regmap_field was mistakenly given the value 32, to set all bits
in the regmap update mask; although incorrect this worked until 921cc294,
where the mask calculation was corrected.

Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
8 years agoprintk: Increase maximum CONFIG_LOG_BUF_SHIFT from 21 to 25
Ingo Molnar [Wed, 1 Jul 2015 08:19:11 +0000 (10:19 +0200)]
printk: Increase maximum CONFIG_LOG_BUF_SHIFT from 21 to 25

So I tried to some kernel debugging that produced a ton of kernel messages
on a big box, and wanted to save them all: but CONFIG_LOG_BUF_SHIFT maxes
out at 21 (2 MB).

Increase it to 25 (32 MB).

This does not affect any existing config or defaults.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agotime: Remove development rules from Kbuild/Makefile
Thomas Gleixner [Wed, 1 Jul 2015 07:57:35 +0000 (09:57 +0200)]
time: Remove development rules from Kbuild/Makefile

time.o gets rebuilt unconditionally due to a leftover Makefile rule
which was placed there for development purposes.

Remove it along with the commented out always rule in the toplevel
Kbuild file.

Fixes: 0a227985d4a9 'time: Move timeconst.h into include/generated'
Reported-by; Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicholas Mc Guire <der.herr@hofr.at>
8 years agoiommu/amd: Introduce protection_domain_init() function
Joerg Roedel [Tue, 30 Jun 2015 06:56:11 +0000 (08:56 +0200)]
iommu/amd: Introduce protection_domain_init() function

This function contains the common parts between the
initialization of dma_ops_domains and usual protection
domains. This also fixes a long-standing bug which was
uncovered by recent changes, in which the api_lock was not
initialized for dma_ops_domains.

Reported-by: George Wang <xuw2015@gmail.com>
Tested-by: George Wang <xuw2015@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
8 years agofs/file.c: __fget() and dup2() atomicity rules
Eric Dumazet [Mon, 29 Jun 2015 15:10:30 +0000 (17:10 +0200)]
fs/file.c: __fget() and dup2() atomicity rules

__fget() does lockless fetch of pointer from the descriptor
table, attempts to grab a reference and treats "it was already
zero" as "it's already gone from the table, we just hadn't
seen the store, let's fail".  Unfortunately, that breaks the
atomicity of dup2() - __fget() might see the old pointer,
notice that it's been already dropped and treat that as
"it's closed".  What we should be getting is either the
old file or new one, depending whether we come before or after
dup2().

Dmitry had following test failing sometimes :

int fd;
void *Thread(void *x) {
  char buf;
  int n = read(fd, &buf, 1);
  if (n != 1)
    exit(printf("read failed: n=%d errno=%d\n", n, errno));
  return 0;
}

int main()
{
  fd = open("/dev/urandom", O_RDONLY);
  int fd2 = open("/dev/urandom", O_RDONLY);
  if (fd == -1 || fd2 == -1)
    exit(printf("open failed\n"));
  pthread_t th;
  pthread_create(&th, 0, Thread, 0);
  if (dup2(fd2, fd) == -1)
    exit(printf("dup2 failed\n"));
  pthread_join(th, 0);
  if (close(fd) == -1)
    exit(printf("close failed\n"));
  if (close(fd2) == -1)
    exit(printf("close failed\n"));
  printf("DONE\n");
  return 0;
}

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8 years agofs/file.c: don't acquire files->file_lock in fd_install()
Eric Dumazet [Tue, 30 Jun 2015 13:54:08 +0000 (15:54 +0200)]
fs/file.c: don't acquire files->file_lock in fd_install()

Mateusz Guzik reported :

 Currently obtaining a new file descriptor results in locking fdtable
 twice - once in order to reserve a slot and second time to fill it.

Holding the spinlock in __fd_install() is needed in case a resize is
done, or to prevent a resize.

Mateusz provided an RFC patch and a micro benchmark :
  http://people.redhat.com/~mguzik/pipebench.c

A resize is an unlikely operation in a process lifetime,
as table size is at least doubled at every resize.

We can use RCU instead of the spinlock.

__fd_install() must wait if a resize is in progress.

The resize must block new __fd_install() callers from starting,
and wait that ongoing install are finished (synchronize_sched())

resize should be attempted by a single thread to not waste resources.

rcu_sched variant is used, as __fd_install() and expand_fdtable() run
from process context.

It gives us a ~30% speedup using pipebench on a dual Intel(R) Xeon(R)
CPU E5-2696 v2 @ 2.50GHz

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Mateusz Guzik <mguzik@redhat.com>
Tested-by: Mateusz Guzik <mguzik@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8 years agofs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
Wang YanQing [Tue, 23 Jun 2015 10:54:45 +0000 (18:54 +0800)]
fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation

Execution of get_anon_bdev concurrently and preemptive kernel all
could bring race condition, it isn't enough to check dev against
its upper limitation with equality operator only.

This patch fix it.

Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Linus Torvalds [Wed, 1 Jul 2015 04:47:12 +0000 (21:47 -0700)]
Merge git://git./linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile updates from Chris Metcalf:
 "These are a grab bag of changes to improve debugging and respond to a
  variety of issues raised on LKML over the last couple of months"

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: avoid a "label not used" warning in do_page_fault()
  tile: vdso: use raw_read_seqcount_begin() in vdso
  tile: force CONFIG_TILEGX if ARCH != tilepro
  tile: improve stack backtrace
  tile: fix "odd fault" warning for stack backtraces
  tile: set up initial stack top to honor STACK_TOP_DELTA
  tile: support delivering NMIs for multicore backtrace
  drivers/tty/hvc/hvc_tile.c: properly return -EAGAIN
  tile: add <asm/word-at-a-time.h> and enable support functions
  tile: use READ_ONCE() in arch_spin_is_locked()
  tile: modify arch_spin_unlock_wait() semantics

8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Wed, 1 Jul 2015 04:44:14 +0000 (21:44 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux

Pull more s390 updates from Martin Schwidefsky:
 "There is one larger patch for the AP bus code to make it work with the
  longer reset periods of the latest crypto cards.

  A new default configuration, a naming cleanup for SMP and a few fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/kdump: fix compile for !SMP
  s390/kdump: fix nosmt kernel parameter
  s390: new default configuration
  s390/smp: cleanup core vs. cpu in the SCLP interface
  s390/smp: fix sigp cpu detection loop
  s390/zcrypt: Fixed reset and interrupt handling of AP queues
  s390/kdump: fix REGSET_VX_LOW vector register ELF notes
  s390/bpf: Fix backward jumps

8 years agoMerge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Wed, 1 Jul 2015 04:40:07 +0000 (21:40 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6

Pull CIFS/SMB3 updates from Steve French:
 "Includes two bug fixes, as well as (minimal) support for the new
  protocol dialect (SMB3.1.1), and support for two ioctls including
  reflink (duplicate extents) file copy and set integrity"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Unset CIFS_MOUNT_POSIX_PATHS flag when following dfs mounts
  Update negotiate protocol for SMB3.11 dialect
  Add ioctl to set integrity
  Add Get/Set Integrity Information structure definitions
  Add reflink copy over SMB3.11 with new FSCTL_DUPLICATE_EXTENTS
  Add SMB3.11 mount option synonym for new dialect
  add struct FILE_STANDARD_INFO
  Make dialect negotiation warning message easier to read
  Add defines and structs for smb3.1 dialect
  Allow parsing vers=3.11 on cifs mount
  client MUST ignore EncryptionKeyLength if CAP_EXTENDED_SECURITY is set

8 years agovfs: avoid creation of inode number 0 in get_next_ino
Carlos Maiolino [Thu, 25 Jun 2015 15:25:58 +0000 (12:25 -0300)]
vfs: avoid creation of inode number 0 in get_next_ino

currently, get_next_ino() is able to create inodes with inode number = 0.
This have a bad impact in the filesystems relying in this function to generate
inode numbers.

While there is no problem at all in having inodes with number 0, userspace tools
which handle file management tasks can have problems handling these files, like
for example, the impossiblity of users to delete these files, since glibc will
ignore them. So, I believe the best way is kernel to avoid creating them.

This problem has been raised previously, but the old thread didn't have any
other update for a year+, and I've seen too many users hitting the same issue
regarding the impossibility to delete files while using filesystems relying on
this function. So, I'm starting the thread again, with the same patch
that I believe is enough to address this problem.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8 years agoMerge tag 'xfs-for-linus-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 1 Jul 2015 03:16:08 +0000 (20:16 -0700)]
Merge tag 'xfs-for-linus-4.2-rc1' of git://git./linux/kernel/git/dgc/linux-xfs

Pul xfs updates from Dave Chinner:
 "There's a couple of small API changes to the core DAX code which
  required small changes to the ext2 and ext4 code bases, but otherwise
  everything is within the XFS codebase.

  This update contains:

   - A new sparse on-disk inode record format to allow small extents to
     be used for inode allocation when free space is fragmented.

   - DAX support.  This includes minor changes to the DAX core code to
     fix problems with lock ordering and bufferhead mapping abuse.

   - transaction commit interface cleanup

   - removal of various unnecessary XFS specific type definitions

   - cleanup and optimisation of freelist preparation before allocation

   - various minor cleanups

   - bug fixes for
- transaction reservation leaks
- incorrect inode logging in unwritten extent conversion
- mmap lock vs freeze ordering
- remote symlink mishandling
- attribute fork removal issues"

* tag 'xfs-for-linus-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (49 commits)
  xfs: don't truncate attribute extents if no extents exist
  xfs: clean up XFS_MIN_FREELIST macros
  xfs: sanitise error handling in xfs_alloc_fix_freelist
  xfs: factor out free space extent length check
  xfs: xfs_alloc_fix_freelist() can use incore perag structures
  xfs: remove xfs_caddr_t
  xfs: use void pointers in log validation helpers
  xfs: return a void pointer from xfs_buf_offset
  xfs: remove inst_t
  xfs: remove __psint_t and __psunsigned_t
  xfs: fix remote symlinks on V5/CRC filesystems
  xfs: fix xfs_log_done interface
  xfs: saner xfs_trans_commit interface
  xfs: remove the flags argument to xfs_trans_cancel
  xfs: pass a boolean flag to xfs_trans_free_items
  xfs: switch remaining xfs_trans_dup users to xfs_trans_roll
  xfs: check min blks for random debug mode sparse allocations
  xfs: fix sparse inodes 32-bit compile failure
  xfs: add initial DAX support
  xfs: add DAX IO path support
  ...

8 years agoMerge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason...
Linus Torvalds [Wed, 1 Jul 2015 03:07:45 +0000 (20:07 -0700)]
Merge branch 'for-linus-4.2' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs updates from Chris Mason:
 "Outside of our usual batch of fixes, this integrates the subvolume
  quota updates that Qu Wenruo from Fujitsu has been working on for a
  few releases now.  He gets an extra gold star for making btrfs smaller
  this time, and fixing a number of quota corners in the process.

  Dave Sterba tested and integrated Anand Jain's sysfs improvements.
  Outside of exporting a symbol (ack'd by Greg) these are all internal
  to btrfs and it's mostly cleanups and fixes.  Anand also attached some
  of our sysfs objects to our internal device management structs instead
  of an object off the super block.  It will make device management
  easier overall and it's a better fit for how the sysfs files are used.
  None of the existing sysfs files are moved around.

  Thanks for all the fixes everyone"

* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (87 commits)
  btrfs: delayed-ref: double free in btrfs_add_delayed_tree_ref()
  Btrfs: Check if kobject is initialized before put
  lib: export symbol kobject_move()
  Btrfs: sysfs: add support to show replacing target in the sysfs
  Btrfs: free the stale device
  Btrfs: use received_uuid of parent during send
  Btrfs: fix use-after-free in btrfs_replay_log
  btrfs: wait for delayed iputs on no space
  btrfs: qgroup: Make snapshot accounting work with new extent-oriented qgroup.
  btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots.
  btrfs: ulist: Add ulist_del() function.
  btrfs: qgroup: Cleanup the old ref_node-oriented mechanism.
  btrfs: qgroup: Switch self test to extent-oriented qgroup mechanism.
  btrfs: qgroup: Switch to new extent-oriented qgroup mechanism.
  btrfs: qgroup: Switch rescan to new mechanism.
  btrfs: qgroup: Add new qgroup calculation function btrfs_qgroup_account_extents().
  btrfs: backref: Add special time_seq == (u64)-1 case for btrfs_find_all_roots().
  btrfs: qgroup: Add new function to record old_roots.
  btrfs: qgroup: Record possible quota-related extent for qgroup.
  btrfs: qgroup: Add function qgroup_update_counters().
  ...

8 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Wed, 1 Jul 2015 02:46:34 +0000 (19:46 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull more block layer patches from Jens Axboe:
 "A few later arrivers that I didn't fold into the first pull request,
  so we had a chance to run some testing.  This contains:

   - NVMe:
        - Set of fixes from Keith
        - 4.4 and earlier gcc build fix from Andrew

   - small set of xen-blk{back,front} fixes from Bob Liu.

   - warnings fix for bogus inline statement in I_BDEV() from Geert.

   - error code fixup for SG_IO ioctl from Paolo Bonzini"

* 'for-linus' of git://git.kernel.dk/linux-block:
  drivers/block/nvme-core.c: fix build with gcc-4.4.4
  bdi: Remove "inline" keyword from exported I_BDEV() implementation
  block: fix bogus EFAULT error from SG_IO ioctl
  NVMe: Fix filesystem deadlock on removal
  NVMe: Failed controller initialization fixes
  NVMe: Unify controller probe and resume
  NVMe: Don't use fake status on cancelled command
  NVMe: Fix device cleanup on initialization failure
  drivers: xen-blkfront: only talk_to_blkback() when in XenbusStateInitialising
  xen/block: add multi-page ring support
  driver: xen-blkfront: move talk_to_blkback to a more suitable place
  drivers: xen-blkback: delay pending_req allocation to connect_ring

8 years agogenalloc: rename of_get_named_gen_pool() to of_gen_pool_get()
Vladimir Zapolskiy [Tue, 30 Jun 2015 22:00:07 +0000 (15:00 -0700)]
genalloc: rename of_get_named_gen_pool() to of_gen_pool_get()

To be consistent with other kernel interface namings, rename
of_get_named_gen_pool() to of_gen_pool_get().  In the original function
name "_named" suffix references to a device tree property, which contains
a phandle to a device and the corresponding device driver is assumed to
register a gen_pool object.

Due to a weak relation and to avoid any confusion (e.g.  in future
possible scenario if gen_pool objects are named) the suffix is removed.

[sfr@canb.auug.org.au: crypto/marvell/cesa - fix up for of_get_named_gen_pool() rename]
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agogenalloc: rename dev_get_gen_pool() to gen_pool_get()
Vladimir Zapolskiy [Tue, 30 Jun 2015 22:00:03 +0000 (15:00 -0700)]
genalloc: rename dev_get_gen_pool() to gen_pool_get()

To be consistent with other genalloc interface namings, rename
dev_get_gen_pool() to gen_pool_get().  The original omitted "dev_" prefix
is removed, since it points to argument type of the function, and so it
does not bring any useful information.

[akpm@linux-foundation.org: update arch/arm/mach-socfpga/pm.c]
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Mark Brown <broonie@kernel.org>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Alan Tull <atull@opensource.altera.com>
Cc: Dinh Nguyen <dinguyen@opensource.altera.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agox86: opt into HAVE_COPY_THREAD_TLS, for both 32-bit and 64-bit
Josh Triplett [Tue, 30 Jun 2015 22:00:00 +0000 (15:00 -0700)]
x86: opt into HAVE_COPY_THREAD_TLS, for both 32-bit and 64-bit

For 32-bit userspace on a 64-bit kernel, this requires modifying
stub32_clone to actually swap the appropriate arguments to match
CONFIG_CLONE_BACKWARDS, rather than just leaving the C argument for tls
broken.

Patch co-authored by Josh Triplett and Thiago Macieira.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: add zpool
Dan Streetman [Tue, 30 Jun 2015 21:59:57 +0000 (14:59 -0700)]
MAINTAINERS: add zpool

Add entry for zpool to MAINTAINERS file.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: BCACHE: Kent Overstreet has changed email address
Joe Perches [Tue, 30 Jun 2015 21:59:54 +0000 (14:59 -0700)]
MAINTAINERS: BCACHE: Kent Overstreet has changed email address

Kent's email address in MAINTAINERS seems to be invalid.
This was his last sign-off address, so use that if appropriate.

Fix the S: status entry while there.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: move Jens Osterkamp to CREDITS
Joe Perches [Tue, 30 Jun 2015 21:59:51 +0000 (14:59 -0700)]
MAINTAINERS: move Jens Osterkamp to CREDITS

Jens' email address bounces, so move his name and entry to CREDITS.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Ishizaki Kou <kou.ishizaki@toshiba.co.jp>
Cc: Jens Osterkamp <jens@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: remove unused nbd.h pattern
Joe Perches [Tue, 30 Jun 2015 21:59:48 +0000 (14:59 -0700)]
MAINTAINERS: remove unused nbd.h pattern

Commit 13e71d69cc74 ("nbd: Remove kernel internal header") deleted the
file, remove the pattern.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: update brcm gpio filename pattern
Joe Perches [Tue, 30 Jun 2015 21:59:45 +0000 (14:59 -0700)]
MAINTAINERS: update brcm gpio filename pattern

Commit 23a71fd616bf ("dt-bindings: brcm: rationalize Broadcom
documentation naming") renamed the file, update the pattern.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Scott Branden <sbranden@broadcom.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: update brcm dts pattern
Joe Perches [Tue, 30 Jun 2015 21:59:42 +0000 (14:59 -0700)]
MAINTAINERS: update brcm dts pattern

Commit 8c0b9ee8665c ("MIPS: Move device-trees into vendor
sub-directories") moved the files, update the pattern.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: update sound soc intel patterns
Joe Perches [Tue, 30 Jun 2015 21:59:39 +0000 (14:59 -0700)]
MAINTAINERS: update sound soc intel patterns

Commit 2106241a6803 ("ASoC: Intel: create common folder and move common
files in") moved the files around.  Update the patterns.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Jie Yang <yang.jie@intel.com>
Acked-by: Jie Yang <yang.jie@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: remove website for paride
Sudip Mukherjee [Tue, 30 Jun 2015 21:59:36 +0000 (14:59 -0700)]
MAINTAINERS: remove website for paride

The webpage mentioned is not working,

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: update Emulex ocrdma email addresses
Laurent Navet [Tue, 30 Jun 2015 21:59:33 +0000 (14:59 -0700)]
MAINTAINERS: update Emulex ocrdma email addresses

@emulex.com addresses respond to use @avagotech.com.

Signed-off-by: Laurent Navet <laurent.navet@gmail.com>
Acked-By: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agobcache: use kvfree() in various places
Pekka Enberg [Tue, 30 Jun 2015 21:59:30 +0000 (14:59 -0700)]
bcache: use kvfree() in various places

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolibcxgbi: use kvfree() in cxgbi_free_big_mem()
Pekka Enberg [Tue, 30 Jun 2015 21:59:27 +0000 (14:59 -0700)]
libcxgbi: use kvfree() in cxgbi_free_big_mem()

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: "James E.J. Bottomley" <JBottomley@odin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agotarget: use kvfree() in session alloc and free
Pekka Enberg [Tue, 30 Jun 2015 21:59:24 +0000 (14:59 -0700)]
target: use kvfree() in session alloc and free

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/ehca: use kvfree() in ipz_queue_{cd}tor()
Pekka Enberg [Tue, 30 Jun 2015 21:59:21 +0000 (14:59 -0700)]
IB/ehca: use kvfree() in ipz_queue_{cd}tor()

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm/nouveau/gem: use kvfree() in u_free()
Pekka Enberg [Tue, 30 Jun 2015 21:59:18 +0000 (14:59 -0700)]
drm/nouveau/gem: use kvfree() in u_free()

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm: use kvfree() in drm_free_large()
Pekka Enberg [Tue, 30 Jun 2015 21:59:15 +0000 (14:59 -0700)]
drm: use kvfree() in drm_free_large()

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocxgb4: use kvfree() in t4_free_mem()
Pekka Enberg [Tue, 30 Jun 2015 21:59:12 +0000 (14:59 -0700)]
cxgb4: use kvfree() in t4_free_mem()

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: Hariprasad S <hariprasad@chelsio.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocxgb3: use kvfree() in cxgb_free_mem()
Pekka Enberg [Tue, 30 Jun 2015 21:59:09 +0000 (14:59 -0700)]
cxgb3: use kvfree() in cxgb_free_mem()

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: Santosh Raspatur <santosh@chelsio.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokernel/relay.c: use kvfree() in relay_free_page_array()
Pekka Enberg [Tue, 30 Jun 2015 21:59:06 +0000 (14:59 -0700)]
kernel/relay.c: use kvfree() in relay_free_page_array()

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoprintk: improve the description of /dev/kmsg line format
Antonio Ospite [Tue, 30 Jun 2015 21:59:03 +0000 (14:59 -0700)]
printk: improve the description of /dev/kmsg line format

The comment about /dev/kmsg does not mention the additional values which
may actually be exported, fix that.

Also move up the part of the comment instructing the users to ignore these
additional values, this way the reading is more fluent and logically
compact.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
Cc: Joe Perches <joe@perches.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoarch/unicore32/kernel/fpu-ucf64.c: remove unnecessary KERN_ERR
Masanari Iida [Tue, 30 Jun 2015 21:59:00 +0000 (14:59 -0700)]
arch/unicore32/kernel/fpu-ucf64.c: remove unnecessary KERN_ERR

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrivers/scsi/scsi_debug.c: resolve sg buffer const-ness issue
Dave Gordon [Tue, 30 Jun 2015 21:58:57 +0000 (14:58 -0700)]
drivers/scsi/scsi_debug.c: resolve sg buffer const-ness issue

do_device_access() takes a separate parameter to indicate the direction of
data transfer, which it used to use to select the appropriate function out
of sg_pcopy_{to,from}_buffer().  However these two functions now have

So this patch makes it bypass these wrappers and call the underlying
function sg_copy_buffer() directly; this has the same calling style as
do_device_access() i.e.  a separate direction-of-transfer parameter and no
pointers-to-const, so skipping the wrappers not only eliminates the
warning, it also make the code simpler :)

[akpm@linux-foundation.org: fix very broken build]
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/scatterlist: mark input buffer parameters as 'const'
Dave Gordon [Tue, 30 Jun 2015 21:58:54 +0000 (14:58 -0700)]
lib/scatterlist: mark input buffer parameters as 'const'

The 'buf' parameter of sg(p)copy_from_buffer() can and should be
const-qualified, although because of the shared implementation of
_to_buffer() and _from_buffer(), we have to cast this away internally.

This means that callers who have a 'const' buffer containing the data to
be copied to the sg-list no longer have to cast away the const-ness
themselves.  It also enables improved coverage by code analysis tools.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/scatterlist.c: fix kerneldoc for sg_pcopy_{to,from}_buffer()
Dave Gordon [Tue, 30 Jun 2015 21:58:52 +0000 (14:58 -0700)]
lib/scatterlist.c: fix kerneldoc for sg_pcopy_{to,from}_buffer()

The kerneldoc for the functions doesn't match the code; the last two
parameters (buflen, skip) have been transposed, which is confusing,
especially as they're both integral types and the compiler won't warn
about swapping them.

These functions and the kerneldoc were introduced in commit:
    df642cea lib/scatterlist: introduce sg_pcopy_from_buffer() ...
    Author: Akinobu Mita <akinobu.mita@gmail.com>
    Date:   Mon Jul 8 16:01:54 2013 -0700

    The only difference between sg_pcopy_{from,to}_buffer() and
    sg_copy_{from,to}_buffer() is an additional argument that
    specifies the number of bytes to skip the SG list before
    copying.

The functions have the extra argument at the end, but the kerneldoc
lists it in penultimate position.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc,sysv: return -EINVAL upon incorrect id/seqnum
Davidlohr Bueso [Tue, 30 Jun 2015 21:58:48 +0000 (14:58 -0700)]
ipc,sysv: return -EINVAL upon incorrect id/seqnum

In ipc_obtain_object_check we return -EIDRM when a bogus sequence number
is detected via ipc_checkid, while the ipc manpages state the following
return codes for such errors:

   EIDRM  <ID> points to a removed identifier.
   EINVAL Invalid <ID> value, or unaligned, etc.

EIDRM should only be returned upon a RMID call (->deleted check), and thus
return EINVAL for wrong seq.  This difference in semantics has also caused
real bugs, ie: https://bugzilla.redhat.com/show_bug.cgi?id=246509

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc,sysv: make return -EIDRM when racing with RMID consistent
Davidlohr Bueso [Tue, 30 Jun 2015 21:58:45 +0000 (14:58 -0700)]
ipc,sysv: make return -EIDRM when racing with RMID consistent

The ipc_lock helper is used by all forms of sysv ipc to acquire the ipc
object's spinlock.  Upon error (bogus identifier), we always return
-EINVAL, whether the problem be in the idr path or because we raced with a
task performing RMID.  For the later, however, all ipc related manpages,
state the that for:

       EIDRM  <ID> points to a removed identifier.

And return:

       EINVAL Invalid <ID> value, or unaligned, etc.

Which (EINVAL) should only return once the ipc resource is deleted.  For
all types of ipc this is done immediately upon a RMID command.  However,
shared memory behaves slightly different as it can merely mark a segment
for deletion, and delay the actual freeing until there are no more active
consumers.  Per shmctl(IPC_RMID) manpage:

""
Mark  the  segment to be destroyed.  The segment will only actually
be destroyed after the last process detaches it (i.e., when the
shm_nattch member of the associated structure shmid_ds is zero).
""

Unlike ipc_lock, paths that behave "correctly", at least per the manpage,
involve controlling the ipc resource via *ctl(), doing the exact same
validity check as ipc_lock after right acquiring the spinlock:

if (!ipc_valid_object()) {
err = -EIDRM;
goto out_unlock;
}

Thus make ipc_lock consistent with the rest of ipc code and return -EIDRM
in ipc_lock when !ipc_valid_object().

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc: rename ipc_obtain_object
Davidlohr Bueso [Tue, 30 Jun 2015 21:58:42 +0000 (14:58 -0700)]
ipc: rename ipc_obtain_object

...  to ipc_obtain_object_idr, which is more meaningful and makes the code
slightly easier to follow.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc,msg: provide barrier pairings for lockless receive
Davidlohr Bueso [Tue, 30 Jun 2015 21:58:39 +0000 (14:58 -0700)]
ipc,msg: provide barrier pairings for lockless receive

We currently use a full barrier on the sender side to to avoid receiver
tasks disappearing on us while still performing on the sender side wakeup.
 We lack however, the proper CPU-CPU interactions pairing on the receiver
side which busy-waits for the message.  Similarly, we do not need a full
smp_mb, and can relax the semantics for the writer and reader sides of the
message.  This is safe as we are only ordering loads and stores to r_msg.
And in both smp_wmb and smp_rmb, there are no stores after the calls
_anyway_.

This obviously applies for pipelined_send and expunge_all, for EIRDM when
destroying a queue.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc,shm: move BUG_ON check into shm_lock
Davidlohr Bueso [Tue, 30 Jun 2015 21:58:36 +0000 (14:58 -0700)]
ipc,shm: move BUG_ON check into shm_lock

Upon every shm_lock call, we BUG_ON if an error was returned, indicating
racing either in idr or in shm_destroy.  Move this logic into the locking.

[akpm@linux-foundation.org: simplify code]
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc/util.c: use kvfree() in ipc_rcu_free()
Pekka Enberg [Tue, 30 Jun 2015 21:58:33 +0000 (14:58 -0700)]
ipc/util.c: use kvfree() in ipc_rcu_free()

Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoarc: use for_each_sg()
Akinobu Mita [Tue, 30 Jun 2015 21:58:30 +0000 (14:58 -0700)]
arc: use for_each_sg()

This replaces the plain loop over the sglist array with for_each_sg()
macro which consists of sg_next() function calls.  Since arc doesn't
select ARCH_HAS_SG_CHAIN, it is not necessary to use for_each_sg() in
order to loop over each sg element.  But this can help find problems with
drivers that do not properly initialize their sg tables when
CONFIG_DEBUG_SG is enabled.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodevpts: if initialization failed, don't crash when opening /dev/ptmx
Josh Triplett [Tue, 30 Jun 2015 21:58:27 +0000 (14:58 -0700)]
devpts: if initialization failed, don't crash when opening /dev/ptmx

If devpts failed to initialize, it would store an ERR_PTR in the global
devpts_mnt.  A subsequent open of /dev/ptmx would call devpts_new_index,
which would dereference devpts_mnt and crash.

Avoid storing invalid values in devpts_mnt; leave it NULL instead.  Make
both devpts_new_index and devpts_pty_new fail gracefully with ENODEV in
that case, which then becomes the return value to the userspace open call
on /dev/ptmx.

[akpm@linux-foundation.org: remove unneeded static]
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoscripts/gdb: remove useless global instruction
Thiébaud Weksteen [Tue, 30 Jun 2015 21:58:24 +0000 (14:58 -0700)]
scripts/gdb: remove useless global instruction

Signed-off-by: Thiébaud Weksteen <thiebaud@weksteen.fr>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoscripts/gdb: add ps command
Thiébaud Weksteen [Tue, 30 Jun 2015 21:58:21 +0000 (14:58 -0700)]
scripts/gdb: add ps command

Signed-off-by: Thiébaud Weksteen <thiebaud@weksteen.fr>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoscripts/gdb: fix PEP8 compliance
Thiébaud Weksteen [Tue, 30 Jun 2015 21:58:18 +0000 (14:58 -0700)]
scripts/gdb: fix PEP8 compliance

Signed-off-by: Thiébaud Weksteen <thiebaud@weksteen.fr>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoscripts/gdb: fix typo in exception name
Thiébaud Weksteen [Tue, 30 Jun 2015 21:58:16 +0000 (14:58 -0700)]
scripts/gdb: fix typo in exception name

Signed-off-by: Thiébaud Weksteen <thiebaud@weksteen.fr>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoscripts/gdb: enable completion for lx-list-check parameter
Jan Kiszka [Tue, 30 Jun 2015 21:58:13 +0000 (14:58 -0700)]
scripts/gdb: enable completion for lx-list-check parameter

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Thiébaud Weksteen <thiebaud@weksteen.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoscripts/gdb: also allow list_head pointer as lx-list-check paramter
Jan Kiszka [Tue, 30 Jun 2015 21:58:10 +0000 (14:58 -0700)]
scripts/gdb: also allow list_head pointer as lx-list-check paramter

This makes the usage more flexible.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Thiébaud Weksteen <thiebaud@weksteen.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoscripts/gdb: add command to check list consistency
Thiébaud Weksteen [Tue, 30 Jun 2015 21:58:07 +0000 (14:58 -0700)]
scripts/gdb: add command to check list consistency

Add a gdb script to verify the consistency of lists.

Signed-off-by: Thiébaud Weksteen <thiebaud@weksteen.fr>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomemstick: remove deprecated use of pci api
Quentin Lambert [Tue, 30 Jun 2015 21:58:04 +0000 (14:58 -0700)]
memstick: remove deprecated use of pci api

Replace occurences of the pci api by appropriate call to the dma api.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr)

@deprecated@
idexpression id;
position p;
@@

(
  pci_dma_supported@p ( id, ...)
|
  pci_alloc_consistent@p ( id, ...)
)

@bad1@
idexpression id;
position deprecated.p;
@@
...when != &id->dev
   when != pci_get_drvdata ( id )
   when != pci_enable_device ( id )
(
  pci_dma_supported@p ( id, ...)
|
  pci_alloc_consistent@p ( id, ...)
)

@depends on !bad1@
idexpression id;
expression direction;
position deprecated.p;
@@

(
- pci_dma_supported@p ( id,
+ dma_supported ( &id->dev,
...
+ , GFP_ATOMIC
  )
|
- pci_alloc_consistent@p ( id,
+ dma_alloc_coherent ( &id->dev,
...
+ , GFP_ATOMIC
  )
)

Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/affs/symlink.c: remove unneeded err variable
Fabian Frederick [Tue, 30 Jun 2015 21:58:01 +0000 (14:58 -0700)]
fs/affs/symlink.c: remove unneeded err variable

err is only assigned to -EIO.  Return that value at the end of fail
context.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/affs/amigaffs.c: remove unneeded initialization
Fabian Frederick [Tue, 30 Jun 2015 21:57:58 +0000 (14:57 -0700)]
fs/affs/amigaffs.c: remove unneeded initialization

bh is initialized unconditionally in affs_remove_link()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/affs/inode.c: remove unneeded initialization
Fabian Frederick [Tue, 30 Jun 2015 21:57:55 +0000 (14:57 -0700)]
fs/affs/inode.c: remove unneeded initialization

bh is initialized unconditionally in affs_add_entry()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/adfs: remove unneeded cast
Firo Yang [Tue, 30 Jun 2015 21:57:52 +0000 (14:57 -0700)]
fs/adfs: remove unneeded cast

kmem_cache_alloc() returns void*.

Signed-off-by: Firo Yang <firogm@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agogcov: add support for GCC 5.1
Lorenzo Stoakes [Tue, 30 Jun 2015 21:57:49 +0000 (14:57 -0700)]
gcov: add support for GCC 5.1

Fix kernel gcov support for GCC 5.1.  Similar to commit a992bf836f9
("gcov: add support for GCC 4.9"), this patch takes into account the
existence of a new gcov counter (see gcc's gcc/gcov-counter.def.)

Firstly, it increments GCOV_COUNTERS (to 10), which makes the data
structure struct gcov_info compatible with GCC 5.1.

Secondly, a corresponding counter function __gcov_merge_icall_topn (Top N
value tracking for indirect calls) is included in base.c with the other
gcov counters unused for kernel profiling.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Yuan Pengfei <coolypf@qq.com>
Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
HATAYAMA Daisuke [Tue, 30 Jun 2015 21:57:46 +0000 (14:57 -0700)]
kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path

Commit f06e5153f4ae2e ("kernel/panic.c: add "crash_kexec_post_notifiers"
option for kdump after panic_notifers") introduced
"crash_kexec_post_notifiers" kernel boot option, which toggles wheather
panic() calls crash_kexec() before panic_notifiers and dump kmsg or after.

The problem is that the commit overlooks panic_on_oops kernel boot option.
 If it is enabled, crash_kexec() is called directly without going through
panic() in oops path.

To fix this issue, this patch adds a check to "crash_kexec_post_notifiers"
in the condition of kexec_should_crash().

Also, put a comment in kexec_should_crash() to explain not obvious things
on this patch.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokernel/panic: call the 2nd crash_kexec() only if crash_kexec_post_notifiers is enabled
HATAYAMA Daisuke [Tue, 30 Jun 2015 21:57:43 +0000 (14:57 -0700)]
kernel/panic: call the 2nd crash_kexec() only if crash_kexec_post_notifiers is enabled

For compatibility with the behaviour before the commit f06e5153f4ae2e
("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after
panic_notifers"), the 2nd crash_kexec() should be called only if
crash_kexec_post_notifiers is enabled.

Note that crash_kexec() returns immediately if kdump crash kernel is not
loaded, so in this case, this patch makes no functionality change, but the
point is to make it explicit, from the caller panic() side, that the 2nd
crash_kexec() does nothing.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agox86/kexec: prepend elfcorehdr instead of appending it to the crash-kernel command...
KarimAllah Ahmed [Tue, 30 Jun 2015 21:57:39 +0000 (14:57 -0700)]
x86/kexec: prepend elfcorehdr instead of appending it to the crash-kernel command-line.

Any parameter passed after '--' in the kernel command-line will not be
parsed by the kernel at all, instead it will be passed directly to init
process.

Currently the kernel appends elfcorehdr=<paddr> to the cmdline passed from
kexec load, and if this command-line is used to pass parameters to init
process this means that 'elfcorehdr' will not be parsed as a kernel
parameter at all which will be a problem for vmcore subsystem since it
will know nothing about the location of the ELF structure!

Prepending 'elfcorehdr' instead of appending it fixes this problem since
it ensures that it always comes before '--' and so it's always parsed as a
kernel command-line parameter.

Even with this patch things can still go wrong if 'CONFIG_CMDLINE' was
also used to embedd a command-line to the crash dump kernel and this
command-line contains '--' since the current behavior of the kernel is to
actually append the boot loader command-line to the embedded command-line.

Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Haren Myneni <hbabu@us.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs: document seq_open()'s usage of file->private_data
Yann Droneaud [Tue, 30 Jun 2015 21:57:36 +0000 (14:57 -0700)]
fs: document seq_open()'s usage of file->private_data

seq_open() stores its struct seq_file in file->private_data, thus it must
not be modified by user of seq_file.

Link: http://lkml.kernel.org/r/cover.1433193673.git.ydroneaud@opteya.com
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs: allocate structure unconditionally in seq_open()
Yann Droneaud [Tue, 30 Jun 2015 21:57:33 +0000 (14:57 -0700)]
fs: allocate structure unconditionally in seq_open()

Since patch described below, from v2.6.15-rc1, seq_open() could use a
struct seq_file already allocated by the caller if the pointer to the
structure is stored in file->private_data before calling the function.

    Commit 1abe77b0fc4b485927f1f798ae81a752677e1d05
    Author: Al Viro <viro@zeniv.linux.org.uk>
    Date:   Mon Nov 7 17:15:34 2005 -0500

        [PATCH] allow callers of seq_open do allocation themselves

        Allow caller of seq_open() to kmalloc() seq_file + whatever else they
        want and set ->private_data to it.  seq_open() will then abstain from
        doing allocation itself.

As there's no more use for such feature, as it could be easily replaced by
calls to seq_open_private() (see commit 39699037a5c9 ("[FS] seq_file:
Introduce the seq_open_private()")) and seq_release_private() (see
v2.6.0-test3), support for this uncommon feature can be removed from
seq_open().

Link: http://lkml.kernel.org/r/cover.1433193673.git.ydroneaud@opteya.com
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs: use seq_open_private() for proc_mounts
Yann Droneaud [Tue, 30 Jun 2015 21:57:30 +0000 (14:57 -0700)]
fs: use seq_open_private() for proc_mounts

A patchset to remove support for passing pre-allocated struct seq_file to
seq_open().  Such feature is undocumented and prone to error.

In particular, if seq_release() is used in release handler, it will
kfree() a pointer which was not allocated by seq_open().

So this patchset drops support for pre-allocated struct seq_file: it's
only of use in proc_namespace.c and can be easily replaced by using
seq_open_private()/seq_release_private().

Additionally, it documents the use of file->private_data to hold pointer
to struct seq_file by seq_open().

This patch (of 3):

Since patch described below, from v2.6.15-rc1, seq_open() could use a
struct seq_file already allocated by the caller if the pointer to the
structure is stored in file->private_data before calling the function.

    Commit 1abe77b0fc4b485927f1f798ae81a752677e1d05
    Author: Al Viro <viro@zeniv.linux.org.uk>
    Date:   Mon Nov 7 17:15:34 2005 -0500

        [PATCH] allow callers of seq_open do allocation themselves

        Allow caller of seq_open() to kmalloc() seq_file + whatever else they
        want and set ->private_data to it.  seq_open() will then abstain from
        doing allocation itself.

Such behavior is only used by mounts_open_common().

In order to drop support for such uncommon feature, proc_mounts is
converted to use seq_open_private(), which take care of allocating the
proc_mounts structure, making it available through ->private in struct
seq_file.

Conversely, proc_mounts is converted to use seq_release_private(), in
order to release the private structure allocated by seq_open_private().

Then, ->private is used directly instead of proc_mounts() macro to access
to the proc_mounts structure.

Link: http://lkml.kernel.org/r/cover.1433193673.git.ydroneaud@opteya.com
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: meminit: finish initialisation of struct pages before basic setup
Mel Gorman [Tue, 30 Jun 2015 21:57:27 +0000 (14:57 -0700)]
mm: meminit: finish initialisation of struct pages before basic setup

Waiman Long reported that 24TB machines hit OOM during basic setup when
struct page initialisation was deferred.  One approach is to initialise
memory on demand but it interferes with page allocator paths.  This patch
creates dedicated threads to initialise memory before basic setup.  It
then blocks on a rw_semaphore until completion as a wait_queue and counter
is overkill.  This may be slower to boot but it's simplier overall and
also gets rid of a section mangling which existed so kswapd could do the
initialisation.

[akpm@linux-foundation.org: include rwsem.h, use DECLARE_RWSEM, fix comment, remove unneeded cast]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Waiman Long <waiman.long@hp.com
Cc: Nathan Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Scott Norton <scott.norton@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: meminit: remove mminit_verify_page_links
Mel Gorman [Tue, 30 Jun 2015 21:57:23 +0000 (14:57 -0700)]
mm: meminit: remove mminit_verify_page_links

mminit_verify_page_links() is an extremely paranoid check that was
introduced when memory initialisation was being heavily reworked.
Profiles indicated that up to 10% of parallel memory initialisation was
spent on checking this for every page.  The cost could be reduced but in
practice this check only found problems very early during the
initialisation rewrite and has found nothing since.  This patch removes an
expensive unnecessary check.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Nate Zimmer <nzimmer@sgi.com>
Tested-by: Waiman Long <waiman.long@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Nate Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: meminit: reduce number of times pageblocks are set during struct page init
Mel Gorman [Tue, 30 Jun 2015 21:57:20 +0000 (14:57 -0700)]
mm: meminit: reduce number of times pageblocks are set during struct page init

During parallel sturct page initialisation, ranges are checked for every
PFN unnecessarily which increases boot times.  This patch alters when the
ranges are checked.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Nate Zimmer <nzimmer@sgi.com>
Tested-by: Waiman Long <waiman.long@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Nate Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: meminit: free pages in large chunks where possible
Mel Gorman [Tue, 30 Jun 2015 21:57:16 +0000 (14:57 -0700)]
mm: meminit: free pages in large chunks where possible

Parallel struct page frees pages one at a time. Try free pages as single
large pages where possible.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Nate Zimmer <nzimmer@sgi.com>
Tested-by: Waiman Long <waiman.long@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Nate Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agox86: mm: enable deferred struct page initialisation on x86-64
Mel Gorman [Tue, 30 Jun 2015 21:57:13 +0000 (14:57 -0700)]
x86: mm: enable deferred struct page initialisation on x86-64

Subject says it all.  Other architectures may enable on a case-by-case
basis after auditing early_pfn_to_nid and testing.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Nate Zimmer <nzimmer@sgi.com>
Tested-by: Waiman Long <waiman.long@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Nate Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>