Programming Languages Research Group: Git - firefly-linux-kernel-4.4.55.git/log

Merge remote-tracking branch 'lsk/v3.10/topic/coresight' into linux-linaro-lsk

Conflicts:
arch/arm/include/asm/hardware/coresight.h
drivers/Makefile
drivers/of/base.c

Merge branch 'v3.10-backport' of git://git.linaro.org/people/mathieu.poirier/coresight into lsk-v3.10-coresight

Merge tag 'v3.10.64' into linux-linaro-lsk

This is the 3.10.64 stable release

Linux 3.10.64

Btrfs: fix fs corruption on transaction abort if device supports discard

commit 678886bdc6378c1cbd5072da2c5a3035000214e3 upstream.

When we abort a transaction we iterate over all the ranges marked as dirty
in fs_info->freed_extents[0] and fs_info->freed_extents[1], clear them
from those trees, add them back (unpin) to the free space caches and, if
the fs was mounted with "-o discard", perform a discard on those regions.
Also, after adding the regions to the free space caches, a fitrim ioctl call
can see those ranges in a block group's free space cache and perform a discard
on the ranges, so the same issue can happen without "-o discard" as well.

This causes corruption, affecting one or multiple btree nodes (in the worst
case leaving the fs unmountable) because some of those ranges (the ones in
the fs_info->pinned_extents tree) correspond to btree nodes/leafs that are
referred by the last committed super block - breaking the rule that anything
that was committed by a transaction is untouched until the next transaction
commits successfully.

I ran into this while running in a loop (for several hours) the fstest that
I recently submitted:

  [PATCH] fstests: add btrfs test to stress chunk allocation/removal and fstrim

The corruption always happened when a transaction aborted and then fsck complained
like this:

   _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
   *** fsck.btrfs output ***
   Check tree block failed, want=94945280, have=0
   Check tree block failed, want=94945280, have=0
   Check tree block failed, want=94945280, have=0
   Check tree block failed, want=94945280, have=0
   Check tree block failed, want=94945280, have=0
   read block failed check_tree_block
   Couldn't open file system

In this case 94945280 corresponded to the root of a tree.
Using frace what I observed was the following sequence of steps happened:

   1) transaction N started, fs_info->pinned_extents pointed to
      fs_info->freed_extents[0];

   2) node/eb 94945280 is created;

   3) eb is persisted to disk;

   4) transaction N commit starts, fs_info->pinned_extents now points to
      fs_info->freed_extents[1], and transaction N completes;

   5) transaction N + 1 starts;

   6) eb is COWed, and btrfs_free_tree_block() called for this eb;

   7) eb range (94945280 to 94945280 + 16Kb) is added to
      fs_info->pinned_extents (fs_info->freed_extents[1]);

   8) Something goes wrong in transaction N + 1, like hitting ENOSPC
      for example, and the transaction is aborted, turning the fs into
      readonly mode. The stack trace I got for example:

      [112065.253935]  [<ffffffff8140c7b6>] dump_stack+0x4d/0x66
      [112065.254271]  [<ffffffff81042984>] warn_slowpath_common+0x7f/0x98
      [112065.254567]  [<ffffffffa0325990>] ? __btrfs_abort_transaction+0x50/0x10b [btrfs]
      [112065.261674]  [<ffffffff810429e5>] warn_slowpath_fmt+0x48/0x50
      [112065.261922]  [<ffffffffa032949e>] ? btrfs_free_path+0x26/0x29 [btrfs]
      [112065.262211]  [<ffffffffa0325990>] __btrfs_abort_transaction+0x50/0x10b [btrfs]
      [112065.262545]  [<ffffffffa036b1d6>] btrfs_remove_chunk+0x537/0x58b [btrfs]
      [112065.262771]  [<ffffffffa033840f>] btrfs_delete_unused_bgs+0x1de/0x21b [btrfs]
      [112065.263105]  [<ffffffffa0343106>] cleaner_kthread+0x100/0x12f [btrfs]
      (...)
      [112065.264493] ---[ end trace dd7903a975a31a08 ]---
      [112065.264673] BTRFS: error (device sdc) in btrfs_remove_chunk:2625: errno=-28 No space left
      [112065.264997] BTRFS info (device sdc): forced readonly

   9) The clear kthread sees that the BTRFS_FS_STATE_ERROR bit is set in
      fs_info->fs_state and calls btrfs_cleanup_transaction(), which in
      turn calls btrfs_destroy_pinned_extent();

   10) Then btrfs_destroy_pinned_extent() iterates over all the ranges
       marked as dirty in fs_info->freed_extents[], and for each one
       it calls discard, if the fs was mounted with "-o discard", and
       adds the range to the free space cache of the respective block
       group;

   11) btrfs_trim_block_group(), invoked from the fitrim ioctl code path,
       sees the free space entries and performs a discard;

   12) After an umount and mount (or fsck), our eb's location on disk was full
       of zeroes, and it should have been untouched, because it was marked as
       dirty in the fs_info->pinned_extents tree, and therefore used by the
       trees that the last committed superblock points to.

Fix this by not performing a discard and not adding the ranges to the free space
caches - it's useless from this point since the fs is now in readonly mode and
we won't write free space caches to disk anymore (otherwise we would leak space)
nor any new superblock. By not adding the ranges to the free space caches, it
prevents other code paths from allocating that space and write to it as well,
therefore being safer and simpler.

This isn't a new problem, as it's been present since 2011 (git commit
acce952b0263825da32cf10489413dec78053347).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Btrfs: do not move em to modified list when unpinning

commit a28046956c71985046474283fa3bcd256915fb72 upstream.

We use the modified list to keep track of which extents have been modified so we
know which ones are candidates for logging at fsync() time.  Newly modified
extents are added to the list at modification time, around the same time the
ordered extent is created.  We do this so that we don't have to wait for ordered
extents to complete before we know what we need to log.  The problem is when
something like this happens

log extent 0-4k on inode 1
copy csum for 0-4k from ordered extent into log
sync log
commit transaction
log some other extent on inode 1
ordered extent for 0-4k completes and adds itself onto modified list again
log changed extents
see ordered extent for 0-4k has already been logged
at this point we assume the csum has been copied
sync log
crash

On replay we will see the extent 0-4k in the log, drop the original 0-4k extent
which is the same one that we are replaying which also drops the csum, and then
we won't find the csum in the log for that bytenr.  This of course causes us to
have errors about not having csums for certain ranges of our inode.  So remove
the modified list manipulation in unpin_extent_cache, any modified extents
should have been added well before now, and we don't want them re-logged.  This
fixes my test that I could reliably reproduce this problem with.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

eCryptfs: Remove buggy and unnecessary write in file name decode routine

commit 942080643bce061c3dd9d5718d3b745dcb39a8bc upstream.

Dmitry Chernenkov used KASAN to discover that eCryptfs writes past the
end of the allocated buffer during encrypted filename decoding. This
fix corrects the issue by getting rid of the unnecessary 0 write when
the current bit offset is 2.

Signed-off-by: Michael Halcrow <mhalcrow@google.com>
Reported-by: Dmitry Chernenkov <dmitryc@google.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

eCryptfs: Force RO mount when encrypted view is enabled

commit 332b122d39c9cbff8b799007a825d94b2e7c12f2 upstream.

The ecryptfs_encrypted_view mount option greatly changes the
functionality of an eCryptfs mount. Instead of encrypting and decrypting
lower files, it provides a unified view of the encrypted files in the
lower filesystem. The presence of the ecryptfs_encrypted_view mount
option is intended to force a read-only mount and modifying files is not
supported when the feature is in use. See the following commit for more
information:

e77a56d [PATCH] eCryptfs: Encrypted passthrough

This patch forces the mount to be read-only when the
ecryptfs_encrypted_view mount option is specified by setting the
MS_RDONLY flag on the superblock. Additionally, this patch removes some
broken logic in ecryptfs_open() that attempted to prevent modifications
of files when the encrypted view feature was in use. The check in
ecryptfs_open() was not sufficient to prevent file modifications using
system calls that do not operate on a file descriptor.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Priya Bansal <p.bansal@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

udf: Verify symlink size before loading it

commit a1d47b262952a45aae62bd49cfaf33dd76c11a2c upstream.

UDF specification allows arbitrarily large symlinks. However we support
only symlinks at most one block large. Check the length of the symlink
so that we don't access memory beyond end of the symlink block.

Reported-by: Carl Henrik Lunde <chlunde@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting

commit 24c037ebf5723d4d9ab0996433cee4f96c292a4d upstream.

alloc_pid() does get_pid_ns() beforehand but forgets to put_pid_ns() if it
fails because disable_pid_allocation() was called by the exiting
child_reaper.

We could simply move get_pid_ns() down to successful return, but this fix
tries to be as trivial as possible.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Sterling Alexander <stalexan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ncpfs: return proper error from NCP_IOC_SETROOT ioctl

commit a682e9c28cac152e6e54c39efcf046e0c8cfcf63 upstream.

If some error happens in NCP_IOC_SETROOT ioctl, the appropriate error
return value is then (in most cases) just overwritten before we return.
This can result in reporting success to userspace although error happened.

This bug was introduced by commit 2e54eb96e2c8 ("BKL: Remove BKL from
ncpfs"). Propagate the errors correctly.

Coverity id: 1226925.

Fixes: 2e54eb96e2c80 ("BKL: Remove BKL from ncpfs")
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: af_alg - fix backlog handling

commit 7e77bdebff5cb1e9876c561f69710b9ab8fa1f7e upstream.

If a request is backlogged, it's complete() handler will get called
twice: once with -EINPROGRESS, and once with the final error code.

af_alg's complete handler, unlike other users, does not handle the
-EINPROGRESS but instead always completes the completion that recvmsg()
is waiting on.  This can lead to a return to user space while the
request is still pending in the driver.  If userspace closes the sockets
before the requests are handled by the driver, this will lead to
use-after-frees (and potential crashes) in the kernel due to the tfm
having been freed.

The crashes can be easily reproduced (for example) by reducing the max
queue length in cryptod.c and running the following (from
http://www.chronox.de/libkcapi.html) on AES-NI capable hardware:

$ while true; do kcapi -x 1 -e -c '__ecb-aes-aesni' \
    -k 00000000000000000000000000000000 \
    -p 00000000000000000000000000000000 >/dev/null & done

Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Unbreak the unprivileged remount tests

commit db86da7cb76f797a1a8b445166a15cb922c6ff85 upstream.

A security fix in caused the way the unprivileged remount tests were
using user namespaces to break. Tweak the way user namespaces are
being used so the test works again.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Allow setting gid_maps without privilege when setgroups is disabled

commit 66d2f338ee4c449396b6f99f5e75cd18eb6df272 upstream.

Now that setgroups can be disabled and not reenabled, setting gid_map
without privielge can now be enabled when setgroups is disabled.

This restores most of the functionality that was lost when unprivileged
setting of gid_map was removed. Applications that use this functionality
will need to check to see if they use setgroups or init_groups, and if they
don't they can be fixed by simply disabling setgroups before writing to
gid_map.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Add a knob to disable setgroups on a per user namespace basis

commit 9cc46516ddf497ea16e8d7cb986ae03a0f6b92f8 upstream.

- Expose the knob to user space through a proc file /proc/<pid>/setgroups

  A value of "deny" means the setgroups system call is disabled in the
  current processes user namespace and can not be enabled in the
  future in this user namespace.

  A value of "allow" means the segtoups system call is enabled.

- Descendant user namespaces inherit the value of setgroups from
  their parents.

- A proc file is used (instead of a sysctl) as sysctls currently do
  not allow checking the permissions at open time.

- Writing to the proc file is restricted to before the gid_map
  for the user namespace is set.

  This ensures that disabling setgroups at a user namespace
  level will never remove the ability to call setgroups
  from a process that already has that ability.

  A process may opt in to the setgroups disable for itself by
  creating, entering and configuring a user namespace or by calling
  setns on an existing user namespace with setgroups disabled.
  Processes without privileges already can not call setgroups so this
  is a noop.  Prodcess with privilege become processes without
  privilege when entering a user namespace and as with any other path
  to dropping privilege they would not have the ability to call
  setgroups.  So this remains within the bounds of what is possible
  without a knob to disable setgroups permanently in a user namespace.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Rename id_map_mutex to userns_state_mutex

commit f0d62aec931e4ae3333c797d346dc4f188f454ba upstream.

Generalize id_map_mutex so it can be used for more state of a user namespace.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Only allow the creator of the userns unprivileged mappings

commit f95d7918bd1e724675de4940039f2865e5eec5fe upstream.

If you did not create the user namespace and are allowed
to write to uid_map or gid_map you should already have the necessary
privilege in the parent user namespace to establish any mapping
you want so this will not affect userspace in practice.

Limiting unprivileged uid mapping establishment to the creator of the
user namespace makes it easier to verify all credentials obtained with
the uid mapping can be obtained without the uid mapping without
privilege.

Limiting unprivileged gid mapping establishment (which is temporarily
absent) to the creator of the user namespace also ensures that the
combination of uid and gid can already be obtained without privilege.

This is part of the fix for CVE-2014-8989.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Check euid no fsuid when establishing an unprivileged uid mapping

commit 80dd00a23784b384ccea049bfb3f259d3f973b9d upstream.

setresuid allows the euid to be set to any of uid, euid, suid, and
fsuid. Therefor it is safe to allow an unprivileged user to map
their euid and use CAP_SETUID privileged with exactly that uid,
as no new credentials can be obtained.

I can not find a combination of existing system calls that allows setting
uid, euid, suid, and fsuid from the fsuid making the previous use
of fsuid for allowing unprivileged mappings a bug.

This is part of a fix for CVE-2014-8989.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Don't allow unprivileged creation of gid mappings

commit be7c6dba2332cef0677fbabb606e279ae76652c3 upstream.

As any gid mapping will allow and must allow for backwards
compatibility dropping groups don't allow any gid mappings to be
established without CAP_SETGID in the parent user namespace.

For a small class of applications this change breaks userspace
and removes useful functionality. This small class of applications
includes tools/testing/selftests/mount/unprivilged-remount-test.c

Most of the removed functionality will be added back with the addition
of a one way knob to disable setgroups. Once setgroups is disabled
setting the gid_map becomes as safe as setting the uid_map.

For more common applications that set the uid_map and the gid_map
with privilege this change will have no affect.

This is part of a fix for CVE-2014-8989.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Don't allow setgroups until a gid mapping has been setablished

commit 273d2c67c3e179adb1e74f403d1e9a06e3f841b5 upstream.

setgroups is unique in not needing a valid mapping before it can be called,
in the case of setgroups(0, NULL) which drops all supplemental groups.

The design of the user namespace assumes that CAP_SETGID can not actually
be used until a gid mapping is established. Therefore add a helper function
to see if the user namespace gid mapping has been established and call
that function in the setgroups permission check.

This is part of the fix for CVE-2014-8989, being able to drop groups
without privilege using user namespaces.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

userns: Document what the invariant required for safe unprivileged mappings.

commit 0542f17bf2c1f2430d368f44c8fcf2f82ec9e53e upstream.

The rule is simple.  Don't allow anything that wouldn't be allowed
without unprivileged mappings.

It was previously overlooked that establishing gid mappings would
allow dropping groups and potentially gaining permission to files and
directories that had lesser permissions for a specific group than for
all other users.

This is the rule needed to fix CVE-2014-8989 and prevent any other
security issues with new_idmap_permitted.

The reason for this rule is that the unix permission model is old and
there are programs out there somewhere that take advantage of every
little corner of it.  So allowing a uid or gid mapping to be
established without privielge that would allow anything that would not
be allowed without that mapping will result in expectations from some
code somewhere being violated.  Violated expectations about the
behavior of the OS is a long way to say a security issue.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

groups: Consolidate the setgroups permission checks

commit 7ff4d90b4c24a03666f296c3d4878cd39001e81e upstream.

Today there are 3 instances of setgroups and due to an oversight their
permission checking has diverged. Add a common function so that
they may all share the same permission checking code.

This corrects the current oversight in the current permission checks
and adds a helper to avoid this in the future.

A user namespace security fix will update this new helper, shortly.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

umount: Disallow unprivileged mount force

commit b2f5d4dc38e034eecb7987e513255265ff9aa1cf upstream.

Forced unmount affects not just the mount namespace but the underlying
superblock as well. Restrict forced unmount to the global root user
for now. Otherwise it becomes possible a user in a less privileged
mount namespace to force the shutdown of a superblock of a filesystem
in a more privileged mount namespace, allowing a DOS attack on root.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mnt: Update unprivileged remount test

commit 4a44a19b470a886997d6647a77bb3e38dcbfa8c5 upstream.

- MNT_NODEV should be irrelevant except when reading back mount flags,
  no longer specify MNT_NODEV on remount.

- Test MNT_NODEV on devpts where it is meaningful even for unprivileged mounts.

- Add a test to verify that remount of a prexisting mount with the same flags
  is allowed and does not change those flags.

- Cleanup up the definitions of MS_REC, MS_RELATIME, MS_STRICTATIME that are used
  when the code is built in an environment without them.

- Correct the test error messages when tests fail.  There were not 5 tests
  that tested MS_RELATIME.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount

commit 3e1866410f11356a9fd869beb3e95983dc79c067 upstream.

Now that remount is properly enforcing the rule that you can't remove
nodev at least sandstorm.io is breaking when performing a remount.

It turns out that there is an easy intuitive solution implicitly
add nodev on remount when nodev was implicitly added on mount.

Tested-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mac80211: free management frame keys when removing station

commit 28a9bc68124c319b2b3dc861e80828a8865fd1ba upstream.

When writing the code to allow per-station GTKs, I neglected to
take into account the management frame keys (index 4 and 5) when
freeing the station and only added code to free the first four
data frame keys.

Fix this by iterating the array of keys over the right length.

Fixes: e31b82136d1a ("cfg80211/mac80211: allow per-station GTKs")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mac80211: fix multicast LED blinking and counter

commit d025933e29872cb1fe19fc54d80e4dfa4ee5779c upstream.

As multicast-frames can't be fragmented, "dot11MulticastReceivedFrameCount"
stopped being incremented after the use-after-free fix. Furthermore, the
RX-LED will be triggered by every multicast frame (which wouldn't happen
before) which wouldn't allow the LED to rest at all.

Fixes https://bugzilla.kernel.org/show_bug.cgi?id=89431 which also had the
patch.

Fixes: b8fff407a180 ("mac80211: fix use-after-free in defragmentation")
Signed-off-by: Andreas Müller <goo@stapelspeicher.org>
[rewrite commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KEYS: Fix stale key registration at error path

commit b26bdde5bb27f3f900e25a95e33a0c476c8c2c48 upstream.

When loading encrypted-keys module, if the last check of
aes_get_sizes() in init_encrypted() fails, the driver just returns an
error without unregistering its key type. This results in the stale
entry in the list. In addition to memory leaks, this leads to a kernel
crash when registering a new key type later.

This patch fixes the problem by swapping the calls of aes_get_sizes()
and register_key_type(), and releasing resources properly at the error
paths.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=908163
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

isofs: Fix unchecked printing of ER records

commit 4e2024624e678f0ebb916e6192bd23c1f9fdf696 upstream.

We didn't check length of rock ridge ER records before printing them.
Thus corrupted isofs image can cause us to access and print some memory
behind the buffer with obvious consequences.

Reported-and-tested-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/tls: Don't validate lm in set_thread_area() after all

commit 3fb2f4237bb452eb4e98f6a5dbd5a445b4fed9d0 upstream.

It turns out that there's a lurking ABI issue.  GCC, when
compiling this in a 32-bit program:

struct user_desc desc = {
.entry_number    = idx,
.base_addr       = base,
.limit           = 0xfffff,
.seg_32bit       = 1,
.contents        = 0, /* Data, grow-up */
.read_exec_only  = 0,
.limit_in_pages  = 1,
.seg_not_present = 0,
.useable         = 0,
};

will leave .lm uninitialized.  This means that anything in the
kernel that reads user_desc.lm for 32-bit tasks is unreliable.

Revert the .lm check in set_thread_area().  The value never did
anything in the first place.

Fixes: 0e58af4e1d21 ("x86/tls: Disallow unusual TLS segments")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm space map metadata: fix sm_bootstrap_get_nr_blocks()

commit c1c6156fe4d4577444b769d7edd5dd503e57bbc9 upstream.

This function isn't right and it causes a static checker warning:

drivers/md/dm-thin.c:3016 maybe_resize_data_dev()
error: potentially using uninitialized 'sb_data_size'.

It should set "*count" and return zero on success the same as the
sm_metadata_get_nr_blocks() function does earlier.

Fixes: 3241b1d3e0aa ('dm: add persistent data library')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm bufio: fix memleak when using a dm_buffer's inline bio

commit 445559cdcb98a141f5de415b94fd6eaccab87e6d upstream.

When dm-bufio sets out to use the bio built into a struct dm_buffer to
issue an IO, it needs to call bio_reset after it's done with the bio
so that we can free things attached to the bio such as the integrity
payload. Therefore, inject our own endio callback to take care of
the bio_reset after calling submit_io's end_io callback.

Test case:
1. modprobe scsi_debug delay=0 dif=1 dix=199 ato=1 dev_size_mb=300
2. Set up a dm-bufio client, e.g. dm-verity, on the scsi_debug device
3. Repeatedly read metadata and watch kmalloc-192 leak!

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfs41: fix nfs4_proc_layoutget error handling

commit 4bd5a980de87d2b5af417485bde97b8eb3d6cf6a upstream.

nfs4_layoutget_release() drops layout hdr refcnt. Grab the refcnt
early so that it is safe to call .release in case nfs4_alloc_pages
fails.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Fixes: a47970ff78147 ("NFSv4.1: Hold reference to layout hdr in layoutget")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

megaraid_sas: corrected return of wait_event from abort frame path

commit 170c238701ec38b1829321b17c70671c101bac55 upstream.

Corrected wait_event() call which was waiting for wrong completion
status (0xFF).

Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: block: add newline to sysfs display of force_ro

commit 0031a98a85e9fca282624bfc887f9531b2768396 upstream.

Make force_ro consistent with other sysfs entries.

Fixes: 371a689f64b0d ('mmc: MMC boot partitions support')
Cc: Andrei Warkentin <andrey.warkentin@gmail.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mfd: tc6393xb: Fail ohci suspend if full state restore is required

commit 1a5fb99de4850cba710d91becfa2c65653048589 upstream.

Some boards with TC6393XB chip require full state restore during system
resume thanks to chip's VCC being cut off during suspend (Sharp SL-6000
tosa is one of them). Failing to do so would result in ohci Oops on
resume due to internal memory contentes being changed. Fail ohci suspend
on tc6393xb is full state restore is required.

Recommended workaround is to unbind tmio-ohci driver before suspend and
rebind it after resume.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

md/bitmap: always wait for writes on unplug.

commit 4b5060ddae2b03c5387321fafc089d242225697a upstream.

If two threads call bitmap_unplug at the same time, then
one might schedule all the writes, and the other might
decide that it doesn't need to wait.  But really it does.

It rarely hurts to wait when it isn't absolutely necessary,
and the current code doesn't really focus on 'absolutely necessary'
anyway.  So just wait always.

This can potentially lead to data corruption if a crash happens
at an awkward time and data was written before the bitmap was
updated.  It is very unlikely, but this should go to -stable
just to be safe.  Appropriate for any -stable.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit

commit 29fa6825463c97e5157284db80107d1bfac5d77b upstream.

paravirt_enabled has the following effects:

- Disables the F00F bug workaround warning.  There is no F00F bug
   workaround any more because Linux's standard IDT handling already
   works around the F00F bug, but the warning still exists.  This
   is only cosmetic, and, in any event, there is no such thing as
   KVM on a CPU with the F00F bug.

- Disables 32-bit APM BIOS detection.  On a KVM paravirt system,
   there should be no APM BIOS anyway.

- Disables tboot.  I think that the tboot code should check the
   CPUID hypervisor bit directly if it matters.

- paravirt_enabled disables espfix32.  espfix32 should *not* be
   disabled under KVM paravirt.

The last point is the purpose of this patch.  It fixes a leak of the
high 16 bits of the kernel stack address on 32-bit KVM paravirt
guests.  Fixes CVE-2014-8134.

Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86_64, switch_to(): Load TLS descriptors before switching DS and ES

commit f647d7c155f069c1a068030255c300663516420e upstream.

Otherwise, if buggy user code points DS or ES into the TLS
array, they would be corrupted after a context switch.

This also significantly improves the comments and documents some
gotchas in the code.

Before this patch, the both tests below failed.  With this
patch, the es test passes, although the gsbase test still fails.

----- begin es test -----

/*
* Copyright (c) 2014 Andy Lutomirski
* GPL v2
*/

static unsigned short GDT3(int idx)
{
return (idx << 3) | 3;
}

static int create_tls(int idx, unsigned int base)
{
struct user_desc desc = {
.entry_number    = idx,
.base_addr       = base,
.limit           = 0xfffff,
.seg_32bit       = 1,
.contents        = 0, /* Data, grow-up */
.read_exec_only  = 0,
.limit_in_pages  = 1,
.seg_not_present = 0,
.useable         = 0,
};

if (syscall(SYS_set_thread_area, &desc) != 0)
err(1, "set_thread_area");

return desc.entry_number;
}

int main()
{
int idx = create_tls(-1, 0);
printf("Allocated GDT index %d\n", idx);

unsigned short orig_es;
asm volatile ("mov %%es,%0" : "=rm" (orig_es));

int errors = 0;
int total = 1000;
for (int i = 0; i < total; i++) {
asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx)));
usleep(100);

unsigned short es;
asm volatile ("mov %%es,%0" : "=rm" (es));
asm volatile ("mov %0,%%es" : : "rm" (orig_es));
if (es != GDT3(idx)) {
if (errors == 0)
printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n",
       GDT3(idx), es);
errors++;
}
}

if (errors) {
printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total);
return 1;
} else {
printf("[OK]\tES was preserved\n");
return 0;
}
}

----- end es test -----

----- begin gsbase test -----

/*
* gsbase.c, a gsbase test
* Copyright (c) 2014 Andy Lutomirski
* GPL v2
*/

static unsigned char *testptr, *testptr2;

static unsigned char read_gs_testvals(void)
{
unsigned char ret;
asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr));
return ret;
}

int main()
{
int errors = 0;

testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE,
       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
if (testptr == MAP_FAILED)
err(1, "mmap");

testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE,
       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
if (testptr2 == MAP_FAILED)
err(1, "mmap");

*testptr = 0;
*testptr2 = 1;

if (syscall(SYS_arch_prctl, ARCH_SET_GS,
    (unsigned long)testptr2 - (unsigned long)testptr) != 0)
err(1, "ARCH_SET_GS");

usleep(100);

if (read_gs_testvals() == 1) {
printf("[OK]\tARCH_SET_GS worked\n");
} else {
printf("[FAIL]\tARCH_SET_GS failed\n");
errors++;
}

asm volatile ("mov %0,%%gs" : : "r" (0));

if (read_gs_testvals() == 0) {
printf("[OK]\tWriting 0 to gs worked\n");
} else {
printf("[FAIL]\tWriting 0 to gs failed\n");
errors++;
}

usleep(100);

if (read_gs_testvals() == 0) {
printf("[OK]\tgsbase is still zero\n");
} else {
printf("[FAIL]\tgsbase was corrupted\n");
errors++;
}

return errors == 0 ? 0 : 1;
}

----- end gsbase test -----

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/tls: Disallow unusual TLS segments

commit 0e58af4e1d2166e9e33375a0f121e4867010d4f8 upstream.

Users have no business installing custom code segments into the
GDT, and segments that are not present but are otherwise valid
are a historical source of interesting attacks.

For completeness, block attempts to set the L bit.  (Prior to
this patch, the L bit would have been silently dropped.)

This is an ABI break.  I've checked glibc, musl, and Wine, and
none of them look like they'll have any trouble.

Note to stable maintainers: this is a hardening patch that fixes
no known bugs.  Given the possibility of ABI issues, this
probably shouldn't be backported quickly.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/tls: Validate TLS entries to protect espfix

commit 41bdc78544b8a93a9c6814b8bbbfef966272abbe upstream.

Installing a 16-bit RW data segment into the GDT defeats espfix.
AFAICT this will not affect glibc, Wine, or dosemu at all.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

isofs: Fix infinite looping over CE entries

commit f54e18f1b831c92f6512d2eedb224cd63d607d3d upstream.

Rock Ridge extensions define so called Continuation Entries (CE) which
define where is further space with Rock Ridge data. Corrupted isofs
image can contain arbitrarily long chain of these, including a one
containing loop and thus causing kernel to end in an infinite loop when
traversing these entries.

Limit the traversal to 32 entries which should be more than enough space
to store all the Rock Ridge data.

Reported-by: P J P <ppandit@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

coresight-replicator: remove .owner field for driver

There is no need of .owner field for driver using
module_platform_driver.

Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ac610cd7c2b13d7f916ebb73c2ae6b91c1d53a61)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight: fix typo in comment in coresight-priv.h

fixes a typo %s/eveyone/everyone/ in function CS_UNLOCK.

Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 84b763d5b808ee6c23d95314ac81859cab471adb)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

ARM: removing support for etb/etm in "arch/arm/kernel/"

Removing minimal support for etb/etm to favour an implementation
that is more flexible, extensible and capable of handling more
platforms.

Also removing the only client of the old driver. That code can
easily be replaced by entries for etb/etm in the device tree.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 184901a06a366d40386e07307bcadc9eeaabbd39)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Conflicts:
arch/arm/include/asm/hardware/coresight.h
arch/arm/kernel/Makefile
arch/arm/kernel/etm.c
arch/arm/mach-omap2/Kconfig

coresight: adding basic support for D01 board

Support for 16 PTMs, funnel, TPIU and replicator connected
to the ETB are included.

Signed-off-by: Xia Kaixu <kaixu.xia@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 4d5616ca59350c47e4b00d17c1480d8b44d3c535)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Conflicts:
arch/arm/boot/dts/hip04.dtsi

coresight: adding basic support for Vexpress TC2

Support for the 2 PTMs, 3 ETMs, funnel, TPIU and replicator
connected to the ETB are included. Proper handling of the
ITM and the replicator linked to it along with the CTIs
and SWO are not included.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 0bec8d82bd112f6aa0e16cab5d840bd7f5d3a7ce)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight: documentation for coresight framework and drivers

Documentation containing an explanation on what the framework
provides and the drivers working with it. A minimal example
on how to use the functionality is also provided.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 872234d3fb32e26d3377590c396858d2324b8c16)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight-etm: add CoreSight ETM/PTM driver

This driver manages CoreSight ETM (Embedded Trace Macrocell) that
supports processor tracing. Currently supported version are ARM
ETMv3.x and PTM1.x.

Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
coresight-etm3x: adding missing error checking
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a939fc5a71ad531633610242400c262e78731532)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight-replicator: add CoreSight Replicator driver

This driver manages non-configurable CoreSight Replicator that
takes a single input trace data stream and replicates it to
produce two identical trace data output streams. Replicators
are typically used to route single interleaved trace data
stream to two or more sinks.

Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ceacc1d9b7ae41e4be185596306be17537682fb1)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight-funnel: add CoreSight Funnel driver

This driver manages CoreSight Funnel which acts as a link.
Funnels have multiple input ports (typically 8) each of which
represents an input trace data stream. These multiple input trace
data streams are interleaved into a single output stream coming
out of the Funnel.

Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6e21e3451556af6ada01e2206d5949fc654d75e1)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight-etb: add CoreSight ETB driver

This driver manages CoreSight ETB (Embedded Trace Buffer) which
acts as a circular buffer sink collecting generated trace data.

Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit fdfc0d8a06b56c3ca8fc3d7d271b3c8e99e6d55c)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight-tpiu: add CoreSight TPIU driver

This driver manages CoreSight TPIU (Trace Port Interface Unit)
which acts as a sink. TPIU is typically connected to some offchip
hardware hosting a storage buffer.

Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit dc161b9f01142c6b2c985290b7aa3e58e2408036)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight-tmc: add CoreSight TMC driver

This driver manages CoreSight TMC (Trace Memory Controller) which
can act as a link or a sink depending upon its configuration. It
can present itself as an ETF (Embedded Trace FIFO) or ETR
(Embedded Trace Router).

ETF when configured in circular buffer mode acts as a trace
collection sink. When configured in HW fifo mode it acts as link.
ETR always acts as a sink and can be used to route data to memory
allocated in RAM.

Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bc4bf7fe98daf4e64cc5ffc6cdc0e820f4d99c14)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

coresight: add CoreSight core layer framework

CoreSight components are compliant with the ARM CoreSight
architecture specification and can be connected in various
topologies to suit a particular SoC tracing needs. These trace
components can generally be classified as sources, links and
sinks. Trace data produced by one or more sources flows through
the intermediate links connecting the source to the currently
selected sink.

The CoreSight framework provides an interface for the CoreSight trace
drivers to register themselves with. It's intended to build up a
topological view of the CoreSight components and configure the
correct serie of components on user input via sysfs.

For eg., when enabling a source, the framework builds up a path
consisting of all the components connecting the source to the
currently selected sink(s) and enables all of them.

The framework also supports switching between available sinks
and provides status information to user space applications
through the debugfs interface.

Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a06ae8609b3dd06b957a6e4e965772a8a14d3af5)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Conflicts:
arch/arm/Kconfig.debug
drivers/Makefile

coresight: bindings for coresight drivers

Coresight IP blocks allow for the support of HW assisted tracing
on ARM SoCs. Bindings for the currently available blocks are
presented herein.

Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 799656de6f6587cecc0b05477501e41c211a75fa)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

sysfs: create __ATTR_WO()

This creates the macro __ATTR_WO() for write-only attributes, instead of
having to "open define" them.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a65fcce75a75c0d41b938f86d09d42b6f1733309)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

sysfs: add more helper macro's for (bin_)attribute(_groups)

With the recent changes to sysfs there's various helper macro's.
However there's no RW, RO BIN_ helper macro's. This patch adds them.

Signed-off-by: Oliver Schinagl <oliver@schinagl.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3493f69f4c4e8703961919a9a56c2d2e6a25b46f)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Conflicts:
include/linux/sysfs.h

backing-dev: remove compilation warning

Removed compilation warning introduced by back porting
aa01aa3ca205ea04f44423a58bae38aec886fb96 to 3.10

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

sysfs: use file mode defines from stat.h

With the last patches stat.h was included to the header, and thus those
permission defines should be used.

Signed-off-by: Oliver Schinagl <oliver@schinagl.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit aa01aa3ca205ea04f44423a58bae38aec886fb96)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Conflicts:
include/linux/sysfs.h

bitops: Introduce a more generic BITMASK macro

This patch is back-porting the generic _portion_ of
10ef6b0dffe404bcc54e94cb2ca1a5b18445a66b to 3.10

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

driver core: create write-only attribute macros for devices and drivers

This creates the macros DRIVER_ATTR_WO() and DEVICE_ATTR_WO() for
write-only attributes for drivers and devices.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 25d134e51f990962ecb54e5449c97b82f818ab30)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

of: back-porting generic graph bindings

This patch is back-porting the generic _portion_ of
f2a575f67695dcba9062acd666ae5aab2380b95c to 3.10

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

of: back-porting v4l2 graph bindings

This patch is back-porting the generic _portion_ of
fd9fdb78a9bf85b94fb2190c82ff280c8f8375cc to 3.10

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

driver core: device.h: add RW and RO attribute macros

Make it easier to create attributes without having to always audit the
mode settings.

Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit aead4f11289457e4c4dd8c65e901740fe18e5750)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>

Merge tag 'v3.10.63' into linux-linaro-lsk

This is the 3.10.63 stable release

Linux 3.10.63

ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery

commit 66139a48cee1530c91f37c145384b4ee7043f0b7 upstream.

In snd_usbmidi_error_timer(), the driver tries to resubmit MIDI input
URBs to reactivate the MIDI stream, but this causes the error when
some of URBs are still pending like:

WARNING: CPU: 0 PID: 0 at ../drivers/usb/core/urb.c:339 usb_submit_urb+0x5f/0x70()
URB ef705c40 submitted while active
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.6-2-desktop #1
Hardware name: FOXCONN TPS01/TPS01, BIOS 080015  03/23/2010
  c0984bfa f4009ed4 c078deaf f4009ee4 c024c884 c09a135c f4009f00 00000000
  c0984bfa 00000153 c061ac4f c061ac4f 00000009 00000001 ef705c40 e854d1c0
  f4009eec c024c8d3 00000009 f4009ee4 c09a135c f4009f00 f4009f04 c061ac4f
Call Trace:
  [<c0205df6>] try_stack_unwind+0x156/0x170
  [<c020482a>] dump_trace+0x5a/0x1b0
  [<c0205e56>] show_trace_log_lvl+0x46/0x50
  [<c02049d1>] show_stack_log_lvl+0x51/0xe0
  [<c0205eb7>] show_stack+0x27/0x50
  [<c078deaf>] dump_stack+0x45/0x65
  [<c024c884>] warn_slowpath_common+0x84/0xa0
  [<c024c8d3>] warn_slowpath_fmt+0x33/0x40
  [<c061ac4f>] usb_submit_urb+0x5f/0x70
  [<f7974104>] snd_usbmidi_submit_urb+0x14/0x60 [snd_usbmidi_lib]
  [<f797483a>] snd_usbmidi_error_timer+0x6a/0xa0 [snd_usbmidi_lib]
  [<c02570c0>] call_timer_fn+0x30/0x130
  [<c0257442>] run_timer_softirq+0x1c2/0x260
  [<c0251493>] __do_softirq+0xc3/0x270
  [<c0204732>] do_softirq_own_stack+0x22/0x30
  [<c025186d>] irq_exit+0x8d/0xa0
  [<c0795228>] smp_apic_timer_interrupt+0x38/0x50
  [<c0794a3c>] apic_timer_interrupt+0x34/0x3c
  [<c0673d9e>] cpuidle_enter_state+0x3e/0xd0
  [<c028bb8d>] cpu_idle_loop+0x29d/0x3e0
  [<c028bd23>] cpu_startup_entry+0x53/0x60
  [<c0bfac1e>] start_kernel+0x415/0x41a

For avoiding these errors, check the pending URBs and skip
resubmitting such ones.

Reported-and-tested-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc: 32 bit getcpu VDSO function uses 64 bit instructions

commit 152d44a853e42952f6c8a504fb1f8eefd21fd5fd upstream.

I used some 64 bit instructions when adding the 32 bit getcpu VDSO
function. Fix it.

Fixes: 18ad51dd342a ("powerpc: Add VDSO version of getcpu")
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: sched_clock: Load cycle count after epoch stabilizes

commit 336ae1180df5f69b9e0fb6561bec01c5f64361cf upstream.

There is a small race between when the cycle count is read from
the hardware and when the epoch stabilizes. Consider this
scenario:

CPU0                           CPU1
----                           ----
cyc = read_sched_clock()
cyc_to_sched_clock()
                                 update_sched_clock()
                                  ...
                                  cd.epoch_cyc = cyc;
  epoch_cyc = cd.epoch_cyc;
  ...
  epoch_ns + cyc_to_ns((cyc - epoch_cyc)

The cyc on cpu0 was read before the epoch changed. But we
calculate the nanoseconds based on the new epoch by subtracting
the new epoch from the old cycle count. Since epoch is most likely
larger than the old cycle count we calculate a large number that
will be converted to nanoseconds and added to epoch_ns, causing
time to jump forward too much.

Fix this problem by reading the hardware after the epoch has
stabilized.

Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

igb: bring link up when PHY is powered up

commit aec653c43b0c55667355e26d7de1236bda9fb4e3 upstream.

Call igb_setup_link() when the PHY is powered up.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Reported-by: Jeff Westfahl <jeff.westfahl@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Vincent Donnefort <vdonnefort@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext2: Fix oops in ext2_get_block() called from ext2_quota_write()

commit df4e7ac0bb70abc97fbfd9ef09671fc084b3f9db upstream.

ext2_quota_write() doesn't properly setup bh it passes to
ext2_get_block() and thus we hit assertion BUG_ON(maxblocks == 0) in
ext2_get_blocks() (or we could actually ask for mapping arbitrary number
of blocks depending on whatever value was on stack).

Fix ext2_quota_write() to properly fill in number of blocks to map.

Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nEPT: Nested INVEPT

commit bfd0a56b90005f8c8a004baf407ad90045c2b11e upstream.

If we let L1 use EPT, we should probably also support the INVEPT instruction.

In our current nested EPT implementation, when L1 changes its EPT table
for L2 (i.e., EPT12), L0 modifies the shadow EPT table (EPT02), and in
the course of this modification already calls INVEPT. But if last level
of shadow page is unsync not all L1's changes to EPT12 are intercepted,
which means roots need to be synced when L1 calls INVEPT. Global INVEPT
should not be different since roots are synced by kvm_mmu_load() each
time EPTP02 changes.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Jun Nakajima <jun.nakajima@intel.com>
Signed-off-by: Xinhao Xu <xinhao.xu@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.2:
- Adjust context, filename
- Simplify handle_invept() as recommended by Paolo - nEPT is not
supported so we always raise #UD]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Vinson Lee <vlee@twopensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: sctp: use MAX_HEADER for headroom reserve in output path

[ Upstream commit 9772b54c55266ce80c639a80aa68eeb908f8ecf5 ]

To accomodate for enough headroom for tunnels, use MAX_HEADER instead
of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs
of trinity an skb_under_panic() via SCTP output path (see reference).
I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere
in other protocols might be one possible cause for this.

In any case, it looks like accounting on chunks themself seems to look
good as the skb already passed the SCTP output path and did not hit
any skb_over_panic(). Given tunneling was enabled in his .config, the
headroom would have been expanded by MAX_HEADER in this case.

Reported-by: Robert Święcki <robert@swiecki.net>
Reference: https://lkml.org/lkml/2014/12/1/507
Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: mvneta: fix Tx interrupt delay

[ Upstream commit aebea2ba0f7495e1a1c9ea5e753d146cb2f6b845 ]

The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.

The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.

In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.

The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.

Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.

No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer. If one
wants to increase this value for certain workloads where it is safe
to do so, "ethtool -C $dev tx-frames" will override this default
setting.

This fix needs to be applied to stable kernels starting with 3.10.

Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtnetlink: release net refcnt on error in do_setlink()

[ Upstream commit e0ebde0e131b529fd721b24f62872def5ec3718c ]

rtnl_link_get_net() holds a reference on the 'struct net', we need to release
it in case of error.

CC: Eric W. Biederman <ebiederm@xmission.com>
Fixes: b51642f6d77b ("net: Enable a userns root rtnl calls that are safe for unprivilged users")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/mlx4_core: Limit count field to 24 bits in qp_alloc_res

[ Upstream commit 2d5c57d7fbfaa642fb7f0673df24f32b83d9066c ]

Some VF drivers use the upper byte of "param1" (the qp count field)
in mlx4_qp_reserve_range() to pass flags which are used to optimize
the range allocation.

Under the current code, if any of these flags are set, the 32-bit
count field yields a count greater than 2^24, which is out of range,
and this VF fails.

As these flags represent a "best-effort" allocation hint anyway, they may
safely be ignored. Therefore, the PF driver may simply mask out the bits.

Fixes: c82e9aa0a8 "mlx4_core: resource tracking for HCA resources used by guests"
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tg3: fix ring init when there are more TX than RX channels

[ Upstream commit a620a6bc1c94c22d6c312892be1e0ae171523125 ]

If TX channels are set to 4 and RX channels are set to less than 4,
using ethtool -L, the driver will try to initialize more RX channels
than it has allocated, causing an oops.

This fix only initializes the RX ring if it has been allocated.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: gre: fix wrong skb->protocol in WCCP

[ Upstream commit be6572fdb1bfbe23b2624d477de50af50b02f5d6 ]

When using GRE redirection in WCCP, it sets the wrong skb->protocol,
that is, ETH_P_IP instead of ETH_P_IPV6 for the encapuslated traffic.

Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Cc: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Yuri Chislov <yuri.chislov@gmail.com>
Tested-by: Yuri Chislov <yuri.chislov@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sata_fsl: fix error handling of irq_of_parse_and_map

commit aad0b624129709c94c2e19e583b6053520353fa8 upstream.

irq_of_parse_and_map() returns 0 on error (the result is unsigned int),
so testing for negative result never works.

Signed-off-by: Dmitry Torokhov <dtor@chromium.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ahci: disable MSI on SAMSUNG 0xa800 SSD

commit 2b21ef0aae65f22f5ba86b13c4588f6f0c2dbefb upstream.

Just like 0x1600 which got blacklisted by 66a7cbc303f4 ("ahci: disable
MSI instead of NCQ on Samsung pci-e SSDs on macbooks"), 0xa800 chokes
on NCQ commands if MSI is enabled. Disable MSI.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dominik Mierzejewski <dominik@greysector.net>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=89171
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller

commit 249cd0a187ed4ef1d0af7f74362cc2791ec5581b upstream.

This patch adds DeviceIDs for Sunrise Point-LP.

Signed-off-by: Devin Ryles <devin.ryles@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: smiapp: Only some selection targets are settable

commit b31eb901c4e5eeef4c83c43dfbc7fe0d4348cb21 upstream.

Setting a non-settable selection target caused BUG() to be called. The check
for valid selections only takes the selection target into account, but does
not tell whether it may be set, or only get. Fix the issue by simply
returning an error to the user.

Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Unlock panel even when LVDS is disabled

commit b0616c5306b342ceca07044dbc4f917d95c4f825 upstream.

Otherwise we'll have backtraces in assert_panel_unlocked because the
BIOS locks the register. In the reporter's case this regression was
introduced in

commit c31407a3672aaebb4acddf90944a114fa5c8af7b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Oct 18 21:07:01 2012 +0100

drm/i915: Add no-lvds quirk for Supermicro X7SPA-H

Reported-by: Alexey Orishko <alexey.orishko@gmail.com>
Cc: Alexey Orishko <alexey.orishko@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Francois Tigeot <ftigeot@wolfpond.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Tested-by: Alexey Orishko <alexey.orishko@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6

commit f5475cc43c899e33098d4db44b7c5e710f16589d upstream.

I was unable too boot 3.18.0-rc6 because of the following kernel
panic in drm_calc_vbltimestamp_from_scanoutpos():

    [drm] Initialized drm 1.1.0 20060810
    [drm] radeon kernel modesetting enabled.
    [drm] initializing kernel modesetting (RV100 0x1002:0x515E 0x15D9:0x8080).
    [drm] register mmio base: 0xC8400000
    [drm] register mmio size: 65536
    radeon 0000:0b:01.0: VRAM: 128M 0x00000000D0000000 - 0x00000000D7FFFFFF (16M used)
    radeon 0000:0b:01.0: GTT: 512M 0x00000000B0000000 - 0x00000000CFFFFFFF
    [drm] Detected VRAM RAM=128M, BAR=128M
    [drm] RAM width 16bits DDR
    [TTM] Zone  kernel: Available graphics memory: 3829346 kiB
    [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
    [TTM] Initializing pool allocator
    [TTM] Initializing DMA pool allocator
    [drm] radeon: 16M of VRAM memory ready
    [drm] radeon: 512M of GTT memory ready.
    [drm] GART: num cpu pages 131072, num gpu pages 131072
    [drm] PCI GART of 512M enabled (table at 0x0000000037880000).
    radeon 0000:0b:01.0: WB disabled
    radeon 0000:0b:01.0: fence driver on ring 0 use gpu addr 0x00000000b0000000 and cpu addr 0xffff8800bbbfa000
    [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
    [drm] Driver supports precise vblank timestamp query.
    [drm] radeon: irq initialized.
    [drm] Loading R100 Microcode
    radeon 0000:0b:01.0: Direct firmware load for radeon/R100_cp.bin failed with error -2
    radeon_cp: Failed to load firmware "radeon/R100_cp.bin"
    [drm:r100_cp_init] *ERROR* Failed to load firmware!
    radeon 0000:0b:01.0: failed initializing CP (-2).
    radeon 0000:0b:01.0: Disabling GPU acceleration
    [drm] radeon: cp finalized
    BUG: unable to handle kernel NULL pointer dereference at 000000000000025c
    IP: [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in:
    CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-4-default #2649
    Hardware name: Supermicro X7DB8/X7DB8, BIOS 6.00 07/26/2006
    task: ffff880234da2010 ti: ffff880234da4000 task.ti: ffff880234da4000
    RIP: 0010:[<ffffffff8150423b>]  [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
    RSP: 0000:ffff880234da7918  EFLAGS: 00010086
    RAX: ffffffff81557890 RBX: 0000000000000000 RCX: ffff880234da7a48
    RDX: ffff880234da79f4 RSI: 0000000000000000 RDI: ffff880232e15000
    RBP: ffff880234da79b8 R08: 0000000000000000 R09: 0000000000000000
    R10: 000000000000000a R11: 0000000000000001 R12: ffff880232dda1c0
    R13: ffff880232e1518c R14: 0000000000000292 R15: ffff880232e15000
    FS:  0000000000000000(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 000000000000025c CR3: 0000000002014000 CR4: 00000000000007e0
    Stack:
     ffff880234da79d8 0000000000000286 ffff880232dcbc00 0000000000002480
     ffff880234da7958 0000000000000296 ffff880234da7998 ffffffff8151b51d
     ffff880234da7a48 0000000032dcbeb0 ffff880232dcbc00 ffff880232dcbc58
    Call Trace:
     [<ffffffff8151b51d>] ? drm_vma_offset_remove+0x1d/0x110
     [<ffffffff8152dc98>] radeon_get_vblank_timestamp_kms+0x38/0x60
     [<ffffffff8152076a>] ? ttm_bo_release_list+0xba/0x180
     [<ffffffff81503751>] drm_get_last_vbltimestamp+0x41/0x70
     [<ffffffff81503933>] vblank_disable_and_save+0x73/0x1d0
     [<ffffffff81106b2f>] ? try_to_del_timer_sync+0x4f/0x70
     [<ffffffff81505245>] drm_vblank_cleanup+0x65/0xa0
     [<ffffffff815604fa>] radeon_irq_kms_fini+0x1a/0x70
     [<ffffffff8156c07e>] r100_init+0x26e/0x410
     [<ffffffff8152ae3e>] radeon_device_init+0x7ae/0xb50
     [<ffffffff8152d57f>] radeon_driver_load_kms+0x8f/0x210
     [<ffffffff81506965>] drm_dev_register+0xb5/0x110
     [<ffffffff8150998f>] drm_get_pci_dev+0x8f/0x200
     [<ffffffff815291cd>] radeon_pci_probe+0xad/0xe0
     [<ffffffff8141a365>] local_pci_probe+0x45/0xa0
     [<ffffffff8141b741>] pci_device_probe+0xd1/0x130
     [<ffffffff81633dad>] driver_probe_device+0x12d/0x3e0
     [<ffffffff8163413b>] __driver_attach+0x9b/0xa0
     [<ffffffff816340a0>] ? __device_attach+0x40/0x40
     [<ffffffff81631cd3>] bus_for_each_dev+0x63/0xa0
     [<ffffffff8163378e>] driver_attach+0x1e/0x20
     [<ffffffff81633390>] bus_add_driver+0x180/0x240
     [<ffffffff81634914>] driver_register+0x64/0xf0
     [<ffffffff81419cac>] __pci_register_driver+0x4c/0x50
     [<ffffffff81509bf5>] drm_pci_init+0xf5/0x120
     [<ffffffff821dc871>] ? ttm_init+0x6a/0x6a
     [<ffffffff821dc908>] radeon_init+0x97/0xb5
     [<ffffffff810002fc>] do_one_initcall+0xbc/0x1f0
     [<ffffffff810e3278>] ? __wake_up+0x48/0x60
     [<ffffffff8218e256>] kernel_init_freeable+0x18a/0x215
     [<ffffffff8218d983>] ? initcall_blacklist+0xc0/0xc0
     [<ffffffff818a78f0>] ? rest_init+0x80/0x80
     [<ffffffff818a78fe>] kernel_init+0xe/0xf0
     [<ffffffff818c0c3c>] ret_from_fork+0x7c/0xb0
     [<ffffffff818a78f0>] ? rest_init+0x80/0x80
    Code: 45 ac 0f 88 a8 01 00 00 3b b7 d0 01 00 00 49 89 ff 0f 83 99 01 00 00 48 8b 47 20 48 8b 80 88 00 00 00 48 85 c0 0f 84 cd 01 00 00 <41> 8b b1 5c 02 00 00 41 8b 89 58 02 00 00 89 75 98 41 8b b1 60
    RIP  [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
     RSP <ffff880234da7918>
    CR2: 000000000000025c
    ---[ end trace ad2c0aadf48e2032 ]---
    Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

It has helped me to add a NULL pointer check that was suggested at
http://lists.freedesktop.org/archives/dri-devel/2014-October/070663.html

I am not familiar with the code. But the change looks sane
and we need something fast at this stage of 3.18 development.

Suggested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: davinci: generate STP always when NACK is received

commit 9ea359f7314132cbcb5a502d2d8ef095be1f45e4 upstream.

According to I2C specification the NACK should be handled as follows:
"When SDA remains HIGH during this ninth clock pulse, this is defined as the Not
Acknowledge signal. The master can then generate either a STOP condition to
abort the transfer, or a repeated START condition to start a new transfer."
[I2C spec Rev. 6, 3.1.6: http://www.nxp.com/documents/user_manual/UM10204.pdf]

Currently the Davinci i2c driver interrupts the transfer on receipt of a
NACK but fails to send a STOP in some situations and so makes the bus
stuck until next I2C IP reset (idle/enable).

For example, the issue will happen during SMBus read transfer which
consists from two i2c messages write command/address and read data:

S Slave Address Wr A Command Code A Sr Slave Address Rd A D1..Dn A P
<--- write -----------------------> <--- read --------------------->

The I2C client device will send NACK if it can't recognize "Command Code"
and it's expected from I2C master to generate STP in this case.
But now, Davinci i2C driver will just exit with -EREMOTEIO and STP will
not be generated.

Hence, fix it by generating Stop condition (STP) always when NACK is received.

This patch fixes Davinci I2C in the same way it was done for OMAP I2C
commit cda2109a26eb ("i2c: omap: query STP always when NACK is received").

Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reported-by: Hein Tibosch <hein_tibosch@yahoo.es>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: omap: fix i207 errata handling

commit ccfc866356674cb3a61829d239c685af6e85f197 upstream.

commit 6d9939f651419a63e091105663821f9c7d3fec37 (i2c: omap: split out [XR]DR
and [XR]RDY) changed the way how errata i207 (I2C: RDR Flag May Be Incorrectly
Set) get handled. 6d9939f6514 code doesn't correspond to workaround provided by
errata.

According to errata ISR must filter out spurious RDR before data read not after.
ISR must read RXSTAT to get number of bytes available to read. Because RDR
could be set while there could no data in the receive FIFO.

Restored pre 6d9939f6514 way of handling errata.

Found by code review. Real impact haven't seen.
Tested on Beagleboard XM C.

Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Fixes: 6d9939f651419a63e09110 i2c: omap: split out [XR]DR and [XR]RDY
Tested-by: Felipe Balbi <balbi@ti.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: omap: fix NACK and Arbitration Lost irq handling

commit 27caca9d2e01c92b26d0690f065aad093fea01c7 upstream.

commit 1d7afc95946487945cc7f5019b41255b72224b70 (i2c: omap: ack IRQ in parts)
changed the interrupt handler to complete transfers without clearing
XRDY (AL case) and ARDY (NACK case) flags. XRDY or ARDY interrupts will be
fired again. As a result, ISR keep processing transfer after it was already
complete (from the driver code point of view).

A didn't see real impacts of the 1d7afc9, but it is really bad idea to
have ISR running on user data after transfer was complete.

It looks, what 1d7afc9 violate TI specs in what how AL and NACK should be
handled (see Note 1, sprugn4r, Figure 17-31 and Figure 17-32).

According to specs (if I understood correctly), in case of NACK and AL driver
must reset NACK, AL, ARDY, RDR, and RRDY (Master Receive Mode), and
NACK, AL, ARDY, and XDR (Master Transmitter Mode).

All that is done down the code under the if condition:
if (stat & (OMAP_I2C_STAT_ARDY | OMAP_I2C_STAT_NACK | OMAP_I2C_STAT_AL)) ...

The patch restore pre 1d7afc9 logic of handling NACK and AL interrupts, so
no interrupts is fired after ISR informs the rest of driver what transfer
complete.

Note: instead of removing break under NACK case, we could just replace 'break'
with 'continue' and allow NACK transfer to finish using ARDY event. I found
that NACK and ARDY bits usually set together. That case confirm TI wiki:
http://processors.wiki.ti.com/index.php/I2C_Tips#Detecting_and_handling_NACK

In order if someone interested in the event traces for NACK and AL cases,
I sent them to mailing list.

Tested on Beagleboard XM C.

Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Fixes: 1d7afc9 i2c: omap: ack IRQ in parts
Acked-by: Felipe Balbi <balbi@ti.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xen-netfront: Remove BUGs on paged skb data which crosses a page boundary

commit 8d609725d4357f499e2103e46011308b32f53513 upstream.

These BUGs can be erroneously triggered by frags which refer to
tail pages within a compound page. The data in these pages may
overrun the hardware page while still being contained within the
compound page, but since compound_order() evaluates to 0 for tail
pages the assertion fails. The code already iterates through
subsequent pages correctly in this scenario, so the BUGs are
unnecessary and can be removed.

Fixes: f36c374782e4 ("xen/netfront: handle compound page fragments on transmit")
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: fix swapoff hang after page migration and fork

commit 2022b4d18a491a578218ce7a4eca8666db895a73 upstream.

I've been seeing swapoff hangs in recent testing: it's cycling around
trying unsuccessfully to find an mm for some remaining pages of swap.

I have been exercising swap and page migration more heavily recently,
and now notice a long-standing error in copy_one_pte(): it's trying to
add dst_mm to swapoff's mmlist when it finds a swap entry, but is doing
so even when it's a migration entry or an hwpoison entry.

Which wouldn't matter much, except it adds dst_mm next to src_mm,
assuming src_mm is already on the mmlist: which may not be so. Then if
pages are later swapped out from dst_mm, swapoff won't be able to find
where to replace them.

There's already a !non_swap_entry() test for stats: move that up before
the swap_duplicate() and the addition to mmlist.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: frontswap: invalidate expired data on a dup-store failure

commit fb993fa1a2f669215fa03a09eed7848f2663e336 upstream.

If a frontswap dup-store failed, it should invalidate the expired page
in the backend, or it could trigger some data corruption issue.
Such as:
1. use zswap as the frontswap backend with writeback feature
2. store a swap page(version_1) to entry A, success
3. dup-store a newer page(version_2) to the same entry A, fail
4. use __swap_writepage() write version_2 page to swapfile, success
5. zswap do shrink, writeback version_1 page to swapfile
6. version_2 page is overwrited by version_1, data corrupt.

This patch fixes this issue by invalidating expired data immediately
when meet a dup-store failure.

Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge remote-tracking branch 'lsk/v3.10/topic/genpd' into linux-linaro-lsk

PM / Domains: Add generic OF-based PM domain look-up

This patch introduces generic code to perform PM domain look-up using
device tree and automatically bind devices to their PM domains.

Generic device tree bindings are introduced to specify PM domains of
devices in their device tree nodes.

Backwards compatibility with legacy Samsung-specific PM domain bindings
is provided, but for now the new code is not compiled when
CONFIG_ARCH_EXYNOS is selected to avoid collision with legacy code.
This will change as soon as the Exynos PM domain code gets converted to
use the generic framework in further patch.

This patch was originally submitted by Tomasz Figa when he was employed
by Samsung.

Link: http://marc.info/?l=linux-pm&m=139955349702152&w=2
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit aa42240ab2544a8bcb2efb400193826f57f3175e)
Signed-off-by: Mark Brown <broonie@kernel.org>
Conflicts:
include/linux/pm_domain.h

ACPI / PM: Assign the ->detach() callback when attaching the PM domain

As as preparation to simplify the detachment of devices from their PM
domains, we assign the ->detach() callback to genpd_dev_pm_detach().

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 86f1e15f5646b4855bd77025c950239650c4843e)
Signed-off-by: Mark Brown <broonie@kernel.org>

PM / Domains: Add a detach callback to the struct dev_pm_domain

The intent of this callback is to simplify detachment of devices from
their PM domains. Further patches will show the benefit.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit c3099a5294f2c7266234e8ea35cbffc20a41aa9a)
Signed-off-by: Mark Brown <broonie@kernel.org>

Merge remote-tracking branch 'lsk/v3.10/topic/clk-divider' into linux-linaro-lsk

clk: divider: Fix overflow in clk_divider_bestdiv

Commit c686078 ("clk: divider: Add round to closest divider") introduced
a helper function to check whether given divisor is the best one instead
of direct check. However due to int type used instead of unsigned long
for passing calculated rates to this function in certain cases an
overflow could occur, for example when trying to obtain maximum possible
clock rate by calling clk_round_rate(..., UINT_MAX).

This patch fixes this issue by changing the type of rate, now and best
arguments of the function to unsigned long, which is the type that
should be used for clock rates.

Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Acked-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
(cherry picked from commit 3c17296f28820d2a9fa23e549a1e808a4df5dfc6)
Signed-off-by: Mark Brown <broonie@kernel.org>

clk: divider: Fix table round up function

Commit 1d9fe6b97 ("clk: divider: Fix best div calculation for power-of-two and
table dividers") introduces a regression in its _table_round_up function.

When the divider passed to this function is greater than the max divider
available in the table, this function returns table's max divider.
Problem is that it causes an infinite loop in clk_divider_bestdiv() because
_next_div() will never return a value greater than maxdiv.

Instead of returning table's max divider, this patch returns INT_MAX.

Reported-by: Fabio Estevam <festevam@gmail.com>
Reported-by: Shawn Guo <shawn.guo@freescale.com>
Tested-by: Fabio Estevam <festevam@gmail.com>
Tested-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
(cherry picked from commit fe52e7505f8bf365d5ab0eeee19ababe406cbaaf)
Signed-off-by: Mark Brown <broonie@kernel.org>

clk: divider: Add round to closest divider

In some cases, we want to be able to round the divider to the closest one,
instead than rounding up.

This patch adds a new CLK_DIVIDER_ROUND_CLOSEST flag to specify the divider
has to round to closest div, keeping rounding up as de default behaviour.

Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
(cherry picked from commit 774b514390b1eb8476bc759262790762bd1ef45a)
Signed-off-by: Mark Brown <broonie@kernel.org>
Conflicts:
include/linux/clk-provider.h

clk: divider: Fix best div calculation for power-of-two and table dividers

The divider returned by clk_divider_bestdiv() is likely to be invalid in case
of power-of-two and table dividers when CLK_SET_RATE_PARENT flag isn't set.

Fixes boot on STiH416 platform.

Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: trivial merge conflict & updated changelog]

(cherry picked from commit dd23c2cd38da2c64af381b19795d2c4f115e8ecb)
Signed-off-by: Mark Brown <broonie@kernel.org>