firefly-linux-kernel-4.4.55.git
8 years agoRevert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"
Mauro Carvalho Chehab [Wed, 11 May 2016 16:09:34 +0000 (13:09 -0300)]
Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"

commit 93f0750dcdaed083d6209b01e952e98ca730db66 upstream.

This patch causes a Kernel panic when called on a DVB driver.

This was also reported by David R <david@unsolicited.net>:

May  7 14:47:35 server kernel: [  501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
May  7 14:47:35 server kernel: [  501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.247354] PGD cae6f067 PUD ca99c067 PMD 0
May  7 14:47:35 server kernel: [  501.247426] Oops: 0000 [#1] SMP
May  7 14:47:35 server kernel: [  501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport
May  7 14:47:35 server kernel: [  501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3
May  7 14:47:35 server kernel: [  501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211    07/08/2009
May  7 14:47:35 server kernel: [  501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000
May  7 14:47:35 server kernel: [  501.249861] RIP: 0010:[<ffffffffa0222c71>]  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.250002] RSP: 0018:ffff8801e07a3de8  EFLAGS: 00010086
May  7 14:47:35 server kernel: [  501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283
May  7 14:47:35 server kernel: [  501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014
May  7 14:47:35 server kernel: [  501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000
May  7 14:47:35 server kernel: [  501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8
May  7 14:47:35 server kernel: [  501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828
May  7 14:47:35 server kernel: [  501.250528] FS:  00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40
May  7 14:47:35 server kernel: [  501.250631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  7 14:47:35 server kernel: [  501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0
May  7 14:47:35 server kernel: [  501.250794] Stack:
May  7 14:47:35 server kernel: [  501.250822]  ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb
May  7 14:47:35 server kernel: [  501.250937]  0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000
May  7 14:47:35 server kernel: [  501.251051]  ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38
May  7 14:47:35 server kernel: [  501.251165] Call Trace:
May  7 14:47:35 server kernel: [  501.251200]  [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.251306]  [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251398]  [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100
May  7 14:47:35 server kernel: [  501.251482]  [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251569]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251662]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.255982]  [<ffffffff8106f984>] kthread+0xc4/0xe0
May  7 14:47:35 server kernel: [  501.260292]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.264615]  [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70
May  7 14:47:35 server kernel: [  501.268962]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e
May  7 14:47:35 server kernel: [  501.282146] RIP  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.286391]  RSP <ffff8801e07a3de8>
May  7 14:47:35 server kernel: [  501.290619] CR2: 0000000000000004
May  7 14:47:35 server kernel: [  501.294786] ---[ end trace b2b354153ccad110 ]---

This reverts commit 2c1f6951a8a82e6de0d82b1158b5e493fc6c54ab.

Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Fixes: 2c1f6951a8a8 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing")
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoInput: max8997-haptic - fix NULL pointer dereference
Marek Szyprowski [Mon, 9 May 2016 16:31:47 +0000 (09:31 -0700)]
Input: max8997-haptic - fix NULL pointer dereference

commit 6ae645d5fa385f3787bf1723639cd907fe5865e7 upstream.

NULL pointer derefence happens when booting with DTB because the
platform data for haptic device is not set in supplied data from parent
MFD device.

The MFD device creates only platform data (from Device Tree) for itself,
not for haptic child.

Unable to handle kernel NULL pointer dereference at virtual address 0000009c
pgd = c0004000
[0000009c] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
(max8997_haptic_probe) from [<c03f9cec>] (platform_drv_probe+0x4c/0xb0)
(platform_drv_probe) from [<c03f8440>] (driver_probe_device+0x214/0x2c0)
(driver_probe_device) from [<c03f8598>] (__driver_attach+0xac/0xb0)
(__driver_attach) from [<c03f67ac>] (bus_for_each_dev+0x68/0x9c)
(bus_for_each_dev) from [<c03f7a38>] (bus_add_driver+0x1a0/0x218)
(bus_add_driver) from [<c03f8db0>] (driver_register+0x78/0xf8)
(driver_register) from [<c0101774>] (do_one_initcall+0x90/0x1d8)
(do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
(kernel_init_freeable) from [<c06bb5b4>] (kernel_init+0x8/0x114)
(kernel_init) from [<c0107938>] (ret_from_fork+0x14/0x3c)

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 104594b01ce7 ("Input: add driver support for MAX8997-haptic")
[k.kozlowski: Write commit message, add CC-stable]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoget_rock_ridge_filename(): handle malformed NM entries
Al Viro [Thu, 5 May 2016 20:25:35 +0000 (16:25 -0400)]
get_rock_ridge_filename(): handle malformed NM entries

commit 99d825822eade8d827a1817357cbf3f889a552d6 upstream.

Payloads of NM entries are not supposed to contain NUL.  When we run
into such, only the part prior to the first NUL goes into the
concatenation (i.e. the directory entry name being encoded by a bunch
of NM entries).  We do stop when the amount collected so far + the
claimed amount in the current NM entry exceed 254.  So far, so good,
but what we return as the total length is the sum of *claimed*
sizes, not the actual amount collected.  And that can grow pretty
large - not unlimited, since you'd need to put CE entries in
between to be able to get more than the maximum that could be
contained in one isofs directory entry / continuation chunk and
we are stop once we'd encountered 32 CEs, but you can get about 8Kb
easily.  And that's what will be passed to readdir callback as the
name length.  8Kb __copy_to_user() from a buffer allocated by
__get_free_page()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agotools lib traceevent: Do not reassign parg after collapse_tree()
Steven Rostedt [Wed, 11 May 2016 19:09:36 +0000 (15:09 -0400)]
tools lib traceevent: Do not reassign parg after collapse_tree()

commit 106b816cb46ebd87408b4ed99a2e16203114daa6 upstream.

At the end of process_filter(), collapse_tree() was changed to update
the parg parameter, but the reassignment after the call wasn't removed.

What happens is that the "current_op" gets modified and freed and parg
is assigned to the new allocated argument. But after the call to
collapse_tree(), parg is assigned again to the just freed "current_op",
and this causes the tool to crash.

The current_op variable must also be assigned to NULL in case of error,
otherwise it will cause it to be free()ed twice.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 42d6194d133c ("tools lib traceevent: Refactor process_filter()")
Link: http://lkml.kernel.org/r/20160511150936.678c18a1@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoqla1280: Don't allocate 512kb of host tags
Johannes Thumshirn [Wed, 27 Apr 2016 08:48:52 +0000 (10:48 +0200)]
qla1280: Don't allocate 512kb of host tags

commit 2bcbc81421c511ef117cadcf0bee9c4340e68db0 upstream.

The qla1280 driver sets the scsi_host_template's can_queue field to 0xfffff
which results in an allocation failure when allocating the block layer tags
for the driver's queues. This was introduced with the change for host wide
tags in commit 64d513ac31b - "scsi: use host wide tags by default".

Reduce can_queue to MAX_OUTSTANDING_COMMANDS (512) to solve the allocation
error.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: 64d513ac31b - "scsi: use host wide tags by default"
Cc: Laura Abbott <labbott@redhat.com>
Cc: Michael Reed <mdr@sgi.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <jejb@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoatomic_open(): fix the handling of create_error
Al Viro [Wed, 27 Apr 2016 05:11:55 +0000 (01:11 -0400)]
atomic_open(): fix the handling of create_error

commit 10c64cea04d3c75c306b3f990586ffb343b63287 upstream.

* if we have a hashed negative dentry and either CREAT|EXCL on
r/o filesystem, or CREAT|TRUNC on r/o filesystem, or CREAT|EXCL
with failing may_o_create(), we should fail with EROFS or the
error may_o_create() has returned, but not ENOENT.  Which is what
the current code ends up returning.

* if we have CREAT|TRUNC hitting a regular file on a read-only
filesystem, we can't fail with EROFS here.  At the very least,
not until we'd done follow_managed() - we might have a writable
file (or a device, for that matter) bound on top of that one.
Moreover, the code downstream will see that O_TRUNC and attempt
to grab the write access (*after* following possible mount), so
if we really should fail with EROFS, it will happen.  No need
to do that inside atomic_open().

The real logics is much simpler than what the current code is
trying to do - if we decided to go for simple lookup, ended
up with a negative dentry *and* had create_error set, fail with
create_error.  No matter whether we'd got that negative dentry
from lookup_real() or had found it in dcache.

Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoregulator: axp20x: Fix axp22x ldo_io voltage ranges
Hans de Goede [Wed, 27 Apr 2016 13:59:27 +0000 (15:59 +0200)]
regulator: axp20x: Fix axp22x ldo_io voltage ranges

commit a2262e5a12e05389ab4c7fc5cf60016b041dd8dc upstream.

The minium voltage of 1800mV is a copy and paste error from the axp20x
regulator info. The correct minimum voltage for the ldo_io regulators
on the axp22x is 700mV.

Fixes: 1b82b4e4f954 ("regulator: axp20x: Add support for AXP22X regulators")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoregulator: s2mps11: Fix invalid selector mask and voltages for buck9
Krzysztof Kozlowski [Mon, 28 Mar 2016 04:09:56 +0000 (13:09 +0900)]
regulator: s2mps11: Fix invalid selector mask and voltages for buck9

commit 3b672623079bb3e5685b8549e514f2dfaa564406 upstream.

The buck9 regulator of S2MPS11 PMIC had incorrect vsel_mask (0xff
instead of 0x1f) thus reading entire register as buck9's voltage. This
effectively caused regulator core to interpret values as higher voltages
than they were and then to set real voltage much lower than intended.

The buck9 provides power to other regulators, including LDO13
and LDO19 which supply the MMC2 (SD card). On Odroid XU3/XU4 the lower
voltage caused SD card detection errors on Odroid XU3/XU4:
mmc1: card never left busy state
mmc1: error -110 whilst initialising SD card

During driver probe the regulator core was checking whether initial
voltage matches the constraints. With incorrect vsel_mask of 0xff and
default value of 0x50, the core interpreted this as 5 V which is outside
of constraints (3-3.775 V). Then the regulator core was adjusting the
voltage to match the constraints. With incorrect vsel_mask this new
voltage mapped to a vere low voltage in the driver.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoworkqueue: fix rebind bound workers warning
Wanpeng Li [Wed, 11 May 2016 09:55:18 +0000 (17:55 +0800)]
workqueue: fix rebind bound workers warning

commit f7c17d26f43d5cc1b7a6b896cd2fa24a079739b9 upstream.

------------[ cut here ]------------
WARNING: CPU: 0 PID: 16 at kernel/workqueue.c:4559 rebind_workers+0x1c0/0x1d0
Modules linked in:
CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 4.6.0-rc4+ #31
Hardware name: IBM IBM System x3550 M4 Server -[7914IUW]-/00Y8603, BIOS -[D7E128FUS-1.40]- 07/23/2013
 0000000000000000 ffff881037babb58 ffffffff8139d885 0000000000000010
 0000000000000000 0000000000000000 0000000000000000 ffff881037babba8
 ffffffff8108505d ffff881037ba0000 000011cf3e7d6e60 0000000000000046
Call Trace:
 dump_stack+0x89/0xd4
 __warn+0xfd/0x120
 warn_slowpath_null+0x1d/0x20
 rebind_workers+0x1c0/0x1d0
 workqueue_cpu_up_callback+0xf5/0x1d0
 notifier_call_chain+0x64/0x90
 ? trace_hardirqs_on_caller+0xf2/0x220
 ? notify_prepare+0x80/0x80
 __raw_notifier_call_chain+0xe/0x10
 __cpu_notify+0x35/0x50
 notify_down_prepare+0x5e/0x80
 ? notify_prepare+0x80/0x80
 cpuhp_invoke_callback+0x73/0x330
 ? __schedule+0x33e/0x8a0
 cpuhp_down_callbacks+0x51/0xc0
 cpuhp_thread_fun+0xc1/0xf0
 smpboot_thread_fn+0x159/0x2a0
 ? smpboot_create_threads+0x80/0x80
 kthread+0xef/0x110
 ? wait_for_completion+0xf0/0x120
 ? schedule_tail+0x35/0xf0
 ret_from_fork+0x22/0x50
 ? __init_kthread_worker+0x70/0x70
---[ end trace eb12ae47d2382d8f ]---
notify_down_prepare: attempt to take down CPU 0 failed

This bug can be reproduced by below config w/ nohz_full= all cpus:

CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
CONFIG_DEBUG_HOTPLUG_CPU0=y
CONFIG_NO_HZ_FULL=y

As Thomas pointed out:

| If a down prepare callback fails, then DOWN_FAILED is invoked for all
| callbacks which have successfully executed DOWN_PREPARE.
|
| But, workqueue has actually two notifiers. One which handles
| UP/DOWN_FAILED/ONLINE and one which handles DOWN_PREPARE.
|
| Now look at the priorities of those callbacks:
|
| CPU_PRI_WORKQUEUE_UP        = 5
| CPU_PRI_WORKQUEUE_DOWN      = -5
|
| So the call order on DOWN_PREPARE is:
|
| CB 1
| CB ...
| CB workqueue_up() -> Ignores DOWN_PREPARE
| CB ...
| CB X ---> Fails
|
| So we call up to CB X with DOWN_FAILED
|
| CB 1
| CB ...
| CB workqueue_up() -> Handles DOWN_FAILED
| CB ...
| CB X-1
|
| So the problem is that the workqueue stuff handles DOWN_FAILED in the up
| callback, while it should do it in the down callback. Which is not a good idea
| either because it wants to be called early on rollback...
|
| Brilliant stuff, isn't it? The hotplug rework will solve this problem because
| the callbacks become symetric, but for the existing mess, we need some
| workaround in the workqueue code.

The boot CPU handles housekeeping duty(unbound timers, workqueues,
timekeeping, ...) on behalf of full dynticks CPUs. It must remain
online when nohz full is enabled. There is a priority set to every
notifier_blocks:

workqueue_cpu_up > tick_nohz_cpu_down > workqueue_cpu_down

So tick_nohz_cpu_down callback failed when down prepare cpu 0, and
notifier_blocks behind tick_nohz_cpu_down will not be called any
more, which leads to workers are actually not unbound. Then hotplug
state machine will fallback to undo and online cpu 0 again. Workers
will be rebound unconditionally even if they are not unbound and
trigger the warning in this progress.

This patch fix it by catching !DISASSOCIATED to avoid rebind bound
workers.

Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Suggested-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC
Boris Brezillon [Wed, 11 May 2016 09:00:02 +0000 (11:00 +0200)]
ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC

commit aab0a4c83ceb344d2327194bf354820e50607af6 upstream.

The memory range assigned to the PMC (Power Management Controller) was
not including the PMC_PCR register which are used to control peripheral
clocks.

This was working fine thanks to the page granularity of ioremap(), but
started to fail when we switched to syscon/regmap, because regmap is
making sure that all accesses are falling into the reserved range.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Tested-by: Richard Genoud <richard.genoud@gmail.com>
Fixes: 863a81c3be1d ("clk: at91: make use of syscon to share PMC registers in several drivers")
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agovfs: rename: check backing inode being equal
Miklos Szeredi [Tue, 10 May 2016 23:16:37 +0000 (01:16 +0200)]
vfs: rename: check backing inode being equal

commit 9409e22acdfc9153f88d9b1ed2bd2a5b34d2d3ca upstream.

If a file is renamed to a hardlink of itself POSIX specifies that rename(2)
should do nothing and return success.

This condition is checked in vfs_rename().  However it won't detect hard
links on overlayfs where these are given separate inodes on the overlayfs
layer.

Overlayfs itself detects this condition and returns success without doing
anything, but then vfs_rename() will proceed as if this was a successful
rename (detach_mounts(), d_move()).

The correct thing to do is to detect this condition before even calling
into overlayfs.  This patch does this by calling vfs_select_inode() to get
the underlying inodes.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agovfs: add vfs_select_inode() helper
Miklos Szeredi [Tue, 10 May 2016 23:16:37 +0000 (01:16 +0200)]
vfs: add vfs_select_inode() helper

commit 54d5ca871e72f2bb172ec9323497f01cd5091ec7 upstream.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoperf/core: Disable the event on a truncated AUX record
Alexander Shishkin [Tue, 10 May 2016 13:18:33 +0000 (16:18 +0300)]
perf/core: Disable the event on a truncated AUX record

commit 9f448cd3cbcec8995935e60b27802ae56aac8cc0 upstream.

When the PMU driver reports a truncated AUX record, it effectively means
that there is no more usable room in the event's AUX buffer (even though
there may still be some room, so that perf_aux_output_begin() doesn't take
action). At this point the consumer still has to be woken up and the event
has to be disabled, otherwise the event will just keep spinning between
perf_aux_output_begin() and perf_aux_output_end() until its context gets
unscheduled.

Again, for cpu-wide events this means never, so once in this condition,
they will be forever losing data.

Fix this by disabling the event and waking up the consumer in case of a
truncated AUX record.

Reported-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1462886313-13660-3-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoregmap: spmi: Fix regmap_spmi_ext_read in multi-byte case
Jack Pham [Fri, 15 Apr 2016 06:37:26 +0000 (23:37 -0700)]
regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case

commit dec8e8f6e6504aa3496c0f7cc10c756bb0e10f44 upstream.

Specifically for the case of reads that use the Extended Register
Read Long command, a multi-byte read operation is broken up into
8-byte chunks.  However the call to spmi_ext_register_readl() is
incorrectly passing 'val_size', which if greater than 8 will
always fail.  The argument should instead be 'len'.

Fixes: c9afbb05a9ff ("regmap: spmi: support base and extended register spaces")
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopinctrl: at91-pio4: fix pull-up/down logic
Ludovic Desroches [Tue, 19 Apr 2016 14:03:45 +0000 (16:03 +0200)]
pinctrl: at91-pio4: fix pull-up/down logic

commit 5305a7b7e860bb40ab226bc7d58019416073948a upstream.

The default configuration of a pin is often with a value in the
pull-up/down field at chip reset. So, even if the internal logic of the
controller prevents writing a configuration with pull-up and pull-down at
the same time, we must ensure explicitly this condition before writing the
register.

This was leading to a pull-down condition not taken into account for
instance.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Fixes: 776180848b57 ("pinctrl: introduce driver for Atmel PIO4 controller")
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agospi: spi-ti-qspi: Handle truncated frames properly
Ben Hutchings [Tue, 12 Apr 2016 11:58:14 +0000 (12:58 +0100)]
spi: spi-ti-qspi: Handle truncated frames properly

commit 1ff7760ff66b98ef244bf0e5e2bd5310651205ad upstream.

We clamp frame_len_words to a maximum of 4096, but do not actually
limit the number of words written or read through the DATA registers
or the length added to spi_message::actual_length.  This results in
silent data corruption for commands longer than this maximum.

Recalculate the length of each transfer, taking frame_len_words into
account.  Use this length in qspi_{read,write}_msg(), and to increment
spi_message::actual_length.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agospi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden
Ben Hutchings [Tue, 12 Apr 2016 11:56:25 +0000 (12:56 +0100)]
spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden

commit ea1b60fb085839a9544cb3a0069992991beabb7f upstream.

Each transfer can specify 8, 16 or 32 bits per word independently of
the default for the device being addressed.  However, currently we
calculate the number of words in the frame assuming that the word size
is the device default.

If multiple transfers in the same message have differing
bits_per_word, we bitwise-or the different values in the WLEN register
field.

Fix both of these.  Also rename 'frame_length' to 'frame_len_words' to
make clear that it's not a byte count like spi_message::frame_length.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agospi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT
Jarkko Nikula [Tue, 26 Apr 2016 07:08:26 +0000 (10:08 +0300)]
spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT

commit 66ec246eb9982e7eb8e15e1fc55f543230310dd0 upstream.

Certain Intel Sunrisepoint PCH variants report zero chip selects in SPI
capabilities register even they have one per port. Detection in
pxa2xx_spi_probe() sets master->num_chipselect to 0 leading to -EINVAL
from spi_register_master() where chip select count is validated.

Fix this by not using SPI capabilities register on Sunrisepoint. They don't
have more than one chip select so use the default value 1 instead of
detection.

Fixes: 8b136baa5892 ("spi: pxa2xx: Detect number of enabled Intel LPSS SPI chip select signals")
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda - Fix broken reconfig
Takashi Iwai [Tue, 10 May 2016 08:24:02 +0000 (10:24 +0200)]
ALSA: hda - Fix broken reconfig

commit addacd801e1638f41d659cb53b9b73fc14322cb1 upstream.

The HD-audio reconfig function got broken in the recent kernels,
typically resulting in a failure like:
  snd_hda_intel 0000:00:1b.0: control 3:0:0:Playback Channel Map:0 is already present

This is because of the code restructuring to move the PCM and control
instantiation into the codec drive probe, by the commit [bcd96557bd0a:
ALSA: hda - Build PCMs and controls at codec driver probe].  Although
the commit above removed the calls of snd_hda_codec_build_pcms() and
*_build_controls() at the controller driver probe, the similar calls
in the reconfig were still left forgotten.  This caused the
conflicting and duplicated PCMs and controls.

The fix is trivial: just remove these superfluous calls from
reconfig_codec().

Fixes: bcd96557bd0a ('ALSA: hda - Build PCMs and controls at codec driver probe')
Reported-by: Jochen Henneberg <jh@henneberg-systemdesign.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda - Fix white noise on Asus UX501VW headset
Kaho Ng [Sun, 8 May 2016 16:27:49 +0000 (00:27 +0800)]
ALSA: hda - Fix white noise on Asus UX501VW headset

commit 2da2dc9ead232f25601404335cca13c0f722d41b upstream.

For reducing the noise from the headset output on ASUS UX501VW,
call the existing fixup, alc_fixup_headset_mode_alc668(), additionally.

Thread: https://bbs.archlinux.org/viewtopic.php?id=209554

Signed-off-by: Kaho Ng <ngkaho1234@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: hda - Fix subwoofer pin on ASUS N751 and N551
Yura Pakhuchiy [Sat, 7 May 2016 16:53:36 +0000 (23:53 +0700)]
ALSA: hda - Fix subwoofer pin on ASUS N751 and N551

commit 3231e2053eaeee70bdfb216a78a30f11e88e2243 upstream.

Subwoofer does not work out of the box on ASUS N751/N551 laptops. This
patch fixes it. Patch tested on N751 laptop. N551 part is not tested,
but according to [1] and [2] this laptop requires similar changes, so I
included them in the patch.

1. https://github.com/honsiorovskyi/asus-n551-hda-fix
2. https://bugs.launchpad.net/ubuntu/+source/alsa-tools/+bug/1405691

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=117781
Signed-off-by: Yura Pakhuchiy <pakhuchiy@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: usb-audio: Yet another Phoneix Audio device quirk
Takashi Iwai [Wed, 11 May 2016 15:48:00 +0000 (17:48 +0200)]
ALSA: usb-audio: Yet another Phoneix Audio device quirk

commit 84add303ef950b8d85f54bc2248c2bc73467c329 upstream.

Phoenix Audio has yet another device with another id (even a different
vendor id, 0556:0014) that requires the same quirk for the sample
rate.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=110221
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2)
Takashi Iwai [Fri, 29 Apr 2016 09:20:15 +0000 (11:20 +0200)]
ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2)

commit 2d2c038a9999f423e820d89db2b5d7774b67ba49 upstream.

Phoenix Audio MT202pcs (1de7:0114) and MT202exe (1de7:0013) need the
same workaround as TMX320 for avoiding the firmware bug.  It fixes the
frequent error about the sample rate inquiries and the slow device
probe as consequence.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=117321
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocrypto: testmgr - Use kmalloc memory for RSA input
Herbert Xu [Thu, 5 May 2016 08:42:49 +0000 (16:42 +0800)]
crypto: testmgr - Use kmalloc memory for RSA input

commit df27b26f04ed388ff4cc2b5d8cfdb5d97678816f upstream.

As akcipher uses an SG interface, you must not use vmalloc memory
as input for it.  This patch fixes testmgr to copy the vmalloc
test vectors to kmalloc memory before running the test.

This patch also removes a superfluous sg_virt call in do_test_rsa.

Reported-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocrypto: hash - Fix page length clamping in hash walk
Herbert Xu [Wed, 4 May 2016 09:52:56 +0000 (17:52 +0800)]
crypto: hash - Fix page length clamping in hash walk

commit 13f4bb78cf6a312bbdec367ba3da044b09bf0e29 upstream.

The crypto hash walk code is broken when supplied with an offset
greater than or equal to PAGE_SIZE.  This patch fixes it by adjusting
walk->pg and walk->offset when this happens.

Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocrypto: qat - fix invalid pf2vf_resp_wq logic
Tadeusz Struk [Mon, 25 Apr 2016 14:32:19 +0000 (07:32 -0700)]
crypto: qat - fix invalid pf2vf_resp_wq logic

commit 9e209fcfb804da262e38e5cd2e680c47a41f0f95 upstream.

The pf2vf_resp_wq is a global so it has to be created at init
and destroyed at exit, instead of per device.

Tested-by: Suresh Marikkannu <sureshx.marikkannu@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agos390/mm: fix asce_bits handling with dynamic pagetable levels
Gerald Schaefer [Fri, 15 Apr 2016 14:38:40 +0000 (16:38 +0200)]
s390/mm: fix asce_bits handling with dynamic pagetable levels

commit 723cacbd9dc79582e562c123a0bacf8bfc69e72a upstream.

There is a race with multi-threaded applications between context switch and
pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and
mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a
pagetable upgrade on another thread in crst_table_upgrade() could already
have set new asce_bits, but not yet the new mm->pgd. This would result in a
corrupt user_asce in switch_mm(), and eventually in a kernel panic from a
translation exception.

Fix this by storing the complete asce instead of just the asce_bits, which
can then be read atomically from switch_mm(), so that it either sees the
old value or the new value, but no mixture. Both cases are OK. Having the
old value would result in a page fault on access to the higher level memory,
but the fault handler would see the new mm->pgd, if it was a valid access
after the mmap on the other thread has completed. So as worst-case scenario
we would have a page fault loop for the racing thread until the next time
slice.

Also remove dead code and simplify the upgrade/downgrade path, there are no
upgrades from 2 levels, and only downgrades from 3 levels for compat tasks.
There are also no concurrent upgrades, because the mmap_sem is held with
down_write() in do_mmap, so the flush and table checks during upgrade can
be removed.

Reported-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agozsmalloc: fix zs_can_compact() integer overflow
Sergey Senozhatsky [Mon, 9 May 2016 23:28:49 +0000 (16:28 -0700)]
zsmalloc: fix zs_can_compact() integer overflow

commit 44f43e99fe70833058482d183e99fdfd11220996 upstream.

zs_can_compact() has two race conditions in its core calculation:

unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
zs_stat_get(class, OBJ_USED);

1) classes are not locked, so the numbers of allocated and used
   objects can change by the concurrent ops happening on other CPUs
2) shrinker invokes it from preemptible context

Depending on the circumstances, thus, OBJ_ALLOCATED can become
less than OBJ_USED, which can result in either very high or
negative `total_scan' value calculated later in do_shrink_slab().

do_shrink_slab() has some logic to prevent those cases:

 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62

However, due to the way `total_scan' is calculated, not every
shrinker->count_objects() overflow can be spotted and handled.
To demonstrate the latter, I added some debugging code to do_shrink_slab()
(x86_64) and the results were:

 vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
 vmscan: but total_scan > 0: 92679974445502
 vmscan: resulting total_scan: 92679974445502
[..]
 vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
 vmscan: but total_scan > 0: 22634041808232578
 vmscan: resulting total_scan: 22634041808232578

Even though shrinker->count_objects() has returned an overflowed value,
the resulting `total_scan' is positive, and, what is more worrisome, it
is insanely huge. This value is getting used later on in
shrinker->scan_objects() loop:

        while (total_scan >= batch_size ||
               total_scan >= freeable) {
                unsigned long ret;
                unsigned long nr_to_scan = min(batch_size, total_scan);

                shrinkctl->nr_to_scan = nr_to_scan;
                ret = shrinker->scan_objects(shrinker, shrinkctl);
                if (ret == SHRINK_STOP)
                        break;
                freed += ret;

                count_vm_events(SLABS_SCANNED, nr_to_scan);
                total_scan -= nr_to_scan;

                cond_resched();
        }

`total_scan >= batch_size' is true for a very-very long time and
'total_scan >= freeable' is also true for quite some time, because
`freeable < 0' and `total_scan' is large enough, for example,
22634041808232578. The only break condition, in the given scheme of
things, is shrinker->scan_objects() == SHRINK_STOP test, which is a
bit too weak to rely on, especially in heavy zsmalloc-usage scenarios.

To fix the issue, take a pool stat snapshot and use it instead of
racy zs_stat_get() calls.

Link: http://lkml.kernel.org/r/20160509140052.3389-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoocfs2: fix posix_acl_create deadlock
Junxiao Bi [Thu, 12 May 2016 22:42:18 +0000 (15:42 -0700)]
ocfs2: fix posix_acl_create deadlock

commit c25a1e0671fbca7b2c0d0757d533bd2650d6dc0c upstream.

Commit 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure")
refactored code to use posix_acl_create.  The problem with this function
is that it is not mindful of the cluster wide inode lock making it
unsuitable for use with ocfs2 inode creation with ACLs.  For example,
when used in ocfs2_mknod, this function can cause deadlock as follows.
The parent dir inode lock is taken when calling posix_acl_create ->
get_acl -> ocfs2_iop_get_acl which takes the inode lock again.  This can
cause deadlock if there is a blocked remote lock request waiting for the
lock to be downconverted.  And same deadlock happened in ocfs2_reflink.
This fix is to revert back using ocfs2_init_acl.

Fixes: 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure")
Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang
Junxiao Bi [Thu, 12 May 2016 22:42:15 +0000 (15:42 -0700)]
ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang

commit 5ee0fbd50fdf1c1329de8bee35ea9d7c6a81a2e0 upstream.

Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
introduced this issue.  ocfs2_setattr called by chmod command holds
cluster wide inode lock when calling posix_acl_chmod.  This latter
function in turn calls ocfs2_iop_get_acl and ocfs2_iop_set_acl.  These
two are also called directly from vfs layer for getfacl/setfacl commands
and therefore acquire the cluster wide inode lock.  If a remote
conversion request comes after the first inode lock in ocfs2_setattr,
OCFS2_LOCK_BLOCKED will be set.  And this will cause the second call to
inode lock from the ocfs2_iop_get_acl() to block indefinetly.

The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which
does not call back into the filesystem.  Therefore, we restore
ocfs2_acl_chmod(), modify it slightly for locking as needed, and use that
instead.

Fixes: 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet/route: enforce hoplimit max value
Paolo Abeni [Fri, 13 May 2016 16:33:41 +0000 (18:33 +0200)]
net/route: enforce hoplimit max value

[ Upstream commit 626abd59e51d4d8c6367e03aae252a8aa759ac78 ]

Currently, when creating or updating a route, no check is performed
in both ipv4 and ipv6 code to the hoplimit value.

The caller can i.e. set hoplimit to 256, and when such route will
 be used, packets will be sent with hoplimit/ttl equal to 0.

This commit adds checks for the RTAX_HOPLIMIT value, in both ipv4
ipv6 route code, substituting any value greater than 255 with 255.

This is consistent with what is currently done for ADVMSS and MTU
in the ipv4 code.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agotcp: refresh skb timestamp at retransmit time
Eric Dumazet [Tue, 10 May 2016 03:55:16 +0000 (20:55 -0700)]
tcp: refresh skb timestamp at retransmit time

[ Upstream commit 10a81980fc47e64ffac26a073139813d3f697b64 ]

In the very unlikely case __tcp_retransmit_skb() can not use the cloning
done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing
the copy and transmit, otherwise TCP TS val will be an exact copy of
original transmit.

Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: thunderx: avoid exposing kernel stack
xypron.glpk@gmx.de [Sun, 8 May 2016 22:46:18 +0000 (00:46 +0200)]
net: thunderx: avoid exposing kernel stack

[ Upstream commit 161de2caf68c549c266e571ffba8e2163886fb10 ]

Reserved fields should be set to zero to avoid exposing
bits from the kernel stack.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: fix a kernel infoleak in x25 module
Kangjie Lu [Sun, 8 May 2016 16:10:14 +0000 (12:10 -0400)]
net: fix a kernel infoleak in x25 module

[ Upstream commit 79e48650320e6fba48369fccf13fd045315b19b8 ]

Stack object "dte_facilities" is allocated in x25_rx_call_request(),
which is supposed to be initialized in x25_negotiate_facilities.
However, 5 fields (8 bytes in total) are not initialized. This
object is then copied to userland via copy_to_user, thus infoleak
occurs.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agouapi glibc compat: fix compile errors when glibc net/if.h included before linux/if...
Mikko Rapeli [Sun, 24 Apr 2016 15:45:00 +0000 (17:45 +0200)]
uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h MIME-Version: 1.0

[ Upstream commit 4a91cb61bb995e5571098188092e296192309c77 ]

glibc's net/if.h contains copies of definitions from linux/if.h and these
conflict and cause build failures if both files are included by application
source code. Changes in uapi headers, which fixed header file dependencies to
include linux/if.h when it was needed, e.g. commit 1ffad83d, made the
net/if.h and linux/if.h incompatibilities visible as build failures for
userspace applications like iproute2 and xtables-addons.

This patch fixes compile errors when glibc net/if.h is included before
linux/if.h:

./linux/if.h:99:21: error: redeclaration of enumerator ‘IFF_NOARP’
./linux/if.h:98:23: error: redeclaration of enumerator ‘IFF_RUNNING’
./linux/if.h:97:26: error: redeclaration of enumerator ‘IFF_NOTRAILERS’
./linux/if.h:96:27: error: redeclaration of enumerator ‘IFF_POINTOPOINT’
./linux/if.h:95:24: error: redeclaration of enumerator ‘IFF_LOOPBACK’
./linux/if.h:94:21: error: redeclaration of enumerator ‘IFF_DEBUG’
./linux/if.h:93:25: error: redeclaration of enumerator ‘IFF_BROADCAST’
./linux/if.h:92:19: error: redeclaration of enumerator ‘IFF_UP’
./linux/if.h:252:8: error: redefinition of ‘struct ifconf’
./linux/if.h:203:8: error: redefinition of ‘struct ifreq’
./linux/if.h:169:8: error: redefinition of ‘struct ifmap’
./linux/if.h:107:23: error: redeclaration of enumerator ‘IFF_DYNAMIC’
./linux/if.h:106:25: error: redeclaration of enumerator ‘IFF_AUTOMEDIA’
./linux/if.h:105:23: error: redeclaration of enumerator ‘IFF_PORTSEL’
./linux/if.h:104:25: error: redeclaration of enumerator ‘IFF_MULTICAST’
./linux/if.h:103:21: error: redeclaration of enumerator ‘IFF_SLAVE’
./linux/if.h:102:22: error: redeclaration of enumerator ‘IFF_MASTER’
./linux/if.h:101:24: error: redeclaration of enumerator ‘IFF_ALLMULTI’
./linux/if.h:100:23: error: redeclaration of enumerator ‘IFF_PROMISC’

The cases where linux/if.h is included before net/if.h need a similar fix in
the glibc side, or the order of include files can be changed userspace
code as a workaround.

This change was tested in x86 userspace on Debian unstable with
scripts/headers_compile_test.sh:

$ make headers_install && \
  cd usr/include && ../../scripts/headers_compile_test.sh -l -k
...
cc -Wall -c -nostdinc -I /usr/lib/gcc/i586-linux-gnu/5/include -I /usr/lib/gcc/i586-linux-gnu/5/include-fixed -I . -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH/i586-linux-gnu -o /dev/null ./linux/if.h_libc_before_kernel.h
PASSED libc before kernel test: ./linux/if.h

Reported-by: Jan Engelhardt <jengelh@inai.de>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Reported-by: Stephen Hemminger <shemming@brocade.com>
Reported-by: Waldemar Brodkorb <mail@waldemar-brodkorb.de>
Cc: Gabriel Laskar <gabriel@lse.epita.fr>
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobridge: fix igmp / mld query parsing
Linus Lüssing [Wed, 4 May 2016 15:25:02 +0000 (17:25 +0200)]
bridge: fix igmp / mld query parsing

[ Upstream commit 856ce5d083e14571d051301fe3c65b32b8cbe321 ]

With the newly introduced helper functions the skb pulling is hidden
in the checksumming function - and undone before returning to the
caller.

The IGMP and MLD query parsing functions in the bridge still
assumed that the skb is pointing to the beginning of the IGMP/MLD
message while it is now kept at the beginning of the IPv4/6 header.

If there is a querier somewhere else, then this either causes
the multicast snooping to stay disabled even though it could be
enabled. Or, if we have the querier enabled too, then this can
create unnecessary IGMP / MLD query messages on the link.

Fixing this by taking the offset between IP and IGMP/MLD header into
account, too.

Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: bridge: fix old ioctl unlocked net device walk
Nikolay Aleksandrov [Wed, 4 May 2016 14:18:45 +0000 (16:18 +0200)]
net: bridge: fix old ioctl unlocked net device walk

[ Upstream commit 31ca0458a61a502adb7ed192bf9716c6d05791a5 ]

get_bridge_ifindices() is used from the old "deviceless" bridge ioctl
calls which aren't called with rtnl held. The comment above says that it is
called with rtnl but that is not really the case.
Here's a sample output from a test ASSERT_RTNL() which I put in
get_bridge_ifindices and executed "brctl show":
[  957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30)
[  957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G        W  O
4.6.0-rc4+ #157
[  957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[  957.423009]  0000000000000000 ffff880058adfdf0 ffffffff8138dec5
0000000000000400
[  957.423009]  ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32
0000000000000001
[  957.423009]  00007ffec1a444b0 0000000000000400 ffff880053c19130
0000000000008940
[  957.423009] Call Trace:
[  957.423009]  [<ffffffff8138dec5>] dump_stack+0x85/0xc0
[  957.423009]  [<ffffffffa05ead32>]
br_ioctl_deviceless_stub+0x212/0x2e0 [bridge]
[  957.423009]  [<ffffffff81515beb>] sock_ioctl+0x22b/0x290
[  957.423009]  [<ffffffff8126ba75>] do_vfs_ioctl+0x95/0x700
[  957.423009]  [<ffffffff8126c159>] SyS_ioctl+0x79/0x90
[  957.423009]  [<ffffffff8163a4c0>] entry_SYSCALL_64_fastpath+0x23/0xc1

Since it only reads bridge ifindices, we can use rcu to safely walk the net
device list. Also remove the wrong rtnl comment above.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoVSOCK: do not disconnect socket when peer has shutdown SEND only
Ian Campbell [Wed, 4 May 2016 13:21:53 +0000 (14:21 +0100)]
VSOCK: do not disconnect socket when peer has shutdown SEND only

[ Upstream commit dedc58e067d8c379a15a8a183c5db318201295bb ]

The peer may be expecting a reply having sent a request and then done a
shutdown(SHUT_WR), so tearing down the whole socket at this point seems
wrong and breaks for me with a client which does a SHUT_WR.

Looking at other socket family's stream_recvmsg callbacks doing a shutdown
here does not seem to be the norm and removing it does not seem to have
had any adverse effects that I can see.

I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
on the vmci transport.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Andy King <acking@vmware.com>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: Adit Ranadive <aditr@vmware.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet/mlx4_en: Fix endianness bug in IPV6 csum calculation
Daniel Jurgens [Wed, 4 May 2016 12:00:33 +0000 (15:00 +0300)]
net/mlx4_en: Fix endianness bug in IPV6 csum calculation

[ Upstream commit 82d69203df634b4dfa765c94f60ce9482bcc44d6 ]

Use htons instead of unconditionally byte swapping nexthdr.  On a little
endian systems shifting the byte is correct behavior, but it results in
incorrect csums on big endian architectures.

Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Carol Soto <clsoto@us.ibm.com>
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: fix infoleak in rtnetlink
Kangjie Lu [Tue, 3 May 2016 20:46:24 +0000 (16:46 -0400)]
net: fix infoleak in rtnetlink

[ Upstream commit 5f8e44741f9f216e33736ea4ec65ca9ac03036e6 ]

The stack object “map” has a total size of 32 bytes. Its last 4
bytes are padding generated by compiler. These padding bytes are
not initialized and sent out via “nla_put”.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: fix infoleak in llc
Kangjie Lu [Tue, 3 May 2016 20:35:05 +0000 (16:35 -0400)]
net: fix infoleak in llc

[ Upstream commit b8670c09f37bdf2847cc44f36511a53afc6161fd ]

The stack object “info” has a total size of 12 bytes. Its last byte
is padding which is not initialized and leaked via “put_cmsg”.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: fec: only clear a queue's work bit if the queue was emptied
Uwe Kleine-König [Tue, 3 May 2016 14:38:53 +0000 (16:38 +0200)]
net: fec: only clear a queue's work bit if the queue was emptied

[ Upstream commit 1c021bb717a70aaeaa4b25c91f43c2aeddd922de ]

In the receive path a queue's work bit was cleared unconditionally even
if fec_enet_rx_queue only read out a part of the available packets from
the hardware. This resulted in not reading any packets in the next napi
turn and so packets were delayed or lost.

The obvious fix is to only clear a queue's bit when the queue was
emptied.

Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Fugang Duan <fugang.duan@nxp.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonetem: Segment GSO packets on enqueue
Neil Horman [Mon, 2 May 2016 16:20:15 +0000 (12:20 -0400)]
netem: Segment GSO packets on enqueue

[ Upstream commit 6071bd1aa13ed9e41824bafad845b7b7f4df5cfd ]

This was recently reported to me, and reproduced on the latest net kernel,
when attempting to run netperf from a host that had a netem qdisc attached
to the egress interface:

[  788.073771] ---------------------[ cut here ]---------------------------
[  788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda()
[  788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962
data_len=0 gso_size=1448 gso_type=1 ip_summed=3
[  788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif
ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul
glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si
i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter
pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c
sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt
i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci
crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp
serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log
dm_mod
[  788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G        W
------------   3.10.0-327.el7.x86_64 #1
[  788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012
[  788.542260]  ffff880437c036b8 f7afc56532a53db9 ffff880437c03670
ffffffff816351f1
[  788.576332]  ffff880437c036a8 ffffffff8107b200 ffff880633e74200
ffff880231674000
[  788.611943]  0000000000000001 0000000000000003 0000000000000000
ffff880437c03710
[  788.647241] Call Trace:
[  788.658817]  <IRQ>  [<ffffffff816351f1>] dump_stack+0x19/0x1b
[  788.686193]  [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
[  788.713803]  [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
[  788.741314]  [<ffffffff812f92f3>] ? ___ratelimit+0x93/0x100
[  788.767018]  [<ffffffff81637f49>] skb_warn_bad_offload+0xcd/0xda
[  788.796117]  [<ffffffff8152950c>] skb_checksum_help+0x17c/0x190
[  788.823392]  [<ffffffffa01463a1>] netem_enqueue+0x741/0x7c0 [sch_netem]
[  788.854487]  [<ffffffff8152cb58>] dev_queue_xmit+0x2a8/0x570
[  788.880870]  [<ffffffff8156ae1d>] ip_finish_output+0x53d/0x7d0
...

The problem occurs because netem is not prepared to handle GSO packets (as it
uses skb_checksum_help in its enqueue path, which cannot manipulate these
frames).

The solution I think is to simply segment the skb in a simmilar fashion to the
way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes.
When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt
the first segment, and enqueue the remaining ones.

tested successfully by myself on the latest net kernel, to which this applies

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netem@lists.linux-foundation.org
CC: eric.dumazet@gmail.com
CC: stephen@networkplumber.org
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agosch_dsmark: update backlog as well
WANG Cong [Thu, 25 Feb 2016 22:55:03 +0000 (14:55 -0800)]
sch_dsmark: update backlog as well

[ Upstream commit bdf17661f63a79c3cb4209b970b1cc39e34f7543 ]

Similarly, we need to update backlog too when we update qlen.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agosch_htb: update backlog as well
WANG Cong [Thu, 25 Feb 2016 22:55:02 +0000 (14:55 -0800)]
sch_htb: update backlog as well

[ Upstream commit 431e3a8e36a05a37126f34b41aa3a5a6456af04e ]

We saw qlen!=0 but backlog==0 on our production machine:

qdisc htb 1: dev eth0 root refcnt 2 r2q 10 default 1 direct_packets_stat 0 ver 3.17
 Sent 172680457356 bytes 222469449 pkt (dropped 0, overlimits 123575834 requeues 0)
 backlog 0b 72p requeues 0

The problem is we only count qlen for HTB qdisc but not backlog.
We need to update backlog too when we update qlen, so that we
can at least know the average packet length.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet_sched: update hierarchical backlog too
WANG Cong [Thu, 25 Feb 2016 22:55:01 +0000 (14:55 -0800)]
net_sched: update hierarchical backlog too

[ Upstream commit 2ccccf5fb43ff62b2b96cc58d95fc0b3596516e4 ]

When the bottom qdisc decides to, for example, drop some packet,
it calls qdisc_tree_decrease_qlen() to update the queue length
for all its ancestors, we need to update the backlog too to
keep the stats on root qdisc accurate.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet_sched: introduce qdisc_replace() helper
WANG Cong [Thu, 25 Feb 2016 22:55:00 +0000 (14:55 -0800)]
net_sched: introduce qdisc_replace() helper

[ Upstream commit 86a7996cc8a078793670d82ed97d5a99bb4e8496 ]

Remove nearly duplicated code and prepare for the following patch.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agogre: do not pull header in ICMP error processing
Jiri Benc [Fri, 29 Apr 2016 21:31:32 +0000 (23:31 +0200)]
gre: do not pull header in ICMP error processing

[ Upstream commit b7f8fe251e4609e2a437bd2c2dea01e61db6849c ]

iptunnel_pull_header expects that IP header was already pulled; with this
expectation, it pulls the tunnel header. This is not true in gre_err.
Furthermore, ipv4_update_pmtu and ipv4_redirect expect that skb->data points
to the IP header.

We cannot pull the tunnel header in this path. It's just a matter of not
calling iptunnel_pull_header - we don't need any of its effects.

Fixes: bda7bb463436 ("gre: Allow multiple protocol listener for gre protocol.")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
Tim Bingham [Fri, 29 Apr 2016 17:30:23 +0000 (13:30 -0400)]
net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case

[ Upstream commit 2c94b53738549d81dc7464a32117d1f5112c64d3 ]

Prior to commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op
when !DEBUG") the implementation of net_dbg_ratelimited() was buggy
for both the DEBUG and CONFIG_DYNAMIC_DEBUG cases.

The bug was that net_ratelimit() was being called and, despite
returning true, nothing was being printed to the console. This
resulted in messages like the following -

"net_ratelimit: %d callbacks suppressed"

with no other output nearby.

After commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when
!DEBUG") the bug is fixed for the DEBUG case. However, there's no
output at all for CONFIG_DYNAMIC_DEBUG case.

This patch restores debug output (if enabled) for the
CONFIG_DYNAMIC_DEBUG case.

Add a definition of net_dbg_ratelimited() for the CONFIG_DYNAMIC_DEBUG
case. The implementation takes care to check that dynamic debugging is
enabled before calling net_ratelimit().

Fixes: d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG")
Signed-off-by: Tim Bingham <tbingham@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agosamples/bpf: fix trace_output example
Alexei Starovoitov [Thu, 28 Apr 2016 01:56:22 +0000 (18:56 -0700)]
samples/bpf: fix trace_output example

[ Upstream commit 569cc39d39385a74b23145496bca2df5ac8b2fb8 ]

llvm cannot always recognize memset as builtin function and optimize
it away, so just delete it. It was a leftover from testing
of bpf_perf_event_output() with large data structures.

Fixes: 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobpf: fix check_map_func_compatibility logic
Alexei Starovoitov [Thu, 28 Apr 2016 01:56:21 +0000 (18:56 -0700)]
bpf: fix check_map_func_compatibility logic

[ Upstream commit 6aff67c85c9e5a4bc99e5211c1bac547936626ca ]

The commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
introduced clever way to check bpf_helper<->map_type compatibility.
Later on commit a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") adjusted
the logic and inadvertently broke it.
Get rid of the clever bool compare and go back to two-way check
from map and from helper perspective.

Fixes: a43eec304259 ("bpf: introduce bpf_perf_event_output() helper")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobpf: fix refcnt overflow
Alexei Starovoitov [Thu, 28 Apr 2016 01:56:20 +0000 (18:56 -0700)]
bpf: fix refcnt overflow

[ Upstream commit 92117d8443bc5afacc8d5ba82e541946310f106e ]

On a system with >32Gbyte of phyiscal memory and infinite RLIMIT_MEMLOCK,
the malicious application may overflow 32-bit bpf program refcnt.
It's also possible to overflow map refcnt on 1Tb system.
Impose 32k hard limit which means that the same bpf program or
map cannot be shared by more than 32k processes.

Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobpf: fix double-fdput in replace_map_fd_with_map_ptr()
Jann Horn [Tue, 26 Apr 2016 20:26:26 +0000 (22:26 +0200)]
bpf: fix double-fdput in replace_map_fd_with_map_ptr()

[ Upstream commit 8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7 ]

When bpf(BPF_PROG_LOAD, ...) was invoked with a BPF program whose bytecode
references a non-map file descriptor as a map file descriptor, the error
handling code called fdput() twice instead of once (in __bpf_map_get() and
in replace_map_fd_with_map_ptr()). If the file descriptor table of the
current task is shared, this causes f_count to be decremented too much,
allowing the struct file to be freed while it is still in use
(use-after-free). This can be exploited to gain root privileges by an
unprivileged user.

This bug was introduced in
commit 0246e64d9a5f ("bpf: handle pseudo BPF_LD_IMM64 insn"), but is only
exploitable since
commit 1be7f75d1668 ("bpf: enable non-root eBPF programs") because
previously, CAP_SYS_ADMIN was required to reach the vulnerable code.

(posted publicly according to request by maintainer)

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet/mlx4_en: fix spurious timestamping callbacks
Eric Dumazet [Sat, 23 Apr 2016 18:35:46 +0000 (11:35 -0700)]
net/mlx4_en: fix spurious timestamping callbacks

[ Upstream commit fc96256c906362e845d848d0f6a6354450059e81 ]

When multiple skb are TX-completed in a row, we might incorrectly keep
a timestamp of a prior skb and cause extra work.

Fixes: ec693d47010e8 ("net/mlx4_en: Add HW timestamping (TS) support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoipv4/fib: don't warn when primary address is missing if in_dev is dead
Paolo Abeni [Thu, 21 Apr 2016 20:23:31 +0000 (22:23 +0200)]
ipv4/fib: don't warn when primary address is missing if in_dev is dead

[ Upstream commit 391a20333b8393ef2e13014e6e59d192c5594471 ]

After commit fbd40ea0180a ("ipv4: Don't do expensive useless work
during inetdev destroy.") when deleting an interface,
fib_del_ifaddr() can be executed without any primary address
present on the dead interface.

The above is safe, but triggers some "bug: prim == NULL" warnings.

This commit avoids warning if the in_dev is dead

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet/mlx5e: Fix minimum MTU
Saeed Mahameed [Thu, 21 Apr 2016 21:33:04 +0000 (00:33 +0300)]
net/mlx5e: Fix minimum MTU

[ Upstream commit d8edd2469ace550db707798180d1c84d81f93bca ]

Minimum MTU that can be set in Connectx4 device is 68.

This fixes the case where a user wants to set invalid MTU,
the driver will fail to satisfy this request and the interface
will stay down.

It is better to report an error and continue working with old
mtu.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet/mlx5e: Device's mtu field is u16 and not int
Saeed Mahameed [Thu, 21 Apr 2016 21:33:03 +0000 (00:33 +0300)]
net/mlx5e: Device's mtu field is u16 and not int

[ Upstream commit 046339eaab26804f52f6604877f5674f70815b26 ]

For set/query MTU port firmware commands the MTU field
is 16 bits, here I changed all the "int mtu" parameters
of the functions wrapping those firmware commands to be u16.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoopenvswitch: use flow protocol when recalculating ipv6 checksums
Simon Horman [Thu, 21 Apr 2016 01:49:15 +0000 (11:49 +1000)]
openvswitch: use flow protocol when recalculating ipv6 checksums

[ Upstream commit b4f70527f052b0c00be4d7cac562baa75b212df5 ]

When using masked actions the ipv6_proto field of an action
to set IPv6 fields may be zero rather than the prevailing protocol
which will result in skipping checksum recalculation.

This patch resolves the problem by relying on the protocol
in the flow key rather than that in the set field action.

Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.")
Cc: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoatl2: Disable unimplemented scatter/gather feature
Ben Hutchings [Wed, 20 Apr 2016 22:23:08 +0000 (23:23 +0100)]
atl2: Disable unimplemented scatter/gather feature

[ Upstream commit f43bfaeddc79effbf3d0fcb53ca477cca66f3db8 ]

atl2 includes NETIF_F_SG in hw_features even though it has no support
for non-linear skbs.  This bug was originally harmless since the
driver does not claim to implement checksum offload and that used to
be a requirement for SG.

Now that SG and checksum offload are independent features, if you
explicitly enable SG *and* use one of the rare protocols that can use
SG without checkusm offload, this potentially leaks sensitive
information (before you notice that it just isn't working).  Therefore
this obscure bug has been designated CVE-2016-2117.

Reported-by: Justin Yackoski <jyackoski@crypto-nite.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: ec5f06156423 ("net: Kill link between CSUM and SG features.")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agovlan: pull on __vlan_insert_tag error path and fix csum correction
Daniel Borkmann [Sat, 16 Apr 2016 00:27:58 +0000 (02:27 +0200)]
vlan: pull on __vlan_insert_tag error path and fix csum correction

[ Upstream commit 9241e2df4fbc648a92ea0752918e05c26255649e ]

When __vlan_insert_tag() fails from skb_vlan_push() path due to the
skb_cow_head(), we need to undo the __skb_push() in the error path
as well that was done earlier to move skb->data pointer to mac header.

Moreover, I noticed that when in the non-error path the __skb_pull()
is done and the original offset to mac header was non-zero, we fixup
from a wrong skb->data offset in the checksum complete processing.

So the skb_postpush_rcsum() really needs to be done before __skb_pull()
where skb->data still points to the mac header start and thus operates
under the same conditions as in __vlan_insert_tag().

Fixes: 93515d53b133 ("net: move vlan pop/push functions into common code")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: use skb_postpush_rcsum instead of own implementations
Daniel Borkmann [Fri, 19 Feb 2016 23:29:30 +0000 (00:29 +0100)]
net: use skb_postpush_rcsum instead of own implementations

[ Upstream commit 6b83d28a55a891a9d70fc61ccb1c138e47dcbe74 ]

Replace individual implementations with the recently introduced
skb_postpush_rcsum() helper.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Tom Herbert <tom@herbertland.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocdc_mbim: apply "NDP to end" quirk to all Huawei devices
Bjørn Mork [Tue, 12 Apr 2016 14:11:12 +0000 (16:11 +0200)]
cdc_mbim: apply "NDP to end" quirk to all Huawei devices

[ Upstream commit c5b5343cfbc9f46af65033fa4f407d7b7d98371d ]

We now have a positive report of another Huawei device needing
this quirk: The ME906s-158 (12d1:15c1).  This is an m.2 form
factor modem with no obvious relationship to the E3372 (12d1:157d)
we already have a quirk entry for.  This is reason enough to
believe the quirk might be necessary for any number of current
and future Huawei devices.

Applying the quirk to all Huawei devices, since it is crucial
to any device affected by the firmware bug, while the impact
on non-affected devices is negligible.

The quirk can if necessary be disabled per-device by writing
N to /sys/class/net/<iface>/cdc_ncm/ndp_to_end

Reported-by: Andreas Fett <andreas.fett@secunet.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobpf/verifier: reject invalid LD_ABS | BPF_DW instruction
Alexei Starovoitov [Tue, 12 Apr 2016 17:26:19 +0000 (10:26 -0700)]
bpf/verifier: reject invalid LD_ABS | BPF_DW instruction

[ Upstream commit d82bccc69041a51f7b7b9b4a36db0772f4cdba21 ]

verifier must check for reserved size bits in instruction opcode and
reject BPF_LD | BPF_ABS | BPF_DW and BPF_LD | BPF_IND | BPF_DW instructions,
otherwise interpreter will WARN_RATELIMIT on them during execution.

Fixes: ddd872bc3098 ("bpf: verifier: add checks for BPF_ABS | BPF_IND instructions")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonet: sched: do not requeue a NULL skb
Lars Persson [Tue, 12 Apr 2016 06:45:52 +0000 (08:45 +0200)]
net: sched: do not requeue a NULL skb

[ Upstream commit 3dcd493fbebfd631913df6e2773cc295d3bf7d22 ]

A failure in validate_xmit_skb_list() triggered an unconditional call
to dev_requeue_skb with skb=NULL. This slowly grows the queue
discipline's qlen count until all traffic through the queue stops.

We take the optimistic approach and continue running the queue after a
failure since it is unknown if later packets also will fail in the
validate path.

Fixes: 55a93b3ea780 ("qdisc: validate skb without holding lock")
Signed-off-by: Lars Persson <larper@axis.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopacket: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface
Mathias Krause [Sun, 10 Apr 2016 10:52:28 +0000 (12:52 +0200)]
packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface

[ Upstream commit 309cf37fe2a781279b7675d4bb7173198e532867 ]

Because we miss to wipe the remainder of i->addr[] in packet_mc_add(),
pdiag_put_mclist() leaks uninitialized heap bytes via the
PACKET_DIAG_MCLIST netlink attribute.

Fix this by explicitly memset(0)ing the remaining bytes in i->addr[].

Fixes: eea68e2f1a00 ("packet: Report socket mclist info via diag module")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoroute: do not cache fib route info on local routes with oif
Chris Friesen [Fri, 8 Apr 2016 21:21:30 +0000 (15:21 -0600)]
route: do not cache fib route info on local routes with oif

[ Upstream commit d6d5e999e5df67f8ec20b6be45e2229455ee3699 ]

For local routes that require a particular output interface we do not want
to cache the result.  Caching the result causes incorrect behaviour when
there are multiple source addresses on the interface.  The end result
being that if the intended recipient is waiting on that interface for the
packet he won't receive it because it will be delivered on the loopback
interface and the IP_PKTINFO ipi_ifindex will be set to the loopback
interface as well.

This can be tested by running a program such as "dhcp_release" which
attempts to inject a packet on a particular interface so that it is
received by another program on the same board.  The receiving process
should see an IP_PKTINFO ipi_ifndex value of the source interface
(e.g., eth1) instead of the loopback interface (e.g., lo).  The packet
will still appear on the loopback interface in tcpdump but the important
aspect is that the CMSG info is correct.

Sample dhcp_release command line:

   dhcp_release eth1 192.168.204.222 02:11:33:22:44:66

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed off-by: Chris Friesen <chris.friesen@windriver.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodecnet: Do not build routes to devices without decnet private data.
David S. Miller [Mon, 11 Apr 2016 03:01:30 +0000 (23:01 -0400)]
decnet: Do not build routes to devices without decnet private data.

[ Upstream commit a36a0d4008488fa545c74445d69eaf56377d5d4e ]

In particular, make sure we check for decnet private presence
for loopback devices.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoLinux 4.4.10
Greg Kroah-Hartman [Wed, 11 May 2016 09:23:26 +0000 (11:23 +0200)]
Linux 4.4.10

8 years agodrm/i915/skl: Fix DMC load on Skylake J0 and K0
Mat Martineau [Thu, 28 Jan 2016 23:19:23 +0000 (15:19 -0800)]
drm/i915/skl: Fix DMC load on Skylake J0 and K0

commit a41c8882592fb80458959b10e37632ce030b68ca upstream.

The driver does not load firmware for unknown steppings, so these new
steppings must be added to the list.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1454023163-25469-1-git-send-email-mathew.j.martineau@linux.intel.com
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolib/test-string_helpers.c: fix and improve string_get_size() tests
Vitaly Kuznetsov [Wed, 3 Feb 2016 00:57:18 +0000 (16:57 -0800)]
lib/test-string_helpers.c: fix and improve string_get_size() tests

commit 72676bb53f33fd0ef3a1484fc1ecfd306dc6ff40 upstream.

Recently added commit 564b026fbd0d ("string_helpers: fix precision loss
for some inputs") fixed precision issues for string_get_size() and broke
tests.

Fix and improve them: test both STRING_UNITS_2 and STRING_UNITS_10 at a
time, better failure reporting, test small an huge values.

Fixes: 564b026fbd0d28e9 ("string_helpers: fix precision loss for some inputs")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: James Bottomley <JBottomley@Odin.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoACPI / processor: Request native thermal interrupt handling via _OSC
Srinivas Pandruvada [Thu, 24 Mar 2016 04:07:39 +0000 (21:07 -0700)]
ACPI / processor: Request native thermal interrupt handling via _OSC

commit a21211672c9a1d730a39aa65d4a5b3414700adfb upstream.

There are several reports of freeze on enabling HWP (Hardware PStates)
feature on Skylake-based systems by the Intel P-states driver. The root
cause is identified as the HWP interrupts causing BIOS code to freeze.

HWP interrupts use the thermal LVT which can be handled by Linux
natively, but on the affected Skylake-based systems SMM will respond
to it by default.  This is a problem for several reasons:
 - On the affected systems the SMM thermal LVT handler is broken (it
   will crash when invoked) and a BIOS update is necessary to fix it.
 - With thermal interrupt handled in SMM we lose all of the reporting
   features of the arch/x86/kernel/cpu/mcheck/therm_throt driver.
 - Some thermal drivers like x86-package-temp depend on the thermal
   threshold interrupts signaled via the thermal LVT.
 - The HWP interrupts are useful for debugging and tuning
   performance (if the kernel can handle them).
The native handling of thermal interrupts needs to be enabled
because of that.

This requires some way to tell SMM that the OS can handle thermal
interrupts.  That can be done by using _OSC/_PDC in processor
scope very early during ACPI initialization.

The meaning of _OSC/_PDC bit 12 in processor scope is whether or
not the OS supports native handling of interrupts for Collaborative
Processor Performance Control (CPPC) notifications.  Since on
HWP-capable systems CPPC is a firmware interface to HWP, setting
this bit effectively tells the firmware that the OS will handle
thermal interrupts natively going forward.

For details on _OSC/_PDC refer to:
http://www.intel.com/content/www/us/en/standards/processor-vendor-specific-acpi-specification.html

To implement the _OSC/_PDC handshake as described, introduce a new
function, acpi_early_processor_osc(), that walks the ACPI
namespace looking for ACPI processor objects and invokes _OSC for
them with bit 12 in the capabilities buffer set and terminates the
namespace walk on the first success.

Also modify intel_thermal_interrupt() to clear HWP status bits in
the HWP_STATUS MSR to acknowledge HWP interrupts (which prevents
them from firing continuously).

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject & changelog, function rename ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: Fake HDMI live status
Shashank Sharma [Thu, 21 Apr 2016 11:18:32 +0000 (16:48 +0530)]
drm/i915: Fake HDMI live status

commit 60b3143c7cac7e8d2ca65c0b347466c5776395d1 upstream.

This patch does the following:
- Fakes live status of HDMI as connected (even if that's not).
  While testing certain (monitor + cable) combinations with
  various intel  platforms, it seems that live status register
  doesn't work reliably on some older devices. So limit the
  live_status check for HDMI detection, only for platforms
  from gen7 onwards.

V2: restrict faking live_status to certain platforms
V3: (Ville)
   - keep the debug message for !live_status case
   - fix indentation of comment
   - remove "warning" from the debug message

    (Jani)
   - Change format of fix details in the commit message

Fixes: 237ed86c693d ("drm/i915: Check live status before reading edid")
Suggested-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461237606-16491-1-git-send-email-shashank.sharma@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 4f4a8185011773f7520d9916c6857db946e7f9d1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
Ville Syrjälä [Wed, 20 Apr 2016 13:43:56 +0000 (16:43 +0300)]
drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW

commit 4ea3959018d09edfa36a9e7b5ccdbd4ec4b99e49 upstream.

Somehow my SNB GT1 (Dell XPS 8300) gets very unhappy around
GPU hangs if the RPS EI/thresholds aren't suitably aligned.
It seems like scheduling/timer interupts stop working somehow
and things get stuck eg. in usleep_range().

I bisected the problem down to
commit 8a5864377b12 ("drm/i915/skl: Restructured the gen6_set_rps_thresholds function")
I observed that before all the values were at least multiples of 25,
but afterwards they are not. And rounding things up to the next multiple
of 25 does seem to help, so lets' do that. I also tried roundup(..., 5)
but that wasn't sufficient. Also I have no idea if we might need this sort of
thing on gen9+ as well.

These are the original EI/thresholds:
 LOW_POWER
  GEN6_RP_UP_EI          12500
  GEN6_RP_UP_THRESHOLD   11800
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 21250
 BETWEEN
  GEN6_RP_UP_EI          10250
  GEN6_RP_UP_THRESHOLD    9225
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 18750
 HIGH_POWER
  GEN6_RP_UP_EI           8000
  GEN6_RP_UP_THRESHOLD    6800
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 15000

These are after 8a5864377b12:
 LOW_POWER
  GEN6_RP_UP_EI          12500
  GEN6_RP_UP_THRESHOLD   11875
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 21250
 BETWEEN
  GEN6_RP_UP_EI          10156
  GEN6_RP_UP_THRESHOLD    9140
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 18750
 HIGH_POWER
  GEN6_RP_UP_EI           7812
  GEN6_RP_UP_THRESHOLD    6640
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 15000

And these are what we have after this patch:
 LOW_POWER
  GEN6_RP_UP_EI          12500
  GEN6_RP_UP_THRESHOLD   11875
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 21250
 BETWEEN
  GEN6_RP_UP_EI          10175
  GEN6_RP_UP_THRESHOLD    9150
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 18750
 HIGH_POWER
  GEN6_RP_UP_EI           7825
  GEN6_RP_UP_THRESHOLD    6650
  GEN6_RP_DOWN_EI        25000
  GEN6_RP_DOWN_THRESHOLD 15000

Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Testcase: igt/kms_pipe_crc_basic/hang-read-crc-pipe-B
Fixes: 8a5864377b12 ("drm/i915/skl: Restructured the gen6_set_rps_thresholds function")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461159836-9108-1-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
(cherry picked from commit 8a292d016d1cc4938ff14b4df25328230b08a408)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: Fix eDP low vswing for Broadwell
Mika Kahola [Wed, 20 Apr 2016 12:39:02 +0000 (15:39 +0300)]
drm/i915: Fix eDP low vswing for Broadwell

commit 992e7a41f9fcc7bcd10e7d346aee5ed7a2c241cb upstream.

It was noticed on bug #94087 that module parameter
i915.edp_vswing=2 that should override the VBT setting
to use default voltage swing (400 mV) was not applied
for Broadwell.

This patch provides a fix for this by checking if default
i.e. higher voltage swing is requested to be used and
applies the DDI translations table for DP instead of eDP
(low vswing) table.

v2: Combine two if statements into one (Jani)
v3: Change dev_priv->edp_low_vswing to use dev_priv->vbt.edp.low_vswing

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94087
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461155942-7749-1-git-send-email-mika.kahola@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 00983519214b61c1b9371ec2ed55a4dde773e384)
[Jani: s/dev_priv->vbt.edp.low_vswing/dev_priv->edp_low_vswing/ to backport]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
Imre Deak [Mon, 18 Apr 2016 07:04:21 +0000 (10:04 +0300)]
drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume

commit 5eaa60c7109b40f17ac81090bc8b90482da76cd1 upstream.

The driver's VDD on/off logic assumes that whenever the VDD is on we
also hold an AUX power domain reference. Since BIOS can leave the VDD on
during booting and resuming and on DDI platforms we won't take a
corresponding power reference, the above assumption won't hold on those
platforms and an eventual delayed VDD off work will do an extraneous AUX
power domain put resulting in a refcount underflow. Fix this the same
way we did this for non-DDI DP encoders:

commit 6d93c0c41760c0 ("drm/i915: fix VDD state tracking after system
resume")

At the same time call the DP encoder suspend handler the same way as the
non-DDI DP encoders do to flush any pending VDD off work. Leaving the
work running may cause a HW access where we don't expect this (at a point
where power domains are suspended already).

While at it remove an unnecessary function call indirection.

This fixed for me AUX refcount underflow problems on BXT during
suspend/resume.

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460963062-13211-4-git-send-email-imre.deak@intel.com
(cherry picked from commit bf93ba67e9c05882f05b7ca2d773cfc8bf462c2a)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: make sure vertical front porch is at least 1
Alex Deucher [Mon, 2 May 2016 22:53:27 +0000 (18:53 -0400)]
drm/radeon: make sure vertical front porch is at least 1

commit 3104b8128d4d646a574ed9d5b17c7d10752cd70b upstream.

hw doesn't like a 0 value.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: ak8975: fix maybe-uninitialized warning
Richard Leitner [Tue, 5 Apr 2016 13:03:48 +0000 (15:03 +0200)]
iio: ak8975: fix maybe-uninitialized warning

commit 05be8d4101d960bad271d32b4f6096af1ccb1534 upstream.

If i2c_device_id *id is NULL and acpi_match_device returns NULL too,
then chipset may be unitialized when accessing &ak_def_array[chipset] in
ak8975_probe. Therefore initialize chipset to AK_MAX_TYPE, which will
return an error when not changed.

This patch fixes the following maybe-uninitialized warning:

drivers/iio/magnetometer/ak8975.c: In function ‘ak8975_probe’:
drivers/iio/magnetometer/ak8975.c:788:14: warning: ‘chipset’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
  data->def = &ak_def_array[chipset];

Signed-off-by: Richard Leitner <dev@g0hl1n.net>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiio: ak8975: Fix NULL pointer exception on early interrupt
Krzysztof Kozlowski [Mon, 4 Apr 2016 05:54:59 +0000 (14:54 +0900)]
iio: ak8975: Fix NULL pointer exception on early interrupt

commit 07d2390e36ee5b3265e9cc8305f2a106c8721e16 upstream.

In certain probe conditions the interrupt came right after registering
the handler causing a NULL pointer exception because of uninitialized
waitqueue:

$ udevadm trigger
i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL)
i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = e8b38000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c
CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty #101
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
(...)
(__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c)
(__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30)
(ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c)
(handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68)
(exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140)
(handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68)
(handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194)
(handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34)
(generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4)
(__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94)
(gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80)

The bug was reproduced on exynos4412-trats2 (with a max77693 device also
using i2c-gpio) after building max77693 as a module.

Fixes: 94a6d5cf7caa ("iio:ak8975 Implement data ready interrupt handling")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Gregor Boirie <gregor.boirie@parrot.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: set metadata pointer to NULL after freeing.
Dave Airlie [Tue, 3 May 2016 02:44:29 +0000 (12:44 +1000)]
drm/amdgpu: set metadata pointer to NULL after freeing.

commit 0092d3edcb23fcdb8cbe4159ba94a534290ff982 upstream.

Without this there was a double free of the metadata,
which ended up freeing the fd table for me here, and taking
out the machine more often than not.

I reproduced with X.org + modesetting DDX + latest llvm/mesa,
also required using dri3.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: make sure vertical front porch is at least 1
Alex Deucher [Mon, 2 May 2016 22:54:39 +0000 (18:54 -0400)]
drm/amdgpu: make sure vertical front porch is at least 1

commit 0126d4b9a516256f2432ca0dc78ab293a8255378 upstream.

hw doesn't like a 0 value.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agogpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
Philipp Zabel [Wed, 27 Apr 2016 08:17:51 +0000 (10:17 +0200)]
gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading

commit 503fe87bd0a8346ba9d8b7f49115dcd0a4185226 upstream.

If of_node is set before calling platform_device_add, the driver core
will try to use of: modalias matching, which fails because the device
tree nodes don't have a compatible property set. This patch fixes
imx-ipuv3-crtc module autoloading by setting the of_node property only
after the platform modalias is set.

Fixes: 304e6be652e2 ("gpu: ipu-v3: Assign of_node of child platform devices to corresponding ports")
Reported-by: Dennis Gilmore <dennis@ausil.us>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-By: Dennis Gilmore <dennis@ausil.us>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonvmem: mxs-ocotp: fix buffer overflow in read
Stanislav Meduna [Mon, 2 May 2016 15:05:11 +0000 (16:05 +0100)]
nvmem: mxs-ocotp: fix buffer overflow in read

commit d1306eb675ad7a9a760b6b8e8e189824b8db89e7 upstream.

This patch fixes the issue where the mxs_ocotp_read is reading
the ocotp in reg_size steps but decrements the remaining size
by 1. The number of iterations is thus four times higher,
overwriting the area behind the output buffer.

Fixes: c01e9a11ab6f ("nvmem: add driver for ocotp in i.MX23 and i.MX28")
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Stanislav Meduna <stano@meduna.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoUSB: serial: cp210x: add Straizona Focusers device ids
Jasem Mutlaq [Tue, 19 Apr 2016 07:38:27 +0000 (10:38 +0300)]
USB: serial: cp210x: add Straizona Focusers device ids

commit 613ac23a46e10d4d4339febdd534fafadd68e059 upstream.

Adding VID:PID for Straizona Focusers to cp210x driver.

Signed-off-by: Jasem Mutlaq <mutlaqja@ikarustech.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoUSB: serial: cp210x: add ID for Link ECU
Mike Manning [Mon, 18 Apr 2016 12:13:23 +0000 (12:13 +0000)]
USB: serial: cp210x: add ID for Link ECU

commit 1d377f4d690637a0121eac8701f84a0aa1e69a69 upstream.

The Link ECU is an aftermarket ECU computer for vehicles that provides
full tuning abilities as well as datalogging and displaying capabilities
via the USB to Serial adapter built into the device.

Signed-off-by: Mike Manning <michael@bsch.com.au>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoata: ahci-platform: Add ports-implemented DT bindings.
Srinivas Kandagatla [Fri, 1 Apr 2016 07:52:57 +0000 (08:52 +0100)]
ata: ahci-platform: Add ports-implemented DT bindings.

commit 17dcc37e3e847bc0e67a5b1ec52471fcc6c18682 upstream.

On some SOCs PORTS_IMPL register value is never programmed by the
firmware and left at zero value. Which means that no sata ports are
available for software. AHCI driver used to cope up with this by
fabricating the port_map if the PORTS_IMPL register is read zero,
but recent patch broke this workaround as zero value was valid for
NVMe disks.

This patch adds ports-implemented DT bindings as workaround for this issue
in a way that DT can can override the PORTS_IMPL register in cases where
the firmware did not program it already.

Fixes: 566d1827df2e ("libata: disable forced PORTS_IMPL for >= AHCI 1.3")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolibahci: save port map for forced port map
Srinivas Kandagatla [Fri, 1 Apr 2016 07:52:56 +0000 (08:52 +0100)]
libahci: save port map for forced port map

commit 2fd0f46cb1b82587c7ae4a616d69057fb9bd0af7 upstream.

In usecases where force_port_map is used saved_port_map is never set,
resulting in not programming the PORTS_IMPL register as part of initial
config. This patch fixes this by setting it to port_map even in case
where force_port_map is used, making it more inline with other parts of
the code.

Fixes: 566d1827df2e ("libata: disable forced PORTS_IMPL for >= AHCI 1.3")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopowerpc: Fix bad inline asm constraint in create_zero_mask()
Anton Blanchard [Fri, 29 Apr 2016 22:29:27 +0000 (08:29 +1000)]
powerpc: Fix bad inline asm constraint in create_zero_mask()

commit b4c112114aab9aff5ed4568ca5e662bb02cdfe74 upstream.

In create_zero_mask() we have:

addi %1,%2,-1
andc %1,%1,%2
popcntd %0,%1

using the "r" constraint for %2. r0 is a valid register in the "r" set,
but addi X,r0,X turns it into an li:

li r7,-1
andc r7,r7,r0
popcntd r4,r7

Fix this by using the "b" constraint, for which r0 is not a valid
register.

This was found with a kernel build using gcc trunk, narrowed down to
when -frename-registers was enabled at -O2. It is just luck however
that we aren't seeing this on older toolchains.

Thanks to Segher for working with me to find this issue.

Fixes: d0cebfa650a0 ("powerpc: word-at-a-time optimization for 64-bit Little Endian")
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoACPICA: Dispatcher: Update thread ID for recursive method calls
Prarit Bhargava [Wed, 4 May 2016 05:48:56 +0000 (13:48 +0800)]
ACPICA: Dispatcher: Update thread ID for recursive method calls

commit 93d68841a23a5779cef6fb9aa0ef32e7c5bd00da upstream.

ACPICA commit 7a3bd2d962f221809f25ddb826c9e551b916eb25

Set the mutex owner thread ID.
Original patch from: Prarit Bhargava <prarit@redhat.com>

Link: https://bugzilla.kernel.org/show_bug.cgi?id=115121
Link: https://github.com/acpica/acpica/commit/7a3bd2d9
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Tested-by: Andy Lutomirski <luto@kernel.org> # On a Dell XPS 13 9350
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/sysfb_efi: Fix valid BAR address range check
Wang YanQing [Thu, 5 May 2016 13:14:21 +0000 (14:14 +0100)]
x86/sysfb_efi: Fix valid BAR address range check

commit c10fcb14c7afd6688c7b197a814358fecf244222 upstream.

The code for checking whether a BAR address range is valid will break
out of the loop when a start address of 0x0 is encountered.

This behaviour is wrong since by breaking out of the loop we may miss
the BAR that describes the EFI frame buffer in a later iteration.

Because of this bug I can't use video=efifb: boot parameter to get
efifb on my new ThinkPad E550 for my old linux system hard disk with
3.10 kernel. In 3.10, efifb is the only choice due to DRM/I915 not
supporting the GPU.

This patch also add a trivial optimization to break out after we find
the frame buffer address range without testing later BARs.

Signed-off-by: Wang YanQing <udknight@gmail.com>
[ Rewrote changelog. ]
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Peter Jones <pjones@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1462454061-21561-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARC: Add missing io barriers to io{read,write}{16,32}be()
Vineet Gupta [Thu, 5 May 2016 08:02:34 +0000 (13:32 +0530)]
ARC: Add missing io barriers to io{read,write}{16,32}be()

commit e5bc0478ab6cf565619224536d75ecb2aedca43b upstream.

While reviewing a different change to asm-generic/io.h Arnd spotted that
ARC ioread32 and ioread32be both of which come from asm-generic versions
are not symmetrical in terms of calling the io barriers.

generic ioread32   -> ARC readl()                  [ has barriers]
generic ioread32be -> __be32_to_cpu(__raw_readl()) [ lacks barriers]

While generic ioread32be is being remediated to call readl(), that involves
a swab32(), causing double swaps on ioread32be() on Big Endian systems.

So provide our versions of big endian IO accessors to ensure io barrier
calls while also keeping them optimal

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
James Morse [Tue, 26 Apr 2016 11:15:01 +0000 (12:15 +0100)]
ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value

commit 625fe4f8ffc1b915248558481bb94249f6bd411c upstream.

arm_cpuidle_suspend() may return -EOPNOTSUPP, or any value returned
by the cpu_ops/cpuidle_ops suspend call. arm_enter_idle_state() doesn't
update 'ret' with this value, meaning we always signal success to
cpuidle_enter_state(), causing it to update the usage counters as if we
succeeded.

Fixes: 191de17aa3c1 ("ARM64: cpuidle: Replace cpu_suspend by the common ARM/ARM64 function")
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agopropogate_mnt: Handle the first propogated copy being a slave
Eric W. Biederman [Thu, 5 May 2016 14:29:29 +0000 (09:29 -0500)]
propogate_mnt: Handle the first propogated copy being a slave

commit 5ec0811d30378ae104f250bfc9b3640242d81e3f upstream.

When the first propgated copy was a slave the following oops would result:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> IP: [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
> PGD bacd4067 PUD bac66067 PMD 0
> Oops: 0000 [#1] SMP
> Modules linked in:
> CPU: 1 PID: 824 Comm: mount Not tainted 4.6.0-rc5userns+ #1523
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> task: ffff8800bb0a8000 ti: ffff8800bac3c000 task.ti: ffff8800bac3c000
> RIP: 0010:[<ffffffff811fba4e>]  [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
> RSP: 0018:ffff8800bac3fd38  EFLAGS: 00010283
> RAX: 0000000000000000 RBX: ffff8800bb77ec00 RCX: 0000000000000010
> RDX: 0000000000000000 RSI: ffff8800bb58c000 RDI: ffff8800bb58c480
> RBP: ffff8800bac3fd48 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000001ca1 R11: 0000000000001c9d R12: 0000000000000000
> R13: ffff8800ba713800 R14: ffff8800bac3fda0 R15: ffff8800bb77ec00
> FS:  00007f3c0cd9b7e0(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000010 CR3: 00000000bb79d000 CR4: 00000000000006e0
> Stack:
>  ffff8800bb77ec00 0000000000000000 ffff8800bac3fd88 ffffffff811fbf85
>  ffff8800bac3fd98 ffff8800bb77f080 ffff8800ba713800 ffff8800bb262b40
>  0000000000000000 0000000000000000 ffff8800bac3fdd8 ffffffff811f1da0
> Call Trace:
>  [<ffffffff811fbf85>] propagate_mnt+0x105/0x140
>  [<ffffffff811f1da0>] attach_recursive_mnt+0x120/0x1e0
>  [<ffffffff811f1ec3>] graft_tree+0x63/0x70
>  [<ffffffff811f1f6b>] do_add_mount+0x9b/0x100
>  [<ffffffff811f2c1a>] do_mount+0x2aa/0xdf0
>  [<ffffffff8117efbe>] ? strndup_user+0x4e/0x70
>  [<ffffffff811f3a45>] SyS_mount+0x75/0xc0
>  [<ffffffff8100242b>] do_syscall_64+0x4b/0xa0
>  [<ffffffff81988f3c>] entry_SYSCALL64_slow_path+0x25/0x25
> Code: 00 00 75 ec 48 89 0d 02 22 22 01 8b 89 10 01 00 00 48 89 05 fd 21 22 01 39 8e 10 01 00 00 0f 84 e0 00 00 00 48 8b 80 d8 00 00 00 <48> 8b 50 10 48 89 05 df 21 22 01 48 89 15 d0 21 22 01 8b 53 30
> RIP  [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
>  RSP <ffff8800bac3fd38>
> CR2: 0000000000000010
> ---[ end trace 2725ecd95164f217 ]---

This oops happens with the namespace_sem held and can be triggered by
non-root users.  An all around not pleasant experience.

To avoid this scenario when finding the appropriate source mount to
copy stop the walk up the mnt_master chain when the first source mount
is encountered.

Further rewrite the walk up the last_source mnt_master chain so that
it is clear what is going on.

The reason why the first source mount is special is that it it's
mnt_parent is not a mount in the dest_mnt propagation tree, and as
such termination conditions based up on the dest_mnt mount propgation
tree do not make sense.

To avoid other kinds of confusion last_dest is not changed when
computing last_source.  last_dest is only used once in propagate_one
and that is above the point of the code being modified, so changing
the global variable is meaningless and confusing.

fixes: f2ebb3a921c1ca1e2ddd9242e95a1989a50c4c68 ("smarter propagate_mnt()")
Reported-by: Tycho Andersen <tycho.andersen@canonical.com>
Reviewed-by: Seth Forshee <seth.forshee@canonical.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agofs/pnode.c: treat zero mnt_group_id-s as unequal
Maxim Patlasov [Tue, 16 Feb 2016 19:45:33 +0000 (11:45 -0800)]
fs/pnode.c: treat zero mnt_group_id-s as unequal

commit 7ae8fd0351f912b075149a1e03a017be8b903b9a upstream.

propagate_one(m) calculates "type" argument for copy_tree() like this:

>    if (m->mnt_group_id == last_dest->mnt_group_id) {
>        type = CL_MAKE_SHARED;
>    } else {
>        type = CL_SLAVE;
>        if (IS_MNT_SHARED(m))
>           type |= CL_MAKE_SHARED;
>   }

The "type" argument then governs clone_mnt() behavior with respect to flags
and mnt_master of new mount. When we iterate through a slave group, it is
possible that both current "m" and "last_dest" are not shared (although,
both are slaves, i.e. have non-NULL mnt_master-s). Then the comparison
above erroneously makes new mount shared and sets its mnt_master to
last_source->mnt_master. The patch fixes the problem by handling zero
mnt_group_id-s as though they are unequal.

The similar problem exists in the implementation of "else" clause above
when we have to ascend upward in the master/slave tree by calling:

>    last_source = last_source->mnt_master;
>    last_dest = last_source->mnt_parent;

proper number of times. The last step is governed by
"n->mnt_group_id != last_dest->mnt_group_id" condition that may lie if
both are zero. The patch fixes this case in the same way as the former one.

[AV: don't open-code an obvious helper...]

Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agox86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
Chen Yu [Fri, 6 May 2016 03:33:39 +0000 (11:33 +0800)]
x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO

commit 886123fb3a8656699dff40afa0573df359abeb18 upstream.

Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;

Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
(35.5), the ratio bits are bit 8-15.

Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
TSC calibration and the Local APIC timer frequency to be incorrect.

Fix this problem by masking 0xff instead.

[ tglx: Massaged changelog ]

Fixes: 7da7c1561366 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Bin Gao <bin.gao@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoMAINTAINERS: Remove asterisk from EFI directory names
Matt Fleming [Tue, 3 May 2016 19:29:39 +0000 (20:29 +0100)]
MAINTAINERS: Remove asterisk from EFI directory names

commit e8dfe6d8f6762d515fcd4f30577f7bfcf7659887 upstream.

Mark reported that having asterisks on the end of directory names
confuses get_maintainer.pl when it encounters subdirectories, and that
my name does not appear when run on drivers/firmware/efi/libstub.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1462303781-8686-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agowriteback: Fix performance regression in wb_over_bg_thresh()
Howard Cochran [Thu, 10 Mar 2016 06:12:39 +0000 (01:12 -0500)]
writeback: Fix performance regression in wb_over_bg_thresh()

commit 74d369443325063a5f0260e63971decb950fd8fa upstream.

Commit 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use
wb_domain aware operations") unintentionally changed this function's
meaning from "are there more dirty pages than the background writeback
threshold" to "are there more dirty pages than the writeback threshold".
The background writeback threshold is typically half of the writeback
threshold, so this had the effect of raising the number of dirty pages
required to cause a writeback worker to perform background writeout.

This can cause a very severe performance regression when a BDI uses
BDI_CAP_STRICTLIMIT because balance_dirty_pages() and the writeback worker
can now disagree on whether writeback should be initiated.

For example, in a system having 1GB of RAM, a single spinning disk, and a
"pass-through" FUSE filesystem mounted over the disk, application code
mmapped a 128MB file on the disk and was randomly dirtying pages in that
mapping.

Because FUSE uses strictlimit and has a default max_ratio of only 1%, in
balance_dirty_pages, thresh is ~200, bg_thresh is ~100, and the
dirty_freerun_ceiling is the average of those, ~150. So, it pauses the
dirtying processes when we have 151 dirty pages and wakes up a background
writeback worker. But the worker tests the wrong threshold (200 instead of
100), so it does not initiate writeback and just returns.

Thus, balance_dirty_pages keeps looping, sleeping and then waking up the
worker who will do nothing. It remains stuck in this state until the few
dirty pages that we have finally expire and we write them back for that
reason. Then the whole process repeats, resulting in near-zero throughput
through the FUSE BDI.

The fix is to call the parameterized variant of wb_calc_thresh, so that the
worker will do writeback if the bg_thresh is exceeded which was the
behavior before the referenced commit.

Fixes: 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use wb_domain aware operations")
Signed-off-by: Howard Cochran <hcochran@kernelspring.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobatman-adv: Reduce refcnt of removed router when updating route
Sven Eckelmann [Sun, 20 Mar 2016 11:27:53 +0000 (12:27 +0100)]
batman-adv: Reduce refcnt of removed router when updating route

commit d1a65f1741bfd9c69f9e4e2ad447a89b6810427d upstream.

_batadv_update_route rcu_derefences orig_ifinfo->router outside of a
spinlock protected region to print some information messages to the debug
log. But this pointer is not checked again when the new pointer is assigned
in the spinlock protected region. Thus is can happen that the value of
orig_ifinfo->router changed in the meantime and thus the reference counter
of the wrong router gets reduced after the spinlock protected region.

Just rcu_dereferencing the value of orig_ifinfo->router inside the spinlock
protected region (which also set the new pointer) is enough to get the
correct old router object.

Fixes: e1a5382f978b ("batman-adv: Make orig_node->router an rcu protected pointer")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobatman-adv: Fix broadcast/ogm queue limit on a removed interface
Linus Lüssing [Fri, 11 Mar 2016 13:04:49 +0000 (14:04 +0100)]
batman-adv: Fix broadcast/ogm queue limit on a removed interface

commit c4fdb6cff2aa0ae740c5f19b6f745cbbe786d42f upstream.

When removing a single interface while a broadcast or ogm packet is
still pending then we will free the forward packet without releasing the
queue slots again.

This patch is supposed to fix this issue.

Fixes: 6d5808d4ae1b ("batman-adv: Add missing hardif_free_ref in forw_packet_free")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
[sven@narfation.org: fix conflicts with current version]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobatman-adv: Check skb size before using encapsulated ETH+VLAN header
Sven Eckelmann [Fri, 26 Feb 2016 16:56:13 +0000 (17:56 +0100)]
batman-adv: Check skb size before using encapsulated ETH+VLAN header

commit c78296665c3d81f040117432ab9e1cb125521b0c upstream.

The encapsulated ethernet and VLAN header may be outside the received
ethernet frame. Thus the skb buffer size has to be checked before it can be
parsed to find out if it encapsulates another batman-adv packet.

Fixes: 420193573f11 ("batman-adv: softif bridge loop avoidance")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agobatman-adv: fix DAT candidate selection (must use vid)
Antonio Quartulli [Sat, 12 Mar 2016 10:12:59 +0000 (11:12 +0100)]
batman-adv: fix DAT candidate selection (must use vid)

commit 2871734e85e920503d49b3a8bc0afbe0773b6036 upstream.

Now that DAT is VLAN aware, it must use the VID when
computing the DHT address of the candidate nodes where
an entry is going to be stored/retrieved.

Fixes: be1db4f6615b ("batman-adv: make the Distributed ARP Table vlan aware")
Signed-off-by: Antonio Quartulli <a@unstable.cc>
[sven@narfation.org: fix conflicts with current version]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>