OSDN Git Service

tomoyo/tomoyo-test1.git
4 years agoMerge branch 'remotes/lorenzo/pci/aardvark'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:14 +0000 (12:59 -0500)]
Merge branch 'remotes/lorenzo/pci/aardvark'

  - Train link immediately after enabling link training to avoid issues
    with Compex WLE900VX and Turris MOX devices (Pali Rohár)

  - Remove ASPM config and let the PCI core do it (Pali Rohár)

  - Interpret zero 'max-link-speed' value as invalid (Pali Rohár)

  - Respect the 'max-link-speed' property and improve link training (Marek
    Behún)

  - Issue PERST via GPIO (Pali Rohár)

  - Add PHY support (Marek Behún)

  - Use standard PCIe capability macros (Pali Rohár)

  - Document new 'max-link-speed', 'phys', and 'reset-gpios' properties
    (Marek Behún)

* remotes/lorenzo/pci/aardvark:
  dt-bindings: PCI: aardvark: Describe new properties
  PCI: aardvark: Replace custom macros by standard linux/pci_regs.h macros
  PCI: aardvark: Add PHY support
  PCI: aardvark: Add FIXME comment for PCIE_CORE_CMD_STATUS_REG access
  PCI: aardvark: Issue PERST via GPIO
  PCI: aardvark: Improve link training
  PCI: of: Zero max-link-speed value is invalid
  PCI: aardvark: Don't blindly enable ASPM L0s and don't write to read-only register
  PCI: aardvark: Train link immediately after enabling training

4 years agoMerge branch 'pci/virtualization'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:13 +0000 (12:59 -0500)]
Merge branch 'pci/virtualization'

  - Remove unused xen_register_pirq() parameter (Wei Liu)

  - Quirk AMD Matisse HD Audio & USB 3.0 devices where FLR hangs the device
    (Marcos Scriven)

  - Quirk AMD Starship USB 3.0 device where FLR doesn't seem to work (Kevin
    Buettner)

  - Add ACS quirk for Intel RCiEPs (Ashok Raj)

* pci/virtualization:
  PCI: Add ACS quirk for Intel Root Complex Integrated Endpoints
  PCI: Avoid FLR for AMD Starship USB 3.0
  PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0
  x86/PCI: Drop unused xen_register_pirq() gsi_override parameter

4 years agoMerge branch 'pci/switchtec'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:13 +0000 (12:59 -0500)]
Merge branch 'pci/switchtec'

  - Fix a minor bool type issue (Krzysztof Wilczynski)

* pci/switchtec:
  PCI/switchtec: Correct bool variable type assignment

4 years agoMerge branch 'pci/resource'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:12 +0000 (12:59 -0500)]
Merge branch 'pci/resource'

  - Allow resizing BARs of devices on root bus (Ard Biesheuvel)

* pci/resource:
  PCI: Allow pci_resize_resource() for devices on root bus

4 years agoMerge branch 'pci/pm'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:12 +0000 (12:59 -0500)]
Merge branch 'pci/pm'

  - Check .bridge_d3() hook for NULL before calling it (Bjorn Helgaas)

  - Disable PME# for Pericom OHCI/UHCI USB controllers because it's
    not reliably asserted on USB hotplug (Kai-Heng Feng)

  - Assume ports without DLL Link Active train links in 100 ms to work
    around Thunderbolt bridge defects (Mika Westerberg)

* pci/pm:
  PCI/PM: Assume ports without DLL Link Active train links in 100 ms
  PCI/PM: Adjust pcie_wait_for_link_delay() for caller delay
  PCI: Avoid Pericom USB controller OHCI/EHCI PME# defect
  serial: 8250_pci: Move Pericom IDs to pci_ids.h
  PCI/PM: Call .bridge_d3() hook only if non-NULL

4 years agoMerge branch 'pci/p2pdma'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:11 +0000 (12:59 -0500)]
Merge branch 'pci/p2pdma'

  - Add AMD Zen Raven and Renoir Root Ports to P2PDMA whitelist (Alex
    Deucher)

* pci/p2pdma:
  PCI/P2PDMA: Add AMD Zen Raven and Renoir Root Ports to whitelist

4 years agoMerge branch 'pci/misc'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:11 +0000 (12:59 -0500)]
Merge branch 'pci/misc'

  - Clarify that platform_get_irq() should never return 0 (Bjorn Helgaas)

  - Check for platform_get_irq() failure consistently (Bjorn Helgaas)

  - Replace zero-length array with flexible-array (Gustavo A. R. Silva)

  - Unify pcie_find_root_port() and pci_find_pcie_root_port() (Yicong Yang)

  - Quirk Intel C620 MROMs, which have non-BARs in BAR locations (Xiaochun
    Lee)

  - Fix pcie_pme_resume() and pcie_pme_remove() kernel-doc (Jay Fang)

  - Rename _DSM constants to align with spec (Krzysztof Wilczyński)

* pci/misc:
  PCI: Rename _DSM constants to align with spec
  PCI/PME: Fix kernel-doc of pcie_pme_resume() and pcie_pme_remove()
  x86/PCI: Mark Intel C620 MROMs as having non-compliant BARs
  PCI: Unify pcie_find_root_port() and pci_find_pcie_root_port()
  PCI: Replace zero-length array with flexible-array
  PCI: Check for platform_get_irq() failure consistently
  driver core: platform: Clarify that IRQ 0 is invalid

4 years agoMerge branch 'pci/kconfig'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:10 +0000 (12:59 -0500)]
Merge branch 'pci/kconfig'

  - Remove unnecessary "default y" Kconfig options (Bjorn Helgaas)

* pci/kconfig:
  PCI/AER: Don't select CONFIG_PCIEAER by default
  PCI: keystone: Don't select CONFIG_PCI_KEYSTONE_HOST by default
  PCI: dra7xx: Don't select CONFIG_PCI_DRA7XX_HOST by default

4 years agoMerge branch 'pci/hotplug'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:10 +0000 (12:59 -0500)]
Merge branch 'pci/hotplug'

  - Remove unused pciehp EMI() and HP_SUPR_RM() macros (Ani Sinha)

  - Use of_node_name_eq() for node name comparisons (Rob Herring)

  - Convert shpchp_unconfigure_device() to void (Krzysztof Wilczynski)

* pci/hotplug:
  PCI: shpchp: Make shpchp_unconfigure_device() void
  PCI: Use of_node_name_eq() for node name comparisons
  PCI: pciehp: Remove unused EMI() and HP_SUPR_RM() macros

4 years agoMerge branch 'pci/error'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:09 +0000 (12:59 -0500)]
Merge branch 'pci/error'

  - Log only ACPI_NOTIFY_DISCONNECT_RECOVER events for EDR, not all ACPI
    SYSTEM-level events (Kuppuswamy Sathyanarayanan)

  - Rely only on _OSC (not _OSC + HEST FIRMWARE_FIRST) to negotiate AER
    Capability ownership (Alexandru Gagniuc)

  - Remove HEST/FIRMWARE_FIRST parsing that was previously used to help
    intuit AER Capability ownership (Kuppuswamy Sathyanarayanan)

  - Remove redundant pci_is_pcie() and dev->aer_cap checks (Kuppuswamy
    Sathyanarayanan)

  - Print IRQ number used by DPC (Yicong Yang)

* pci/error:
  PCI/DPC: Print IRQ number used by port
  PCI/AER: Use "aer" variable for capability offset
  PCI/AER: Remove redundant dev->aer_cap checks
  PCI/AER: Remove redundant pci_is_pcie() checks
  PCI/AER: Remove HEST/FIRMWARE_FIRST parsing for AER ownership
  PCI/AER: Use only _OSC to determine AER ownership
  PCI/EDR: Log only ACPI_NOTIFY_DISCONNECT_RECOVER events

4 years agoMerge branch 'pci/enumeration'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:09 +0000 (12:59 -0500)]
Merge branch 'pci/enumeration'

  - Fix pci_register_host_bridge() device_register() error handling (Rob
    Herring)

  - Fix pci_host_bridge struct device release/free handling (Rob Herring)

  - Program MPS for RCiEP devices (Ashok Raj)

  - Inherit PTM settings from Switch Upstream Port so we can enable PTM on
    Endpoints (Bjorn Helgaas)

  - Add #defines for bridge windows (PCI_BRIDGE_IO_WINDOW,
    PCI_BRIDGE_MEM_WINDOW, etc) (Krzysztof Wilczynski)

* pci/enumeration:
  pcmcia: Use CardBus window names (PCI_CB_BRIDGE_IO_0_WINDOW etc) when freeing
  PCI: Use bridge window names (PCI_BRIDGE_IO_WINDOW etc)
  PCI/PTM: Inherit Switch Downstream Port PTM settings from Upstream Port
  PCI: Program MPS for RCiEP devices
  PCI: Fix pci_host_bridge struct device release/free handling
  PCI: Fix pci_register_host_bridge() device_register() error handling

4 years agoMerge branch 'pci/aspm'
Bjorn Helgaas [Thu, 4 Jun 2020 17:59:09 +0000 (12:59 -0500)]
Merge branch 'pci/aspm'

  - Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges (Kai-Heng Feng)

* pci/aspm:
  PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges

4 years agoPCI: Add ACS quirk for Intel Root Complex Integrated Endpoints
Ashok Raj [Thu, 28 May 2020 20:57:42 +0000 (13:57 -0700)]
PCI: Add ACS quirk for Intel Root Complex Integrated Endpoints

All Intel platforms guarantee that all root complex implementations must
send transactions up to IOMMU for address translations. Hence for Intel
RCiEP devices, we can assume some ACS-type isolation even without an ACS
capability.

From the Intel VT-d spec, r3.1, sec 3.16 ("Root-Complex Peer to Peer
Considerations"):

  When DMA remapping is enabled, peer-to-peer requests through the
  Root-Complex must be handled as follows:

  - The input address in the request is translated (through first-level,
    second-level or nested translation) to a host physical address (HPA).
    The address decoding for peer addresses must be done only on the
    translated HPA. Hardware implementations are free to further limit
    peer-to-peer accesses to specific host physical address regions (or
    to completely disallow peer-forwarding of translated requests).

  - Since address translation changes the contents (address field) of
    the PCI Express Transaction Layer Packet (TLP), for PCI Express
    peer-to-peer requests with ECRC, the Root-Complex hardware must use
    the new ECRC (re-computed with the translated address) if it
    decides to forward the TLP as a peer request.

  - Root-ports, and multi-function root-complex integrated endpoints, may
    support additional peer-to-peer control features by supporting PCI
    Express Access Control Services (ACS) capability. Refer to ACS
    capability in PCI Express specifications for details.

Since Linux didn't give special treatment to allow this exception, certain
RCiEP MFD devices were grouped in a single IOMMU group. This doesn't permit
a single device to be assigned to a guest for instance.

In one vendor system: Device 14.x were grouped in a single IOMMU group.

  /sys/kernel/iommu_groups/5/devices/0000:00:14.0
  /sys/kernel/iommu_groups/5/devices/0000:00:14.2
  /sys/kernel/iommu_groups/5/devices/0000:00:14.3

After this patch:

  /sys/kernel/iommu_groups/5/devices/0000:00:14.0
  /sys/kernel/iommu_groups/5/devices/0000:00:14.2
  /sys/kernel/iommu_groups/6/devices/0000:00:14.3 <<< new group

14.0 and 14.2 are integrated devices, but legacy end points, whereas 14.3
was a PCIe-compliant RCiEP.

  00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30)
    Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00

This permits assigning this device to a guest VM.

[bhelgaas: drop "Fixes" tag since this doesn't fix a bug in that commit]
Link: https://lore.kernel.org/r/1590699462-7131-1-git-send-email-ashok.raj@intel.com
Tested-by: Darrel Goeddel <dgoeddel@forcepoint.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Mark Scott <mscott@forcepoint.com>,
Cc: Romil Sharma <rsharma@forcepoint.com>
4 years agoPCI/DPC: Print IRQ number used by port
Yicong Yang [Sat, 9 May 2020 09:56:54 +0000 (17:56 +0800)]
PCI/DPC: Print IRQ number used by port

Print IRQ number used by DPC port, like AER/PME does.  It provides
convenience to track DPC interrupts counts of certain port from
/proc/interrupts.

Link: https://lore.kernel.org/r/1589018214-52752-1-git-send-email-yangyicong@hisilicon.com
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI/AER: Use "aer" variable for capability offset
Bjorn Helgaas [Fri, 29 May 2020 22:56:09 +0000 (17:56 -0500)]
PCI/AER: Use "aer" variable for capability offset

Previously we used "pos" or "aer_pos" for the offset of the AER Capability.
Use "aer" consistently and initialize it the same way everywhere.  No
functional change intended.

Link: https://lore.kernel.org/r/20200529230915.GA479883@bjorn-Precision-5520
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
4 years agoPCI/AER: Remove redundant dev->aer_cap checks
Kuppuswamy Sathyanarayanan [Tue, 26 May 2020 23:18:26 +0000 (16:18 -0700)]
PCI/AER: Remove redundant dev->aer_cap checks

pcie_aer_get_firmware_first() checks dev->aer_cap, so we can remove
redundant dev->aer_cap checks in the callers.

Link: https://lore.kernel.org/r/d5ccc7a060ec9cdc234bdae7df8a0a4410f13f42.1590534843.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI/AER: Remove redundant pci_is_pcie() checks
Kuppuswamy Sathyanarayanan [Tue, 26 May 2020 23:18:25 +0000 (16:18 -0700)]
PCI/AER: Remove redundant pci_is_pcie() checks

AER is a PCIe Extended Capability, so dev->aer_cap will only be set for
PCIe devices.  Remove redundant pci_is_pcie() checks.

Link: https://lore.kernel.org/r/361c622eabe5b845b8092e0bec04a3a2c262cb38.1590534843.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI/AER: Remove HEST/FIRMWARE_FIRST parsing for AER ownership
Kuppuswamy Sathyanarayanan [Tue, 26 May 2020 23:18:29 +0000 (16:18 -0700)]
PCI/AER: Remove HEST/FIRMWARE_FIRST parsing for AER ownership

Commit c100beb9ccfb ("PCI/AER: Use only _OSC to determine AER ownership")
removed the use of HEST in determining AER ownership, but the AER driver
still used HEST to verify AER ownership in some of its APIs.

Per the ACPI spec v6.3, sec 18.3.2.4, some HEST table entries contain a
FIRMWARE_FIRST bit, but that bit does not tell us anything about ownership
of the AER capability.

Remove parsing of HEST to look for FIRMWARE_FIRST.

Add pcie_aer_is_native() for the places that need to know whether the OS
owns the AER capability.

[bhelgaas: commit log, reorder patch, remove unused __aer_firmware_first]
Link: https://lore.kernel.org/r/9a37f53a4e6ff4942ff8e18dbb20b00e16c47341.1590534843.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI: Rename _DSM constants to align with spec
Krzysztof Wilczyński [Tue, 26 May 2020 21:39:05 +0000 (21:39 +0000)]
PCI: Rename _DSM constants to align with spec

Rename PCI-related _DSM constants to align them with the PCI Firmware Spec,
r3.2, sec 4.6.  No functional change intended.

Link: https://lore.kernel.org/r/20200526213905.2479381-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI: Avoid FLR for AMD Starship USB 3.0
Kevin Buettner [Sun, 24 May 2020 07:35:29 +0000 (00:35 -0700)]
PCI: Avoid FLR for AMD Starship USB 3.0

The AMD Starship USB 3.0 host controller advertises Function Level Reset
support, but it apparently doesn't work.  Add a quirk to prevent use of FLR
on this device.

Without this quirk, when attempting to assign (pass through) an AMD
Starship USB 3.0 host controller to a guest OS, the system becomes
increasingly unresponsive over the course of several minutes, eventually
requiring a hard reset.  Shortly after attempting to start the guest, I see
these messages:

  vfio-pci 0000:05:00.3: not ready 1023ms after FLR; waiting
  vfio-pci 0000:05:00.3: not ready 2047ms after FLR; waiting
  vfio-pci 0000:05:00.3: not ready 4095ms after FLR; waiting
  vfio-pci 0000:05:00.3: not ready 8191ms after FLR; waiting

And then eventually:

  vfio-pci 0000:05:00.3: not ready 65535ms after FLR; giving up
  INFO: NMI handler (perf_event_nmi_handler) took too long to run: 0.000 msecs
  perf: interrupt took too long (642744 > 2500), lowering kernel.perf_event_max_sample_rate to 1000
  INFO: NMI handler (perf_event_nmi_handler) took too long to run: 82.270 msecs
  INFO: NMI handler (perf_event_nmi_handler) took too long to run: 680.608 msecs
  INFO: NMI handler (perf_event_nmi_handler) took too long to run: 100.952 msecs
  ...
  watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [qemu-system-x86:7487]

Tested on a Micro-Star International Co., Ltd. MS-7C59/Creator TRX40
motherboard with an AMD Ryzen Threadripper 3970X.

Link: https://lore.kernel.org/r/20200524003529.598434ff@f31-4.lan
Signed-off-by: Kevin Buettner <kevinb@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0
Marcos Scriven [Wed, 20 May 2020 23:23:30 +0000 (18:23 -0500)]
PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0

The AMD Matisse HD Audio & USB 3.0 devices advertise Function Level Reset
support, but hang when an FLR is triggered.

To reproduce the problem, attach the device to a VM, then detach and try to
attach again.

Rename the existing quirk_intel_no_flr(), which was not Intel-specific, to
quirk_no_flr(), and apply it to prevent the use of FLR on these AMD
devices.

Link: https://lore.kernel.org/r/CAAri2DpkcuQZYbT6XsALhx2e6vRqPHwtbjHYeiH7MNp4zmt1RA@mail.gmail.com
Signed-off-by: Marcos Scriven <marcos@scriven.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agox86/PCI: Drop unused xen_register_pirq() gsi_override parameter
Wei Liu [Tue, 28 Apr 2020 15:36:40 +0000 (15:36 +0000)]
x86/PCI: Drop unused xen_register_pirq() gsi_override parameter

All callers of xen_register_pirq() pass -1 (no override) for the
gsi_override parameter.  Remove it and related code.

Link: https://lore.kernel.org/r/20200428153640.76476-1-wei.liu@kernel.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
4 years agopcmcia: Use CardBus window names (PCI_CB_BRIDGE_IO_0_WINDOW etc) when freeing
Krzysztof Wilczynski [Wed, 20 May 2020 18:34:11 +0000 (18:34 +0000)]
pcmcia: Use CardBus window names (PCI_CB_BRIDGE_IO_0_WINDOW etc) when freeing

Remove the loop used to free CardBus resources and replace it with
a yenta_free_res() helper used to release bridge resources explicitly.

Link: https://lore.kernel.org/r/20200520183411.1534621-3-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
4 years agoPCI: Use bridge window names (PCI_BRIDGE_IO_WINDOW etc)
Krzysztof Wilczynski [Wed, 20 May 2020 18:34:10 +0000 (18:34 +0000)]
PCI: Use bridge window names (PCI_BRIDGE_IO_WINDOW etc)

Use bridge resource definitions instead of using the PCI_BRIDGE_RESOURCES
constant with an integer offeset.

Link: https://lore.kernel.org/r/20200520183411.1534621-2-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI/PTM: Inherit Switch Downstream Port PTM settings from Upstream Port
Bjorn Helgaas [Thu, 21 May 2020 20:40:07 +0000 (15:40 -0500)]
PCI/PTM: Inherit Switch Downstream Port PTM settings from Upstream Port

Except for Endpoints, we enable PTM at enumeration-time.  Previously we did
not account for the fact that Switch Downstream Ports are not permitted to
have a PTM capability; their PTM behavior is controlled by the Upstream
Port (PCIe r5.0, sec 7.9.16).  Since Downstream Ports don't have a PTM
capability, we did not mark them as "ptm_enabled", which meant that
pci_enable_ptm() on an Endpoint failed because there was no PTM path to it.

Mark Downstream Ports as "ptm_enabled" if their Upstream Port has PTM
enabled.

Fixes: eec097d43100 ("PCI: Add pci_enable_ptm() for drivers to enable PTM on endpoints")
Reported-by: Aditya Paluri <Venkata.AdityaPaluri@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI: shpchp: Make shpchp_unconfigure_device() void
Krzysztof Wilczynski [Thu, 21 May 2020 19:04:57 +0000 (19:04 +0000)]
PCI: shpchp: Make shpchp_unconfigure_device() void

shpchp_unconfigure_device() always returned 0, so there's no reason for a
return value.  In addition, remove_board() checked the return value for
possible error which is unnecessary.

Convert shpchp_unconfigure_device() to a void function and remove the
return value check.  This addresses the following Coccinelle warning:

  drivers/pci/hotplug/shpchp_pci.c:66:5-7: Unneeded variable: "rc".  Return "0" on line 86

Link: https://lore.kernel.org/r/20200521190457.1066600-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI/switchtec: Correct bool variable type assignment
Krzysztof Wilczynski [Thu, 21 May 2020 20:04:39 +0000 (20:04 +0000)]
PCI/switchtec: Correct bool variable type assignment

Use "true" instead of 1 to initialize "bool use_dma_mrpc".  This resolves
the following Coccinelle warning:

  drivers/pci/switch/switchtec.c:28:12-24: WARNING: Assignment of 0/1 to bool variable

Link: https://lore.kernel.org/r/20200521200439.1076672-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
4 years agoPCI/PME: Fix kernel-doc of pcie_pme_resume() and pcie_pme_remove()
Jay Fang [Sat, 16 May 2020 07:00:14 +0000 (15:00 +0800)]
PCI/PME: Fix kernel-doc of pcie_pme_resume() and pcie_pme_remove()

Fix kernel-doc of the "srv" parameter to pcie_pme_resume() and
pcie_pme_remove().  Building with W=1 produced these warnings:

  drivers/pci/pcie/pme.c:414: warning: Function parameter or member 'srv' not described in 'pcie_pme_resume'
  drivers/pci/pcie/pme.c:437: warning: Function parameter or member 'srv' not described in 'pcie_pme_remove'

Link: https://lore.kernel.org/r/1589612414-61682-1-git-send-email-f.fangjian@huawei.com
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agodt-bindings: PCI: aardvark: Describe new properties
Marek Behún [Thu, 30 Apr 2020 08:06:22 +0000 (10:06 +0200)]
dt-bindings: PCI: aardvark: Describe new properties

Document the possibility to reference a PHY and reset-gpios and to set
max-link-speed property.

Link: https://lore.kernel.org/r/20200430080625.26070-10-pali@kernel.org
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Marek Behún <marek.behun@nic.cz>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
4 years agoPCI: aardvark: Replace custom macros by standard linux/pci_regs.h macros
Pali Rohár [Thu, 30 Apr 2020 08:06:21 +0000 (10:06 +0200)]
PCI: aardvark: Replace custom macros by standard linux/pci_regs.h macros

PCI-E capability macros are already defined in linux/pci_regs.h.
Remove their reimplementation in pcie-aardvark.

Link: https://lore.kernel.org/r/20200430080625.26070-9-pali@kernel.org
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
4 years agoPCI: aardvark: Add PHY support
Marek Behún [Thu, 30 Apr 2020 08:06:20 +0000 (10:06 +0200)]
PCI: aardvark: Add PHY support

With recent proposed changes for U-Boot it is possible that bootloader
won't initialize the PHY for this controller (currently the PHY is
initialized regardless whether PCI is used in U-Boot, but with these
proposed changes the PHY is initialized only on request).

Since the mvebu-a3700-comphy driver by Miquèl Raynal supports enabling
PCIe PHY, and since Linux' functionality should be independent on what
bootloader did, add code for enabling generic PHY if found in device OF
node.

The mvebu-a3700-comphy driver does PHY powering via SMC calls to ARM
Trusted Firmware. The corresponding code in ARM Trusted Firmware skips
one register write which U-Boot does not: step 7 ("Enable TX"), see [1].
Instead ARM Trusted Firmware expects PCIe driver to do this step,
probably because the register is in PCIe controller address space,
instead of PHY address space. We therefore add this step into the
advk_pcie_setup_hw function.

[1] https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/drivers/marvell/comphy/phy-comphy-3700.c?h=v2.3-rc2#n836

Link: https://lore.kernel.org/r/20200430080625.26070-8-pali@kernel.org
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Marek Behún <marek.behun@nic.cz>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquèl Raynal <miquel.raynal@bootlin.com>
4 years agoPCI: aardvark: Add FIXME comment for PCIE_CORE_CMD_STATUS_REG access
Pali Rohár [Thu, 30 Apr 2020 08:06:19 +0000 (10:06 +0200)]
PCI: aardvark: Add FIXME comment for PCIE_CORE_CMD_STATUS_REG access

This register is applicable only when the controller is configured for
Endpoint mode, which is not the case for the current version of this
driver.

Attempting to remove this code though caused some ath10k cards to stop
working, so for some unknown reason it is needed here.

This should be investigated and a comment explaining this should be put
before the code, so we add a FIXME comment for now.

Link: https://lore.kernel.org/r/20200430080625.26070-7-pali@kernel.org
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
4 years agoPCI: aardvark: Issue PERST via GPIO
Pali Rohár [Thu, 30 Apr 2020 08:06:18 +0000 (10:06 +0200)]
PCI: aardvark: Issue PERST via GPIO

Add support for issuing PERST via GPIO specified in 'reset-gpios'
property (as described in PCI device tree bindings).

Some buggy cards (e.g. Compex WLE900VX or WLE1216) are not detected
after reboot when PERST is not issued during driver initialization.

If bootloader already enabled link training then issuing PERST has no
effect for some buggy cards (e.g. Compex WLE900VX) and these cards are
not detected. We therefore clear the LINK_TRAINING_EN register before.

It was observed that Compex WLE900VX card needs to be in PERST reset
for at least 10ms if bootloader enabled link training.

Tested on Turris MOX.

Link: https://lore.kernel.org/r/20200430080625.26070-6-pali@kernel.org
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
4 years agoPCI: aardvark: Improve link training
Marek Behún [Thu, 30 Apr 2020 08:06:17 +0000 (10:06 +0200)]
PCI: aardvark: Improve link training

Currently the aardvark driver trains link in PCIe gen2 mode. This may
cause some buggy gen1 cards (such as Compex WLE900VX) to be unstable or
even not detected. Moreover when ASPM code tries to retrain link second
time, these cards may stop responding and link goes down. If gen1 is
used this does not happen.

Unconditionally forcing gen1 is not a good solution since it may have
performance impact on gen2 cards.

To overcome this, read 'max-link-speed' property (as defined in PCI
device tree bindings) and use this as max gen mode. Then iteratively try
link training at this mode or lower until successful. After successful
link training choose final controller gen based on Negotiated Link Speed
from Link Status register, which should match card speed.

Link: https://lore.kernel.org/r/20200430080625.26070-5-pali@kernel.org
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <marek.behun@nic.cz>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
4 years agoPCI: of: Zero max-link-speed value is invalid
Pali Rohár [Thu, 30 Apr 2020 08:06:16 +0000 (10:06 +0200)]
PCI: of: Zero max-link-speed value is invalid

Interpret zero value of max-link-speed property as invalid,
as the device tree bindings documentation specifies.

Link: https://lore.kernel.org/r/20200430080625.26070-4-pali@kernel.org
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
4 years agoPCI: aardvark: Don't blindly enable ASPM L0s and don't write to read-only register
Pali Rohár [Thu, 30 Apr 2020 08:06:15 +0000 (10:06 +0200)]
PCI: aardvark: Don't blindly enable ASPM L0s and don't write to read-only register

Trying to change Link Status register does not have any effect as this
is a read-only register. Trying to overwrite bits for Negotiated Link
Width does not make sense.

In future proper change of link width can be done via Lane Count Select
bits in PCIe Control 0 register.

Trying to unconditionally enable ASPM L0s via ASPM Control bits in Link
Control register is wrong. There should be at least some detection if
endpoint supports L0s as isn't mandatory.

Moreover ASPM Control bits in Link Control register are controlled by
pcie/aspm.c code which sets it according to system ASPM settings,
immediately after aardvark driver probes. So setting these bits by
aardvark driver has no long running effect.

Remove code which touches ASPM L0s bits from this driver and let
kernel's ASPM implementation to set ASPM state properly.

Some users are reporting issues that this code is problematic for some
Intel wifi cards and removing it fixes them, see e.g.:
https://bugzilla.kernel.org/show_bug.cgi?id=196339

If problems with Intel wifi cards occur even after this commit, then
pcie/aspm.c code could be modified / hooked to not enable ASPM L0s state
for affected problematic cards.

Link: https://lore.kernel.org/r/20200430080625.26070-3-pali@kernel.org
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
4 years agoPCI: aardvark: Train link immediately after enabling training
Pali Rohár [Thu, 30 Apr 2020 08:06:14 +0000 (10:06 +0200)]
PCI: aardvark: Train link immediately after enabling training

Adding even 100ms (PCI_PM_D3COLD_WAIT) delay between enabling link
training and starting link training causes detection issues with some
buggy cards (such as Compex WLE900VX).

Move the code which enables link training immediately before the one
which starts link traning.

This fixes detection issues of Compex WLE900VX card on Turris MOX after
cold boot.

Link: https://lore.kernel.org/r/20200430080625.26070-2-pali@kernel.org
Fixes: f4c7d053d7f7 ("PCI: aardvark: Wait for endpoint to be ready...")
Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
4 years agoPCI/PM: Assume ports without DLL Link Active train links in 100 ms
Mika Westerberg [Thu, 14 May 2020 13:30:43 +0000 (16:30 +0300)]
PCI/PM: Assume ports without DLL Link Active train links in 100 ms

Kai-Heng Feng reported that it takes a long time (> 1 s) to resume
Thunderbolt-connected devices from both runtime suspend and system sleep
(s2idle).

This was because some Downstream Ports that support > 5 GT/s do not also
support Data Link Layer Link Active reporting.  Per PCIe r5.0 sec 6.6.1:

  With a Downstream Port that supports Link speeds greater than 5.0 GT/s,
  software must wait a minimum of 100 ms after Link training completes
  before sending a Configuration Request to the device immediately below
  that Port. Software can determine when Link training completes by polling
  the Data Link Layer Link Active bit or by setting up an associated
  interrupt (see Section 6.7.3.3).

Sec 7.5.3.6 requires such Ports to support DLL Link Active reporting, but
at least the Intel JHL6240 Thunderbolt 3 Bridge [8086:15c0] and the Intel
JHL7540 Thunderbolt 3 Bridge [8086:15ea] do not.

Previously we tried to wait for Link training to complete, but since there
was no DLL Link Active reporting, all we could do was wait the worst-case
1000 ms, then another 100 ms.

Instead of using the supported speeds to determine whether to wait for Link
training, check whether the port supports DLL Link Active reporting.  The
Ports in question do not, so we'll wait only the 100 ms required for Ports
that support Link speeds <= 5 GT/s.

This of course assumes these Ports always train the Link within 100 ms even
if they are operating at > 5 GT/s, which is not required by the spec.

[bhelgaas: commit log, comment]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206837
Link: https://lore.kernel.org/r/20200514133043.27429-1-mika.westerberg@linux.intel.com
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI/PM: Adjust pcie_wait_for_link_delay() for caller delay
Bjorn Helgaas [Fri, 15 May 2020 19:31:16 +0000 (14:31 -0500)]
PCI/PM: Adjust pcie_wait_for_link_delay() for caller delay

The caller of pcie_wait_for_link_delay() specifies the time to wait after
the link becomes active.  When the downstream port doesn't support link
active reporting, obviously we can't tell when the link becomes active, so
we waited the worst-case time (1000 ms) plus 100 ms, ignoring the delay
from the caller.

Instead, wait for 1000 ms + the delay from the caller.

Fixes: 4827d63891b6 ("PCI/PM: Add pcie_wait_for_link_delay()")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agox86/PCI: Mark Intel C620 MROMs as having non-compliant BARs
Xiaochun Lee [Fri, 15 May 2020 03:31:07 +0000 (23:31 -0400)]
x86/PCI: Mark Intel C620 MROMs as having non-compliant BARs

The Intel C620 Platform Controller Hub has MROM functions that have non-PCI
registers (undocumented in the public spec) where BAR 0 is supposed to be,
which results in messages like this:

  pci 0000:00:11.0: [Firmware Bug]: reg 0x30: invalid BAR (can't size)

Mark these MROM functions as having non-compliant BARs so we don't try to
probe any of them.  There are no other BARs on these devices.

See the Intel C620 Series Chipset Platform Controller Hub Datasheet,
May 2019, Document Number 336067-007US, sec 2.1, 35.5, 35.6.

[bhelgaas: commit log, add 0xa26d]
Link: https://lore.kernel.org/r/1589513467-17070-1-git-send-email-lixiaochun.2888@163.com
Signed-off-by: Xiaochun Lee <lixc17@lenovo.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
4 years agoPCI: Program MPS for RCiEP devices
Ashok Raj [Fri, 27 Mar 2020 21:16:15 +0000 (14:16 -0700)]
PCI: Program MPS for RCiEP devices

Root Complex Integrated Endpoints (RCiEPs) do not have an upstream bridge,
so pci_configure_mps() previously ignored them, which may result in reduced
performance.

Instead, program the Max_Payload_Size of RCiEPs to the maximum supported
value (unless it is limited for the PCIE_BUS_PEER2PEER case).  This also
affects the subsequent programming of Max_Read_Request_Size because Linux
programs MRRS based on the MPS value.

Fixes: 9dae3a97297f ("PCI: Move MPS configuration check to pci_configure_device()")
Link: https://lore.kernel.org/r/1585343775-4019-1-git-send-email-ashok.raj@intel.com
Tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
4 years agoPCI: Fix pci_host_bridge struct device release/free handling
Rob Herring [Wed, 13 May 2020 22:38:59 +0000 (17:38 -0500)]
PCI: Fix pci_host_bridge struct device release/free handling

The PCI code has several paths where the struct pci_host_bridge is freed
directly. This is wrong because it contains a struct device which is
refcounted and should be freed using put_device(). This can result in
use-after-free errors. I think this problem has existed since 2012 with
commit 7b5436635800 ("PCI: add generic device into pci_host_bridge
struct"). It generally hasn't mattered as most host bridge drivers are
still built-in and can't unbind.

The problem is a struct device should never be freed directly once
device_initialize() is called and a ref is held, but that doesn't happen
until pci_register_host_bridge(). There's then a window between allocating
the host bridge and pci_register_host_bridge() where kfree should be used.
This is fragile and requires callers to do the right thing. To fix this, we
need to split device_register() into device_initialize() and device_add()
calls, so that the host bridge struct is always freed by using a
put_device().

devm_pci_alloc_host_bridge() is using devm_kzalloc() to allocate struct
pci_host_bridge which will be freed directly. Instead, we can use a custom
devres action to call put_device().

Link: https://lore.kernel.org/r/20200513223859.11295-2-robh@kernel.org
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
4 years agoPCI: Fix pci_register_host_bridge() device_register() error handling
Rob Herring [Wed, 13 May 2020 22:38:58 +0000 (17:38 -0500)]
PCI: Fix pci_register_host_bridge() device_register() error handling

If device_register() has an error, we should bail out of
pci_register_host_bridge() rather than continuing on.

Fixes: 37d6a0a6f470 ("PCI: Add pci_register_host_bridge() interface")
Link: https://lore.kernel.org/r/20200513223859.11295-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
4 years agoPCI: Unify pcie_find_root_port() and pci_find_pcie_root_port()
Yicong Yang [Sat, 9 May 2020 10:19:28 +0000 (18:19 +0800)]
PCI: Unify pcie_find_root_port() and pci_find_pcie_root_port()

Previously we used pcie_find_root_port() to find a Root Port from a PCIe
device and pci_find_pcie_root_port() to find a Root Port from a
Conventional PCI device.

Unify the two functions and use pcie_find_root_port() to find a Root Port
from either a Conventional PCI device or a PCIe device.  Then there is no
need to distinguish the type of the device.

Link: https://lore.kernel.org/r/1589019568-5216-1-git-send-email-yangyicong@hisilicon.com
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # thunderbolt
4 years agoPCI: Replace zero-length array with flexible-array
Gustavo A. R. Silva [Thu, 7 May 2020 19:05:44 +0000 (14:05 -0500)]
PCI: Replace zero-length array with flexible-array

The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare variable-length
types such as these as a flexible array member [1][2], introduced in C99:

  struct foo {
    int stuff;
    struct boo array[];
  };

By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that dynamic memory allocations won't be affected by this
change:

  Flexible array members have incomplete type, and so the sizeof operator
  may not be applied. As a quirk of the original implementation of
  zero-length arrays, sizeof evaluates to zero. [1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type [1]. There are some instances of code in which
the sizeof() operator is being incorrectly/erroneously applied to
zero-length arrays, and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also help
to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Link: https://lore.kernel.org/r/20200507190544.GA15633@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI: Check for platform_get_irq() failure consistently
Aman Sharma [Wed, 11 Mar 2020 19:19:02 +0000 (00:49 +0530)]
PCI: Check for platform_get_irq() failure consistently

The platform_get_irq*() interfaces return either a negative error number or
a valid IRQ.  0 is not a valid return value, so check for "< 0" to detect
failure as recommended by the function documentation.

On failure, return the error number from platform_get_irq*() instead of
making up a new one.

Link: https://lore.kernel.org/r/cover.1583952275.git.amanharitsh123@gmail.com
[bhelgaas: commit log, squash into one patch]
Signed-off-by: Aman Sharma <amanharitsh123@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Karthikeyan Mitran <m.karthikeyan@mobiveil.co.in>
Cc: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Ryder Lee <ryder.lee@mediatek.com>
Cc: Marc Gonzalez <marc.w.gonzalez@free.fr>
4 years agodriver core: platform: Clarify that IRQ 0 is invalid
Bjorn Helgaas [Mon, 16 Mar 2020 21:43:38 +0000 (16:43 -0500)]
driver core: platform: Clarify that IRQ 0 is invalid

These interfaces return a negative error number or an IRQ:

  platform_get_irq()
  platform_get_irq_optional()
  platform_get_irq_byname()
  platform_get_irq_byname_optional()

The function comments suggest checking for error like this:

  irq = platform_get_irq(...);
  if (irq < 0)
    return irq;

which is what most callers (~900 of 1400) do, so it's implicit that IRQ 0
is invalid.  But some callers check for "irq <= 0", and it's not obvious
from the source that we never return an IRQ 0.

Make this more explicit by updating the comments to say that an IRQ number
is always non-zero and adding a WARN() if we ever do return zero.  If we do
return IRQ 0, it likely indicates a bug in the arch-specific parts of
platform_get_irq().

Relevant prior discussion at [1, 2].

[1] https://lore.kernel.org/r/Pine.LNX.4.64.0701250940220.25027@woody.linux-foundation.org/
[2] https://lore.kernel.org/r/Pine.LNX.4.64.0701252029570.25027@woody.linux-foundation.org/
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
4 years agoPCI: Avoid Pericom USB controller OHCI/EHCI PME# defect
Kai-Heng Feng [Fri, 8 May 2020 06:53:41 +0000 (14:53 +0800)]
PCI: Avoid Pericom USB controller OHCI/EHCI PME# defect

Both Pericom OHCI and EHCI devices advertise PME# support from all power
states:

  06:00.0 USB controller [0c03]: Pericom Semiconductor PI7C9X442SL USB OHCI Controller [12d8:400e] (rev 01) (prog-if 10 [OHCI])
    Subsystem: Pericom Semiconductor PI7C9X442SL USB OHCI Controller [12d8:400e]
    Capabilities: [80] Power Management version 3
      Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)

  06:00.2 USB controller [0c03]: Pericom Semiconductor PI7C9X442SL USB EHCI Controller [12d8:400f] (rev 01) (prog-if 20 [EHCI])
    Subsystem: Pericom Semiconductor PI7C9X442SL USB EHCI Controller [12d8:400f]
    Capabilities: [80] Power Management version 3
      Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)

But testing shows that it's unreliable: there is a 20% chance PME# won't be
asserted when a USB device is plugged.

Remove PME support for both devices to make USB plugging work reliably.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205981
Link: https://lore.kernel.org/r/20200508065343.32751-2-kai.heng.feng@canonical.com
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
4 years agoserial: 8250_pci: Move Pericom IDs to pci_ids.h
Kai-Heng Feng [Fri, 8 May 2020 06:53:40 +0000 (14:53 +0800)]
serial: 8250_pci: Move Pericom IDs to pci_ids.h

Move the IDs to pci_ids.h so it can be used by next patch.

Link: https://lore.kernel.org/r/20200508065343.32751-1-kai.heng.feng@canonical.com
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
4 years agoPCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges
Kai-Heng Feng [Tue, 5 May 2020 17:34:21 +0000 (01:34 +0800)]
PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges

7d715a6c1ae5 ("PCI: add PCI Express ASPM support") added the ability for
Linux to enable ASPM, but for some undocumented reason, it didn't enable
ASPM on links where the downstream component is a PCIe-to-PCI/PCI-X Bridge.

Remove this exclusion so we can enable ASPM on these links.

The Dell OptiPlex 7080 mentioned in the bugzilla has a TI XIO2001
PCIe-to-PCI Bridge.  Enabling ASPM on the link leading to it allows the
Intel SoC to enter deeper Package C-states, which is a significant power
savings.

[bhelgaas: commit log]
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207571
Link: https://lore.kernel.org/r/20200505173423.26968-1-kai.heng.feng@canonical.com
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
4 years agoPCI: Allow pci_resize_resource() for devices on root bus
Ard Biesheuvel [Tue, 21 Apr 2020 16:22:56 +0000 (18:22 +0200)]
PCI: Allow pci_resize_resource() for devices on root bus

When resizing a BAR, pci_reassign_bridge_resources() is invoked to bring
the bridge windows of parent bridges in line with the new BAR assignment.

This assumes the device whose BAR is being resized lives on a subordinate
bus, but this is not necessarily the case. A device may live on the root
bus, in which case dev->bus->self is NULL, and passing a NULL pci_dev
pointer to pci_reassign_bridge_resources() will cause it to crash.

So let's make the call to pci_reassign_bridge_resources() conditional on
whether dev->bus->self is non-NULL in the first place.

Fixes: 8bb705e3e79d84e7 ("PCI: Add pci_resize_resource() for resizing BARs")
Link: https://lore.kernel.org/r/20200421162256.26887-1-ardb@kernel.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
4 years agoPCI/AER: Use only _OSC to determine AER ownership
Alexandru Gagniuc [Mon, 27 Apr 2020 23:25:13 +0000 (18:25 -0500)]
PCI/AER: Use only _OSC to determine AER ownership

Per the PCI Firmware spec, r3.2, sec 4.5.1, the OS can request control of
AER via bit 3 of the _OSC Control Field.  In the returned value of the
Control Field:

  The firmware sets [bit 3] to 1 to grant control over PCI Express Advanced
  Error Reporting.  ...  after control is transferred to the operating
  system, firmware must not modify the Advanced Error Reporting Capability.
  If control of this feature was requested and denied or was not requested,
  firmware returns this bit set to 0.

Previously the pci_root driver looked at the HEST FIRMWARE_FIRST bit to
determine whether to request ownership of the AER Capability.  This was
based on ACPI spec v6.3, sec 18.3.2.4, and similar sections, which say
things like:

  Bit [0] - FIRMWARE_FIRST: If set, indicates that system firmware will
            handle errors from this source first.

  Bit [1] - GLOBAL: If set, indicates that the settings contained in this
            structure apply globally to all PCI Express Devices.

These ACPI references don't say anything about ownership of the AER
Capability.

Remove use of the FIRMWARE_FIRST bit and rely only on the _OSC bit to
determine whether we have control of the AER Capability.

Link: https://lore.kernel.org/r/20181115231605.24352-1-mr.nuke.me@gmail.com/
Link: https://lore.kernel.org/r/20190326172343.28946-1-mr.nuke.me@gmail.com/
Link: https://lore.kernel.org/r/67af2931705bed9a588b5a39d369cb70b9942190.1587925636.git.sathyanarayanan.kuppuswamy@linux.intel.com
[bhelgaas: commit log, note: Alex posted this identical patch 18 months
ago, and I failed to apply it then, so I made him the author, added links
to his postings, and added his Signed-off-by]
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
4 years agoPCI/EDR: Log only ACPI_NOTIFY_DISCONNECT_RECOVER events
Kuppuswamy Sathyanarayanan [Thu, 16 Apr 2020 00:38:32 +0000 (17:38 -0700)]
PCI/EDR: Log only ACPI_NOTIFY_DISCONNECT_RECOVER events

Previously we logged *all* ACPI SYSTEM-level events, which may include lots
of non-EDR events.  Move the message so we only log those related to EDR.

Link: https://lore.kernel.org/r/01afb4e01efbe455de0c445bef6cf3ffc59340d2.1586996350.git.sathyanarayanan.kuppuswamy@linux.intel.com
[bhelgaas: drop the pci_dbg() of all events since ACPI can log those
already]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4 years agoPCI: Use of_node_name_eq() for node name comparisons
Rob Herring [Thu, 16 Apr 2020 21:51:14 +0000 (16:51 -0500)]
PCI: Use of_node_name_eq() for node name comparisons

Convert string compares of DT node names to use of_node_name_eq() helper
instead. This removes direct access to the node name pointer.

Link: https://lore.kernel.org/r/20200416215114.7715-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
4 years agoPCI/AER: Don't select CONFIG_PCIEAER by default
Bjorn Helgaas [Wed, 8 Apr 2020 23:13:34 +0000 (18:13 -0500)]
PCI/AER: Don't select CONFIG_PCIEAER by default

PCIe Advanced Error Reporting (AER) is optional and there's no need for it
to be selected by default.

Remove the "default y" for CONFIG_PCIEAER.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Russell Currey <ruscur@russell.cc>
Cc: Sam Bobroff <sbobroff@linux.ibm.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
4 years agoPCI: keystone: Don't select CONFIG_PCI_KEYSTONE_HOST by default
Bjorn Helgaas [Wed, 8 Apr 2020 23:08:53 +0000 (18:08 -0500)]
PCI: keystone: Don't select CONFIG_PCI_KEYSTONE_HOST by default

Drivers should not be selected by default because that bloats the kernel
for people who don't need them.

Remove the "default y" for CONFIG_PCI_KEYSTONE_HOST.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: linux-arm-kernel@lists.infradead.org
4 years agoPCI: dra7xx: Don't select CONFIG_PCI_DRA7XX_HOST by default
Bjorn Helgaas [Wed, 8 Apr 2020 23:10:54 +0000 (18:10 -0500)]
PCI: dra7xx: Don't select CONFIG_PCI_DRA7XX_HOST by default

Drivers should not be selected by default because that bloats the kernel
for people who don't need them.

Enable CONFIG_PCI_DRA7XX_HOST by default only if SOC_DRA7XX.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Cc: linux-omap@vger.kernel.org
4 years agoPCI/PM: Call .bridge_d3() hook only if non-NULL
Bjorn Helgaas [Tue, 7 Apr 2020 23:23:15 +0000 (18:23 -0500)]
PCI/PM: Call .bridge_d3() hook only if non-NULL

26ad34d510a8 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") added
the struct pci_platform_pm_ops.bridge_d3() function pointer and
platform_pci_bridge_d3() to use it.

The .bridge_d3() op is implemented by acpi_pci_platform_pm, but not by
mid_pci_platform_pm.  We don't expect platform_pci_bridge_d3() to be called
on Intel MID platforms, but nothing in the code itself would prevent that.

Check the .bridge_d3() pointer for NULL before calling it.

Fixes: 26ad34d510a8 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
4 years agoPCI/P2PDMA: Add AMD Zen Raven and Renoir Root Ports to whitelist
Alex Deucher [Mon, 6 Apr 2020 19:42:01 +0000 (15:42 -0400)]
PCI/P2PDMA: Add AMD Zen Raven and Renoir Root Ports to whitelist

According to the hardware architect, pre-Zen parts support p2p writes and
Zen parts support both p2p reads and writes.

Add entries for Zen parts Raven (0x15d0) and Renoir (0x1630).

Link: https://lore.kernel.org/r/20200406194201.846411-1-alexander.deucher@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
4 years agoPCI: pciehp: Remove unused EMI() and HP_SUPR_RM() macros
Ani Sinha [Tue, 21 Apr 2020 03:27:50 +0000 (08:57 +0530)]
PCI: pciehp: Remove unused EMI() and HP_SUPR_RM() macros

EMI() and HP_SUPR_RM() are unused, so remove them.

Link: https://lore.kernel.org/r/1587439673-39652-1-git-send-email-ani@anisinha.ca
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
4 years agoLinux 5.7-rc1 v5.7-rc1
Linus Torvalds [Sun, 12 Apr 2020 19:35:55 +0000 (12:35 -0700)]
Linux 5.7-rc1

4 years agoMAINTAINERS: sort field names for all entries
Linus Torvalds [Sun, 12 Apr 2020 18:04:58 +0000 (11:04 -0700)]
MAINTAINERS: sort field names for all entries

This sorts the actual field names too, potentially causing even more
chaos and confusion at merge time if you have edited the MAINTAINERS
file.  But the end result is a more consistent layout, and hopefully
it's a one-time pain minimized by doing this just before the -rc1
release.

This was entirely scripted:

  ./scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS --order

Requested-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMAINTAINERS: sort entries by entry name
Linus Torvalds [Sun, 12 Apr 2020 18:03:52 +0000 (11:03 -0700)]
MAINTAINERS: sort entries by entry name

They are all supposed to be sorted, but people who add new entries don't
always know the alphabet.  Plus sometimes the entry names get edited,
and people don't then re-order the entry.

Let's see how painful this will be for merging purposes (the MAINTAINERS
file is often edited in various different trees), but Joe claims there's
relatively few patches in -next that touch this, and doing it just
before -rc1 is likely the best time.  Fingers crossed.

This was scripted with

  /scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS

but then I also ended up manually upper-casing a few entry names that
stood out when looking at the end result.

Requested-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'x86-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 12 Apr 2020 17:17:16 +0000 (10:17 -0700)]
Merge tag 'x86-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of three patches to fix the fallout of the newly added split
  lock detection feature.

  It addressed the case where a KVM guest triggers a split lock #AC and
  KVM reinjects it into the guest which is not prepared to handle it.

  Add proper sanity checks which prevent the unconditional injection
  into the guest and handles the #AC on the host side in the same way as
  user space detections are handled. Depending on the detection mode it
  either warns and disables detection for the task or kills the task if
  the mode is set to fatal"

* tag 'x86-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest
  KVM: x86: Emulate split-lock access as a write in emulator
  x86/split_lock: Provide handle_guest_split_lock()

4 years agoMerge tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Apr 2020 17:13:14 +0000 (10:13 -0700)]
Merge tag 'timers-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip

Pull time(keeping) updates from Thomas Gleixner:

 - Fix the time_for_children symlink in /proc/$PID/ so it properly
   reflects that it part of the 'time' namespace

 - Add the missing userns limit for the allowed number of time
   namespaces, which was half defined but the actual array member was
   not added. This went unnoticed as the array has an exessive empty
   member at the end but introduced a user visible regression as the
   output was corrupted.

 - Prevent further silent ucount corruption by adding a BUILD_BUG_ON()
   to catch half updated data.

* tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ucount: Make sure ucounts in /proc/sys/user don't regress again
  time/namespace: Add max_time_namespaces ucount
  time/namespace: Fix time_for_children symlink

4 years agoMerge tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Apr 2020 17:09:19 +0000 (10:09 -0700)]
Merge tag 'sched-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes/updates from Thomas Gleixner:

 - Deduplicate the average computations in the scheduler core and the
   fair class code.

 - Fix a raise between runtime distribution and assignement which can
   cause exceeding the quota by up to 70%.

 - Prevent negative results in the imbalanace calculation

 - Remove a stale warning in the workqueue code which can be triggered
   since the call site was moved out of preempt disabled code. It's a
   false positive.

 - Deduplicate the print macros for procfs

 - Add the ucmap values to the SCHED_DEBUG procfs output for completness

* tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/debug: Add task uclamp values to SCHED_DEBUG procfs
  sched/debug: Factor out printing formats into common macros
  sched/debug: Remove redundant macro define
  sched/core: Remove unused rq::last_load_update_tick
  workqueue: Remove the warning in wq_worker_sleeping()
  sched/fair: Fix negative imbalance in imbalance calculation
  sched/fair: Fix race between runtime distribution and assignment
  sched/fair: Align rq->avg_idle and rq->avg_scan_cost

4 years agoMerge tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 12 Apr 2020 17:05:24 +0000 (10:05 -0700)]
Merge tag 'perf-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "Three fixes/updates for perf:

   - Fix the perf event cgroup tracking which tries to track the cgroup
     even for disabled events.

   - Add Ice Lake server support for uncore events

   - Disable pagefaults when retrieving the physical address in the
     sampling code"

* tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Disable page faults when getting phys address
  perf/x86/intel/uncore: Add Ice Lake server uncore support
  perf/cgroup: Correct indirection in perf_less_group_idx()
  perf/core: Fix event cgroup tracking

4 years agoMerge tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Apr 2020 16:47:10 +0000 (09:47 -0700)]
Merge tag 'locking-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "Three small fixes/updates for the locking core code:

   - Plug a task struct reference leak in the percpu rswem
     implementation.

   - Document the refcount interaction with PID_MAX_LIMIT

   - Improve the 'invalid wait context' data dump in lockdep so it
     contains all information which is required to decode the problem"

* tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/lockdep: Improve 'invalid wait context' splat
  locking/refcount: Document interaction with PID_MAX_LIMIT
  locking/percpu-rwsem: Fix a task_struct refcount

4 years agoMerge tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 12 Apr 2020 16:41:01 +0000 (09:41 -0700)]
Merge tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Ten cifs/smb fixes:

   - five RDMA (smbdirect) related fixes

   - add experimental support for swap over SMB3 mounts

   - also a fix which improves performance of signed connections"

* tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: enable swap on SMB3 mounts
  smb3: change noisy error message to FYI
  smb3: smbdirect support can be configured by default
  cifs: smbd: Do not schedule work to send immediate packet on every receive
  cifs: smbd: Properly process errors on ib_post_send
  cifs: Allocate crypto structures on the fly for calculating signatures of incoming packets
  cifs: smbd: Update receive credits before sending and deal with credits roll back on failure before sending
  cifs: smbd: Check send queue size before posting a send
  cifs: smbd: Merge code to track pending packets
  cifs: ignore cached share root handle closing errors

4 years agoMerge tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Sun, 12 Apr 2020 16:39:47 +0000 (09:39 -0700)]
Merge tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfix from Trond Myklebust:
 "Fix an RCU read lock leakage in pnfs_alloc_ds_commits_list()"

* tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  pNFS: Fix RCU lock leakage

4 years agoMerge tag 'nios2-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan...
Linus Torvalds [Sat, 11 Apr 2020 18:38:44 +0000 (11:38 -0700)]
Merge tag 'nios2-v5.7-rc1' of git://git./linux/kernel/git/lftan/nios2

Pull nios2 updates from Ley Foon Tan:

 - Remove nios2-dev@lists.rocketboards.org from MAINTAINERS

 - remove 'resetvalue' property

 - rename 'altr,gpio-bank-width' -> 'altr,ngpio'

 - enable the common clk subsystem on Nios2

* tag 'nios2-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
  MAINTAINERS: Remove nios2-dev@lists.rocketboards.org
  arch: nios2: remove 'resetvalue' property
  arch: nios2: rename 'altr,gpio-bank-width' -> 'altr,ngpio'
  arch: nios2: Enable the common clk subsystem on Nios2

4 years agoMerge tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping
Linus Torvalds [Sat, 11 Apr 2020 18:34:36 +0000 (11:34 -0700)]
Merge tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fixes from Christoph Hellwig:

 - fix an integer truncation in dma_direct_get_required_mask
   (Kishon Vijay Abraham)

 - fix the display of dma mapping types (Grygorii Strashko)

* tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping:
  dma-debug: fix displaying of dma allocation type
  dma-direct: fix data truncation in dma_direct_get_required_mask()

4 years agoMerge tag 'kbuild-v5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy...
Linus Torvalds [Sat, 11 Apr 2020 16:46:12 +0000 (09:46 -0700)]
Merge tag 'kbuild-v5.7-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:

 - raise minimum supported binutils version to 2.23

 - remove old CONFIG_AS_* macros that we know binutils >= 2.23 supports

 - move remaining CONFIG_AS_* tests to Kconfig from Makefile

 - enable -Wtautological-compare warnings to catch more issues

 - do not support GCC plugins for GCC <= 4.7

 - fix various breakages of 'make xconfig'

 - include the linker version used for linking the kernel into
   LINUX_COMPILER, which is used for the banner, and also exposed to
   /proc/version

 - link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y, which
   allows us to remove the lib-ksyms.o workaround, and to solve the last
   known issue of the LLVM linker

 - add dummy tools in scripts/dummy-tools/ to enable all compiler tests
   in Kconfig, which will be useful for distro maintainers

 - support the single switch, LLVM=1 to use Clang and all LLVM utilities
   instead of GCC and Binutils.

 - support LLVM_IAS=1 to enable the integrated assembler, which is still
   experimental

* tag 'kbuild-v5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (36 commits)
  kbuild: fix comment about missing include guard detection
  kbuild: support LLVM=1 to switch the default tools to Clang/LLVM
  kbuild: replace AS=clang with LLVM_IAS=1
  kbuild: add dummy toolchains to enable all cc-option etc. in Kconfig
  kbuild: link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y
  MIPS: fw: arc: add __weak to prom_meminit and prom_free_prom_memory
  kbuild: remove -I$(srctree)/tools/include from scripts/Makefile
  kbuild: do not pass $(KBUILD_CFLAGS) to scripts/mkcompile_h
  Documentation/llvm: fix the name of llvm-size
  kbuild: mkcompile_h: Include $LD version in /proc/version
  kconfig: qconf: Fix a few alignment issues
  kconfig: qconf: remove some old bogus TODOs
  kconfig: qconf: fix support for the split view mode
  kconfig: qconf: fix the content of the main widget
  kconfig: qconf: Change title for the item window
  kconfig: qconf: clean deprecated warnings
  gcc-plugins: drop support for GCC <= 4.7
  kbuild: Enable -Wtautological-compare
  x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2
  crypto: x86 - clean up poly1305-x86_64-cryptogams.S by 'make clean'
  ...

4 years agomailmap: Add Sedat Dilek (replacement for expired email address)
Sedat Dilek [Sat, 11 Apr 2020 13:29:43 +0000 (15:29 +0200)]
mailmap: Add Sedat Dilek (replacement for expired email address)

I do not longer work for credativ Germany.

Please, use my private email address instead.

This is for the case when people want to CC me on
patches sent from my old business email address.

Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agopNFS: Fix RCU lock leakage
Trond Myklebust [Sat, 11 Apr 2020 15:37:18 +0000 (11:37 -0400)]
pNFS: Fix RCU lock leakage

Another brown paper bag moment. pnfs_alloc_ds_commits_list() is leaking
the RCU lock.

Fixes: a9901899b649 ("pNFS: Add infrastructure for cleaning up per-layout commit structures")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
4 years agoKVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest
Xiaoyao Li [Fri, 10 Apr 2020 11:54:02 +0000 (13:54 +0200)]
KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest

Two types of #AC can be generated in Intel CPUs:
 1. legacy alignment check #AC
 2. split lock #AC

Reflect #AC back into the guest if the guest has legacy alignment checks
enabled or if split lock detection is disabled.

If the #AC is not a legacy one and split lock detection is enabled, then
invoke handle_guest_split_lock() which will either warn and disable split
lock detection for this task or force SIGBUS on it.

[ tglx: Switch it to handle_guest_split_lock() and rename the misnamed
  helper function. ]

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115517.176308876@linutronix.de
4 years agoKVM: x86: Emulate split-lock access as a write in emulator
Xiaoyao Li [Fri, 10 Apr 2020 11:54:01 +0000 (13:54 +0200)]
KVM: x86: Emulate split-lock access as a write in emulator

Emulate split-lock accesses as writes if split lock detection is on
to avoid #AC during emulation, which will result in a panic(). This
should never occur for a well-behaved guest, but a malicious guest can
manipulate the TLB to trigger emulation of a locked instruction[1].

More discussion can be found at [2][3].

[1] https://lkml.kernel.org/r/8c5b11c9-58df-38e7-a514-dc12d687b198@redhat.com
[2] https://lkml.kernel.org/r/20200131200134.GD18946@linux.intel.com
[3] https://lkml.kernel.org/r/20200227001117.GX9940@linux.intel.com

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115517.084300242@linutronix.de
4 years agox86/split_lock: Provide handle_guest_split_lock()
Thomas Gleixner [Fri, 10 Apr 2020 11:54:00 +0000 (13:54 +0200)]
x86/split_lock: Provide handle_guest_split_lock()

Without at least minimal handling for split lock detection induced #AC,
VMX will just run into the same problem as the VMWare hypervisor, which
was reported by Kenneth.

It will inject the #AC blindly into the guest whether the guest is
prepared or not.

Provide a function for guest mode which acts depending on the host
SLD mode. If mode == sld_warn, treat it like user space, i.e. emit a
warning, disable SLD and mark the task accordingly. Otherwise force
SIGBUS.

 [ bp: Add a !CPU_SUP_INTEL stub for handle_guest_split_lock(). ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115516.978037132@linutronix.de
Link: https://lkml.kernel.org/r/20200402123258.895628824@linutronix.de
4 years agokbuild: fix comment about missing include guard detection
Masahiro Yamada [Wed, 8 Apr 2020 18:29:19 +0000 (03:29 +0900)]
kbuild: fix comment about missing include guard detection

The keyword here is 'twice' to explain the trick.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 11 Apr 2020 00:57:48 +0000 (17:57 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge yet more updates from Andrew Morton:

 - Almost all of the rest of MM (memcg, slab-generic, slab, pagealloc,
   gup, hugetlb, pagemap, memremap)

 - Various other things (hfs, ocfs2, kmod, misc, seqfile)

* akpm: (34 commits)
  ipc/util.c: sysvipc_find_ipc() should increase position index
  kernel/gcov/fs.c: gcov_seq_next() should increase position index
  fs/seq_file.c: seq_read(): add info message about buggy .next functions
  drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
  change email address for Pali Rohár
  selftests: kmod: test disabling module autoloading
  selftests: kmod: fix handling test numbers above 9
  docs: admin-guide: document the kernel.modprobe sysctl
  fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
  kmod: make request_module() return an error when autoloading is disabled
  mm/memremap: set caching mode for PCI P2PDMA memory to WC
  mm/memory_hotplug: add pgprot_t to mhp_params
  powerpc/mm: thread pgprot_t through create_section_mapping()
  x86/mm: introduce __set_memory_prot()
  x86/mm: thread pgprot_t through init_memory_mapping()
  mm/memory_hotplug: rename mhp_restrictions to mhp_params
  mm/memory_hotplug: drop the flags field from struct mhp_restrictions
  mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
  mm/vma: introduce VM_ACCESS_FLAGS
  mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
  ...

4 years agoMerge tag 'docs-5.7-2' of git://git.lwn.net/linux
Linus Torvalds [Sat, 11 Apr 2020 00:53:43 +0000 (17:53 -0700)]
Merge tag 'docs-5.7-2' of git://git.lwn.net/linux

Pull Documentation fixes from Jonathan Corbet:
 "A handful of late-arriving fixes for the documentation tree"

* tag 'docs-5.7-2' of git://git.lwn.net/linux:
  Documentation: android: binderfs: add 'stats' mount option
  Documentation: driver-api/usb/writing_usb_driver.rst Updates documentation links
  docs: driver-api: address duplicate label warning
  Documentation: sysrq: fix RST formatting
  docs: kernel-parameters.txt: Fix broken references
  docs: kernel-parameters.txt: Remove nompx
  docs: filesystems: fix typo in qnx6.rst

4 years agoMerge tag 'for-linus-5.7-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubca...
Linus Torvalds [Sat, 11 Apr 2020 00:50:01 +0000 (17:50 -0700)]
Merge tag 'for-linus-5.7-ofs1' of git://git./linux/kernel/git/hubcap/linux

Pull orangefs updates from Mike Marshall:
 "A fix and two cleanups.

  Fix:

   - Christoph Hellwig noticed that some logic I added to
     orangefs_file_read_iter introduced a race condition, so he sent a
     reversion patch. I had to modify his patch since reverting at this
     point broke Orangefs.

  Cleanups:

   - Christoph Hellwig noticed that we were doing some unnecessary work
     in orangefs_flush, so he sent in a patch that removed the un-needed
     code.

   - Al Viro told me he had trouble building Orangefs. Orangefs should
     be easy to build, even for Al :-).

     I looked back at the test server build notes in orangefs.txt, just
     in case that's where the trouble really is, and found a couple of
     typos and made a couple of clarifications"

* tag 'for-linus-5.7-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: clarify build steps for test server in orangefs.txt
  orangefs: don't mess with I_DIRTY_TIMES in orangefs_flush
  orangefs: get rid of knob code...

4 years agoMerge tag 'xtensa-20200410' of git://github.com/jcmvbkbc/linux-xtensa
Linus Torvalds [Sat, 11 Apr 2020 00:39:20 +0000 (17:39 -0700)]
Merge tag 'xtensa-20200410' of git://github.com/jcmvbkbc/linux-xtensa

Pull xtensa updates from Max Filippov:

 - replace setup_irq() by request_irq()

 - cosmetic fixes in xtensa Kconfig and boot/Makefile

* tag 'xtensa-20200410' of git://github.com/jcmvbkbc/linux-xtensa:
  arch/xtensa: fix grammar in Kconfig help text
  xtensa: remove meaningless export ccflags-y
  xtensa: replace setup_irq() by request_irq()

4 years agoMerge tag 'for-linus-5.7-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 11 Apr 2020 00:20:06 +0000 (17:20 -0700)]
Merge tag 'for-linus-5.7-rc1b-tag' of git://git./linux/kernel/git/xen/tip

Pull more xen updates from Juergen Gross:

 - two cleanups

 - fix a boot regression introduced in this merge window

 - fix wrong use of memory allocation flags

* tag 'for-linus-5.7-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: fix booting 32-bit pv guest
  x86/xen: make xen_pvmmu_arch_setup() static
  xen/blkfront: fix memory allocation flags in blkfront_setup_indirect()
  xen: Use evtchn_type_t as a type for event channels

4 years agoipc/util.c: sysvipc_find_ipc() should increase position index
Vasily Averin [Fri, 10 Apr 2020 21:34:13 +0000 (14:34 -0700)]
ipc/util.c: sysvipc_find_ipc() should increase position index

If seq_file .next function does not change position index, read after
some lseek can generate unexpected output.

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/b7a20945-e315-8bb0-21e6-3875c14a8494@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agokernel/gcov/fs.c: gcov_seq_next() should increase position index
Vasily Averin [Fri, 10 Apr 2020 21:34:10 +0000 (14:34 -0700)]
kernel/gcov/fs.c: gcov_seq_next() should increase position index

If seq_file .next function does not change position index, read after
some lseek can generate unexpected output.

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Waiman Long <longman@redhat.com>
Link: http://lkml.kernel.org/r/f65c6ee7-bd00-f910-2f8a-37cc67e4ff88@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agofs/seq_file.c: seq_read(): add info message about buggy .next functions
Vasily Averin [Fri, 10 Apr 2020 21:34:06 +0000 (14:34 -0700)]
fs/seq_file.c: seq_read(): add info message about buggy .next functions

Patch series "seq_file .next functions should increase position index".

In Aug 2018 NeilBrown noticed commit 1f4aace60b0e ("fs/seq_file.c:
simplify seq_file iteration code and interface")

"Some ->next functions do not increment *pos when they return NULL...
Note that such ->next functions are buggy and should be fixed.  A simple
demonstration is dd if=/proc/swaps bs=1000 skip=1 Choose any block size
larger than the size of /proc/swaps.  This will always show the whole
last line of /proc/swaps"

Described problem is still actual.  If you make lseek into middle of
last output line following read will output end of last line and whole
last line once again.

  $ dd if=/proc/swaps bs=1  # usual output
  Filename Type Size Used Priority
  /dev/dm-0                             partition 4194812 97536 -2
  104+0 records in
  104+0 records out
  104 bytes copied

  $ dd if=/proc/swaps bs=40 skip=1    # last line was generated twice
  dd: /proc/swaps: cannot skip to specified offset
  v/dm-0                                partition 4194812 97536 -2
  /dev/dm-0                             partition 4194812 97536 -2
  3+1 records in
  3+1 records out
  131 bytes copied

There are lot of other affected files, I've found 30+ including
/proc/net/ip_tables_matches and /proc/sysvipc/*

I've sent patches into maillists of affected subsystems already, this
patch-set fixes the problem in files related to pstore, tracing, gcov,
sysvipc and other subsystems processed via linux-kernel@ mailing list
directly

https://bugzilla.kernel.org/show_bug.cgi?id=206283

This patch (of 4):

Add debug code to seq_read() to detect missed or out-of-tree incorrect
.next seq_file functions.

[akpm@linux-foundation.org: s/pr_info/pr_info_ratelimited/, per Qian Cai]
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: NeilBrown <neilb@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Waiman Long <longman@redhat.com>
Link: http://lkml.kernel.org/r/244674e5-760c-86bd-d08a-047042881748@virtuozzo.com
Link: http://lkml.kernel.org/r/7c24087c-e280-e580-5b0c-0cdaeb14cd18@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agodrivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
kbuild test robot [Fri, 10 Apr 2020 21:34:03 +0000 (14:34 -0700)]
drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings

Remove dev_err() messages after platform_get_irq*() failures.
platform_get_irq() already prints an error.

Generated by: scripts/coccinelle/api/platform_get_irq.cocci

Fixes: 6c41ac96ad92 ("dmaengine: tegra-apb: Support COMPILE_TEST")
Signed-off-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Cc: Vinod Koul <vinod.koul@linux.intel.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Jon Hunter <jonathanh@nvidia.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2002271133450.2973@hadrien
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agochange email address for Pali Rohár
Pali Rohár [Fri, 10 Apr 2020 21:34:00 +0000 (14:34 -0700)]
change email address for Pali Rohár

For security reasons I stopped using gmail account and kernel address is
now up-to-date alias to my personal address.

People periodically send me emails to address which they found in source
code of drivers, so this change reflects state where people can contact
me.

[ Added .mailmap entry as per Joe Perches  - Linus ]
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200307104237.8199-1-pali@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoselftests: kmod: test disabling module autoloading
Eric Biggers [Fri, 10 Apr 2020 21:33:57 +0000 (14:33 -0700)]
selftests: kmod: test disabling module autoloading

Test that request_module() fails with -ENOENT when
/proc/sys/kernel/modprobe contains (a) a nonexistent path, and (b) an
empty path.

Case (b) is a regression test for the patch "kmod: make request_module()
return an error when autoloading is disabled".

Tested with 'kmod.sh -t 0010 && kmod.sh -t 0011', and also simply with
'kmod.sh' to run all kmod tests.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200312202552.241885-5-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoselftests: kmod: fix handling test numbers above 9
Eric Biggers [Fri, 10 Apr 2020 21:33:53 +0000 (14:33 -0700)]
selftests: kmod: fix handling test numbers above 9

get_test_count() and get_test_enabled() were broken for test numbers
above 9 due to awk interpreting a field specification like '$0010' as
octal rather than decimal.  Fix it by stripping the leading zeroes.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200318230515.171692-5-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agodocs: admin-guide: document the kernel.modprobe sysctl
Eric Biggers [Fri, 10 Apr 2020 21:33:50 +0000 (14:33 -0700)]
docs: admin-guide: document the kernel.modprobe sysctl

Document the kernel.modprobe sysctl in the same place that all the other
kernel.* sysctls are documented.  Make sure to mention how to use this
sysctl to completely disable module autoloading, and how this sysctl
relates to CONFIG_STATIC_USERMODEHELPER.

[ebiggers@google.com: v5]
Link: http://lkml.kernel.org/r/20200318230515.171692-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200312202552.241885-4-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agofs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
Eric Biggers [Fri, 10 Apr 2020 21:33:47 +0000 (14:33 -0700)]
fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()

After request_module(), nothing is stopping the module from being
unloaded until someone takes a reference to it via try_get_module().

The WARN_ONCE() in get_fs_type() is thus user-reachable, via userspace
running 'rmmod' concurrently.

Since WARN_ONCE() is for kernel bugs only, not for user-reachable
situations, downgrade this warning to pr_warn_once().

Keep it printed once only, since the intent of this warning is to detect
a bug in modprobe at boot time.  Printing the warning more than once
wouldn't really provide any useful extra information.

Fixes: 41124db869b7 ("fs: warn in case userspace lied about modprobe return")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Cc: <stable@vger.kernel.org> [4.13+]
Link: http://lkml.kernel.org/r/20200312202552.241885-3-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agokmod: make request_module() return an error when autoloading is disabled
Eric Biggers [Fri, 10 Apr 2020 21:33:43 +0000 (14:33 -0700)]
kmod: make request_module() return an error when autoloading is disabled

Patch series "module autoloading fixes and cleanups", v5.

This series fixes a bug where request_module() was reporting success to
kernel code when module autoloading had been completely disabled via
'echo > /proc/sys/kernel/modprobe'.

It also addresses the issues raised on the original thread
(https://lkml.kernel.org/lkml/20200310223731.126894-1-ebiggers@kernel.org/T/#u)
bydocumenting the modprobe sysctl, adding a self-test for the empty path
case, and downgrading a user-reachable WARN_ONCE().

This patch (of 4):

It's long been possible to disable kernel module autoloading completely
(while still allowing manual module insertion) by setting
/proc/sys/kernel/modprobe to the empty string.

This can be preferable to setting it to a nonexistent file since it
avoids the overhead of an attempted execve(), avoids potential
deadlocks, and avoids the call to security_kernel_module_request() and
thus on SELinux-based systems eliminates the need to write SELinux rules
to dontaudit module_request.

However, when module autoloading is disabled in this way,
request_module() returns 0.  This is broken because callers expect 0 to
mean that the module was successfully loaded.

Apparently this was never noticed because this method of disabling
module autoloading isn't used much, and also most callers don't use the
return value of request_module() since it's always necessary to check
whether the module registered its functionality or not anyway.

But improperly returning 0 can indeed confuse a few callers, for example
get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit:

if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
fs = __get_fs_type(name, len);
WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
}

This is easily reproduced with:

echo > /proc/sys/kernel/modprobe
mount -t NONEXISTENT none /

It causes:

request_module fs-NONEXISTENT succeeded, but still no fs?
WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
[...]

This should actually use pr_warn_once() rather than WARN_ONCE(), since
it's also user-reachable if userspace immediately unloads the module.
Regardless, request_module() should correctly return an error when it
fails.  So let's make it return -ENOENT, which matches the error when
the modprobe binary doesn't exist.

I've also sent patches to document and test this case.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Ben Hutchings <benh@debian.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/memremap: set caching mode for PCI P2PDMA memory to WC
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:39 +0000 (14:33 -0700)]
mm/memremap: set caching mode for PCI P2PDMA memory to WC

PCI BAR IO memory should never be mapped as WB, however prior to this
the PAT bits were set WB and it was typically overridden by MTRR
registers set by the firmware.

Set PCI P2PDMA memory to be UC as this is what it currently, typically,
ends up being mapped as on x86 after the MTRR registers override the
cache setting.

Future use-cases may need to generalize this by adding flags to select
the caching type, as some P2PDMA cases may not want UC.  However, those
use-cases are not upstream yet and this can be changed when they arrive.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-8-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/memory_hotplug: add pgprot_t to mhp_params
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:36 +0000 (14:33 -0700)]
mm/memory_hotplug: add pgprot_t to mhp_params

devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory.  At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB.
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-.  In the case firmware doesn't set this
register it is effectively WB and will typically result in a machine
check exception when it's accessed.

Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.

To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().

Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
simple change to pass the pgprot_t down to their respective functions
which set up the page tables.  For x86_32, set the page tables
explicitly using _set_memory_prot() (seeing they are already mapped).

For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
should be fine, for now, seeing these architectures don't support
ZONE_DEVICE.

A check in __add_pages() is also added to ensure the pgprot parameter
was set for all arches.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agopowerpc/mm: thread pgprot_t through create_section_mapping()
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:32 +0000 (14:33 -0700)]
powerpc/mm: thread pgprot_t through create_section_mapping()

In prepartion to support a pgprot_t argument for arch_add_memory().

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-6-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agox86/mm: introduce __set_memory_prot()
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:28 +0000 (14:33 -0700)]
x86/mm: introduce __set_memory_prot()

For use in the 32bit arch_add_memory() to set the pgprot type of the
memory to add.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-5-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agox86/mm: thread pgprot_t through init_memory_mapping()
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:24 +0000 (14:33 -0700)]
x86/mm: thread pgprot_t through init_memory_mapping()

In preparation to support a pgprot_t argument for arch_add_memory().

It's required to move the prototype of init_memory_mapping() seeing the
original location came before the definition of pgprot_t.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-4-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/memory_hotplug: rename mhp_restrictions to mhp_params
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:21 +0000 (14:33 -0700)]
mm/memory_hotplug: rename mhp_restrictions to mhp_params

The mhp_restrictions struct really doesn't specify anything resembling a
restriction anymore so rename it to be mhp_params as it is a list of
extended parameters.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>