OSDN Git Service

sagit-ice-cold/kernel_xiaomi_msm8998.git
9 years agovirtio: document queue state logic
Michael S. Tsirkin [Thu, 2 Apr 2015 11:05:47 +0000 (13:05 +0200)]
virtio: document queue state logic

commit d631b94e7a15277858ec5f88d674d93080506999
    virtio: change comment in transmit

started clarifying the logic behind queue state management,
but introduced an inaccuracy: TX_BUSY does not cause
a BUG message.

Clean this up some more, explaining the tradeoffs in detail.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotc: bpf: add checksum helpers
Alexei Starovoitov [Thu, 2 Apr 2015 00:12:13 +0000 (17:12 -0700)]
tc: bpf: add checksum helpers

Commit 608cd71a9c7c ("tc: bpf: generalize pedit action") has added the
possibility to mangle packet data to BPF programs in the tc pipeline.
This patch adds two helpers bpf_l3_csum_replace() and bpf_l4_csum_replace()
for fixing up the protocol checksums after the packet mangling.

It also adds 'flags' argument to bpf_skb_store_bytes() helper to avoid
unnecessary checksum recomputations when BPF programs adjusting l3/l4
checksums and documents all three helpers in uapi header.

Moreover, a sample program is added to show how BPF programs can make use
of the mangle and csum helpers.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'nf-hook-compress'
David S. Miller [Sat, 4 Apr 2015 19:23:15 +0000 (15:23 -0400)]
Merge branch 'nf-hook-compress'

netfilter: Compress hook function signatures.

Currently netfilter hooks have a function signature that is huge and
has many arguments.  This propagates from the hook entry points down
into the individual hook implementations themselves.

This means that if, for example, we want to change the type of one of
these arguments then we have to touch hundreds of locations.

The main initial motivation behind this is that we'd like to change
the signature of "okfn" so that a socket pointer can be passed in (and
reference counted properly) for the sake of using the proper socket
context in the case of tunnels whilst not releasing the top level user
socket from skb->sk (and thus releasing it's socket memory quota
usage) in order to accomodate this.

This also makes it clear who actually uses 'okfn', nf_queue().  It is
absolutely critical to make this obvious because any user of 'okfn'
down in these hook chains have the be strictly audited for
escapability.  Specifically, escapability of references to objects
outside of the packet processing path.  And that's exactly what
nf_queue() does via it's packet reinjection framework.

In fact this points out a bug in Jiri's original attempt to push the
socket pointer down through netfilter's okfn.  It didn't grab and drop
a reference to the socket in net/netfilter/nf_queue.c as needed.

Furthermore, so many code paths are simplified, and should in fact be
more efficient because we aren't passing in arguments that often are
simply not used by the netfilter hook at all.

Further simplifications are probably possible, but this series takes
care of the main cases.

Unfortunately I couldn't convert ebt_do_table() because ebtables is
complete and utter crap and uses ebt_do_table() outside of the hook
call chains.  But that should not be news to anyone.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
9 years agonetfilter: Pass nf_hook_state through arpt_do_table().
David S. Miller [Sat, 4 Apr 2015 01:18:46 +0000 (21:18 -0400)]
netfilter: Pass nf_hook_state through arpt_do_table().

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Pass nf_hook_state through nft_set_pktinfo*().
David S. Miller [Sat, 4 Apr 2015 01:16:25 +0000 (21:16 -0400)]
netfilter: Pass nf_hook_state through nft_set_pktinfo*().

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Pass nf_hook_state through ip6t_do_table().
David S. Miller [Sat, 4 Apr 2015 01:09:51 +0000 (21:09 -0400)]
netfilter: Pass nf_hook_state through ip6t_do_table().

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Pass nf_hook_state through nf_nat_ipv6_{in,out,fn,local_fn}().
David S. Miller [Sat, 4 Apr 2015 01:05:07 +0000 (21:05 -0400)]
netfilter: Pass nf_hook_state through nf_nat_ipv6_{in,out,fn,local_fn}().

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Pass nf_hook_state through ipt_do_table().
David S. Miller [Sat, 4 Apr 2015 00:56:08 +0000 (20:56 -0400)]
netfilter: Pass nf_hook_state through ipt_do_table().

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Pass nf_hook_state through nf_nat_ipv4_{in,out,fn,local_fn}().
David S. Miller [Sat, 4 Apr 2015 00:51:13 +0000 (20:51 -0400)]
netfilter: Pass nf_hook_state through nf_nat_ipv4_{in,out,fn,local_fn}().

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Make nf_hookfn use nf_hook_state.
David S. Miller [Sat, 4 Apr 2015 00:32:56 +0000 (20:32 -0400)]
netfilter: Make nf_hookfn use nf_hook_state.

Pass the nf_hook_state all the way down into the hook
functions themselves.

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Use nf_hook_state in nf_queue_entry.
David S. Miller [Fri, 3 Apr 2015 20:31:01 +0000 (16:31 -0400)]
netfilter: Use nf_hook_state in nf_queue_entry.

That way we don't have to reinstantiate another nf_hook_state
on the stack of the nf_reinject() path.

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetfilter: Create and use nf_hook_state.
David S. Miller [Fri, 3 Apr 2015 20:23:58 +0000 (16:23 -0400)]
netfilter: Create and use nf_hook_state.

Instead of passing a large number of arguments down into the nf_hook()
entry points, create a structure which carries this state down through
the hook processing layers.

This makes is so that if we want to change the types or signatures of
any of these pieces of state, there are less places that need to be
changed.

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotest_rhashtable: Remove bogus max_size setting
Herbert Xu [Thu, 2 Apr 2015 04:29:50 +0000 (12:29 +0800)]
test_rhashtable: Remove bogus max_size setting

Now that resizing is completely automatic, we need to remove
the max_size setting or the test will fail.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'mvneta-sgmii'
David S. Miller [Fri, 3 Apr 2015 19:08:20 +0000 (15:08 -0400)]
Merge branch 'mvneta-sgmii'

Stas Sergeev says:

====================
mvneta: SGMII-based in-band link state signaling

Currently the fixed-link DT binding is pre-configured and
cannot be changed in run-time. This means the cable unplug
events are not being detected, and the link parameters can't
be negotiated.

The following patches are needed when mvneta is used
in fixed-link mode (without MDIO).
They add an API to fixed_phy that allows to update
status, and use that API in the mvneta driver when parsing
the SGMII in-band status.

There is also another implementation that doesn't add any API
and does everything in mvneta driver locally:
https://lkml.org/lkml/2015/3/31/327
I'll let people decide which approach is better.
No strong opinion on my side.
====================

Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomvneta: implement SGMII-based in-band link state signaling
Stas Sergeev [Wed, 1 Apr 2015 17:32:49 +0000 (20:32 +0300)]
mvneta: implement SGMII-based in-band link state signaling

When MDIO bus is unavailable (common setup for SGMII), the in-band
signaling must be used to correctly track link state.
This patch enables the in-band status delivery for link state changes, namely:
- link up/down
- link speed
- duplex full/half
fixed_phy_update_state() is used to update phy status.

CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
CC: Florian Fainelli <f.fainelli@gmail.com>
CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoadd fixed_phy_update_state() - update state of fixed_phy
Stas Sergeev [Wed, 1 Apr 2015 17:30:31 +0000 (20:30 +0300)]
add fixed_phy_update_state() - update state of fixed_phy

Currently fixed_phy uses a callback to periodically poll the link state.
This patch adds the fixed_phy_update_state() API.
It solves the following problems:
- On link state interrupt, MAC driver can't update status.
Instead it needs to provide the callback to periodically query
the HW about the link state. It is more efficient to update status
after interrupt.
- The callback needs to be unregistered before phy_disconnect(),
or otherwise it will be called with net_dev==NULL. phy_disconnect()
does not have enough info to unregister the callback automatically.
- The callback needs to be registered before of_phy_connect() to
avoid running with outdated state, but of_phy_connect() returns the
phy_device pointer, which is needed to register the callback. Registering
it before of_phy_connect() will therefore require a hack to get the
pointer earlier.

Overall, this addition makes the subsequent patch that implements
SGMII link status for mvneta, much cleaner.

CC: Florian Fainelli <f.fainelli@gmail.com>
CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoebpf: add skb->priority to offset map for usage in {cls, act}_bpf
Daniel Borkmann [Fri, 3 Apr 2015 18:52:24 +0000 (20:52 +0200)]
ebpf: add skb->priority to offset map for usage in {cls, act}_bpf

This adds the ability to read out the skb->priority from an eBPF
program, so that it can be taken into account from a tc filter
or action for the use-case where the priority is not being used
to directly override the filter classification in a qdisc, but
to tag traffic otherwise for the classifier; the priority can be
assigned from various places incl. user space, in future we may
also mangle it from an eBPF program.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agojhash: Update jhash_[321]words functions to use correct initval
Alexander Duyck [Tue, 31 Mar 2015 21:19:10 +0000 (14:19 -0700)]
jhash: Update jhash_[321]words functions to use correct initval

Looking over the implementation for jhash2 and comparing it to jhash_3words
I realized that the two hashes were in fact very different.  Doing a bit of
digging led me to "The new jhash implementation" in which lookup2 was
supposed to have been replaced with lookup3.

In reviewing the patch I noticed that jhash2 had originally initialized a
and b to JHASH_GOLDENRATIO and c to initval, but after the patch a, b, and
c were initialized to initval + (length << 2) + JHASH_INITVAL.  However the
changes in jhash_3words simply replaced the initialization of a and b with
JHASH_INITVAL.

This change corrects what I believe was an oversight so that a, b, and c in
jhash_3words all have the same value added consisting of initval + (length
<< 2) + JHASH_INITVAL so that jhash2 and jhash_3words will now produce the
same hash result given the same inputs.

Fixes: 60d509c823cca ("The new jhash implementation")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Fri, 3 Apr 2015 16:40:50 +0000 (12:40 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-04-03

This series contains updates to i40e and i40evf only.

Anjali provides a fix for verifying outer UDP receive checksum.  Also
adds helpful information to display when figuring out the cause of
HMC errors.

Mitch provides a fix to prevent a malicious or buggy VF driver from
sending an invalid index into the VSI array which could panic the host.
Cleans up the code where a function was moved, but the message did
not follow.  Adds protection to the VLAN filter list, same as the
MAC filter list, to protect from corruption if the watchdog happens
to run at the same time as a VLAN filter is being added/deleted.

Jesse changes several memcpy() statements to struct assignments which
are type safe and preferable.  Fixed a bug when skb allocation fails,
where we should not continue using the skb pointer.  Also fixed a void
function in FCoE which should not be returning anything.

Greg fixes both i40e and i40evf to set the Ethernet protocol correctly
when transmit VLAN offloads are disabled.

Shannon fixes up VLAN messages when ports are added or removed, which
were giving bogus index info.  Also aligned the message text style
with other messages in the driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetdevice: document NETDEV_TX_BUSY deprecation.
Rusty Russell [Fri, 3 Apr 2015 11:47:17 +0000 (22:17 +1030)]
netdevice: document NETDEV_TX_BUSY deprecation.

This paraphrases DaveM (and steals some of his words) explaining why
a device shouldn't return NETDEV_TX_BUSY, even though it looks so inviting
to driver authors.

See http://www.spinics.net/lists/netdev/msg322350.html

Inspired-by: David Miller <davem@davemloft.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'ipv4-null-cmp'
David S. Miller [Fri, 3 Apr 2015 16:11:15 +0000 (12:11 -0400)]
Merge branch 'ipv4-null-cmp'

Ian Morris says:

====================
ipv4: coding style - comparisons with NULL

Per the suggestion of Joe Perches, attached is a patch which aligns the
coding style in ipv4 for comparisons with NULL.

The code uses multiple different styles when comparing with NULL (I.e.
x == NULL and !x as well as x != NULL and x). Generally the latter form
is preferred in netdev and so this changes aligns the code to this style.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipv4: coding style: comparison for inequality with NULL
Ian Morris [Fri, 3 Apr 2015 08:17:27 +0000 (09:17 +0100)]
ipv4: coding style: comparison for inequality with NULL

The ipv4 code uses a mixture of coding styles. In some instances check
for non-NULL pointer is done as x != NULL and sometimes as x. x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipv4: coding style: comparison for equality with NULL
Ian Morris [Fri, 3 Apr 2015 08:17:26 +0000 (09:17 +0100)]
ipv4: coding style: comparison for equality with NULL

The ipv4 code uses a mixture of coding styles. In some instances check
for NULL pointer is done as x == NULL and sometimes as !x. !x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoi40e: Bump to version 1.3.1
Catherine Sullivan [Tue, 31 Mar 2015 07:45:06 +0000 (00:45 -0700)]
i40e: Bump to version 1.3.1

Bump.

Change-ID: I7dc88baa33264e5919bc938adf76706573209432
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40evf: Refactor VF RSS code
Anjali Singhai Jain [Tue, 31 Mar 2015 07:45:06 +0000 (00:45 -0700)]
i40evf: Refactor VF RSS code

Refactor VF RSS code to allow RSS on a single queue and eliminate
the need for the next_queue function.

Change-ID: I9253bad96b7f542ee7036e15636db0e5d58d8ef2
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40evf: protect VLAN filter list
Mitch Williams [Tue, 31 Mar 2015 07:45:05 +0000 (00:45 -0700)]
i40evf: protect VLAN filter list

The MAC filter list is protected by a critical task bit, and the VLAN
list should be protected as well. This prevents list corruption if the
watchdog happens to run at the same time as a VLAN filter is being added
or deleted.

Change-ID: Ia4867cebbbb046a1f38012771b288a634ca5882b
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: Communicate VSI id in place of VSI index to the VFs
Anjali Singhai Jain [Tue, 31 Mar 2015 07:45:05 +0000 (00:45 -0700)]
i40e: Communicate VSI id in place of VSI index to the VFs

This does not affect the Virtual channel API as such but it changes the
meaning of what is communicated to the VSI resource struct as vsi_id.
Earlier vsi_idx was being passed in, which was the index in the PF's VSI
array. Now we pass vsi_id as communicated by the FW to the driver.
This will help with future expansion of VF and FW communication.

With this in place now the VF and Virtual channel driver change to move over
to VSI id use is complete and is validated.

Change-ID: I14246ef82b3b3dc1fa76291d2dd0c05d12cedb7c
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: stop flow director on shutdown
Mitch Williams [Tue, 31 Mar 2015 07:45:04 +0000 (00:45 -0700)]
i40e: stop flow director on shutdown

In some cases, the hardware would continue to try to access the FDIR
ring after entering D3Hot state, which would cause either PCIe errors or
NMIs, depending upon system configuration.

Explicitly stop FDIR in our shutdown routine to eliminate this
possibility.

Change-ID: Ib98060d6352ec595ab9a78bfe252675a9fa5d8bc
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: fix up VXLAN messages
Shannon Nelson [Tue, 31 Mar 2015 07:45:04 +0000 (00:45 -0700)]
i40e: fix up VXLAN messages

When the VXLAN ports are added and removed, the messaging was giving some
bogus index info, the port was always '0' for the delete, and the message
text style didn't match other messages in the driver.  Also, there was an
over-use of the tertiary statement which made reading a little harder
than necessary.

Change-ID: Ie805182a697b8b4c12024403ada87fd4e4fa2358
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: Don't register/de-register apps on NIC partitions in MFP mode
Neerav Parikh [Tue, 31 Mar 2015 07:45:03 +0000 (00:45 -0700)]
i40e: Don't register/de-register apps on NIC partitions in MFP mode

Do not register or try to de-register DCB applications with the DCBNL
layer in case of NIC partitions when adapter is in MFP mode.

Change-ID: I603d042a61983a6562be471c6a2b181572504118
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e/i40evf: Set Ethernet protocol correctly when Tx VLAN offloads are disabled
Greg Rose [Tue, 31 Mar 2015 07:45:03 +0000 (00:45 -0700)]
i40e/i40evf: Set Ethernet protocol correctly when Tx VLAN offloads are disabled

If transmit VLAN HW offloads are disabled then the network stack sends up
an skb with the protocol set to 8021q. In that case to get the correct
checksum offloads we have to reset the skb protocol to the encapsulated
ethertype.

Change-ID: I903d78533de09b1c5d3ec695ee1990dd0fa5dd0d
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: warn at the right time
Mitch Williams [Tue, 31 Mar 2015 07:45:02 +0000 (00:45 -0700)]
i40e: warn at the right time

The call to pci_disable_sriov got moved, but the message about not
disabling VFs didn't move. So move it. While we're at, reword the
message a bit to make it more consistent with other driver messages.

Change-ID: I17d3e15e4fcfd5c9431a96ecb0117d728d3da18b
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: fix invalid void return in FCoE code
Jesse Brandeburg [Tue, 31 Mar 2015 07:45:02 +0000 (00:45 -0700)]
i40e: fix invalid void return in FCoE code

A function was calling i40e_tx_map with return, but tx_map returns
void, and the caller returns void, so just drop the return, and
everything is good.

Change-ID: I53fc676d517864761e7cbb8ca83f1ef0c15b1f8f
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e/i40evf: fix bug when skb allocation fails
Jesse Brandeburg [Tue, 31 Mar 2015 07:45:01 +0000 (00:45 -0700)]
i40e/i40evf: fix bug when skb allocation fails

If the skb allocation fails we should not continue using the skb
pointer.  Breaking out at the point of failure means that at the next
RX interrupt the driver will try the allocation again.

Change-ID: Iefaad69856ced7418bfd92afe55322676341f82e
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: Change some memcpys to struct assignments
Jesse Brandeburg [Tue, 31 Mar 2015 07:45:01 +0000 (00:45 -0700)]
i40e: Change some memcpys to struct assignments

Several memcpys are not necessary and can be changed to structure
assignments.  Struct assignments are always type safe so this
is preferable.

Change-ID: I7daf45a4b5e799c686b9d5c8ba9db047584ab82b
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: Print some more info to help figure out the cause of HMC error
Anjali Singhai Jain [Tue, 31 Mar 2015 07:45:01 +0000 (00:45 -0700)]
i40e: Print some more info to help figure out the cause of HMC error

HMC_ERRORINFO and HMC_ERRORDATA helps explain the cause of HMC error.

Change-ID: I053bbc175a5f4c5c3e9ec2ea7400d5c56aaa4ec1
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40e: validate VSI param from VFs
Mitch Williams [Tue, 31 Mar 2015 07:45:00 +0000 (00:45 -0700)]
i40e: validate VSI param from VFs

Validate that the VF has sent us a valid VSI index before actually using
that index. Without this code, a malicious or buggy VF driver could
panic the host by sending an invalid index into the VSI array.

Change-ID: I66a177687a0dcc281ec83e714d3813d70d18c8b4
Reported-by: Nick Nunley <nicholas.d.nunley@intel.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoi40evf: Fix Outer UDP RX checksum code
Anjali Singhai Jain [Tue, 31 Mar 2015 07:44:59 +0000 (00:44 -0700)]
i40evf: Fix Outer UDP RX checksum code

Inner protocol being UDP should not stop us from verifying Outer UDP
checksum correctness.

If the Outer protocol is not UDP (NVGRE) we should not be doing a UDP
checksum check. If the packet has zero checksum, skip checksum check.

Change-ID: Ie7f153feb276a59f66a54a0938901b2c0a8100fa
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9 years agoMerge branch 'mlx5-next'
David S. Miller [Thu, 2 Apr 2015 20:33:43 +0000 (16:33 -0400)]
Merge branch 'mlx5-next'

Eli Cohen says:

====================
mlx5 batch of patches for net-next

This series contains small fixes to the mlx5 core driver and also
preparation steps towards adding Ethernet support for ConnectX4
devices which will be part of mlx5 driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Extend struct mlx5_interface to support multiple protocols
Saeed Mahameed [Thu, 2 Apr 2015 14:07:34 +0000 (17:07 +0300)]
net/mlx5_core: Extend struct mlx5_interface to support multiple protocols

Preparation for ethernet driver.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Modify arm CQ in preparation for upcoming Ethernet driver
Saeed Mahameed [Thu, 2 Apr 2015 14:07:33 +0000 (17:07 +0300)]
net/mlx5_core: Modify arm CQ in preparation for upcoming Ethernet driver

Pass consumer index as a parameter to arm CQ

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Move completion eqs from mlx5_ib to mlx5_core
Saeed Mahameed [Thu, 2 Apr 2015 14:07:32 +0000 (17:07 +0300)]
net/mlx5_core: Move completion eqs from mlx5_ib to mlx5_core

Preparation for ethernet driver.
These functions will be used in drivers other than mlx5_ib.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Update module info macros for ConnectX4 Support
Achiad Shochat [Thu, 2 Apr 2015 14:07:31 +0000 (17:07 +0300)]
net/mlx5_core: Update module info macros for ConnectX4 Support

Declare the support of new ConnectX4 HCA in module info
Pump up the module version

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoIB/mlx5: Fix Mellanox copyright note
Saeed Mahameed [Thu, 2 Apr 2015 14:07:30 +0000 (17:07 +0300)]
IB/mlx5: Fix Mellanox copyright note

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Fix Mellanox copyright note
Saeed Mahameed [Thu, 2 Apr 2015 14:07:29 +0000 (17:07 +0300)]
net/mlx5_core: Fix Mellanox copyright note

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Fix a bug in alloc_token
Achiad Shochat [Thu, 2 Apr 2015 14:07:28 +0000 (17:07 +0300)]
net/mlx5_core: Fix a bug in alloc_token

In alloc_token(), the token '1' would be allocated twice consecutively.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Avoid usage command work entry after writing command doorbell
Ira Gusinsky [Thu, 2 Apr 2015 14:07:27 +0000 (17:07 +0300)]
net/mlx5_core: Avoid usage command work entry after writing command doorbell

Avoid usage of command work entry in cmd_work_handler since it can be released
by mlx5_cmd_invoke before the work handler returns to running.

Signed-off-by: Ira Gusinsky <irenag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Avoid copying outbox in aysnc command completion
Eli Cohen [Thu, 2 Apr 2015 14:07:26 +0000 (17:07 +0300)]
net/mlx5_core: Avoid copying outbox in aysnc command completion

Avoid copying to the output buffer in cmd_exec since this is done after the
command is completed. Failure to do this may cause cases where the callback
handler is called before the copy done by cmd_exec which then overwrites it.

Reported-by: Tamer Hleihel <tamerh@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Use coherent memory for command interface page
Eli Cohen [Thu, 2 Apr 2015 14:07:25 +0000 (17:07 +0300)]
net/mlx5_core: Use coherent memory for command interface page

Use coherent memory for the commands descriptor page. Take measures to make
sure the page is aligned to MLX5_ADAPTER_PAGE_SIZE as required by the hardware.

Reported-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Use the right inbox struct in destroy mkey command
Achiad Shochat [Thu, 2 Apr 2015 14:07:24 +0000 (17:07 +0300)]
net/mlx5_core: Use the right inbox struct in destroy mkey command

struct mlx5_query_mkey_mbox_in rather than mlx5_destroy_mkey_mbox_in

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Clear doorbell record inside mlx5_db_alloc()
Saeed Mahameed [Thu, 2 Apr 2015 14:07:23 +0000 (17:07 +0300)]
net/mlx5_core: Clear doorbell record inside mlx5_db_alloc()

Do it in one place instead of every where the function is invoked

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Avoid setting DC requestor/responder resources
Eli Cohen [Thu, 2 Apr 2015 14:07:22 +0000 (17:07 +0300)]
net/mlx5_core: Avoid setting DC requestor/responder resources

PRM does not support setting these values so avoid setting them.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Coding style fix
Eli Cohen [Thu, 2 Apr 2015 14:07:21 +0000 (17:07 +0300)]
net/mlx5_core: Coding style fix

Put a line of space before return and next statement.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Fix call to mlx5_core_qp_modify
Haggai Abramonvsky [Thu, 2 Apr 2015 14:07:20 +0000 (17:07 +0300)]
net/mlx5_core: Fix call to mlx5_core_qp_modify

Pass 0 in the sqd_event parameter.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx5_core: Allocate firmware pages from device's NUMA node
Eli Cohen [Thu, 2 Apr 2015 14:07:19 +0000 (17:07 +0300)]
net/mlx5_core: Allocate firmware pages from device's NUMA node

Allocate firmware pages from the NUMA node which is close to the device.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'tipc-next'
David S. Miller [Thu, 2 Apr 2015 20:27:13 +0000 (16:27 -0400)]
Merge branch 'tipc-next'

Jon Maloy says:

====================
tipc: remove some unnecessary complexity

The TIPC code is unnecessarily complex in some places, often because
the conditions or assumptions that were the cause for the complexity
are not valid anymore.

In these three commits, we eliminate some cases of such redundant
complexity.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: simplify link mtu negotiation
Jon Paul Maloy [Thu, 2 Apr 2015 13:33:02 +0000 (09:33 -0400)]
tipc: simplify link mtu negotiation

When a link is being established, the two endpoints advertise their
respective interface MTU in the transmitted RESET and ACTIVATE messages.
If there is any difference, the lower of the two MTUs will be selected
for use by both endpoints.

However, as a remnant of earlier attempts to introduce TIPC level
routing. there also exists an MTU discovery mechanism. If an intermediate
node has a lower MTU than the two endpoints, they will discover this
through a bisectional approach, and finally adopt this MTU for common use.

Since there is no TIPC level routing, and probably never will be,
this mechanism doesn't make any sense, and only serves to make the
link level protocol unecessarily complex.

In this commit, we eliminate the MTU discovery algorithm,and fall back
to the simple MTU advertising approach. This change is fully backwards
compatible.

Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: eliminate delayed link deletion at link failover
Jon Paul Maloy [Thu, 2 Apr 2015 13:33:01 +0000 (09:33 -0400)]
tipc: eliminate delayed link deletion at link failover

When a bearer is disabled manually, all its links have to be reset
and deleted. However, if there is a remaining, parallel link ready
to take over a deleted link's traffic, we currently delay the delete
of the removed link until the failover procedure is finished. This
is because the remaining link needs to access state from the reset
link, such as the last received packet number, and any partially
reassembled buffer, in order to perform a successful failover.

In this commit, we do instead move the state data over to the new
link, so that it can fulfill the procedure autonomously, without
accessing any data on the old link. This means that we can now
proceed and delete all pertaining links immediately when a bearer
is disabled. This saves us from some unnecessary complexity in such
situations.

We also choose to change the confusing definitions CHANGEOVER_PROTOCOL,
ORIGINAL_MSG and DUPLICATE_MSG to the more descriptive TUNNEL_PROTOCOL,
FAILOVER_MSG and SYNCH_MSG respectively.

Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotipc: drop tunneled packet duplicates at reception
Jon Paul Maloy [Thu, 2 Apr 2015 13:33:00 +0000 (09:33 -0400)]
tipc: drop tunneled packet duplicates at reception

In commit 8b4ed8634f8b3f9aacfc42b4a872d30c36b9e255
("tipc: eliminate race condition at dual link establishment")
we introduced a parallel link synchronization mechanism that
guarentees sequential delivery even for users switching from
an old to a newly established link. The new mechanism makes it
unnecessary to deliver the tunneled duplicate packets back to
the old link, as we are currently doing. It is now sufficient
to use the last tunneled packet's inner sequence number as
synchronization point between the two parallel links, whereafter
it can be dropped.

In this commit, we drop the duplicate packets arriving on the new
link, after updating the synchronization point at each new arrival.

Although it would now have been sufficient for the other endpoint
to only tunnel the last packet in its send queue, and not the
entire queue, we must still do this to maintain compatibility
with older nodes.

This commit makes it possible to get rid if some complex
interaction between the two parallel links.

Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'mlx4-next'
David S. Miller [Thu, 2 Apr 2015 20:25:27 +0000 (16:25 -0400)]
Merge branch 'mlx4-next'

Or Gerlitz says:

====================
Mellanox mlx4 driver updates

The main feature added by this series are Ido's changes to support
Granular QoS for VFs, where for the time being only max rate is supported.

Muhammad added support for setting rx-fcs and rx-all through ethtool,
and Ido did the interface identify work.

Last, add Ido as a maintainer for the mlx4 Ethernet driver!

Some of next week is the Passover holiday here and I will be
mostly OOO. If needed (...) Ido or Amir will send V1 and such.

Rebased against net-next commit 033f46b "crypto: algif -
explicitly mark end of data".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMAINTAINERS: Update mlx4_en entry
Or Gerlitz [Thu, 2 Apr 2015 13:31:23 +0000 (16:31 +0300)]
MAINTAINERS: Update mlx4_en entry

Add Ido Shamay as co-maintainer for the mlx4 Ethernet driver.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4_en: Add RX-ALL support
Muhammad Mahajna [Thu, 2 Apr 2015 13:31:22 +0000 (16:31 +0300)]
net/mlx4_en: Add RX-ALL support

Enabled when the device supports KEEP FCS and IGNORE FCS.

When the flag is set, pass all received frames up the stack,
even ones with invalid FCS, controlled by ethtool.

Signed-off-by: Muhammad Mahajna <muhammadm@mellanox.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4_en: Add RX-FCS support
Muhammad Mahajna [Thu, 2 Apr 2015 13:31:21 +0000 (16:31 +0300)]
net/mlx4_en: Add RX-FCS support

Enabled when device supports KEEP FCS. When the flag is set, Ethernet FCS
is appended to the end of the frame, controlled by ethtool.

Signed-off-by: Muhammad Mahajna <muhammadm@mellanox.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4_en: Add interface identify support
Ido Shamay [Thu, 2 Apr 2015 13:31:20 +0000 (16:31 +0300)]
net/mlx4_en: Add interface identify support

Add support for the interface ethtool identify feature.

Make the physical port LED to blink with green and yellow colors.

The device handles the LED blink by itself (synchrous use of
set_phys_id), by returning 0 to ETHTOOL_ID_ACTIVE command.

Signed-off-by: Eyal Grossman <eyalgr@mellanox.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Add SET_PORT opcode modifiers enumeration
Ido Shamay [Thu, 2 Apr 2015 13:31:19 +0000 (16:31 +0300)]
net/mlx4: Add SET_PORT opcode modifiers enumeration

The calls to SET_PORT used hard-code numbers, when supplying command's
opcode modifiers, fix that to use well defined constants.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Set enhanced QoS support by default when ETS supported
Ido Shamay [Thu, 2 Apr 2015 13:31:18 +0000 (16:31 +0300)]
net/mlx4: Set enhanced QoS support by default when ETS supported

If HCA supports ETS QoS feature, set enhanced QoS bit in init_hca as default.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Warn users of depracated QoS Firmware
Ido Shamay [Thu, 2 Apr 2015 13:31:17 +0000 (16:31 +0300)]
net/mlx4: Warn users of depracated QoS Firmware

A new capability bit was introduced in the past to to differ devices
using the QoS ETS feature. The old was deprecated since then.
If driver sees device which set only the old capabilty, it will print
warning to user suggesting to upgrade the FW.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4_en: Enable TX rate limit per VF
Ido Shamay [Thu, 2 Apr 2015 13:31:16 +0000 (16:31 +0300)]
net/mlx4_en: Enable TX rate limit per VF

Support granular QoS per VF, by implementing the ndo_set_vf_rate.

Enforce a rate limit per VF when called, and enabled only for VFs in
VST mode with user priority supported by the device.

We don't enforce VFs to be in VST mode at the moment of configuration,
but rather save the given rate limit and enforce it when the VF is
moved to VST with user priority which is supported (currently 0).

VST<->VGT or VST qos value state changes are disallowed when a rate
limit is configured. Minimum BW share is not supported yet.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Added qos_vport QP configuration in VST mode
Ido Shamay [Thu, 2 Apr 2015 13:31:15 +0000 (16:31 +0300)]
net/mlx4: Added qos_vport QP configuration in VST mode

Granular QoS per VF feature introduce a new QP field, qos_vport.

PF administrator can connect VF QPs to a certain QoS Vport, to
inherit its proporties. Connecting QPs to the default QoS Vport
(defined as 0) is always allowed, even when there are no allocated VPPs.
At this point, only the default vport is connected to QPs.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Allocate VPPs for each port on PF init
Ido Shamay [Thu, 2 Apr 2015 13:31:14 +0000 (16:31 +0300)]
net/mlx4: Allocate VPPs for each port on PF init

Initialization of granular Qos per VF mechanism.

Query the port availible VPPs and allocates those on all supported
priorities in an equal share. Allocation is done only in SRIOV mode,
when the feature is supported by the device and port type is Ethernet.

Allocation currently is done only on the default priority 0.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Query device for QoS per VF support
Ido Shamay [Thu, 2 Apr 2015 13:31:13 +0000 (16:31 +0300)]
net/mlx4: Query device for QoS per VF support

Checks in QUERY_DEV_CAP if the granular QoS per VF feature is
supported by the device. Disabled for guests.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Add mlx4_SET_VPORT_QOS implementation
Ido Shamay [Thu, 2 Apr 2015 13:31:12 +0000 (16:31 +0300)]
net/mlx4: Add mlx4_SET_VPORT_QOS implementation

Add the SET_VPORT_QOS device command, which is ntended for virtual
granular QoS configuration per VF in SRIOV mode. The SET_VPORT_QOS
command sets and queries QoS parameters of a VPort. Each priority
allowed for a VPort is assigned with a share of the BW, and a BW
limitation. QoS parameters can be modified at any time, but must be
initialized before any QP is associated with the VPort.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Add mlx4_ALLOCATE_VPP implementation
Ido Shamay [Thu, 2 Apr 2015 13:31:11 +0000 (16:31 +0300)]
net/mlx4: Add mlx4_ALLOCATE_VPP implementation

Implements device ALLOCATE_VPP command, to be used for granular QoS
configuration of VFs by the PF device. Defines and queries the amount
of VPPs assigned to each port, and the amount of VPPs assigned to each
priority of each port. Once the total VPPs are split between the priorities
of a port, they may be assigned with a share of the BW or a rate limit.

Split into two functions (get/set) whoch are supplied with
mlx4_alloc_vpp_context and physical port number.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: New file for QoS related firmware commands
Ido Shamay [Thu, 2 Apr 2015 13:31:10 +0000 (16:31 +0300)]
net/mlx4: New file for QoS related firmware commands

Create two new files fw_qos.h and fw_qos.c in mlx4_core module.

It gathers all relevant QoS firmware related commands etc, thus improving
encapsulation of the mlx4_core module. For now it contains the QoS existing
commands: mlx4_SET_PORT_SCHEDULER and mlx4_SET_PORT_PRIO2TC.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Aesthetic code changes in multi_func_init
Ido Shamay [Thu, 2 Apr 2015 13:31:09 +0000 (16:31 +0300)]
net/mlx4: Aesthetic code changes in multi_func_init

Previous vf_oper and vf_admin code created very long lines, making it hard
to read the code. Added relevant in-struct pointers to reduce code
complexity and avoid code lines spread over 80 lines. Same logic is preserved.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Make mlx4_is_eth visible inline funcion
Ido Shamay [Thu, 2 Apr 2015 13:31:08 +0000 (16:31 +0300)]
net/mlx4: Make mlx4_is_eth visible inline funcion

Currently implemented as static function in resource_tracker.c --
this change will allow other files in mlx4_core to use it as well.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4_en: Change loopback only upon feature change
Ido Shamay [Thu, 2 Apr 2015 13:31:07 +0000 (16:31 +0300)]
net/mlx4_en: Change loopback only upon feature change

Currently any change of netdev features results in a call to
mlx4_en_update_loopback_state(). Those calls are unnecessary,
and should be called only upon loopback feature change.

Also moved some of the logic into mlx4_en_update_loopback_state().

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/mlx4: Add RSS support for fragmented IP datagrams
Ido Shamay [Thu, 2 Apr 2015 13:31:06 +0000 (16:31 +0300)]
net/mlx4: Add RSS support for fragmented IP datagrams

Enable RSS support for fragmented IP packets, when device supports it.
Until now, fragmented IP packets were directed only to the default_qpn.
Since IP fragments (datagram) have no upper protocols (L3 IP packets),
hash is performed on 3-tuple - dst MAC, source IP and dest IP. The HW
makes sure that this holds for the 1st fragment too, so all fragments
go to the same QP.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Thu, 2 Apr 2015 20:16:53 +0000 (16:16 -0400)]
Merge git://git./linux/kernel/git/davem/net

Conflicts:
drivers/net/usb/asix_common.c
drivers/net/usb/sr9800.c
drivers/net/usb/usbnet.c
include/linux/usb/usbnet.h
net/ipv4/tcp_ipv4.c
net/ipv6/tcp_ipv6.c

The TCP conflicts were overlapping changes.  In 'net' we added a
READ_ONCE() to the socket cached RX route read, whilst in 'net-next'
Eric Dumazet touched the surrounding code dealing with how mini
sockets are handled.

With USB, it's a case of the same bug fix first going into net-next
and then I cherry picked it back into net.

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Linus Torvalds [Thu, 2 Apr 2015 18:30:36 +0000 (11:30 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "This time we have addition of caps for jz4740 which fixes intentional
  warning at boot.  Then we have memory leak issues in drivers using
  virt-dma by Peter on few drive"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: moxart-dma: Fix memory leak when stopping a running transfer
  dmaengine: bcm2835-dma: Fix memory leak when stopping a running transfer
  dmaengine: omap-dma: Fix memory leak when terminating running transfer
  dmaengine: edma: fix memory leak when terminating running transfers
  dmaengine: jz4740: Define capabilities

9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 2 Apr 2015 18:09:41 +0000 (11:09 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix use-after-free with mac80211 RX A-MPDU reorder timer, from
    Johannes Berg.

 2) iwlwifi leaks memory every module load/unload cycles, fix from Larry
    Finger.

 3) Need to use for_each_netdev_safe() in rtnl_group_changelink()
    otherwise we can crash, from WANG Cong.

 4) mlx4 driver does register_netdev() too early in the probe sequence,
    from Ido Shamay.

 5) Don't allow router discovery hop limit to decrease the interface's
    hop limit, from D.S. Ljungmark.

 6) tx_packets and tx_bytes improperly accounted for certain classes of
    USB network devices, fix from Ben Hutchings.

 7) ip{6}mr_rules_init() mistakenly use plain kfree to release the ipmr
    tables in the error path, they must instead use ip{6}mr_free_table().
    Fix from WANG Cong.

 8) cxgb4 doesn't properly quiesce all RX activity before unregistering
    the netdevice.  Fix from Hariprasad Shenai.

 9) Fix hash corruptions in ipvlan driver, from Jiri Benc.

10) nla_memcpy(), like a real memcpy, should fully initialize the
    destination buffer, even if the source attribute is smaller.  Fix
    from Jiri Benc.

11) Fix wrong error code returned from iucv_sock_sendmsg().  We should
    use whatever sock_alloc_send_skb() put into 'err'.  From Eugene
    Crosser.

12) Fix slab object leak on module unload in TIPC, from Ying Xue.

13) Need a READ_ONCE() when reading the cached RX socket route in
    tcp_v{4,6}_early_demux().  From Michal Kubecek.

14) Still too many problems with TPC support in the ath9k driver, so
    disable it for now.  From Felix Fietkau.

15) When in AP mode the rtlwifi driver can leak DMA mappings, fix from
    Larry Finger.

16) Missing kzalloc() failure check in gs_usb CAN driver, from Colin Ian
    King.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  cxgb4: Fix to dump devlog, even if FW is crashed
  cxgb4: Firmware macro changes for fw verison 1.13.32.0
  bnx2x: Fix kdump when iommu=on
  bnx2x: Fix kdump on 4-port device
  mac80211: fix RX A-MPDU session reorder timer deletion
  MAINTAINERS: Update Intel Wired Ethernet Driver info
  tipc: fix a slab object leak
  net/usb/r8152: add device id for Lenovo TP USB 3.0 Ethernet
  af_iucv: fix AF_IUCV sendmsg() errno
  openvswitch: Return vport module ref before destruction
  netlink: pad nla_memcpy dest buffer with zeroes
  bonding: Bonding Overriding Configuration logic restored.
  ipvlan: fix check for IP addresses in control path
  ipvlan: do not use rcu operations for address list
  ipvlan: protect against concurrent link removal
  ipvlan: fix addr hash list corruption
  net: fec: setup right value for mdio hold time
  net: tcp6: fix double call of tcp_v6_fill_cb()
  cxgb4vf: Fix sparse warnings
  netns: don't clear nsid too early on removal
  ...

9 years agoMerge branch 'netdev_iflink_remove'
David S. Miller [Thu, 2 Apr 2015 18:05:02 +0000 (14:05 -0400)]
Merge branch 'netdev_iflink_remove'

Nicolas Dichtel says:

====================
Remove iflink field from the net_device structure

The first goal of this series was to advertise the veth peer via the IFLA_LINK
attribute, but iflink was not ready for network namespaces.

The iflink of an interface should be set to its ifindex for a physical interface
and to another value (0 if not relevant) for a virtual interface.
This was not the case for some interfaces, like vxlan, bond, or bridge for
example.
There is also a risk, if the targeted interface moves to another netns, that the
ifindex changes without updating corresponding iflink fields (eg. vlan).

Moving the management of this property into virtual interface drivers allows to
better handle this last case because most of virtual interface drivers have a
pointer to the link netdevice.
Anyway, dev->iflink value was always a copy of some internal data of the virtual
interface driver, thus let's use these internal data directly.

So, this series removes the iflink field and let the drivers manage it.
Only the last patch was present in the v1, but I fully rework it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoveth: set iflink to the peer veth
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:11 +0000 (17:07 +0200)]
veth: set iflink to the peer veth

Now that the peer netns is advertised in rtnl messages, we can set this property
so that IFLA_LINK will advertise the peer ifindex. It allows the userland to get
the full veth configuration.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodev: set iflink to 0 for virtual interfaces
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:10 +0000 (17:07 +0200)]
dev: set iflink to 0 for virtual interfaces

Virtual interfaces are supposed to set an iflink value != of their ifindex.
It was not the case for some of them, like vxlan, bond or bridge.
Let's set iflink to 0 when dev->rtnl_link_ops is set.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: remove iflink field from struct net_device
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:09 +0000 (17:07 +0200)]
net: remove iflink field from struct net_device

Now that all users of iflink have the ndo_get_iflink handler available, it's
possible to remove this field.

By default, dev_get_iflink() returns the ifindex of the interface.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodsa: implement ndo_get_iflink
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:08 +0000 (17:07 +0200)]
dsa: implement ndo_get_iflink

Don't use dev->iflink anymore.

CC: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoinfiniband/ipoib: implement ndo_get_iflink
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:07 +0000 (17:07 +0200)]
infiniband/ipoib: implement ndo_get_iflink

Don't use dev->iflink anymore.

CC: Roland Dreier <roland@kernel.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipvlan: implement ndo_get_iflink
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:06 +0000 (17:07 +0200)]
ipvlan: implement ndo_get_iflink

Don't use dev->iflink anymore.

CC: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomacvlan: implement ndo_get_iflink
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:05 +0000 (17:07 +0200)]
macvlan: implement ndo_get_iflink

Don't use dev->iflink anymore.

CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agovlan: implement ndo_get_iflink
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:04 +0000 (17:07 +0200)]
vlan: implement ndo_get_iflink

Don't use dev->iflink anymore.

CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipmr,ip6mr: implement ndo_get_iflink
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:03 +0000 (17:07 +0200)]
ipmr,ip6mr: implement ndo_get_iflink

Don't use dev->iflink anymore.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipip,gre,vti,sit: implement ndo_get_iflink
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:02 +0000 (17:07 +0200)]
ipip,gre,vti,sit: implement ndo_get_iflink

Don't use dev->iflink anymore.

CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoip6tnl,gre6,vti6: implement ndo_get_iflink
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:01 +0000 (17:07 +0200)]
ip6tnl,gre6,vti6: implement ndo_get_iflink

Don't use dev->iflink anymore.

CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodev: introduce dev_get_iflink()
Nicolas Dichtel [Thu, 2 Apr 2015 15:07:00 +0000 (17:07 +0200)]
dev: introduce dev_get_iflink()

The goal of this patch is to prepare the removal of the iflink field. It
introduces a new ndo function, which will be implemented by virtual interfaces.

There is no functional change into this patch. All readers of iflink field
now call dev_get_iflink().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocrypto: algif - explicitly mark end of data
tadeusz.struk@intel.com [Wed, 1 Apr 2015 20:53:06 +0000 (13:53 -0700)]
crypto: algif - explicitly mark end of data

After the TX sgl is expanded we need to explicitly mark end of data
at the last buffer that contains data.

Changes in v2
 - use type 'bool' and true/false for 'mark'.

Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'dsa-next'
David S. Miller [Thu, 2 Apr 2015 02:55:41 +0000 (22:55 -0400)]
Merge branch 'dsa-next'

Andrew Lunn says:

====================
DSA Mavell drivers refactoring and cleanup

v1->v2:
 * Add missing signed-of-by: For patches authored by Guenter Roeck.
 * Add Reviewed by from Guenter Roack to patch #5.

This is a collection of patches again net-next from today containing
refactoring and consolidate of code, cleanups and using #define's to
replace register numbers.

Patch #1 Swaps the 6131 driver to use the consolidated setup code.

Patch #2 Moves the Switch IDs used during probe into a central
         location.  We need these later so that we can differentiate
         the different features the devices have.

Patch #3 Makes the 6131 driver set the number of ports in the private
         state structure. It then uses this, rather than hard coded
         maximum number of ports.

Patch #4 Similar to Patch #3, but for the 6123_61_65 driver.

Patch #5 Similar to Patch #3, and #4, but for all the remaining
         drivers.  This greatly increases the similarity of the code
         between drivers, allow further patches to consolidate the
         duplicated code.

Patch #6 Consolidate the switch reset code, which has two minor
         variants. Removes around 35 lines per driver.

Patch #7 Moves phy page access functions out of the 6352 driver into
         the shared code. Currently only the 6352 driver uses this,
         but it is likely other devices will come along wanting this
         functionality.

Patch #8 Consolidates the code used to access phy registers. Removes
         around 40 lines of code per driver.

Patch #9 Fixes missing mutex locking in the EEE code, and refactors
         the code a bit to make it more understandable with respect to
         locks.

Patch #10 Consolidates reading statistics. This is very similar code
          for all devices, but the number of available statistics
          differ, which can be determined from the product ID. Removes
  around 65 lines per driver.

Patch #11 Add #defines for registers, and bits within the
          registers. For the moment, this is limited to the shared
          code. The individual drivers will be converted once the
          remaining duplicated code is consolidated

Patch #12 Fix broken statistic counters on the 6172. The 6352 family
          requires the port number is poked into a different set of
          bits in the register compared to other devices.

Many thanks to Guenter Roeck for repeatedly reviewing the patches and
testing them on his hardware.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: dsa: mv88e6xxx: Fix stats counters for 6352 family
Andrew Lunn [Thu, 2 Apr 2015 02:06:40 +0000 (04:06 +0200)]
net: dsa: mv88e6xxx: Fix stats counters for 6352 family

The statistic counters for the mv88e6172 never worked. This device is
a member of the 6352 family of chips, which has a slightly different
layout of the register used for capturing statistics. Add support for
detecting this family and poking the port in the right place in the
register.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: dsa: Use mnemonics rather than register numbers
Andrew Lunn [Thu, 2 Apr 2015 02:06:39 +0000 (04:06 +0200)]
net: dsa: Use mnemonics rather than register numbers

Rather than refer to registers by number, define mnemonics. Also
define mnemonics for the commonly used bits within the registers.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: dsa: Consolidate getting the statistics
Andrew Lunn [Thu, 2 Apr 2015 02:06:38 +0000 (04:06 +0200)]
net: dsa: Consolidate getting the statistics

Reading the statistics from the hardware is the same for all
chips. What differs is the number of available statistics. Have just
one copy of the code in the shared mv88e6xxx.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: dsa: mv88e6xxx: Add missing mutex's in EEE operations.
Andrew Lunn [Thu, 2 Apr 2015 02:06:37 +0000 (04:06 +0200)]
net: dsa: mv88e6xxx: Add missing mutex's in EEE operations.

The phy_mutex should be held while reading and writing to the phy.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>