OSDN Git Service

tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit
authorSoheil Hassas Yeganeh <soheil@google.com>
Mon, 14 Sep 2020 21:52:09 +0000 (17:52 -0400)
committerDavid S. Miller <davem@davemloft.net>
Mon, 14 Sep 2020 23:58:24 +0000 (16:58 -0700)
If there was any event available on the TCP socket, tcp_poll()
will be called to retrieve all the events.  In tcp_poll(), we call
sk_stream_is_writeable() which returns true as long as we are at least
one byte below notsent_lowat.  This will result in quite a few
spurious EPLLOUT and frequent tiny sendmsg() calls as a result.

Similar to sk_stream_write_space(), use __sk_stream_is_writeable
with a wake value of 1, so that we set EPOLLOUT only if half the
space is available for write.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ipv4/tcp.c

index d3781b6..48c3518 100644 (file)
@@ -564,7 +564,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
                        mask |= EPOLLIN | EPOLLRDNORM;
 
                if (!(sk->sk_shutdown & SEND_SHUTDOWN)) {
-                       if (sk_stream_is_writeable(sk)) {
+                       if (__sk_stream_is_writeable(sk, 1)) {
                                mask |= EPOLLOUT | EPOLLWRNORM;
                        } else {  /* send SIGIO later */
                                sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk);
@@ -576,7 +576,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
                                 * pairs with the input side.
                                 */
                                smp_mb__after_atomic();
-                               if (sk_stream_is_writeable(sk))
+                               if (__sk_stream_is_writeable(sk, 1))
                                        mask |= EPOLLOUT | EPOLLWRNORM;
                        }
                } else