inet: frag: move eviction of queues to work queue

When the high_thresh limit is reached we try to toss the 'oldest'
incomplete fragment queues until memory limits are below the low_thresh
value.  This happens in softirq/packet processing context.

This has two drawbacks:

1) processors might evict a queue that was about to be completed
by another cpu, because they will compete wrt. resource usage and
resource reclaim.

2) LRU list maintenance is expensive.

But when constantly overloaded, even the 'least recently used' element is
recent, so removing 'lru' queue first is not 'fairer' than removing any
other fragment queue.

This moves eviction out of the fast path:

When the low threshold is reached, a work queue is scheduled
which then iterates over the table and removes the queues that exceed
the memory limits of the namespace. It sets a new flag called
INET_FRAG_EVICTED on the evicted queues so the proper counters will get
incremented when the queue is forcefully expired.

When the high threshold is reached, no more fragment queues are
created until we're below the limit again.

The LRU list is now unused and will be removed in a followup patch.

Joint work with Nikolay Aleksandrov.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index 9fe644d..e975032 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -32,6 +32,7 @@
 	int			meat;
 	__u8			last_in;    /* first/last segment arrived? */
 
+#define INET_FRAG_EVICTED	8
 #define INET_FRAG_COMPLETE	4
 #define INET_FRAG_FIRST_IN	2
 #define INET_FRAG_LAST_IN	1
@@ -48,7 +49,7 @@
  *	       rounded up (SKB_TRUELEN(0) + sizeof(struct ipq or
  *	       struct frag_queue))
  */
-#define INETFRAGS_MAXDEPTH		128
+#define INETFRAGS_MAXDEPTH	128
 
 struct inet_frag_bucket {
 	struct hlist_head	chain;
@@ -65,6 +66,9 @@
 	int			secret_interval;
 	struct timer_list	secret_timer;
 
+	struct work_struct	frags_work;
+	unsigned int next_bucket;
+
 	/* The first call to hashfn is responsible to initialize
 	 * rnd. This is best done with net_get_random_once.
 	 */