From anand@eis.iisc.ernet.in Thu Jan 1 01:36:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jan 2004 01:36:44 -0800 (PST) Received: from iisc.ernet.in ([144.16.64.3]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i019aFTa015403 for ; Thu, 1 Jan 2004 01:36:17 -0800 Received: from eis.iisc.ernet.in (eis.iisc.ernet.in [144.16.64.5]) by iisc.ernet.in (8.12.8/8.12.8) with SMTP id i019f9bk016607 for ; Thu, 1 Jan 2004 15:11:09 +0530 Received: by eis.iisc.ernet.in (SMI-8.6/SMI-4.1) id PAA19361; Thu, 1 Jan 2004 15:05:48 +0530 From: anand@eis.iisc.ernet.in (SVR Anand) Message-Id: <200401010935.PAA19361@eis.iisc.ernet.in> Subject: netfilter hook in packet capture To: netdev@oss.sgi.com Date: Thu, 1 Jan 2004 15:05:47 +0530 (GMT+05:30) X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2193 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anand@eis.iisc.ernet.in Precedence: bulk X-list: netdev Hi, I am hoping that this mail will be relevant to those who have an user level application that does firewalling by promiscuously capturing packets, apply extensive decision making rules. I have written one such application and found what I am going to say below has helped me a lot. In the above mentioned scenario, there are many instances I preferred to do the following in the kernel itself as much as possible before further processing done in the user firewall code. What I wanted was i) selective packet capture ii) DoS protection iii) Externally controlling the kernel level filter rules without disrupting the firewall application iv) Ease of filter management vi) Logging v) Performance ... I found that iptables/netfilter satisfies all my requirements as compared to the existing bpf filter. Hence all I had to do was include NF_HOOK at couple of places in af_packet.c and I have the netfilter features accessible to me. With the above trivial inclusion I am currently running my application with all the initial work done by netfilter. Thanks netfilter group! While the sanctity or appropriateness or compliance of the above patch within packet capture scheme of things can be frowned upon, since its utility has been beneficial to me, I thought I would share it with you. If you find it useful I can send you rather simple patch which perhaps you could have easily guessed it! Anand From davem@pizda.ninka.net Thu Jan 1 12:29:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jan 2004 12:29:46 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i01KTWTa002355 for ; Thu, 1 Jan 2004 12:29:32 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA10678; Thu, 1 Jan 2004 12:24:31 -0800 Date: Thu, 1 Jan 2004 12:24:30 -0800 From: "David S. Miller" To: "YOSHIFUJI Hideaki / _$B5HF#1QL@" Cc: netdev@oss.sgi.com Subject: Re: [PATCH] IPV6: kill obsolete functions Message-Id: <20040101122430.4fcf6bfd.davem@redhat.com> In-Reply-To: <20040101.154305.69668715.yoshfuji@linux-ipv6.org> References: <20040101.154305.69668715.yoshfuji@linux-ipv6.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2194 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 01 Jan 2004 15:43:05 +0900 (JST) YOSHIFUJI Hideaki / _$B5HF#1QL@ wrote: > We've migrated to ip6_append_data(). Old functions such as > ip6_frag_xmit() and ip6_build_xmit() are no longer used. > > D: kill obsolete functions. Applied, arigato Yoshfuji-san. From davem@pizda.ninka.net Thu Jan 1 12:43:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jan 2004 12:43:26 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i01KhBTa002835 for ; Thu, 1 Jan 2004 12:43:13 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA10719; Thu, 1 Jan 2004 12:38:14 -0800 Date: Thu, 1 Jan 2004 12:38:14 -0800 From: "David S. Miller" To: Bart De Schuymer Cc: shemminger@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] Fix loopback over bridge port Message-Id: <20040101123814.15fdc875.davem@redhat.com> In-Reply-To: <200312271536.46940.bdschuym@pandora.be> References: <200312271536.46940.bdschuym@pandora.be> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2195 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sat, 27 Dec 2003 15:36:46 +0100 Bart De Schuymer wrote: > I think the code (in net/ipv4/ip_output.c::ip_dev_loopback_xmit()) > __skb_pull(newskb, newskb->nh.raw - newskb->data); > is useless, as data always points to the network header at that moment. But > that's not really my territory... I think you're right about this, but the code there doesn't cause any problems either, effectively it's a NOP. One could test out whether you and I are right or not by replacing the __skb_pull() call with BUG_TRAP(skb->nh.raw == skb->data) From davem@pizda.ninka.net Thu Jan 1 12:47:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jan 2004 12:47:26 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i01KlDTa003236 for ; Thu, 1 Jan 2004 12:47:13 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA10731; Thu, 1 Jan 2004 12:42:18 -0800 Date: Thu, 1 Jan 2004 12:42:18 -0800 From: "David S. Miller" To: Jeff Garzik Cc: benh@kernel.crashing.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Problem with dev_kfree_skb_any() in 2.6.0 Message-Id: <20040101124218.258e8b73.davem@redhat.com> In-Reply-To: <3FF1B939.1090108@pobox.com> References: <1072567054.4112.14.camel@gaston> <20031227170755.4990419b.davem@redhat.com> <3FF0FA6A.8000904@pobox.com> <20031229205157.4c631f28.davem@redhat.com> <20031230051519.GA6916@gtf.org> <20031229220122.30078657.davem@redhat.com> <3FF11745.4060705@pobox.com> <20031229221345.31c8c763.davem@redhat.com> <3FF1B939.1090108@pobox.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2196 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 30 Dec 2003 12:43:21 -0500 Jeff Garzik wrote: > Luckily, I feel there is an easy solution, as shown in the attached > patch. We _already_ queue skbs in dev_kfree_skb_irq(). Therefore, > dev_kfree_skb_any() can simply use precisely that same solution. The > raise-softirq code will immediately proceed to action if we are not in > hard IRQ context, otherwise it will follow the expected path. Ok, this is reasonable and works. Though, is there any particular reason you don't like adding a "|| irqs_disabled()" check to the if statement instead? I prefer that solution better actually. From garzik@gtf.org Thu Jan 1 18:58:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jan 2004 18:58:28 -0800 (PST) Received: from havoc.gtf.org (havoc.gtf.org [63.247.75.124]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i022wETa017655 for ; Thu, 1 Jan 2004 18:58:15 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id 26E0E6611; Thu, 1 Jan 2004 21:58:07 -0500 (EST) Date: Thu, 1 Jan 2004 21:58:07 -0500 From: Jeff Garzik To: "David S. Miller" Cc: benh@kernel.crashing.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Problem with dev_kfree_skb_any() in 2.6.0 Message-ID: <20040102025807.GB3851@gtf.org> References: <1072567054.4112.14.camel@gaston> <20031227170755.4990419b.davem@redhat.com> <3FF0FA6A.8000904@pobox.com> <20031229205157.4c631f28.davem@redhat.com> <20031230051519.GA6916@gtf.org> <20031229220122.30078657.davem@redhat.com> <3FF11745.4060705@pobox.com> <20031229221345.31c8c763.davem@redhat.com> <3FF1B939.1090108@pobox.com> <20040101124218.258e8b73.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040101124218.258e8b73.davem@redhat.com> User-Agent: Mutt/1.3.28i X-archive-position: 2197 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Thu, Jan 01, 2004 at 12:42:18PM -0800, David S. Miller wrote: > On Tue, 30 Dec 2003 12:43:21 -0500 > Jeff Garzik wrote: > > > Luckily, I feel there is an easy solution, as shown in the attached > > patch. We _already_ queue skbs in dev_kfree_skb_irq(). Therefore, > > dev_kfree_skb_any() can simply use precisely that same solution. The > > raise-softirq code will immediately proceed to action if we are not in > > hard IRQ context, otherwise it will follow the expected path. > > Ok, this is reasonable and works. > > Though, is there any particular reason you don't like adding a > "|| irqs_disabled()" check to the if statement instead? > I prefer that solution better actually. Yep, in fact when I wrote the above message, I came across a couple when I was pondering... * the destructor runs in a more predictable context. * given the problem that started this thread, the 'if' test is a potentially problematic area. Why not eliminate all possibility that this problem will occur again? The only counter argument to this -- to which I have no data to answer -- is that there may be advantage to calling __kfree_skb immediately instead of deferring it slightly. I didn't think that disadvantage outweighted the above, but who knows... I can possibly be convinced otherwise. (and "otherwise" would be using || irqs_disabled()) For the users who don't know/don't care about their context, it just seemed to me that they were not a hot path like users of dev_kfree_skb() and dev_kfree_skb_irq() [unconditional] are... Jeff From scott.feldman@intel.com Thu Jan 1 23:25:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 01 Jan 2004 23:25:18 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i027P5Ta027452 for ; Thu, 1 Jan 2004 23:25:05 -0800 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by caduceus.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.12 2003/12/18 18:58:11 root Exp $) with ESMTP id i027PMgd032695; Fri, 2 Jan 2004 07:25:22 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.7 2003/12/18 18:58:10 root Exp $) with SMTP id i027PGMQ030344; Fri, 2 Jan 2004 07:25:18 GMT Received: from [134.134.3.50] ([134.134.3.50]) by fmsmsxvs042.fm.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010123240127965 ; Thu, 01 Jan 2004 23:24:01 -0800 Date: Fri, 2 Jan 2004 00:00:41 -0800 (PST) From: "Feldman, Scott" X-X-Sender: scott.feldman@localhost.localdomain Reply-To: "Feldman, Scott" To: Mirko Lindner cc: Jeff Garzik , , , , Subject: Re: [PATCH]sk98lin ethtool support In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by oss.sgi.com id i027P5Ta027452 X-archive-position: 2198 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev On Tue, 30 Dec 2003, Mirko Lindner wrote: > > Make sure you don't duplicate any ethtool functions.  We don't need a > > NIC-specific diag tool either ;-)  ethtool is the preferred method > > moving forward, as it's already shipping in most Linux distros. > > Yes, we need it ;) No kidding! This is not a tool for SW checks like > media, link or driver version checks, but a tool for HW checks like > register, PROM, MAC, PHY and some other chip and card checks. The > ethtool is a great tool, but the intention of this tool is not the same. If the tool reports the results of running the h/w checks, then you can use ETHTOOL_TEST. The summary results of all the tests is reported as PASS/FAIL. Not sure if your tool needs to do more... -scott From uucp@coruscant.gnumonks.org Fri Jan 2 04:12:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jan 2004 04:12:50 -0800 (PST) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i02CCTTa003090 for ; Fri, 2 Jan 2004 04:12:30 -0800 Received: from uucp by coruscant.gnumonks.org with local-bsmtp (Exim 4.20) id 1AcOAF-0003i6-KJ for netdev@oss.sgi.com; Fri, 02 Jan 2004 13:12:27 +0100 Received: from laforge by obroa-skai.gnumonks.org with local (Exim 3.36 #1) id 1AcOA7-0001Xr-00; Fri, 02 Jan 2004 13:12:19 +0100 Date: Fri, 2 Jan 2004 13:12:18 +0100 From: Harald Welte To: SVR Anand Cc: netdev@oss.sgi.com Subject: Re: netfilter hook in packet capture Message-ID: <20040102121218.GK3530@obroa-skai.de.gnumonks.org> References: <200401010935.PAA19361@eis.iisc.ernet.in> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="KrHCbChajFcK0yQE" Content-Disposition: inline In-Reply-To: <200401010935.PAA19361@eis.iisc.ernet.in> X-Operating-System: Linux obroa-skai.de.gnumonks.org 2.6.0-test11 X-Date: Today is Prickle-Prickle, the 72nd day of The Aftermath in the YOLD 3169 User-Agent: Mutt/1.5.4i X-archive-position: 2199 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --KrHCbChajFcK0yQE Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jan 01, 2004 at 03:05:47PM +0530, SVR Anand wrote: > While the sanctity or appropriateness or compliance of the above patch wi= thin > packet capture scheme of things can be frowned upon, since its utility ha= s=20 > been beneficial to me, I thought I would share it with you. If you find it > useful I can send you rather simple patch which perhaps you could have ea= sily > guessed it! We would definitely be interested in such a patch. Please first submit it to the netfilter development team, and we will then push it for kernel inclusion (after any changes/suggestions that we might have). Please send the patch to netfilter-devel@lists.netfilter.org Thanks a lot! > Anand --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --KrHCbChajFcK0yQE Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQE/9WAiXaXGVTD0i/8RAjuRAJ4ro+9ux1s7gHn0RPyGD/xgoNCFfACeOys6 A3nHSwc7BsqgWdF5Y1ttJb4= =Cv1I -----END PGP SIGNATURE----- --KrHCbChajFcK0yQE-- From demon@pro-linux.de Fri Jan 2 05:26:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jan 2004 05:27:05 -0800 (PST) Received: from pro-linux.de (4demon.com [217.160.186.4]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i02DQoTa006995 for ; Fri, 2 Jan 2004 05:26:51 -0800 Received: from pro-linux.de (p508B2522.dip.t-dialin.net [80.139.37.34]) by pro-linux.de (Postfix) with ESMTP id B05FE14009E; Fri, 2 Jan 2004 14:26:48 +0100 (CET) Message-ID: <3FF57ECE.5020107@pro-linux.de> Date: Fri, 02 Jan 2004 14:23:10 +0000 From: Mirko Lindner User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031210 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Feldman, Scott" Cc: Jeff Garzik , krishnakumar@naturesoft.net, mlindner@syskonnect.de, netdev@oss.sgi.com, felix@allot.com Subject: Re: [PATCH]sk98lin ethtool support References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2200 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: demon@pro-linux.de Precedence: bulk X-list: netdev Feldman, Scott wrote: > On Tue, 30 Dec 2003, Mirko Lindner wrote: > > >>>Make sure you don't duplicate any ethtool functions. We don't need a >>>NIC-specific diag tool either ;-) ethtool is the preferred method >>>moving forward, as it's already shipping in most Linux distros. >> >>Yes, we need it ;) No kidding! This is not a tool for SW checks like >>media, link or driver version checks, but a tool for HW checks like >>register, PROM, MAC, PHY and some other chip and card checks. The >>ethtool is a great tool, but the intention of this tool is not the same. > > > If the tool reports the results of running the h/w checks, then you can > use ETHTOOL_TEST. The summary results of all the tests is reported as > PASS/FAIL. Not sure if your tool needs to do more... > > -scott > > > > > > Scott, thanks for this info, but the tool reports not only the status, but also the results of a test (Example: "Register 0xxxx=xxx", PROM info...). "Problem" 2: All tests are included in the DIAG tool and not in the driver. We have approx. 100 separate tests (over 1000 individual tests) and the driver is huge enough (Support for Genesis, Yukon, Yukon-Lite, Yukon-Plus and Yukon2 chipsets). Mirko From skraw@ithnet.com Fri Jan 2 07:00:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 02 Jan 2004 07:00:34 -0800 (PST) Received: from heather-ng.ithnet.com (mail3.ithnet.com [217.64.64.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i02F0KTa014253 for ; Fri, 2 Jan 2004 07:00:21 -0800 Received: (qmail 6723 invoked by uid 0); 2 Jan 2004 12:39:14 -0000 Received: from skraw@ithnet.com by heather-ng (Processed in 0.556945 secs); 02 Jan 2004 12:39:14 -0000 X-Virus-Status: No Received: from unknown (HELO ithnet.com) (217.64.64.14) by heather-ng.ithnet.com with SMTP; 2 Jan 2004 12:39:13 -0000 X-Sender-Authentication: net64 Date: Fri, 2 Jan 2004 13:39:13 +0100 From: Stephan von Krawczynski To: netdev@oss.sgi.com Subject: Re: 2.4 and ip fragmentation question (background info) Message-Id: <20040102133913.488cd537.skraw@ithnet.com> In-Reply-To: <20031231122325.77f19143.skraw@ithnet.com> References: <20031231122325.77f19143.skraw@ithnet.com> Organization: ith Kommunikationstechnik GmbH X-Mailer: Sylpheed version 0.9.8 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2201 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: skraw@ithnet.com Precedence: bulk X-list: netdev On Wed, 31 Dec 2003 12:23:25 +0100 Stephan von Krawczynski wrote: > Hello, > > is ip fragmentation thought to work with multiple fragmented packets all with > same ID field, same source and destination address? Or can one consider this > situation as generally unsolvable and broken by application? > > Regards, > Stephan As this question obviously sounded significantly stupid enough not to be answered I may point you to this code in 2.4 include/net/ip.h: static inline void ip_select_ident(struct iphdr *iph, struct dst_entry *dst, struct sock *sk) { if (iph->frag_off&__constant_htons(IP_DF)) { /* This is only to work around buggy Windows95/2000 * VJ compression implementations. If the ID field * does not change, they drop every other packet in * a TCP stream using header compression. */ iph->id = ((sk && sk->daddr) ? htons(sk->protinfo.af_inet.id++) : 0); } else __ip_select_ident(iph, dst); } As you all know this sets the ID field inside the ip-header. Interestingly it depends on frag_off and sk->daddr field. I ran into an application (formerly for 2.2 kernel) where the author (!=me) obviously was unaware of this dependency and initialised these fields after calling ip_select_ident. The outcome was that everything runs normal during low traffic, but when more packets were transferred it looked like a increasing amount of packets got "0" as ID, because iph->frag_off was not initialised correctly and the skbs were of course not zeroed. Still this would have been no problem if these packets weren't fragmented. What I saw was that packets got corrupted during high load (because fragmentation obviously vomitted on the high rate of "ID=0" packets), but all was perfectly well during low load. Should the author have read some doc where it is clearly stated that ip_select_ident needs a more or less completely initialised ip header to work as expected? (other way round see my original question...) Regards, Stephan From Sidharth.Deshpande@fh-heidelberg.de Sat Jan 3 06:40:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 03 Jan 2004 06:40:55 -0800 (PST) Received: from proxy.fh-heidelberg.de (dnsfh.fh-heidelberg.de [193.197.74.49]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i03EefTa010720 for ; Sat, 3 Jan 2004 06:40:42 -0800 Received: from EXFBI01.dcs.fh-heidelberg.de (localhost [127.0.0.1]) by proxy.fh-heidelberg.de (Postfix) with ESMTP id 8C9B31EE5B for ; Sat, 3 Jan 2004 15:40:34 +0100 (CET) content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: netstat -s X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Date: Sat, 3 Jan 2004 15:40:34 +0100 Message-ID: <842F6B6B3410144AB9937CFECFE446DE804AF5@EXFBI01.dcs.fh-heidelberg.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: netstat -s Thread-Index: AcPSB6Kw03MxMWcRQMKrf9who8sE8g== From: "Deshpande, Sidharth (FH)" To: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id i03EefTa010720 X-archive-position: 2202 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Sidharth.Deshpande@fh-heidelberg.de Precedence: bulk X-list: netdev Hello Team, I am looking for explanation of output generated by 'netstat -s' on a Linux box. Would any of you spare some time and please let me know what it means or atleast direct me to a definitive source. Thank you Sidharth Deshpande From krishnakumar@naturesoft.net Sun Jan 4 01:58:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 04 Jan 2004 01:58:33 -0800 (PST) Received: from naturesoft.net ([203.145.184.221]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i049wITa029412 for ; Sun, 4 Jan 2004 01:58:19 -0800 Received: from gw3.laser5.co.jp ([211.5.140.198] helo=l5ac210.l5.laser5.co.jp) by naturesoft.net with asmtp (Exim 3.35 #1) id 1Ad4yy-0007Uj-00; Sun, 04 Jan 2004 15:25:41 +0530 Subject: [PATCH] r8169 ethtool support. From: "Krishnakumar. R" Reply-To: krishnakumar@naturesoft.net To: jgarzik@pobox.com Cc: Francois Romieu , netdev@oss.sgi.com Content-Type: text/plain Organization: Naturesoft Ltd Message-Id: <1073210391.3555.7.camel@l5ac210.l5.laser5.co.jp> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 04 Jan 2004 18:59:51 +0900 Content-Transfer-Encoding: 7bit X-archive-position: 2203 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krishnakumar@naturesoft.net Precedence: bulk X-list: netdev Hi, The following patch introduces ethtool support for the r8169 driver. Only the driver info operation will be supported. I dont have the hardware with me, hence has done only a compilation test. The patch is against the latest 2.6.x experimental net driver queue, which is based on 2.6.0-bk2. If found okay, please do apply. Regards, KK. Diffstat output: ---------------- r8169.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+) The patch --------- --- linux-2.6.0-bk2.netdrv.exp/drivers/net/r8169.orig.c 2004-01-04 18:47:10.000000000 +0900 +++ linux-2.6.0-bk2.netdrv.exp/drivers/net/r8169.c 2004-01-04 18:43:06.000000000 +0900 @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -382,6 +383,19 @@ return value; } +static void rtl8169_get_drvinfo (struct net_device *dev, struct ethtool_drvinfo *info) +{ + struct rtl8169_private *tp = dev->priv; + + strcpy (info->driver, RTL8169_DRIVER_NAME); + strcpy (info->version, RTL8169_VERSION ); + strcpy (info->bus_info, pci_name(tp->pci_dev)); +} + +static struct ethtool_ops rtl8169_ethtool_ops = { + .get_drvinfo = rtl8169_get_drvinfo, +}; + static void rtl8169_write_gmii_reg_bit(void *ioaddr, int reg, int bitnum, int bitval) { @@ -793,6 +807,7 @@ dev->open = rtl8169_open; dev->hard_start_xmit = rtl8169_start_xmit; dev->get_stats = rtl8169_get_stats; + dev->ethtool_ops = &rtl8169_ethtool_ops; dev->stop = rtl8169_close; dev->tx_timeout = rtl8169_tx_timeout; dev->set_multicast_list = rtl8169_set_rx_mode; -- Home Page: http://puggy.symonds.net/~krishnakumar/ From sekiya@wide.ad.jp Sun Jan 4 08:06:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 04 Jan 2004 08:06:36 -0800 (PST) Received: from yui.nc.u-tokyo.ac.jp (yui.nc.u-tokyo.ac.jp [130.69.251.116]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i04G6LTa021664 for ; Sun, 4 Jan 2004 08:06:22 -0800 Received: from anzu.nc.u-tokyo.ac.jp (anzu.nc.u-tokyo.ac.jp [130.69.251.114]) (authenticated bits=0) by yui.nc.u-tokyo.ac.jp (8.12.10/8.12.3/Debian-6.4) with ESMTP id i04G6EYl011355 for ; Mon, 5 Jan 2004 01:06:15 +0900 Date: Mon, 05 Jan 2004 01:06:09 +0900 Message-ID: From: Yuji Sekiya To: netdev@oss.sgi.com Subject: USAGI STABLE Release 5 User-Agent: Wanderlust/2.8.1 (Something) SEMI/1.14.3 (=?ISO-2022-JP?B?GyRCNW0lTkMrGyhC?=) FLIM/1.14.4 (=?ISO-2022-JP?B?GyRCM2A4Nj9ANVxBMBsoQg==?=) APEL/10.3 Emacs/20.7 (i386-vine-linux-gnu) MULE/4.1 (=?ISO-2022-JP?B?GyRCMCobKEI=?=) Organization: USAGI Project MIME-Version: 1.0 (generated by SEMI 1.14.3 - =?ISO-2022-JP?B?IhskQjVtGyhC?= =?ISO-2022-JP?B?GyRCJU5DKxsoQiI=?=) Content-Type: text/plain; charset=US-ASCII X-archive-position: 2204 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sekiya@sfc.wide.ad.jp Precedence: bulk X-list: netdev A Happy New Year ! We are glad to announce the USAGI STABLE RELEASE 5, dated on January 4th, 2004. This is the last major STABLE release based on linux-2.4 kernel from USAGI Project. Our primary target of development is moved to linux-2.6 kernel. Changes from the STABLE RELEASE 4.1 are: * based on linux-2.4.21, * IPsec transport/tunnel mode support, * Mobile IPv6 support, * fixed interaction between IPsec and Mobile IPv6, * Default Router Preference Support, and * Route Information Option Support. However, the IPsec and Mobile IPv6 implementations included in this STABLE release may not developed further. Because the IPsec stack was written for linux-2.4 kernel by USAGI Project and currently we have rewritten NEW IPsec stack for linux-2.6 kernel and the stack is included linux-2.6 mainline kernel. The IPsec stack included linux-2.6 kernel is implemented based on "xfrm" architecture and we will continue developing based on the NEW IPsec stack. The Mobile IPv6 stack is the same situation as the IPsec. The Mobile IPv6 stack included in this release is developed by HUT GO Project and the stack is mainly written for linux-2.4 kernel. Currently USAGI Project and GO Project have started joint project for developing new Mobile IPv6 stack based on linux-2.6 kernel. You can get our complete kit which includes kernel tree, library and applications from . We also provide separate patches against the main-line kernel and the tools . Many of our efforts are already in mainline kernel tree. We will continue making reasonable size patches and trying to merge it into mainline kernel tree. We announce the latest information on our web pages. Please check our web site . We also manage the mailing lists for USAGI users. If you have questions, please join the mailing list. Comments and advises are also welcome on that mailing list. Please visit for further information. Thanks. About USAGI Project The USAGI Project is managed by volunteers and aims to provide better IPv6 environment on Linux freely. We are tightly collaborating with WIDE Project, KAME Project and TAHI Project, and trying to improve Linux kernel, IPv6 related libraries and IPv6 applications. Our snapshots are released every two weeks and stable release is released several times a year. Please check our web site http://www.linux-ipv6.org for the latest information. -- USAGI Project members From brad@mainstreetsoftworks.com Sun Jan 4 09:38:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 04 Jan 2004 09:38:34 -0800 (PST) Received: from nameserver1.mcve.com (nameserver1.brainwerkz.net [209.251.159.130]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i04Hc8Ta026827 for ; Sun, 4 Jan 2004 09:38:09 -0800 Received: from mainstreetsoftworks.com (ip68-105-173-45.ga.at.cox.net [68.105.173.45]) by nameserver1.mcve.com (Postfix) with ESMTP id 00D3C85B43; Sun, 4 Jan 2004 12:03:02 -0500 (EST) Message-ID: <3FF846C3.5070207@mainstreetsoftworks.com> Date: Sun, 04 Jan 2004 12:00:51 -0500 From: Brad House User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031121 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu Cc: netdev@oss.sgi.com, Jeff Garzik , brad_mssw@gentoo.org Subject: r8169 in netdev experimental References: <20031122183001.GA16993@gtf.org> <20031124000939.A456@electric-eye.fr.zoreil.com> <20031126004550.A25408@electric-eye.fr.zoreil.com> <20031127235143.A16767@electric-eye.fr.zoreil.com> <20031130014738.A2589@electric-eye.fr.zoreil.com> In-Reply-To: <20031130014738.A2589@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2205 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: brad@mainstreetsoftworks.com Precedence: bulk X-list: netdev Ok, sorry I dropped out of existance for a while. I just tried the 2.6.0-bk2 netdev experimental patches, and the r8169 module locks the system on loading. Funny thing is I had to unplug the power cable from the computer for a few seconds and plug it back in, because an immediate reset wouldn't let to old driver work :/ Let me know where to start debugging this, as I should have some time here. (Been busy getting AMD64 port into 'official' mode for Gentoo, so I haven't had time to look into this too much...) Thanks -Brad Francois Romieu wrote: > Hopefully last round of Brad/Realtek's merging. > > Patches apply in this order: > 1 - r8169-hw_start.patch > 2 - r8169-missing-tx-stats.patch > 3 - r8169-intr_mask.patch > > on top of: > > 2.6.0-test11 > + 2.6.0-test9-bk25-netdrvr-exp1 > + r8169-mac-phy-version > + r8169-init_one > + r8169-timer > > The unconditional calls to rtl8169_{rx/tx}_interrupt in rtl8169_interrupt() > are not integrated. That should not make a huge difference. > > -- > Ueimor > > > ------------------------------------------------------------------------ > > > Merge of changes from Realtek: > - register voodoo in rtl8169_hw_start(). > > > drivers/net/r8169.c | 6 ++++++ > 1 files changed, 6 insertions(+) > > diff -puN drivers/net/r8169.c~r8169-hw_start drivers/net/r8169.c > --- linux-2.6.0-test11/drivers/net/r8169.c~r8169-hw_start 2003-11-29 20:36:12.000000000 +0100 > +++ linux-2.6.0-test11-fr/drivers/net/r8169.c 2003-11-29 20:44:17.000000000 +0100 > @@ -1028,6 +1028,12 @@ rtl8169_hw_start(struct net_device *dev) > RTL_W32(TxConfig, > (TX_DMA_BURST << TxDMAShift) | (InterFrameGap << > TxInterFrameGapShift)); > + RTL_W16(CPlusCmd, RTL_R16(CPlusCmd)); > + > + if (tp->mac_version == RTL_GIGA_MAC_VER_D) { > + dprintk(KERN_INFO PFX "Set MAC Reg C+CR Offset 0xE0: bit-3 and bit-14 MUST be 1\n"); > + RTL_W16(CPlusCmd, RTL_R16(CPlusCmd) | (1 << 14) | (1 << 3)); > + } > > tp->cur_rx = 0; > > > _ > > > ------------------------------------------------------------------------ > > > Driver forgot to update the transmitted bytes counter. > Originally done in rtl8169_start_xmit() by Realtek. > > > drivers/net/r8169.c | 5 ++++- > 1 files changed, 4 insertions(+), 1 deletion(-) > > diff -puN drivers/net/r8169.c~r8169-missing-tx-stats drivers/net/r8169.c > --- linux-2.6.0-test11/drivers/net/r8169.c~r8169-missing-tx-stats 2003-11-29 22:34:10.000000000 +0100 > +++ linux-2.6.0-test11-fr/drivers/net/r8169.c 2003-11-30 00:26:09.000000000 +0100 > @@ -1303,10 +1303,13 @@ rtl8169_tx_interrupt(struct net_device * > int cur = dirty_tx % NUM_TX_DESC; > struct sk_buff *skb = tp->Tx_skbuff[cur]; > > + /* FIXME: is it really accurate for TxErr ? */ > + tp->stats.tx_bytes += skb->len >= ETH_ZLEN ? > + skb->len : ETH_ZLEN; > + tp->stats.tx_packets++; > rtl8169_unmap_tx_skb(tp->pci_dev, tp->Tx_skbuff + cur, > tp->TxDescArray + cur); > dev_kfree_skb_irq(skb); > - tp->stats.tx_packets++; > dirty_tx++; > tx_left--; > entry++; > > _ > > > ------------------------------------------------------------------------ > > drivers/net/r8169.c | 7 ++----- > 1 files changed, 2 insertions(+), 5 deletions(-) > > diff -puN drivers/net/r8169.c~r8169-intr_mask drivers/net/r8169.c > --- linux-2.6.0-test11/drivers/net/r8169.c~r8169-intr_mask 2003-11-30 01:16:48.000000000 +0100 > +++ linux-2.6.0-test11-fr/drivers/net/r8169.c 2003-11-30 01:18:22.000000000 +0100 > @@ -334,8 +334,7 @@ static void rtl8169_tx_timeout(struct ne > static struct net_device_stats *rtl8169_get_stats(struct net_device *netdev); > > static const u16 rtl8169_intr_mask = > - SYSErr | PCSTimeout | RxUnderrun | RxOverflow | RxFIFOOver | TxErr | TxOK | > - RxErr | RxOK; > + RxUnderrun | RxOverflow | RxFIFOOver | TxErr | TxOK | RxErr | RxOK; > static const unsigned int rtl8169_rx_config = > (RX_FIFO_THRESH << RxCfgFIFOShift) | (RX_DMA_BURST << RxCfgDMAShift); > > @@ -1445,9 +1444,7 @@ rtl8169_interrupt(int irq, void *dev_ins > RTL_W16(IntrStatus, > (status & RxFIFOOver) ? (status | RxOverflow) : status); > > - if ((status & > - (SYSErr | PCSTimeout | RxUnderrun | RxOverflow | RxFIFOOver > - | TxErr | TxOK | RxErr | RxOK)) == 0) > + if (!(status & rtl8169_intr_mask)) > break; > > // Rx interrupt > > _ From romieu@fr.zoreil.com Sun Jan 4 14:40:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 04 Jan 2004 14:40:46 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i04MeWTa007487 for ; Sun, 4 Jan 2004 14:40:33 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id i04McosW003611; Sun, 4 Jan 2004 23:38:50 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id i04McnH2003610; Sun, 4 Jan 2004 23:38:49 +0100 Date: Sun, 4 Jan 2004 23:38:49 +0100 From: Francois Romieu To: Brad House Cc: netdev@oss.sgi.com, Jeff Garzik , brad_mssw@gentoo.org Subject: Re: r8169 in netdev experimental Message-ID: <20040104233849.A3214@electric-eye.fr.zoreil.com> References: <20031122183001.GA16993@gtf.org> <20031124000939.A456@electric-eye.fr.zoreil.com> <20031126004550.A25408@electric-eye.fr.zoreil.com> <20031127235143.A16767@electric-eye.fr.zoreil.com> <20031130014738.A2589@electric-eye.fr.zoreil.com> <3FF846C3.5070207@mainstreetsoftworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3FF846C3.5070207@mainstreetsoftworks.com>; from brad@mainstreetsoftworks.com on Sun, Jan 04, 2004 at 12:00:51PM -0500 X-Organisation: Land of Sunshine Inc. X-archive-position: 2206 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Brad House : [...] > Let me know where to start debugging this, as I should > have some time here. static u32 rtl8169_rx_fill(struct rtl8169_private *tp, struct net_device *dev, u32 start, u32 end) { u32 cur; for (cur = start; end - start > 0; cur++) { ^^^^^ This should read: for (cur = start; end - cur > 0; cur++) { Care to test 2.6.1-rc1-mm1 and simply change the offending line ? -- Ueimor From scott.feldman@intel.com Sun Jan 4 16:07:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 04 Jan 2004 16:07:38 -0800 (PST) Received: from hermes-pilot.fm.intel.com (fmr99.intel.com [192.55.52.32]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0507OTa008998 for ; Sun, 4 Jan 2004 16:07:25 -0800 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by hermes-pilot.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.12 2003/12/18 18:58:11 root Exp $) with ESMTP id i0504Pmu031989; Mon, 5 Jan 2004 00:04:25 GMT Received: from fmsmsxvs041.fm.intel.com (fmsmsxvs041.fm.intel.com [132.233.42.126]) by petasus.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.7 2003/12/18 18:58:10 root Exp $) with SMTP id i0508ScU010246; Mon, 5 Jan 2004 00:08:32 GMT Received: from [134.134.3.50] ([134.134.3.50]) by fmsmsxvs041.fm.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010416071211741 ; Sun, 04 Jan 2004 16:07:12 -0800 Date: Sun, 4 Jan 2004 16:43:48 -0800 (PST) From: "Feldman, Scott" X-X-Sender: scott.feldman@localhost.localdomain Reply-To: "Feldman, Scott" To: Jeff Garzik cc: netdev@oss.sgi.com, "Feldman, Scott" , Subject: [e1000 2.6-exp] back out CSA interrupt fix Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2207 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev * 8086:1019 82547 CSA-based LOMs lock up the system with this code, so let's revert back to what's in 2.6.0 until we can figure out why this is causing problems. --------- --- net-drivers-2.5-exp/drivers/net/e1000/e1000_main.c.orig 2004-01-04 15:58:59.000000000 -0800 +++ net-drivers-2.5-exp/drivers/net/e1000/e1000_main.c 2004-01-04 15:59:32.000000000 -0800 @@ -2097,26 +2097,10 @@ __netif_rx_schedule(netdev); } #else - /* Writing IMC and IMS is needed for 82547. - Due to Hub Link bus being occupied, an interrupt - de-assertion message is not able to be sent. - When an interrupt assertion message is generated later, - two messages are re-ordered and sent out. - That causes APIC to think 82547 is in de-assertion - state, while 82547 is in assertion state, resulting - in dead lock. Writing IMC forces 82547 into - de-assertion state. - */ - if(hw->mac_type == e1000_82547 || hw->mac_type == e1000_82547_rev_2) - e1000_irq_disable(adapter); - for(i = 0; i < E1000_MAX_INTR; i++) if(!e1000_clean_rx_irq(adapter) & !e1000_clean_tx_irq(adapter)) break; - - if(hw->mac_type == e1000_82547 || hw->mac_type == e1000_82547_rev_2) - e1000_irq_enable(adapter); #endif return IRQ_HANDLED; From torvalds@osdl.org Sun Jan 4 18:30:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 04 Jan 2004 18:30:43 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i052UTTa013877 for ; Sun, 4 Jan 2004 18:30:29 -0800 Received: from localhost (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id i052ULM15312; Sun, 4 Jan 2004 18:30:21 -0800 Date: Sun, 4 Jan 2004 18:30:21 -0800 (PST) From: Linus Torvalds To: Erik Hensema cc: netdev@oss.sgi.com, "David S. Miller" Subject: Re: 2.6.0: something is leaking memory In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2208 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: torvalds@osdl.org Precedence: bulk X-list: netdev On Sun, 4 Jan 2004, Erik Hensema wrote: > Linus Torvalds (torvalds@osdl.org) wrote: > > > > Can you do /proc/slabinfo too? > > Sure, this is of course my currently running system, 4 days, 9:53 > uptime. > > slabinfo - version: 2.0 > # name : tunables : slabdata > tcp6_sock 19729 19732 1024 4 1 : tunables 54 27 0 : slabdata 4933 4933 0 You've got 19 _megabytes_ allocated to "tcp6_sock", and they are all marked as "active". That's almost certainly the leaking bug. Everything else looks reasonably normal. > I do use IPv6. I've got three active tunnels and native IPv6 over > ethernet. Yeah, but there is no way you have 19 MB worth of sockets active for three tunnels. David? Linus From yoshfuji@linux-ipv6.org Sun Jan 4 20:52:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 04 Jan 2004 20:52:53 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.135.30]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i054qdTa020159 for ; Sun, 4 Jan 2004 20:52:40 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (Postfix) with ESMTP id ABE3033CA5; Mon, 5 Jan 2004 13:52:52 +0900 (JST) Date: Mon, 05 Jan 2004 13:52:52 +0900 (JST) Message-Id: <20040105.135252.07995935.yoshfuji@linux-ipv6.org> To: erik@hensema.net Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: (usagi-core 16947) Re: 2.6.0: something is leaking memory From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 2209 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article (at Sun, 4 Jan 2004 21:31:26 +0000 (UTC)), Erik Hensema says: > > Can you do /proc/slabinfo too? > > Sure, this is of course my currently running system, 4 days, 9:53 > uptime. > > slabinfo - version: 2.0 > # name : tunables : slabdata : > tcp6_sock 19729 19732 1024 4 1 : tunables 54 27 0 : slabdata 4933 4933 0 : > > Clearly the memory leak isn't in the page cache, so the most likely source > > is network buffers, and most likely in iptables connection tracking or > > similar. If you actually _use_ IPv6, then that is also more likely to have > > leaks just due to less testing. > > I do use IPv6. I've got three active tunnels and native IPv6 over > ethernet. > > I've always had problems with nscd leaking filedescriptors, all > IPv6 connections to my LDAP server. This started after upgrading > suse 8.0 to 8.2 (I think the problem is in nss_ldap). > I'm restarting nscd using a cronjob every night now. Output of > netstat --inet6 -avpn is below. All sockets in CLOSE_WAIT are > leaked and will go away after a nscd restart. How about /proc/slabinfo just after restarting nss_ldap? > The server isn't very critical, but I do need it. I'm willing to > try some patches (or do an upgrade to -mm), but nothing to wild. > > netstat --inet6 -avpn > > Active Internet connections (servers and established) > Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name > tcp 0 0 :::22 :::* LISTEN 1208/sshd > tcp 0 0 :::119 :::* LISTEN 1364/innd > tcp 0 0 :::25 :::* LISTEN 1433/sendmail: acce > tcp 0 0 :::953 :::* LISTEN 1175/named > tcp 0 0 ::1:6010 :::* LISTEN 19900/sshd > tcp 0 0 ::1:6011 :::* LISTEN 20150/sshd > tcp 1 0 ::1:50565 2001:888:10a1::1:389 CLOSE_WAIT 26536/nscd > tcp 1 0 ::1:50224 2001:888:10a1::1:389 CLOSE_WAIT 26536/nscd > tcp 0 0 2001:888:10a1::1:389 ::1:55936 ESTABLISHED 1145/slapd > tcp 1 0 ::1:50343 2001:888:10a1::1:389 CLOSE_WAIT 26536/nscd > tcp 1 0 ::1:50988 2001:888:10a1::1:389 CLOSE_WAIT 26536/nscd : There're too many sockets in CLOSE_WAIT, but the number is very different from "tcp6_sock." And, what is happened when you use ipv4 in your nscd? -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From davem@pizda.ninka.net Sun Jan 4 20:54:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 04 Jan 2004 20:54:16 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i054s2Ta020386 for ; Sun, 4 Jan 2004 20:54:03 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id UAA04880; Sun, 4 Jan 2004 20:48:34 -0800 Date: Sun, 4 Jan 2004 20:48:34 -0800 From: "David S. Miller" To: Linus Torvalds Cc: erik@hensema.net, netdev@oss.sgi.com Subject: Re: 2.6.0: something is leaking memory Message-Id: <20040104204834.40b6ca51.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2210 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Sun, 4 Jan 2004 18:30:21 -0800 (PST) Linus Torvalds wrote: > You've got 19 _megabytes_ allocated to "tcp6_sock", and they are all > marked as "active". That's almost certainly the leaking bug. > > Everything else looks reasonably normal. ... > David? Fixed by changeset 1.1496.16.1 which is in 2.6.1-rc1 From madis@cyber.ee Mon Jan 5 04:59:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 04:59:30 -0800 (PST) Received: from alien (pc24.host2.starman.ee [62.65.194.24]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05CxFTa015577 for ; Mon, 5 Jan 2004 04:59:17 -0800 Received: from mzz (helo=localhost) by alien with local-esmtp (Exim 3.36 #1 (Debian)) id 1AdUJl-0000Go-00; Mon, 05 Jan 2004 14:58:49 +0200 Date: Mon, 5 Jan 2004 14:58:48 +0200 (EET) From: madis X-X-Sender: mzz@alien To: Carl-Daniel Hailfinger cc: Madis Janson , netdev@oss.sgi.com Subject: Re: forcedeth unknown events 0x21 In-Reply-To: <3FE6678B.5070006@gmx.net> Message-ID: References: <3FE6678B.5070006@gmx.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2211 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: madis@cyber.ee Precedence: bulk X-list: netdev On Mon, 22 Dec 2003, Carl-Daniel Hailfinger wrote: > Madis Janson wrote: > > When trying forcedeth_2_6_patch_v19.txt, i got the following message > > fastly repeating after ifup eth0: > > > > "eth0: received irq with unknown events 0x21. Please report" > > > > and it didn't work eighter (ifup just stalled). > > > > kernel: > > > > 2.6.0 release + patches: > > http://www.held.org.il/patches/patch-lirc-2.6.0-test9-oh.diff.bz2 > > http://www.hailfinger.org/carldani/linux/patches/forcedeth/forcedeth_2_6_patch_v19.txt > > Please try the attached patch on top of it and report back if it works. > the unknown events message disappeared, but the network did not start to work... ======================================================== + if (events & (NVREG_IRQ_RX_ERR)) { + dprintk(KERN_DEBUG "%s: received irq with events 0x%x. Probably RX fail.\n", + dev->name, events); here were '}' missing. -- mzz From mostrows@watson.ibm.com Mon Jan 5 05:08:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 05:08:13 -0800 (PST) Received: from brick.watson.ibm.com (yktgi01e0-s4.watson.ibm.com [129.34.20.23]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05D7xTa016274 for ; Mon, 5 Jan 2004 05:08:00 -0800 Received: by brick.watson.ibm.com (Postfix, from userid 9965) id E5641C024; Mon, 5 Jan 2004 08:07:58 -0500 (EST) Subject: Deadlock in sungem/ip_auto_config/linkwatch From: Michal Ostrowski To: netdev@oss.sgi.com Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-VstyQxFiTA9BMsLFyv50" Message-Id: <1073307882.2041.98320.camel@brick.watson.ibm.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Mon, 05 Jan 2004 08:07:58 -0500 X-archive-position: 2212 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mostrows@watson.ibm.com Precedence: bulk X-list: netdev --=-VstyQxFiTA9BMsLFyv50 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable I believe I've found a potential deadlock condition. It occurs when we make the following sequence of calls: ip_auto_config ic_open_devs dev_change_flags dev_open gem_open flush_scheduled_work ic_open_devs grabs rtnl_sem with an rtnl_shlock() call. The sungem driver at some point calls gem_init_one, which calls netif_carrier_*, which in turn calls schedule_work (linkwatch_event). linkwatch_event in turn needs rtnl_sem. If we enter the call sequence above and linkwatch_event is still pending, we will deadlock since flush_scheduled_work will wait for completion of linkwatch_event, which is blocked since it cannot get rtnl_sem. In general when can one call flush_scheduled_work? It seems that one can't unless you know your callers aren't holding any locks. --=20 Michal Ostrowski --=-VstyQxFiTA9BMsLFyv50 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQA/+WDqDMDCqU5zPMARAmTwAJ9ujgnF4BI4AVvkhSQ5YgJXBx+sXACfXT+z rx8XL7PYFznObN8IXIZqAW8= =HBFR -----END PGP SIGNATURE----- --=-VstyQxFiTA9BMsLFyv50-- From amir.noam@intel.com Mon Jan 5 07:25:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:25:51 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FPPTa021258 for ; Mon, 5 Jan 2004 07:25:27 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i05FPBiI018818; Mon, 5 Jan 2004 15:25:11 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i05FOkwq008552; Mon, 5 Jan 2004 15:25:11 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010517251024220 ; Mon, 05 Jan 2004 17:25:10 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i05FP8hb001043; Mon, 5 Jan 2004 17:25:09 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 0/3] [bonding 2.4] Using per-bond parameters Date: Mon, 5 Jan 2004 17:25:08 +0200 User-Agent: KMail/1.5.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051725.08348.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2213 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev The following patch set makes each bonding interface keep its own set of parameters, rather than use global values. This is the first step necessary to allow the configuration of different parameters to different bonding interfaces. The patches are against the netdev-2.4 tree (after Shmulik's 'update comment blocks' patch). -- Amir From amir.noam@intel.com Mon Jan 5 07:26:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:27:04 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FQmTa021453 for ; Mon, 5 Jan 2004 07:26:49 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i05FQYiI019025; Mon, 5 Jan 2004 15:26:34 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i05FQYwi008750; Mon, 5 Jan 2004 15:26:34 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010517263324268 ; Mon, 05 Jan 2004 17:26:33 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i05FQXhb001058; Mon, 5 Jan 2004 17:26:33 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 1/3] [bonding 2.4] Save parameters in a per-bond data structure Date: Mon, 5 Jan 2004 17:26:26 +0200 User-Agent: KMail/1.5.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051726.33613.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2214 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev - Save the bonding parameters in a per-bond data structure. - Move all handling of the insmod parameters to bond_check_params(). - Fix the handling of some warning messages regarding parameter use. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Mon Jan 5 17:05:01 2004 +++ b/drivers/net/bonding/bond_main.c Mon Jan 5 17:05:02 2004 @@ -509,7 +509,6 @@ /* monitor all links that often (in milliseconds). <=0 disables monitoring */ #define BOND_LINK_MON_INTERV 0 #define BOND_LINK_ARP_INTERV 0 -#define MAX_ARP_IP_TARGETS 16 static int max_bonds = BOND_DEFAULT_MAX_BONDS; static int miimon = BOND_LINK_MON_INTERV; @@ -520,7 +519,7 @@ static char *mode = NULL; static char *primary = NULL; static char *lacp_rate = NULL; static int arp_interval = BOND_LINK_ARP_INTERV; -static char *arp_ip_target[MAX_ARP_IP_TARGETS] = { NULL, }; +static char *arp_ip_target[BOND_MAX_ARP_TARGETS] = { NULL, }; MODULE_PARM(max_bonds, "i"); MODULE_PARM_DESC(max_bonds, "Max number of bonded devices"); @@ -540,7 +539,7 @@ MODULE_PARM(lacp_rate, "s"); MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner (slow/fast)"); MODULE_PARM(arp_interval, "i"); MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds"); -MODULE_PARM(arp_ip_target, "1-" __MODULE_STRING(MAX_ARP_IP_TARGETS) "s"); +MODULE_PARM(arp_ip_target, "1-" __MODULE_STRING(BOND_MAX_ARP_TARGETS) "s"); MODULE_PARM_DESC(arp_ip_target, "arp targets in n.n.n.n form"); /*----------------------------- Global variables ----------------------------*/ @@ -554,7 +553,7 @@ static LIST_HEAD(bond_dev_list); static struct proc_dir_entry *bond_proc_dir = NULL; #endif -static u32 arp_target[MAX_ARP_IP_TARGETS] = { 0, } ; +static u32 arp_target[BOND_MAX_ARP_TARGETS] = { 0, } ; static int arp_ip_count = 0; static u32 my_ip = 0; static int bond_mode = BOND_MODE_ROUNDROBIN; @@ -590,6 +589,10 @@ static struct bond_parm_tbl bond_mode_tb { NULL, -1}, }; +/*-------------------------- Forward declarations ---------------------------*/ + +static inline void bond_set_mode_ops(struct net_device *bond_dev, int mode); + /*---------------------------- General routines -----------------------------*/ static const char *bond_mode_name(void) @@ -2236,7 +2239,7 @@ static void bond_arp_send_all(struct sla { int i; - for (i = 0; (idev, my_ip, NULL, slave->dev->dev_addr, NULL); @@ -3755,13 +3758,47 @@ static int bond_accept_fastpath(struct n /*------------------------- Device initialization ---------------------------*/ /* + * set bond mode specific net device operations + */ +static inline void bond_set_mode_ops(struct net_device *bond_dev, int mode) +{ + switch (mode) { + case BOND_MODE_ROUNDROBIN: + bond_dev->hard_start_xmit = bond_xmit_roundrobin; + break; + case BOND_MODE_ACTIVEBACKUP: + bond_dev->hard_start_xmit = bond_xmit_activebackup; + break; + case BOND_MODE_XOR: + bond_dev->hard_start_xmit = bond_xmit_xor; + break; + case BOND_MODE_BROADCAST: + bond_dev->hard_start_xmit = bond_xmit_broadcast; + break; + case BOND_MODE_8023AD: + bond_dev->hard_start_xmit = bond_3ad_xmit_xor; + break; + case BOND_MODE_TLB: + case BOND_MODE_ALB: + bond_dev->hard_start_xmit = bond_alb_xmit; + bond_dev->set_mac_address = bond_alb_set_mac_address; + break; + default: + /* Should never happen, mode already checked */ + printk(KERN_ERR DRV_NAME + ": Error: Unknown bonding mode %d\n", + mode); + break; + } +} + +/* * Does not allocate but creates a /proc entry. * Allowed to fail. */ -static int __init bond_init(struct net_device *bond_dev) +static int __init bond_init(struct net_device *bond_dev, struct bond_params *params) { struct bonding *bond = bond_dev->priv; - int count; dprintk("Begin bond_init for %s\n", bond_dev->name); @@ -3769,6 +3806,8 @@ static int __init bond_init(struct net_d rwlock_init(&bond->lock); rwlock_init(&bond->curr_slave_lock); + bond->params = *params; /* copy params struct */ + /* Initialize pointers */ bond->first_slave = NULL; bond->curr_active_slave = NULL; @@ -3785,33 +3824,7 @@ static int __init bond_init(struct net_d bond_dev->change_mtu = bond_change_mtu; bond_dev->set_mac_address = bond_set_mac_address; - switch (bond_mode) { - case BOND_MODE_ROUNDROBIN: - bond_dev->hard_start_xmit = bond_xmit_roundrobin; - break; - case BOND_MODE_ACTIVEBACKUP: - bond_dev->hard_start_xmit = bond_xmit_activebackup; - break; - case BOND_MODE_XOR: - bond_dev->hard_start_xmit = bond_xmit_xor; - break; - case BOND_MODE_BROADCAST: - bond_dev->hard_start_xmit = bond_xmit_broadcast; - break; - case BOND_MODE_8023AD: - bond_dev->hard_start_xmit = bond_3ad_xmit_xor; /* extern */ - break; - case BOND_MODE_TLB: - case BOND_MODE_ALB: - bond_dev->hard_start_xmit = bond_alb_xmit; /* extern */ - bond_dev->set_mac_address = bond_alb_set_mac_address; /* extern */ - break; - default: - printk(KERN_ERR DRV_NAME - ": Error: Unknown bonding mode %d\n", - bond_mode); - return -EINVAL; - } + bond_set_mode_ops(bond_dev, bond->params.mode); #ifdef CONFIG_NET_FASTROUTE bond_dev->accept_fastpath = bond_accept_fastpath; @@ -3821,27 +3834,6 @@ static int __init bond_init(struct net_d bond_dev->tx_queue_len = 0; bond_dev->flags |= IFF_MASTER|IFF_MULTICAST; - printk(KERN_INFO DRV_NAME ": %s registered with", bond_dev->name); - if (miimon) { - printk(" MII link monitoring set to %d ms", miimon); - updelay /= miimon; - downdelay /= miimon; - } else { - printk("out MII link monitoring"); - } - printk(", in %s mode.\n", bond_mode_name()); - - printk(KERN_INFO DRV_NAME ": %s registered with", bond_dev->name); - if (arp_interval > 0) { - printk(" ARP monitoring set to %d ms with %d target(s):", - arp_interval, arp_ip_count); - for (count=0 ; countmode = bond_mode; + params->miimon = miimon; + params->arp_interval = arp_interval; + params->updelay = updelay; + params->downdelay = downdelay; + params->use_carrier = use_carrier; + params->lacp_fast = lacp_fast; + params->primary[0] = 0; + + if (primary) { + strncpy(params->primary, primary, IFNAMSIZ); + params->primary[IFNAMSIZ - 1] = 0; + } + + memcpy(params->arp_targets, arp_target, sizeof(arp_target)); + return 0; } static int __init bonding_init(void) { + struct bond_params params; int i; int res; printk(KERN_INFO "%s", version); - res = bond_check_params(); + res = bond_check_params(¶ms); if (res) { return res; } @@ -4158,7 +4182,7 @@ static int __init bonding_init(void) * /proc files), but before register_netdevice(), because we * need to set function pointers. */ - res = bond_init(bond_dev); + res = bond_init(bond_dev, ¶ms); if (res < 0) { free_netdev(bond_dev); goto out_err; diff -Nuarp a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h --- a/drivers/net/bonding/bonding.h Mon Jan 5 17:05:01 2004 +++ b/drivers/net/bonding/bonding.h Mon Jan 5 17:05:02 2004 @@ -36,11 +36,13 @@ #include "bond_3ad.h" #include "bond_alb.h" -#define DRV_VERSION "2.5.3" +#define DRV_VERSION "2.5.4" #define DRV_RELDATE "December 30, 2003" #define DRV_NAME "bonding" #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" +#define BOND_MAX_ARP_TARGETS 16 + #ifdef BONDING_DEBUG #define dprintk(fmt, args...) \ printk(KERN_DEBUG \ @@ -133,6 +135,18 @@ bond_for_each_slave_from(bond, pos, cnt, (bond)->first_slave) +struct bond_params { + int mode; + int miimon; + int arp_interval; + int use_carrier; + int updelay; + int downdelay; + int lacp_fast; + char primary[IFNAMSIZ]; + u32 arp_targets[BOND_MAX_ARP_TARGETS]; +}; + struct slave { struct net_device *dev; /* first - usefull for panic debug */ struct slave *next; @@ -181,6 +195,7 @@ struct bonding { u16 flags; struct ad_bond_info ad_info; struct alb_bond_info alb_info; + struct bond_params params; }; /** From amir.noam@intel.com Mon Jan 5 07:28:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:28:19 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FS3Ta022198 for ; Mon, 5 Jan 2004 07:28:04 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i05FRniI019159; Mon, 5 Jan 2004 15:27:49 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i05FRFx4008811; Mon, 5 Jan 2004 15:27:48 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010517274724296 ; Mon, 05 Jan 2004 17:27:47 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i05FRlhb001081; Mon, 5 Jan 2004 17:27:48 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 2/3] [bonding 2.4] Use the per-bond value of the bond_mode parameter Date: Mon, 5 Jan 2004 17:27:46 +0200 User-Agent: KMail/1.5.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051727.47839.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2215 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Change usage of the global 'bond_mode' parameter to the per-bond value. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Mon Jan 5 17:05:04 2004 +++ b/drivers/net/bonding/bond_main.c Mon Jan 5 17:05:06 2004 @@ -595,9 +595,9 @@ static inline void bond_set_mode_ops(str /*---------------------------- General routines -----------------------------*/ -static const char *bond_mode_name(void) +static const char *bond_mode_name(int mode) { - switch (bond_mode) { + switch (mode) { case BOND_MODE_ROUNDROBIN : return "load balancing (round-robin)"; case BOND_MODE_ACTIVEBACKUP : @@ -803,7 +803,7 @@ static struct dev_mc_list *bond_mc_list_ */ static void bond_set_promiscuity(struct bonding *bond, int inc) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { /* write lock already acquired */ if (bond->curr_active_slave) { dev_set_promiscuity(bond->curr_active_slave->dev, inc); @@ -822,7 +822,7 @@ static void bond_set_promiscuity(struct */ static void bond_set_allmulti(struct bonding *bond, int inc) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { /* write lock already acquired */ if (bond->curr_active_slave) { dev_set_allmulti(bond->curr_active_slave->dev, inc); @@ -842,7 +842,7 @@ static void bond_set_allmulti(struct bon */ static void bond_mc_add(struct bonding *bond, void *addr, int alen) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { /* write lock already acquired */ if (bond->curr_active_slave) { dev_mc_add(bond->curr_active_slave->dev, addr, alen, 0); @@ -862,7 +862,7 @@ static void bond_mc_add(struct bonding * */ static void bond_mc_delete(struct bonding *bond, void *addr, int alen) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { /* write lock already acquired */ if (bond->curr_active_slave) { dev_mc_delete(bond->curr_active_slave->dev, addr, alen, 0); @@ -922,13 +922,14 @@ static int bond_mc_list_copy(struct dev_ */ static void bond_mc_list_flush(struct net_device *bond_dev, struct net_device *slave_dev) { + struct bonding *bond = bond_dev->priv; struct dev_mc_list *dmi; for (dmi = bond_dev->mc_list; dmi; dmi = dmi->next) { dev_mc_delete(slave_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); } - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* del lacpdu mc addr from mc list */ u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR; @@ -947,7 +948,7 @@ static void bond_mc_swap(struct bonding { struct dev_mc_list *dmi; - if (!USES_PRIMARY(bond_mode)) { + if (!USES_PRIMARY(bond->params.mode)) { /* nothing to do - mc list is already up-to-date on * all slaves */ @@ -1064,7 +1065,7 @@ static void bond_change_active_slave(str if (new_active) { if (new_active->link == BOND_LINK_BACK) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { printk(KERN_INFO DRV_NAME ": %s: making interface %s the new " "active one %d ms earlier.\n", @@ -1076,16 +1077,16 @@ static void bond_change_active_slave(str new_active->link = BOND_LINK_UP; new_active->jiffies = jiffies; - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { bond_3ad_handle_link_change(new_active, BOND_LINK_UP); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { bond_alb_handle_link_change(bond, new_active, BOND_LINK_UP); } } else { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { printk(KERN_INFO DRV_NAME ": %s: making interface %s the new " "active one.\n", @@ -1094,7 +1095,7 @@ static void bond_change_active_slave(str } } - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) { if (old_active) { bond_set_slave_inactive_flags(old_active); } @@ -1104,12 +1105,12 @@ static void bond_change_active_slave(str } } - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { bond_mc_swap(bond, new_active, old_active); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { bond_alb_handle_active_change(bond, new_active); } else { bond->curr_active_slave = new_active; @@ -1264,13 +1265,13 @@ static int bond_enslave(struct net_devic return -EINVAL; } - if ((bond_mode == BOND_MODE_8023AD) || - (bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_8023AD) || + (bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { printk(KERN_ERR DRV_NAME ": Error: to use %s mode, you must upgrade " "ifenslave.\n", - bond_mode_name()); + bond_mode_name(bond->params.mode)); return -EOPNOTSUPP; } } @@ -1326,8 +1327,8 @@ static int bond_enslave(struct net_devic new_slave->dev = slave_dev; - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { /* bond_alb_init_slave() must be called before all other stages since * it might fail and we do not want to have to undo everything */ @@ -1342,7 +1343,7 @@ static int bond_enslave(struct net_devic * curr_active_slave, and that is taken care of later when calling * bond_change_active() */ - if (!USES_PRIMARY(bond_mode)) { + if (!USES_PRIMARY(bond->params.mode)) { /* set promiscuity level to new slave */ if (bond_dev->flags & IFF_PROMISC) { dev_set_promiscuity(slave_dev, 1); @@ -1359,7 +1360,7 @@ static int bond_enslave(struct net_devic } } - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* add lacpdu mc addr to mc list */ u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR; @@ -1432,7 +1433,7 @@ static int bond_enslave(struct net_devic "forced to 100Mbps, duplex forced to Full.\n", new_slave->dev->name); - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { printk(KERN_WARNING "Operation of 802.3ad mode requires ETHTOOL " "support in base driver for proper aggregator " @@ -1440,14 +1441,14 @@ static int bond_enslave(struct net_devic } } - if (USES_PRIMARY(bond_mode) && primary) { + if (USES_PRIMARY(bond->params.mode) && primary) { /* if there is a primary slave, remember it */ if (strcmp(primary, new_slave->dev->name) == 0) { bond->primary_slave = new_slave; } } - switch (bond_mode) { + switch (bond->params.mode) { case BOND_MODE_ACTIVEBACKUP: /* if we're in active-backup mode, we need one and only one active * interface. The backup interfaces will have their NOARP flag set @@ -1637,7 +1638,7 @@ static int bond_release(struct net_devic } /* Inform AD package of unbinding of slave. */ - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* must be called before the slave is * detached from the list */ @@ -1666,8 +1667,8 @@ static int bond_release(struct net_devic bond_change_active_slave(bond, NULL); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { /* Must be called only after the slave has been * detached from the list and the curr_active_slave * has been cleared (if our_slave == old_current), @@ -1693,7 +1694,7 @@ static int bond_release(struct net_devic * promisc and mc settings if it was the curr_active_slave, but that was * already taken care of above when we detached the slave */ - if (!USES_PRIMARY(bond_mode)) { + if (!USES_PRIMARY(bond->params.mode)) { /* unset promiscuity level from slave */ if (bond_dev->flags & IFF_PROMISC) { dev_set_promiscuity(slave_dev, -1); @@ -1765,15 +1766,15 @@ static int bond_release_all(struct net_d /* Inform AD package of unbinding of slave * before slave is detached from the list. */ - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { bond_3ad_unbind_slave(slave); } slave_dev = slave->dev; bond_detach_slave(bond, slave); - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { /* must be called only after the slave * has been detached from the list */ @@ -1790,7 +1791,7 @@ static int bond_release_all(struct net_d * promisc and mc settings if it was the curr_active_slave, but that was * already taken care of above when we detached the slave */ - if (!USES_PRIMARY(bond_mode)) { + if (!USES_PRIMARY(bond->params.mode)) { /* unset promiscuity level from slave */ if (bond_dev->flags & IFF_PROMISC) { dev_set_promiscuity(slave_dev, -1); @@ -1864,6 +1865,10 @@ static int bond_ioctl_change_active(stru struct slave *new_active = NULL; int res = 0; + if (!USES_PRIMARY(bond->params.mode)) { + return -EINVAL; + } + /* Verify that master_dev is indeed the master of slave_dev */ if (!(slave_dev->flags & IFF_SLAVE) || (slave_dev->master != bond_dev)) { @@ -1952,7 +1957,7 @@ static int bond_info_query(struct net_de { struct bonding *bond = bond_dev->priv; - info->bond_mode = bond_mode; + info->bond_mode = bond->params.mode; info->miimon = miimon; read_lock_bh(&bond->lock); @@ -2054,7 +2059,7 @@ static void bond_mii_monitor(struct net_ "%d ms.\n", bond_dev->name, IS_UP(slave_dev) - ? ((bond_mode == BOND_MODE_ACTIVEBACKUP) + ? ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) ? ((slave == oldcurrent) ? "active " : "backup ") : "") @@ -2076,8 +2081,8 @@ static void bond_mii_monitor(struct net_ /* in active/backup mode, we must * completely disable this interface */ - if ((bond_mode == BOND_MODE_ACTIVEBACKUP) || - (bond_mode == BOND_MODE_8023AD)) { + if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) || + (bond->params.mode == BOND_MODE_8023AD)) { bond_set_slave_inactive_flags(slave); } @@ -2089,12 +2094,12 @@ static void bond_mii_monitor(struct net_ slave_dev->name); /* notify ad that the link status has changed */ - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { bond_3ad_handle_link_change(slave, BOND_LINK_DOWN); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN); } @@ -2157,10 +2162,10 @@ static void bond_mii_monitor(struct net_ slave->link = BOND_LINK_UP; slave->jiffies = jiffies; - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* prevent it from being the active one */ slave->state = BOND_STATE_BACKUP; - } else if (bond_mode != BOND_MODE_ACTIVEBACKUP) { + } else if (bond->params.mode != BOND_MODE_ACTIVEBACKUP) { /* make it immediately active */ slave->state = BOND_STATE_ACTIVE; } else if (slave != bond->primary_slave) { @@ -2175,12 +2180,12 @@ static void bond_mii_monitor(struct net_ slave_dev->name); /* notify ad that the link status has changed */ - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { bond_3ad_handle_link_change(slave, BOND_LINK_UP); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { bond_alb_handle_link_change(bond, slave, BOND_LINK_UP); } @@ -2202,7 +2207,7 @@ static void bond_mii_monitor(struct net_ bond_update_speed_duplex(slave); - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { if (old_speed != slave->speed) { bond_3ad_adapter_speed_changed(slave); } @@ -2665,17 +2670,19 @@ static void bond_info_seq_stop(struct se read_unlock(&dev_base_lock); } -static void bond_info_show_master(struct seq_file *seq, struct bonding *bond) +static void bond_info_show_master(struct seq_file *seq) { + struct bonding *bond = seq->private; struct slave *curr; read_lock(&bond->curr_slave_lock); curr = bond->curr_active_slave; read_unlock(&bond->curr_slave_lock); - seq_printf(seq, "Bonding Mode: %s\n", bond_mode_name()); + seq_printf(seq, "Bonding Mode: %s\n", + bond_mode_name(bond->params.mode)); - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { if (curr) { seq_printf(seq, "Currently Active Slave: %s\n", @@ -2688,7 +2695,7 @@ static void bond_info_show_master(struct seq_printf(seq, "Up Delay (ms): %d\n", updelay * miimon); seq_printf(seq, "Down Delay (ms): %d\n", downdelay * miimon); - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { struct ad_info ad_info; seq_puts(seq, "\n802.3ad info\n"); @@ -2720,6 +2727,8 @@ static void bond_info_show_master(struct static void bond_info_show_slave(struct seq_file *seq, const struct slave *slave) { + struct bonding *bond = seq->private; + seq_printf(seq, "\nSlave Interface: %s\n", slave->dev->name); seq_printf(seq, "MII Status: %s\n", (slave->link == BOND_LINK_UP) ? "up" : "down"); @@ -2737,7 +2746,7 @@ static void bond_info_show_slave(struct slave->perm_hwaddr[5]); } - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { const struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator; @@ -2754,7 +2763,7 @@ static int bond_info_seq_show(struct seq { if (v == SEQ_START_TOKEN) { seq_printf(seq, "%s\n", version); - bond_info_show_master(seq, seq->private); + bond_info_show_master(seq); } else { bond_info_show_slave(seq, v); } @@ -3030,14 +3039,14 @@ static int bond_open(struct net_device * bond->kill_timers = 0; - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { struct timer_list *alb_timer = &(BOND_ALB_INFO(bond).alb_timer); /* bond_alb_initialize must be called before the timer * is started. */ - if (bond_alb_initialize(bond, (bond_mode == BOND_MODE_ALB))) { + if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB))) { /* something went wrong - fail the open operation */ return -1; } @@ -3061,7 +3070,7 @@ static int bond_open(struct net_device * init_timer(arp_timer); arp_timer->expires = jiffies + 1; arp_timer->data = (unsigned long)bond_dev; - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) { arp_timer->function = (void *)&bond_activebackup_arp_mon; } else { arp_timer->function = (void *)&bond_loadbalance_arp_mon; @@ -3069,7 +3078,7 @@ static int bond_open(struct net_device * add_timer(arp_timer); } - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { struct timer_list *ad_timer = &(BOND_AD_INFO(bond).ad_timer); init_timer(ad_timer); ad_timer->expires = jiffies + 1; @@ -3092,7 +3101,7 @@ static int bond_close(struct net_device bond_mc_list_destroy(bond); - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* Unregister the receive of LACPDUs */ bond_unregister_lacpdu(bond); } @@ -3114,7 +3123,7 @@ static int bond_close(struct net_device del_timer_sync(&bond->arp_timer); } - switch (bond_mode) { + switch (bond->params.mode) { case BOND_MODE_8023AD: del_timer_sync(&(BOND_AD_INFO(bond).ad_timer)); break; @@ -3129,8 +3138,8 @@ static int bond_close(struct net_device /* Release the bonded slaves */ bond_release_all(bond_dev); - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { /* Must be called only after all * slaves have been released */ @@ -3310,11 +3319,7 @@ static int bond_do_ioctl(struct net_devi break; case BOND_CHANGE_ACTIVE_OLD: case SIOCBONDCHANGEACTIVE: - if (USES_PRIMARY(bond_mode)) { - res = bond_ioctl_change_active(bond_dev, slave_dev); - } else { - res = -EINVAL; - } + res = bond_ioctl_change_active(bond_dev, slave_dev); break; default: res = -EOPNOTSUPP; @@ -3919,7 +3924,7 @@ static int bond_check_params(struct bond if (bond_mode != BOND_MODE_8023AD) { printk(KERN_INFO DRV_NAME ": lacp_rate param is irrelevant in mode %s\n", - bond_mode_name()); + bond_mode_name(bond_mode)); } else { lacp_fast = bond_parse_parm(lacp_rate, bond_lacp_tbl); if (lacp_fast == -1) { @@ -4120,7 +4125,7 @@ static int bond_check_params(struct bond printk(KERN_WARNING DRV_NAME ": Warning: %s primary device specified but has no " "effect in %s mode\n", - primary, bond_mode_name()); + primary, bond_mode_name(bond_mode)); primary = NULL; } From amir.noam@intel.com Mon Jan 5 07:29:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:29:14 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FSvTa022693 for ; Mon, 5 Jan 2004 07:28:58 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i05FShiI019235; Mon, 5 Jan 2004 15:28:43 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i05FShwi008925; Mon, 5 Jan 2004 15:28:43 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010517284124329 ; Mon, 05 Jan 2004 17:28:41 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i05FSghb001106; Mon, 5 Jan 2004 17:28:42 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 3/3] [bonding 2.4] Use the per-bond values of all remaining parameters Date: Mon, 5 Jan 2004 17:28:41 +0200 User-Agent: KMail/1.5.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051728.42301.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2216 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Change usage of the all remaining global parameters to the per-bond values. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Mon Jan 5 17:05:08 2004 +++ b/drivers/net/bonding/bond_main.c Mon Jan 5 17:05:10 2004 @@ -458,6 +458,9 @@ * - Fixed: Releasing the original active slave causes mac address duplication. * - Add support for slaves that use ethtool_ops. * Set version to 2.5.3. + * + * 2004/01/05 - Amir Noam + * - Save bonding parameters per bond instead of using the global values. */ //#define BONDING_DEBUG 1 @@ -699,14 +702,14 @@ verify: * It'd be nice if there was a good way to tell if a driver supports * netif_carrier, but there really isn't. */ -static int bond_check_dev_link(struct net_device *slave_dev, int reporting) +static int bond_check_dev_link(struct bonding *bond, struct net_device *slave_dev, int reporting) { static int (* ioctl)(struct net_device *, struct ifreq *, int); struct ifreq ifr; struct mii_ioctl_data *mii; struct ethtool_value etool; - if (use_carrier) { + if (bond->params.use_carrier) { return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0; } @@ -994,7 +997,7 @@ static struct slave *bond_find_best_slav { struct slave *new_active, *old_active; struct slave *bestslave = NULL; - int mintime; + int mintime = bond->params.updelay; int i; new_active = old_active = bond->curr_active_slave; @@ -1007,15 +1010,13 @@ static struct slave *bond_find_best_slav } } - mintime = updelay; - /* first try the primary link; if arping, a link must tx/rx traffic * before it can be considered the curr_active_slave - also, we would skip * slaves between the curr_active_slave and primary_slave that may be up * and able to arp */ if ((bond->primary_slave) && - (!arp_interval) && + (!bond->params.arp_interval) && (IS_UP(bond->primary_slave->dev))) { new_active = bond->primary_slave; } @@ -1070,7 +1071,7 @@ static void bond_change_active_slave(str ": %s: making interface %s the new " "active one %d ms earlier.\n", bond->dev->name, new_active->dev->name, - (updelay - new_active->delay) * miimon); + (bond->params.updelay - new_active->delay) * bond->params.miimon); } new_active->delay = 0; @@ -1374,10 +1375,10 @@ static int bond_enslave(struct net_devic new_slave->delay = 0; new_slave->link_failure_count = 0; - if (miimon && !use_carrier) { - link_reporting = bond_check_dev_link(slave_dev, 1); + if (bond->params.miimon && !bond->params.use_carrier) { + link_reporting = bond_check_dev_link(bond, slave_dev, 1); - if ((link_reporting == -1) && !arp_interval) { + if ((link_reporting == -1) && !bond->params.arp_interval) { /* * miimon is set but a bonded network driver * does not support ETHTOOL/MII and @@ -1407,13 +1408,13 @@ static int bond_enslave(struct net_devic } /* check for initial state */ - if (!miimon || - (bond_check_dev_link(slave_dev, 0) == BMSR_LSTATUS)) { - if (updelay) { + if (!bond->params.miimon || + (bond_check_dev_link(bond, slave_dev, 0) == BMSR_LSTATUS)) { + if (bond->params.updelay) { dprintk("Initial state of slave_dev is " "BOND_LINK_BACK\n"); new_slave->link = BOND_LINK_BACK; - new_slave->delay = updelay; + new_slave->delay = bond->params.updelay; } else { dprintk("Initial state of slave_dev is " "BOND_LINK_UP\n"); @@ -1441,9 +1442,9 @@ static int bond_enslave(struct net_devic } } - if (USES_PRIMARY(bond->params.mode) && primary) { + if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) { /* if there is a primary slave, remember it */ - if (strcmp(primary, new_slave->dev->name) == 0) { + if (strcmp(bond->params.primary, new_slave->dev->name) == 0) { bond->primary_slave = new_slave; } } @@ -1482,7 +1483,7 @@ static int bond_enslave(struct net_devic * can be called only after the mac address of the bond is set */ bond_3ad_initialize(bond, 1000/AD_TIMER_INTERVAL, - lacp_fast); + bond->params.lacp_fast); } else { SLAVE_AD_INFO(new_slave).id = SLAVE_AD_INFO(new_slave->prev).id + 1; @@ -1958,7 +1959,7 @@ static int bond_info_query(struct net_de struct bonding *bond = bond_dev->priv; info->bond_mode = bond->params.mode; - info->miimon = miimon; + info->miimon = bond->params.miimon; read_lock_bh(&bond->lock); info->num_slaves = bond->slave_cnt; @@ -2008,11 +2009,13 @@ static void bond_mii_monitor(struct net_ struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; int do_failover = 0; - int delta_in_ticks = (miimon * HZ) / 1000; + int delta_in_ticks; int i; read_lock(&bond->lock); + delta_in_ticks = (bond->params.miimon * HZ) / 1000; + if (bond->kill_timers) { goto out; } @@ -2037,7 +2040,7 @@ static void bond_mii_monitor(struct net_ u16 old_speed = slave->speed; u8 old_duplex = slave->duplex; - link_state = bond_check_dev_link(slave_dev, 0); + link_state = bond_check_dev_link(bond, slave_dev, 0); switch (slave->link) { case BOND_LINK_UP: /* the link was up */ @@ -2046,13 +2049,13 @@ static void bond_mii_monitor(struct net_ break; } else { /* link going down */ slave->link = BOND_LINK_FAIL; - slave->delay = downdelay; + slave->delay = bond->params.downdelay; if (slave->link_failure_count < UINT_MAX) { slave->link_failure_count++; } - if (downdelay) { + if (bond->params.downdelay) { printk(KERN_INFO DRV_NAME ": %s: link status down for %s " "interface %s, disabling it in " @@ -2065,7 +2068,7 @@ static void bond_mii_monitor(struct net_ : "") : "idle ", slave_dev->name, - downdelay * miimon); + bond->params.downdelay * bond->params.miimon); } } /* no break ! fall through the BOND_LINK_FAIL test to @@ -2117,7 +2120,7 @@ static void bond_mii_monitor(struct net_ ": %s: link status up again after %d " "ms for interface %s.\n", bond_dev->name, - (downdelay - slave->delay) * miimon, + (bond->params.downdelay - slave->delay) * bond->params.miimon, slave_dev->name); } break; @@ -2127,9 +2130,9 @@ static void bond_mii_monitor(struct net_ break; } else { /* link going up */ slave->link = BOND_LINK_BACK; - slave->delay = updelay; + slave->delay = bond->params.updelay; - if (updelay) { + if (bond->params.updelay) { /* if updelay == 0, no need to advertise about a 0 ms delay */ printk(KERN_INFO DRV_NAME @@ -2138,7 +2141,7 @@ static void bond_mii_monitor(struct net_ "in %d ms.\n", bond_dev->name, slave_dev->name, - updelay * miimon); + bond->params.updelay * bond->params.miimon); } } /* no break ! fall through the BOND_LINK_BACK state in @@ -2153,7 +2156,7 @@ static void bond_mii_monitor(struct net_ ": %s: link status down again after %d " "ms for interface %s.\n", bond_dev->name, - (updelay - slave->delay) * miimon, + (bond->params.updelay - slave->delay) * bond->params.miimon, slave_dev->name); } else { /* link stays up */ @@ -2235,17 +2238,20 @@ static void bond_mii_monitor(struct net_ } re_arm: - mod_timer(&bond->mii_timer, jiffies + delta_in_ticks); + if (bond->params.miimon) { + mod_timer(&bond->mii_timer, jiffies + delta_in_ticks); + } out: read_unlock(&bond->lock); } -static void bond_arp_send_all(struct slave *slave) +static void bond_arp_send_all(struct bonding *bond, struct slave *slave) { int i; + u32 *targets = bond->params.arp_targets; - for (i = 0; (i < BOND_MAX_ARP_TARGETS) && arp_target[i]; i++) { - arp_send(ARPOP_REQUEST, ETH_P_ARP, arp_target[i], slave->dev, + for (i = 0; (i < BOND_MAX_ARP_TARGETS) && targets[i]; i++) { + arp_send(ARPOP_REQUEST, ETH_P_ARP, targets[i], slave->dev, my_ip, NULL, slave->dev->dev_addr, NULL); } @@ -2263,11 +2269,13 @@ static void bond_loadbalance_arp_mon(str struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; int do_failover = 0; - int delta_in_ticks = (arp_interval * HZ) / 1000; + int delta_in_ticks; int i; read_lock(&bond->lock); + delta_in_ticks = (bond->params.arp_interval * HZ) / 1000; + if (bond->kill_timers) { goto out; } @@ -2352,7 +2360,7 @@ static void bond_loadbalance_arp_mon(str * to be unstable during low/no traffic periods */ if (IS_UP(slave->dev)) { - bond_arp_send_all(slave); + bond_arp_send_all(bond, slave); } } @@ -2372,7 +2380,9 @@ static void bond_loadbalance_arp_mon(str } re_arm: - mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); + if (bond->params.arp_interval) { + mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); + } out: read_unlock(&bond->lock); } @@ -2396,11 +2406,13 @@ static void bond_activebackup_arp_mon(st { struct bonding *bond = bond_dev->priv; struct slave *slave; - int delta_in_ticks = (arp_interval * HZ) / 1000; + int delta_in_ticks; int i; read_lock(&bond->lock); + delta_in_ticks = (bond->params.arp_interval * HZ) / 1000; + if (bond->kill_timers) { goto out; } @@ -2559,7 +2571,7 @@ static void bond_activebackup_arp_mon(st * rx traffic */ if (slave && my_ip) { - bond_arp_send_all(slave); + bond_arp_send_all(bond, slave); } } @@ -2580,7 +2592,7 @@ static void bond_activebackup_arp_mon(st if (IS_UP(slave->dev)) { slave->link = BOND_LINK_BACK; bond_set_slave_active_flags(slave); - bond_arp_send_all(slave); + bond_arp_send_all(bond, slave); slave->jiffies = jiffies; bond->current_arp_slave = slave; break; @@ -2612,7 +2624,9 @@ static void bond_activebackup_arp_mon(st } re_arm: - mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); + if (bond->params.arp_interval) { + mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); + } out: read_unlock(&bond->lock); } @@ -2683,22 +2697,27 @@ static void bond_info_show_master(struct bond_mode_name(bond->params.mode)); if (USES_PRIMARY(bond->params.mode)) { - if (curr) { - seq_printf(seq, - "Currently Active Slave: %s\n", - curr->dev->name); - } + seq_printf(seq, "Primary Slave: %s\n", + (bond->params.primary[0]) ? + bond->params.primary : "None"); + + seq_printf(seq, "Currently Active Slave: %s\n", + (curr) ? curr->dev->name : "None"); } seq_printf(seq, "MII Status: %s\n", (curr) ? "up" : "down"); - seq_printf(seq, "MII Polling Interval (ms): %d\n", miimon); - seq_printf(seq, "Up Delay (ms): %d\n", updelay * miimon); - seq_printf(seq, "Down Delay (ms): %d\n", downdelay * miimon); + seq_printf(seq, "MII Polling Interval (ms): %d\n", bond->params.miimon); + seq_printf(seq, "Up Delay (ms): %d\n", + bond->params.updelay * bond->params.miimon); + seq_printf(seq, "Down Delay (ms): %d\n", + bond->params.downdelay * bond->params.miimon); if (bond->params.mode == BOND_MODE_8023AD) { struct ad_info ad_info; seq_puts(seq, "\n802.3ad info\n"); + seq_printf(seq, "LACP rate: %s\n", + (bond->params.lacp_fast) ? "fast" : "slow"); if (bond_3ad_get_active_agg_info(bond, &ad_info)) { seq_printf(seq, "bond %s has no active aggregator\n", @@ -3058,7 +3077,7 @@ static int bond_open(struct net_device * add_timer(alb_timer); } - if (miimon) { /* link check interval, in milliseconds. */ + if (bond->params.miimon) { /* link check interval, in milliseconds. */ init_timer(mii_timer); mii_timer->expires = jiffies + 1; mii_timer->data = (unsigned long)bond_dev; @@ -3066,7 +3085,7 @@ static int bond_open(struct net_device * add_timer(mii_timer); } - if (arp_interval) { /* arp interval, in milliseconds. */ + if (bond->params.arp_interval) { /* arp interval, in milliseconds. */ init_timer(arp_timer); arp_timer->expires = jiffies + 1; arp_timer->data = (unsigned long)bond_dev; @@ -3115,11 +3134,11 @@ static int bond_close(struct net_device * because a running timer might be trying to hold it too */ - if (miimon) { /* link check interval, in milliseconds. */ + if (bond->params.miimon) { /* link check interval, in milliseconds. */ del_timer_sync(&bond->mii_timer); } - if (arp_interval) { /* arp interval, in milliseconds. */ + if (bond->params.arp_interval) { /* arp interval, in milliseconds. */ del_timer_sync(&bond->arp_timer); } @@ -3602,7 +3621,7 @@ static int bond_xmit_activebackup(struct /* if we are sending arp packets, try to at least identify our own ip address */ - if (arp_interval && !my_ip && + if (bond->params.arp_interval && !my_ip && (skb->protocol == __constant_htons(ETH_P_ARP))) { char *the_ip = (char *)skb->data + sizeof(struct ethhdr) + From amir.noam@intel.com Mon Jan 5 07:30:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:30:14 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FTxTa023161 for ; Mon, 5 Jan 2004 07:30:00 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i05FTkiI019313; Mon, 5 Jan 2004 15:29:46 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i05FTbwm008981; Mon, 5 Jan 2004 15:29:46 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010517294524378 ; Mon, 05 Jan 2004 17:29:45 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i05FTjhb001129; Mon, 5 Jan 2004 17:29:45 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 0/3] [bonding 2.6] Using per-bond parameters Date: Mon, 5 Jan 2004 17:29:45 +0200 User-Agent: KMail/1.5.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051729.45524.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2217 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev The following patch set makes each bonding interface keep its own set of parameters, rather than use global values. This is the first step necessary to allow the configuration of different parameters to different bonding interfaces. The patches are against the net-drivers-2.5-exp tree (after Shmulik's 'update comment blocks' patch). -- Amir From amir.noam@intel.com Mon Jan 5 07:30:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:30:25 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FUATa023183 for ; Mon, 5 Jan 2004 07:30:11 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i05FTsiI019317; Mon, 5 Jan 2004 15:29:54 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i05FTrwi008988; Mon, 5 Jan 2004 15:29:53 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010517295224380 ; Mon, 05 Jan 2004 17:29:52 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i05FTqhb001132; Mon, 5 Jan 2004 17:29:52 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 1/3] [bonding 2.6] Save parameters in a per-bond data structure Date: Mon, 5 Jan 2004 17:29:51 +0200 User-Agent: KMail/1.5.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051729.52769.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2218 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev - Save the bonding parameters in a per-bond data structure. - Move all handling of the insmod parameters to bond_check_params(). - Fix the handling of some warning messages regarding parameter use. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Mon Jan 5 17:17:32 2004 +++ b/drivers/net/bonding/bond_main.c Mon Jan 5 17:17:33 2004 @@ -509,7 +509,6 @@ /* monitor all links that often (in milliseconds). <=0 disables monitoring */ #define BOND_LINK_MON_INTERV 0 #define BOND_LINK_ARP_INTERV 0 -#define MAX_ARP_IP_TARGETS 16 static int max_bonds = BOND_DEFAULT_MAX_BONDS; static int miimon = BOND_LINK_MON_INTERV; @@ -520,7 +519,7 @@ static char *mode = NULL; static char *primary = NULL; static char *lacp_rate = NULL; static int arp_interval = BOND_LINK_ARP_INTERV; -static char *arp_ip_target[MAX_ARP_IP_TARGETS] = { NULL, }; +static char *arp_ip_target[BOND_MAX_ARP_TARGETS] = { NULL, }; MODULE_PARM(max_bonds, "i"); MODULE_PARM_DESC(max_bonds, "Max number of bonded devices"); @@ -540,7 +539,7 @@ MODULE_PARM(lacp_rate, "s"); MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner (slow/fast)"); MODULE_PARM(arp_interval, "i"); MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds"); -MODULE_PARM(arp_ip_target, "1-" __MODULE_STRING(MAX_ARP_IP_TARGETS) "s"); +MODULE_PARM(arp_ip_target, "1-" __MODULE_STRING(BOND_MAX_ARP_TARGETS) "s"); MODULE_PARM_DESC(arp_ip_target, "arp targets in n.n.n.n form"); /*----------------------------- Global variables ----------------------------*/ @@ -554,7 +553,7 @@ static LIST_HEAD(bond_dev_list); static struct proc_dir_entry *bond_proc_dir = NULL; #endif -static u32 arp_target[MAX_ARP_IP_TARGETS] = { 0, } ; +static u32 arp_target[BOND_MAX_ARP_TARGETS] = { 0, } ; static int arp_ip_count = 0; static u32 my_ip = 0; static int bond_mode = BOND_MODE_ROUNDROBIN; @@ -590,6 +589,10 @@ static struct bond_parm_tbl bond_mode_tb { NULL, -1}, }; +/*-------------------------- Forward declarations ---------------------------*/ + +static inline void bond_set_mode_ops(struct net_device *bond_dev, int mode); + /*---------------------------- General routines -----------------------------*/ static const char *bond_mode_name(void) @@ -2236,7 +2239,7 @@ static void bond_arp_send_all(struct sla { int i; - for (i = 0; (idev, my_ip, NULL, slave->dev->dev_addr, NULL); @@ -3754,13 +3757,47 @@ static int bond_accept_fastpath(struct n /*------------------------- Device initialization ---------------------------*/ /* + * set bond mode specific net device operations + */ +static inline void bond_set_mode_ops(struct net_device *bond_dev, int mode) +{ + switch (mode) { + case BOND_MODE_ROUNDROBIN: + bond_dev->hard_start_xmit = bond_xmit_roundrobin; + break; + case BOND_MODE_ACTIVEBACKUP: + bond_dev->hard_start_xmit = bond_xmit_activebackup; + break; + case BOND_MODE_XOR: + bond_dev->hard_start_xmit = bond_xmit_xor; + break; + case BOND_MODE_BROADCAST: + bond_dev->hard_start_xmit = bond_xmit_broadcast; + break; + case BOND_MODE_8023AD: + bond_dev->hard_start_xmit = bond_3ad_xmit_xor; + break; + case BOND_MODE_TLB: + case BOND_MODE_ALB: + bond_dev->hard_start_xmit = bond_alb_xmit; + bond_dev->set_mac_address = bond_alb_set_mac_address; + break; + default: + /* Should never happen, mode already checked */ + printk(KERN_ERR DRV_NAME + ": Error: Unknown bonding mode %d\n", + mode); + break; + } +} + +/* * Does not allocate but creates a /proc entry. * Allowed to fail. */ -static int __init bond_init(struct net_device *bond_dev) +static int __init bond_init(struct net_device *bond_dev, struct bond_params *params) { struct bonding *bond = bond_dev->priv; - int count; dprintk("Begin bond_init for %s\n", bond_dev->name); @@ -3768,6 +3805,8 @@ static int __init bond_init(struct net_d rwlock_init(&bond->lock); rwlock_init(&bond->curr_slave_lock); + bond->params = *params; /* copy params struct */ + /* Initialize pointers */ bond->first_slave = NULL; bond->curr_active_slave = NULL; @@ -3784,33 +3823,7 @@ static int __init bond_init(struct net_d bond_dev->change_mtu = bond_change_mtu; bond_dev->set_mac_address = bond_set_mac_address; - switch (bond_mode) { - case BOND_MODE_ROUNDROBIN: - bond_dev->hard_start_xmit = bond_xmit_roundrobin; - break; - case BOND_MODE_ACTIVEBACKUP: - bond_dev->hard_start_xmit = bond_xmit_activebackup; - break; - case BOND_MODE_XOR: - bond_dev->hard_start_xmit = bond_xmit_xor; - break; - case BOND_MODE_BROADCAST: - bond_dev->hard_start_xmit = bond_xmit_broadcast; - break; - case BOND_MODE_8023AD: - bond_dev->hard_start_xmit = bond_3ad_xmit_xor; /* extern */ - break; - case BOND_MODE_TLB: - case BOND_MODE_ALB: - bond_dev->hard_start_xmit = bond_alb_xmit; /* extern */ - bond_dev->set_mac_address = bond_alb_set_mac_address; /* extern */ - break; - default: - printk(KERN_ERR DRV_NAME - ": Error: Unknown bonding mode %d\n", - bond_mode); - return -EINVAL; - } + bond_set_mode_ops(bond_dev, bond->params.mode); bond_dev->destructor = free_netdev; #ifdef CONFIG_NET_FASTROUTE @@ -3821,27 +3834,6 @@ static int __init bond_init(struct net_d bond_dev->tx_queue_len = 0; bond_dev->flags |= IFF_MASTER|IFF_MULTICAST; - printk(KERN_INFO DRV_NAME ": %s registered with", bond_dev->name); - if (miimon) { - printk(" MII link monitoring set to %d ms", miimon); - updelay /= miimon; - downdelay /= miimon; - } else { - printk("out MII link monitoring"); - } - printk(", in %s mode.\n", bond_mode_name()); - - printk(KERN_INFO DRV_NAME ": %s registered with", bond_dev->name); - if (arp_interval > 0) { - printk(" ARP monitoring set to %d ms with %d target(s):", - arp_interval, arp_ip_count); - for (count=0 ; countmode = bond_mode; + params->miimon = miimon; + params->arp_interval = arp_interval; + params->updelay = updelay; + params->downdelay = downdelay; + params->use_carrier = use_carrier; + params->lacp_fast = lacp_fast; + params->primary[0] = 0; + + if (primary) { + strncpy(params->primary, primary, IFNAMSIZ); + params->primary[IFNAMSIZ - 1] = 0; + } + + memcpy(params->arp_targets, arp_target, sizeof(arp_target)); + return 0; } static int __init bonding_init(void) { + struct bond_params params; int i; int res; printk(KERN_INFO "%s", version); - res = bond_check_params(); + res = bond_check_params(¶ms); if (res) { return res; } @@ -4157,7 +4181,7 @@ static int __init bonding_init(void) * /proc files), but before register_netdevice(), because we * need to set function pointers. */ - res = bond_init(bond_dev); + res = bond_init(bond_dev, ¶ms); if (res < 0) { free_netdev(bond_dev); goto out_err; diff -Nuarp a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h --- a/drivers/net/bonding/bonding.h Mon Jan 5 17:17:32 2004 +++ b/drivers/net/bonding/bonding.h Mon Jan 5 17:17:33 2004 @@ -36,11 +36,13 @@ #include "bond_3ad.h" #include "bond_alb.h" -#define DRV_VERSION "2.5.3" +#define DRV_VERSION "2.5.4" #define DRV_RELDATE "December 30, 2003" #define DRV_NAME "bonding" #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" +#define BOND_MAX_ARP_TARGETS 16 + #ifdef BONDING_DEBUG #define dprintk(fmt, args...) \ printk(KERN_DEBUG \ @@ -133,6 +135,18 @@ bond_for_each_slave_from(bond, pos, cnt, (bond)->first_slave) +struct bond_params { + int mode; + int miimon; + int arp_interval; + int use_carrier; + int updelay; + int downdelay; + int lacp_fast; + char primary[IFNAMSIZ]; + u32 arp_targets[BOND_MAX_ARP_TARGETS]; +}; + struct slave { struct net_device *dev; /* first - usefull for panic debug */ struct slave *next; @@ -181,6 +195,7 @@ struct bonding { u16 flags; struct ad_bond_info ad_info; struct alb_bond_info alb_info; + struct bond_params params; }; /** From amir.noam@intel.com Mon Jan 5 07:30:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:30:33 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FUGTa023186 for ; Mon, 5 Jan 2004 07:30:18 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i05FU2iI019364; Mon, 5 Jan 2004 15:30:02 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i05FU1wi009034; Mon, 5 Jan 2004 15:30:02 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010517300024382 ; Mon, 05 Jan 2004 17:30:01 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i05FU1hb001173; Mon, 5 Jan 2004 17:30:01 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 2/3] [bonding 2.6] Use the per-bond value of the bond_mode parameter Date: Mon, 5 Jan 2004 17:30:00 +0200 User-Agent: KMail/1.5.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051730.01365.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2219 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Change usage of the global 'bond_mode' parameter to the per-bond value. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Mon Jan 5 17:17:35 2004 +++ b/drivers/net/bonding/bond_main.c Mon Jan 5 17:17:37 2004 @@ -595,9 +595,9 @@ static inline void bond_set_mode_ops(str /*---------------------------- General routines -----------------------------*/ -static const char *bond_mode_name(void) +static const char *bond_mode_name(int mode) { - switch (bond_mode) { + switch (mode) { case BOND_MODE_ROUNDROBIN : return "load balancing (round-robin)"; case BOND_MODE_ACTIVEBACKUP : @@ -803,7 +803,7 @@ static struct dev_mc_list *bond_mc_list_ */ static void bond_set_promiscuity(struct bonding *bond, int inc) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { /* write lock already acquired */ if (bond->curr_active_slave) { dev_set_promiscuity(bond->curr_active_slave->dev, inc); @@ -822,7 +822,7 @@ static void bond_set_promiscuity(struct */ static void bond_set_allmulti(struct bonding *bond, int inc) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { /* write lock already acquired */ if (bond->curr_active_slave) { dev_set_allmulti(bond->curr_active_slave->dev, inc); @@ -842,7 +842,7 @@ static void bond_set_allmulti(struct bon */ static void bond_mc_add(struct bonding *bond, void *addr, int alen) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { /* write lock already acquired */ if (bond->curr_active_slave) { dev_mc_add(bond->curr_active_slave->dev, addr, alen, 0); @@ -862,7 +862,7 @@ static void bond_mc_add(struct bonding * */ static void bond_mc_delete(struct bonding *bond, void *addr, int alen) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { /* write lock already acquired */ if (bond->curr_active_slave) { dev_mc_delete(bond->curr_active_slave->dev, addr, alen, 0); @@ -922,13 +922,14 @@ static int bond_mc_list_copy(struct dev_ */ static void bond_mc_list_flush(struct net_device *bond_dev, struct net_device *slave_dev) { + struct bonding *bond = bond_dev->priv; struct dev_mc_list *dmi; for (dmi = bond_dev->mc_list; dmi; dmi = dmi->next) { dev_mc_delete(slave_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); } - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* del lacpdu mc addr from mc list */ u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR; @@ -947,7 +948,7 @@ static void bond_mc_swap(struct bonding { struct dev_mc_list *dmi; - if (!USES_PRIMARY(bond_mode)) { + if (!USES_PRIMARY(bond->params.mode)) { /* nothing to do - mc list is already up-to-date on * all slaves */ @@ -1064,7 +1065,7 @@ static void bond_change_active_slave(str if (new_active) { if (new_active->link == BOND_LINK_BACK) { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { printk(KERN_INFO DRV_NAME ": %s: making interface %s the new " "active one %d ms earlier.\n", @@ -1076,16 +1077,16 @@ static void bond_change_active_slave(str new_active->link = BOND_LINK_UP; new_active->jiffies = jiffies; - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { bond_3ad_handle_link_change(new_active, BOND_LINK_UP); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { bond_alb_handle_link_change(bond, new_active, BOND_LINK_UP); } } else { - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { printk(KERN_INFO DRV_NAME ": %s: making interface %s the new " "active one.\n", @@ -1094,7 +1095,7 @@ static void bond_change_active_slave(str } } - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) { if (old_active) { bond_set_slave_inactive_flags(old_active); } @@ -1104,12 +1105,12 @@ static void bond_change_active_slave(str } } - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { bond_mc_swap(bond, new_active, old_active); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { bond_alb_handle_active_change(bond, new_active); } else { bond->curr_active_slave = new_active; @@ -1264,13 +1265,13 @@ static int bond_enslave(struct net_devic return -EINVAL; } - if ((bond_mode == BOND_MODE_8023AD) || - (bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_8023AD) || + (bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { printk(KERN_ERR DRV_NAME ": Error: to use %s mode, you must upgrade " "ifenslave.\n", - bond_mode_name()); + bond_mode_name(bond->params.mode)); return -EOPNOTSUPP; } } @@ -1326,8 +1327,8 @@ static int bond_enslave(struct net_devic new_slave->dev = slave_dev; - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { /* bond_alb_init_slave() must be called before all other stages since * it might fail and we do not want to have to undo everything */ @@ -1342,7 +1343,7 @@ static int bond_enslave(struct net_devic * curr_active_slave, and that is taken care of later when calling * bond_change_active() */ - if (!USES_PRIMARY(bond_mode)) { + if (!USES_PRIMARY(bond->params.mode)) { /* set promiscuity level to new slave */ if (bond_dev->flags & IFF_PROMISC) { dev_set_promiscuity(slave_dev, 1); @@ -1359,7 +1360,7 @@ static int bond_enslave(struct net_devic } } - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* add lacpdu mc addr to mc list */ u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR; @@ -1432,7 +1433,7 @@ static int bond_enslave(struct net_devic "forced to 100Mbps, duplex forced to Full.\n", new_slave->dev->name); - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { printk(KERN_WARNING "Operation of 802.3ad mode requires ETHTOOL " "support in base driver for proper aggregator " @@ -1440,14 +1441,14 @@ static int bond_enslave(struct net_devic } } - if (USES_PRIMARY(bond_mode) && primary) { + if (USES_PRIMARY(bond->params.mode) && primary) { /* if there is a primary slave, remember it */ if (strcmp(primary, new_slave->dev->name) == 0) { bond->primary_slave = new_slave; } } - switch (bond_mode) { + switch (bond->params.mode) { case BOND_MODE_ACTIVEBACKUP: /* if we're in active-backup mode, we need one and only one active * interface. The backup interfaces will have their NOARP flag set @@ -1637,7 +1638,7 @@ static int bond_release(struct net_devic } /* Inform AD package of unbinding of slave. */ - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* must be called before the slave is * detached from the list */ @@ -1666,8 +1667,8 @@ static int bond_release(struct net_devic bond_change_active_slave(bond, NULL); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { /* Must be called only after the slave has been * detached from the list and the curr_active_slave * has been cleared (if our_slave == old_current), @@ -1693,7 +1694,7 @@ static int bond_release(struct net_devic * promisc and mc settings if it was the curr_active_slave, but that was * already taken care of above when we detached the slave */ - if (!USES_PRIMARY(bond_mode)) { + if (!USES_PRIMARY(bond->params.mode)) { /* unset promiscuity level from slave */ if (bond_dev->flags & IFF_PROMISC) { dev_set_promiscuity(slave_dev, -1); @@ -1765,15 +1766,15 @@ static int bond_release_all(struct net_d /* Inform AD package of unbinding of slave * before slave is detached from the list. */ - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { bond_3ad_unbind_slave(slave); } slave_dev = slave->dev; bond_detach_slave(bond, slave); - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { /* must be called only after the slave * has been detached from the list */ @@ -1790,7 +1791,7 @@ static int bond_release_all(struct net_d * promisc and mc settings if it was the curr_active_slave, but that was * already taken care of above when we detached the slave */ - if (!USES_PRIMARY(bond_mode)) { + if (!USES_PRIMARY(bond->params.mode)) { /* unset promiscuity level from slave */ if (bond_dev->flags & IFF_PROMISC) { dev_set_promiscuity(slave_dev, -1); @@ -1864,6 +1865,10 @@ static int bond_ioctl_change_active(stru struct slave *new_active = NULL; int res = 0; + if (!USES_PRIMARY(bond->params.mode)) { + return -EINVAL; + } + /* Verify that master_dev is indeed the master of slave_dev */ if (!(slave_dev->flags & IFF_SLAVE) || (slave_dev->master != bond_dev)) { @@ -1952,7 +1957,7 @@ static int bond_info_query(struct net_de { struct bonding *bond = bond_dev->priv; - info->bond_mode = bond_mode; + info->bond_mode = bond->params.mode; info->miimon = miimon; read_lock_bh(&bond->lock); @@ -2054,7 +2059,7 @@ static void bond_mii_monitor(struct net_ "%d ms.\n", bond_dev->name, IS_UP(slave_dev) - ? ((bond_mode == BOND_MODE_ACTIVEBACKUP) + ? ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) ? ((slave == oldcurrent) ? "active " : "backup ") : "") @@ -2076,8 +2081,8 @@ static void bond_mii_monitor(struct net_ /* in active/backup mode, we must * completely disable this interface */ - if ((bond_mode == BOND_MODE_ACTIVEBACKUP) || - (bond_mode == BOND_MODE_8023AD)) { + if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP) || + (bond->params.mode == BOND_MODE_8023AD)) { bond_set_slave_inactive_flags(slave); } @@ -2089,12 +2094,12 @@ static void bond_mii_monitor(struct net_ slave_dev->name); /* notify ad that the link status has changed */ - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { bond_3ad_handle_link_change(slave, BOND_LINK_DOWN); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { bond_alb_handle_link_change(bond, slave, BOND_LINK_DOWN); } @@ -2157,10 +2162,10 @@ static void bond_mii_monitor(struct net_ slave->link = BOND_LINK_UP; slave->jiffies = jiffies; - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* prevent it from being the active one */ slave->state = BOND_STATE_BACKUP; - } else if (bond_mode != BOND_MODE_ACTIVEBACKUP) { + } else if (bond->params.mode != BOND_MODE_ACTIVEBACKUP) { /* make it immediately active */ slave->state = BOND_STATE_ACTIVE; } else if (slave != bond->primary_slave) { @@ -2175,12 +2180,12 @@ static void bond_mii_monitor(struct net_ slave_dev->name); /* notify ad that the link status has changed */ - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { bond_3ad_handle_link_change(slave, BOND_LINK_UP); } - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { bond_alb_handle_link_change(bond, slave, BOND_LINK_UP); } @@ -2202,7 +2207,7 @@ static void bond_mii_monitor(struct net_ bond_update_speed_duplex(slave); - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { if (old_speed != slave->speed) { bond_3ad_adapter_speed_changed(slave); } @@ -2665,17 +2670,19 @@ static void bond_info_seq_stop(struct se read_unlock(&dev_base_lock); } -static void bond_info_show_master(struct seq_file *seq, struct bonding *bond) +static void bond_info_show_master(struct seq_file *seq) { + struct bonding *bond = seq->private; struct slave *curr; read_lock(&bond->curr_slave_lock); curr = bond->curr_active_slave; read_unlock(&bond->curr_slave_lock); - seq_printf(seq, "Bonding Mode: %s\n", bond_mode_name()); + seq_printf(seq, "Bonding Mode: %s\n", + bond_mode_name(bond->params.mode)); - if (USES_PRIMARY(bond_mode)) { + if (USES_PRIMARY(bond->params.mode)) { if (curr) { seq_printf(seq, "Currently Active Slave: %s\n", @@ -2688,7 +2695,7 @@ static void bond_info_show_master(struct seq_printf(seq, "Up Delay (ms): %d\n", updelay * miimon); seq_printf(seq, "Down Delay (ms): %d\n", downdelay * miimon); - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { struct ad_info ad_info; seq_puts(seq, "\n802.3ad info\n"); @@ -2720,6 +2727,8 @@ static void bond_info_show_master(struct static void bond_info_show_slave(struct seq_file *seq, const struct slave *slave) { + struct bonding *bond = seq->private; + seq_printf(seq, "\nSlave Interface: %s\n", slave->dev->name); seq_printf(seq, "MII Status: %s\n", (slave->link == BOND_LINK_UP) ? "up" : "down"); @@ -2737,7 +2746,7 @@ static void bond_info_show_slave(struct slave->perm_hwaddr[5]); } - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { const struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator; @@ -2754,7 +2763,7 @@ static int bond_info_seq_show(struct seq { if (v == SEQ_START_TOKEN) { seq_printf(seq, "%s\n", version); - bond_info_show_master(seq, seq->private); + bond_info_show_master(seq); } else { bond_info_show_slave(seq, v); } @@ -3029,14 +3038,14 @@ static int bond_open(struct net_device * bond->kill_timers = 0; - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { struct timer_list *alb_timer = &(BOND_ALB_INFO(bond).alb_timer); /* bond_alb_initialize must be called before the timer * is started. */ - if (bond_alb_initialize(bond, (bond_mode == BOND_MODE_ALB))) { + if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB))) { /* something went wrong - fail the open operation */ return -1; } @@ -3060,7 +3069,7 @@ static int bond_open(struct net_device * init_timer(arp_timer); arp_timer->expires = jiffies + 1; arp_timer->data = (unsigned long)bond_dev; - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) { arp_timer->function = (void *)&bond_activebackup_arp_mon; } else { arp_timer->function = (void *)&bond_loadbalance_arp_mon; @@ -3068,7 +3077,7 @@ static int bond_open(struct net_device * add_timer(arp_timer); } - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { struct timer_list *ad_timer = &(BOND_AD_INFO(bond).ad_timer); init_timer(ad_timer); ad_timer->expires = jiffies + 1; @@ -3091,7 +3100,7 @@ static int bond_close(struct net_device bond_mc_list_destroy(bond); - if (bond_mode == BOND_MODE_8023AD) { + if (bond->params.mode == BOND_MODE_8023AD) { /* Unregister the receive of LACPDUs */ bond_unregister_lacpdu(bond); } @@ -3113,7 +3122,7 @@ static int bond_close(struct net_device del_timer_sync(&bond->arp_timer); } - switch (bond_mode) { + switch (bond->params.mode) { case BOND_MODE_8023AD: del_timer_sync(&(BOND_AD_INFO(bond).ad_timer)); break; @@ -3128,8 +3137,8 @@ static int bond_close(struct net_device /* Release the bonded slaves */ bond_release_all(bond_dev); - if ((bond_mode == BOND_MODE_TLB) || - (bond_mode == BOND_MODE_ALB)) { + if ((bond->params.mode == BOND_MODE_TLB) || + (bond->params.mode == BOND_MODE_ALB)) { /* Must be called only after all * slaves have been released */ @@ -3309,11 +3318,7 @@ static int bond_do_ioctl(struct net_devi break; case BOND_CHANGE_ACTIVE_OLD: case SIOCBONDCHANGEACTIVE: - if (USES_PRIMARY(bond_mode)) { - res = bond_ioctl_change_active(bond_dev, slave_dev); - } else { - res = -EINVAL; - } + res = bond_ioctl_change_active(bond_dev, slave_dev); break; default: res = -EOPNOTSUPP; @@ -3918,7 +3923,7 @@ static int bond_check_params(struct bond if (bond_mode != BOND_MODE_8023AD) { printk(KERN_INFO DRV_NAME ": lacp_rate param is irrelevant in mode %s\n", - bond_mode_name()); + bond_mode_name(bond_mode)); } else { lacp_fast = bond_parse_parm(lacp_rate, bond_lacp_tbl); if (lacp_fast == -1) { @@ -4119,7 +4124,7 @@ static int bond_check_params(struct bond printk(KERN_WARNING DRV_NAME ": Warning: %s primary device specified but has no " "effect in %s mode\n", - primary, bond_mode_name()); + primary, bond_mode_name(bond_mode)); primary = NULL; } From amir.noam@intel.com Mon Jan 5 07:30:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:30:42 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FUPTa023251 for ; Mon, 5 Jan 2004 07:30:26 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i05FUBiI019422; Mon, 5 Jan 2004 15:30:11 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i05FUBwi009055; Mon, 5 Jan 2004 15:30:11 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010517301024387 ; Mon, 05 Jan 2004 17:30:10 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i05FUAhb001176; Mon, 5 Jan 2004 17:30:10 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 3/3] [bonding 2.6] Use the per-bond values of all remaining parameters Date: Mon, 5 Jan 2004 17:30:09 +0200 User-Agent: KMail/1.5.3 Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051730.10816.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2220 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Change usage of the all remaining global parameters to the per-bond values. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Mon Jan 5 17:17:39 2004 +++ b/drivers/net/bonding/bond_main.c Mon Jan 5 17:17:40 2004 @@ -458,6 +458,9 @@ * - Fixed: Releasing the original active slave causes mac address duplication. * - Add support for slaves that use ethtool_ops. * Set version to 2.5.3. + * + * 2004/01/05 - Amir Noam + * - Save bonding parameters per bond instead of using the global values. */ //#define BONDING_DEBUG 1 @@ -699,14 +702,14 @@ verify: * It'd be nice if there was a good way to tell if a driver supports * netif_carrier, but there really isn't. */ -static int bond_check_dev_link(struct net_device *slave_dev, int reporting) +static int bond_check_dev_link(struct bonding *bond, struct net_device *slave_dev, int reporting) { static int (* ioctl)(struct net_device *, struct ifreq *, int); struct ifreq ifr; struct mii_ioctl_data *mii; struct ethtool_value etool; - if (use_carrier) { + if (bond->params.use_carrier) { return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0; } @@ -994,7 +997,7 @@ static struct slave *bond_find_best_slav { struct slave *new_active, *old_active; struct slave *bestslave = NULL; - int mintime; + int mintime = bond->params.updelay; int i; new_active = old_active = bond->curr_active_slave; @@ -1007,15 +1010,13 @@ static struct slave *bond_find_best_slav } } - mintime = updelay; - /* first try the primary link; if arping, a link must tx/rx traffic * before it can be considered the curr_active_slave - also, we would skip * slaves between the curr_active_slave and primary_slave that may be up * and able to arp */ if ((bond->primary_slave) && - (!arp_interval) && + (!bond->params.arp_interval) && (IS_UP(bond->primary_slave->dev))) { new_active = bond->primary_slave; } @@ -1070,7 +1071,7 @@ static void bond_change_active_slave(str ": %s: making interface %s the new " "active one %d ms earlier.\n", bond->dev->name, new_active->dev->name, - (updelay - new_active->delay) * miimon); + (bond->params.updelay - new_active->delay) * bond->params.miimon); } new_active->delay = 0; @@ -1374,10 +1375,10 @@ static int bond_enslave(struct net_devic new_slave->delay = 0; new_slave->link_failure_count = 0; - if (miimon && !use_carrier) { - link_reporting = bond_check_dev_link(slave_dev, 1); + if (bond->params.miimon && !bond->params.use_carrier) { + link_reporting = bond_check_dev_link(bond, slave_dev, 1); - if ((link_reporting == -1) && !arp_interval) { + if ((link_reporting == -1) && !bond->params.arp_interval) { /* * miimon is set but a bonded network driver * does not support ETHTOOL/MII and @@ -1407,13 +1408,13 @@ static int bond_enslave(struct net_devic } /* check for initial state */ - if (!miimon || - (bond_check_dev_link(slave_dev, 0) == BMSR_LSTATUS)) { - if (updelay) { + if (!bond->params.miimon || + (bond_check_dev_link(bond, slave_dev, 0) == BMSR_LSTATUS)) { + if (bond->params.updelay) { dprintk("Initial state of slave_dev is " "BOND_LINK_BACK\n"); new_slave->link = BOND_LINK_BACK; - new_slave->delay = updelay; + new_slave->delay = bond->params.updelay; } else { dprintk("Initial state of slave_dev is " "BOND_LINK_UP\n"); @@ -1441,9 +1442,9 @@ static int bond_enslave(struct net_devic } } - if (USES_PRIMARY(bond->params.mode) && primary) { + if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) { /* if there is a primary slave, remember it */ - if (strcmp(primary, new_slave->dev->name) == 0) { + if (strcmp(bond->params.primary, new_slave->dev->name) == 0) { bond->primary_slave = new_slave; } } @@ -1482,7 +1483,7 @@ static int bond_enslave(struct net_devic * can be called only after the mac address of the bond is set */ bond_3ad_initialize(bond, 1000/AD_TIMER_INTERVAL, - lacp_fast); + bond->params.lacp_fast); } else { SLAVE_AD_INFO(new_slave).id = SLAVE_AD_INFO(new_slave->prev).id + 1; @@ -1958,7 +1959,7 @@ static int bond_info_query(struct net_de struct bonding *bond = bond_dev->priv; info->bond_mode = bond->params.mode; - info->miimon = miimon; + info->miimon = bond->params.miimon; read_lock_bh(&bond->lock); info->num_slaves = bond->slave_cnt; @@ -2008,11 +2009,13 @@ static void bond_mii_monitor(struct net_ struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; int do_failover = 0; - int delta_in_ticks = (miimon * HZ) / 1000; + int delta_in_ticks; int i; read_lock(&bond->lock); + delta_in_ticks = (bond->params.miimon * HZ) / 1000; + if (bond->kill_timers) { goto out; } @@ -2037,7 +2040,7 @@ static void bond_mii_monitor(struct net_ u16 old_speed = slave->speed; u8 old_duplex = slave->duplex; - link_state = bond_check_dev_link(slave_dev, 0); + link_state = bond_check_dev_link(bond, slave_dev, 0); switch (slave->link) { case BOND_LINK_UP: /* the link was up */ @@ -2046,13 +2049,13 @@ static void bond_mii_monitor(struct net_ break; } else { /* link going down */ slave->link = BOND_LINK_FAIL; - slave->delay = downdelay; + slave->delay = bond->params.downdelay; if (slave->link_failure_count < UINT_MAX) { slave->link_failure_count++; } - if (downdelay) { + if (bond->params.downdelay) { printk(KERN_INFO DRV_NAME ": %s: link status down for %s " "interface %s, disabling it in " @@ -2065,7 +2068,7 @@ static void bond_mii_monitor(struct net_ : "") : "idle ", slave_dev->name, - downdelay * miimon); + bond->params.downdelay * bond->params.miimon); } } /* no break ! fall through the BOND_LINK_FAIL test to @@ -2117,7 +2120,7 @@ static void bond_mii_monitor(struct net_ ": %s: link status up again after %d " "ms for interface %s.\n", bond_dev->name, - (downdelay - slave->delay) * miimon, + (bond->params.downdelay - slave->delay) * bond->params.miimon, slave_dev->name); } break; @@ -2127,9 +2130,9 @@ static void bond_mii_monitor(struct net_ break; } else { /* link going up */ slave->link = BOND_LINK_BACK; - slave->delay = updelay; + slave->delay = bond->params.updelay; - if (updelay) { + if (bond->params.updelay) { /* if updelay == 0, no need to advertise about a 0 ms delay */ printk(KERN_INFO DRV_NAME @@ -2138,7 +2141,7 @@ static void bond_mii_monitor(struct net_ "in %d ms.\n", bond_dev->name, slave_dev->name, - updelay * miimon); + bond->params.updelay * bond->params.miimon); } } /* no break ! fall through the BOND_LINK_BACK state in @@ -2153,7 +2156,7 @@ static void bond_mii_monitor(struct net_ ": %s: link status down again after %d " "ms for interface %s.\n", bond_dev->name, - (updelay - slave->delay) * miimon, + (bond->params.updelay - slave->delay) * bond->params.miimon, slave_dev->name); } else { /* link stays up */ @@ -2235,17 +2238,20 @@ static void bond_mii_monitor(struct net_ } re_arm: - mod_timer(&bond->mii_timer, jiffies + delta_in_ticks); + if (bond->params.miimon) { + mod_timer(&bond->mii_timer, jiffies + delta_in_ticks); + } out: read_unlock(&bond->lock); } -static void bond_arp_send_all(struct slave *slave) +static void bond_arp_send_all(struct bonding *bond, struct slave *slave) { int i; + u32 *targets = bond->params.arp_targets; - for (i = 0; (i < BOND_MAX_ARP_TARGETS) && arp_target[i]; i++) { - arp_send(ARPOP_REQUEST, ETH_P_ARP, arp_target[i], slave->dev, + for (i = 0; (i < BOND_MAX_ARP_TARGETS) && targets[i]; i++) { + arp_send(ARPOP_REQUEST, ETH_P_ARP, targets[i], slave->dev, my_ip, NULL, slave->dev->dev_addr, NULL); } @@ -2263,11 +2269,13 @@ static void bond_loadbalance_arp_mon(str struct bonding *bond = bond_dev->priv; struct slave *slave, *oldcurrent; int do_failover = 0; - int delta_in_ticks = (arp_interval * HZ) / 1000; + int delta_in_ticks; int i; read_lock(&bond->lock); + delta_in_ticks = (bond->params.arp_interval * HZ) / 1000; + if (bond->kill_timers) { goto out; } @@ -2352,7 +2360,7 @@ static void bond_loadbalance_arp_mon(str * to be unstable during low/no traffic periods */ if (IS_UP(slave->dev)) { - bond_arp_send_all(slave); + bond_arp_send_all(bond, slave); } } @@ -2372,7 +2380,9 @@ static void bond_loadbalance_arp_mon(str } re_arm: - mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); + if (bond->params.arp_interval) { + mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); + } out: read_unlock(&bond->lock); } @@ -2396,11 +2406,13 @@ static void bond_activebackup_arp_mon(st { struct bonding *bond = bond_dev->priv; struct slave *slave; - int delta_in_ticks = (arp_interval * HZ) / 1000; + int delta_in_ticks; int i; read_lock(&bond->lock); + delta_in_ticks = (bond->params.arp_interval * HZ) / 1000; + if (bond->kill_timers) { goto out; } @@ -2559,7 +2571,7 @@ static void bond_activebackup_arp_mon(st * rx traffic */ if (slave && my_ip) { - bond_arp_send_all(slave); + bond_arp_send_all(bond, slave); } } @@ -2580,7 +2592,7 @@ static void bond_activebackup_arp_mon(st if (IS_UP(slave->dev)) { slave->link = BOND_LINK_BACK; bond_set_slave_active_flags(slave); - bond_arp_send_all(slave); + bond_arp_send_all(bond, slave); slave->jiffies = jiffies; bond->current_arp_slave = slave; break; @@ -2612,7 +2624,9 @@ static void bond_activebackup_arp_mon(st } re_arm: - mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); + if (bond->params.arp_interval) { + mod_timer(&bond->arp_timer, jiffies + delta_in_ticks); + } out: read_unlock(&bond->lock); } @@ -2683,22 +2697,27 @@ static void bond_info_show_master(struct bond_mode_name(bond->params.mode)); if (USES_PRIMARY(bond->params.mode)) { - if (curr) { - seq_printf(seq, - "Currently Active Slave: %s\n", - curr->dev->name); - } + seq_printf(seq, "Primary Slave: %s\n", + (bond->params.primary[0]) ? + bond->params.primary : "None"); + + seq_printf(seq, "Currently Active Slave: %s\n", + (curr) ? curr->dev->name : "None"); } seq_printf(seq, "MII Status: %s\n", (curr) ? "up" : "down"); - seq_printf(seq, "MII Polling Interval (ms): %d\n", miimon); - seq_printf(seq, "Up Delay (ms): %d\n", updelay * miimon); - seq_printf(seq, "Down Delay (ms): %d\n", downdelay * miimon); + seq_printf(seq, "MII Polling Interval (ms): %d\n", bond->params.miimon); + seq_printf(seq, "Up Delay (ms): %d\n", + bond->params.updelay * bond->params.miimon); + seq_printf(seq, "Down Delay (ms): %d\n", + bond->params.downdelay * bond->params.miimon); if (bond->params.mode == BOND_MODE_8023AD) { struct ad_info ad_info; seq_puts(seq, "\n802.3ad info\n"); + seq_printf(seq, "LACP rate: %s\n", + (bond->params.lacp_fast) ? "fast" : "slow"); if (bond_3ad_get_active_agg_info(bond, &ad_info)) { seq_printf(seq, "bond %s has no active aggregator\n", @@ -3057,7 +3076,7 @@ static int bond_open(struct net_device * add_timer(alb_timer); } - if (miimon) { /* link check interval, in milliseconds. */ + if (bond->params.miimon) { /* link check interval, in milliseconds. */ init_timer(mii_timer); mii_timer->expires = jiffies + 1; mii_timer->data = (unsigned long)bond_dev; @@ -3065,7 +3084,7 @@ static int bond_open(struct net_device * add_timer(mii_timer); } - if (arp_interval) { /* arp interval, in milliseconds. */ + if (bond->params.arp_interval) { /* arp interval, in milliseconds. */ init_timer(arp_timer); arp_timer->expires = jiffies + 1; arp_timer->data = (unsigned long)bond_dev; @@ -3114,11 +3133,11 @@ static int bond_close(struct net_device * because a running timer might be trying to hold it too */ - if (miimon) { /* link check interval, in milliseconds. */ + if (bond->params.miimon) { /* link check interval, in milliseconds. */ del_timer_sync(&bond->mii_timer); } - if (arp_interval) { /* arp interval, in milliseconds. */ + if (bond->params.arp_interval) { /* arp interval, in milliseconds. */ del_timer_sync(&bond->arp_timer); } @@ -3601,7 +3620,7 @@ static int bond_xmit_activebackup(struct /* if we are sending arp packets, try to at least identify our own ip address */ - if (arp_interval && !my_ip && + if (bond->params.arp_interval && !my_ip && (skb->protocol == __constant_htons(ETH_P_ARP))) { char *the_ip = (char *)skb->data + sizeof(struct ethhdr) + From srompf@isg.de Mon Jan 5 07:32:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 07:32:20 -0800 (PST) Received: from mail.isg.de (rzfoobar.is-asp.com [217.11.194.155]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05FVuTa024600 for ; Mon, 5 Jan 2004 07:31:57 -0800 Received: from barkeeper.frankfurter-softwarefabrik.de (barkeeper.frankfurter-softwarefabrik.de [192.168.6.182]) by mail.isg.de (Postfix) with ESMTP id 27A1EE49224; Mon, 5 Jan 2004 15:50:52 +0100 (CET) From: Stefan Rompf To: Michal Ostrowski , netdev@oss.sgi.com Subject: Re: Deadlock in sungem/ip_auto_config/linkwatch Date: Mon, 5 Jan 2004 15:50:50 +0100 User-Agent: KMail/1.5.94 References: <1073307882.2041.98320.camel@brick.watson.ibm.com> In-Reply-To: <1073307882.2041.98320.camel@brick.watson.ibm.com> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <200401051550.51063.srompf@isg.de> X-archive-position: 2221 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: srompf@isg.de Precedence: bulk X-list: netdev Am Montag, 05. Januar 2004 14:07 schrieb Michal Ostrowski: > ic_open_devs grabs rtnl_sem with an rtnl_shlock() call. > > The sungem driver at some point calls gem_init_one, which calls > netif_carrier_*, which in turn calls schedule_work (linkwatch_event). > > linkwatch_event in turn needs rtnl_sem. Good catch! The sungem driver shows clearly that we need some way to remove queued work without scheduling and waiting for other events. I will change the linkwatch code to use rtnl_shlock_nowait() and backoff and retry in case of failure this week. Call it a workaround, but it increases overall system stability. Btw, what is the planned difference between rtnl_shlock() and rtnl_exlock()? Even though the later is a null operation right now, I don't want to hold more locks than needed in the linkwatch code. Stefan -- "doesn't work" is not a magic word to explain everything. From mostrows@watson.ibm.com Mon Jan 5 08:19:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 08:19:41 -0800 (PST) Received: from brick.watson.ibm.com (yktgi01e0-s4.watson.ibm.com [129.34.20.23]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05GJQTa031239 for ; Mon, 5 Jan 2004 08:19:27 -0800 Received: by brick.watson.ibm.com (Postfix, from userid 9965) id 91C54BFA3; Mon, 5 Jan 2004 11:19:25 -0500 (EST) Subject: Re: Deadlock in sungem/ip_auto_config/linkwatch From: Michal Ostrowski To: Stefan Rompf Cc: netdev@oss.sgi.com In-Reply-To: <200401051550.51063.srompf@isg.de> References: <1073307882.2041.98320.camel@brick.watson.ibm.com> <200401051550.51063.srompf@isg.de> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-T7Vi+m2S37xHNyQWLC01" Message-Id: <1073319565.2043.98923.camel@brick.watson.ibm.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Mon, 05 Jan 2004 11:19:25 -0500 X-archive-position: 2222 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mostrows@watson.ibm.com Precedence: bulk X-list: netdev --=-T7Vi+m2S37xHNyQWLC01 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable This can get pretty hairy. Suppose the linkwatch code backs-off in the case that rtnl_sem is held legitimately by thread A. Meanwhile, thread B is doing a flush_scheduled_work in order to wait for pending linkwatch events to complete. =20 In the proposed solution this will result in incorrect behaviour (flush_scheduled_work returns with the linkwatch work not really done).=20 (Admittedly I'm not sure if such a scenario really is feasible.) My initial though was to use a seperate work-queue, un-entangled with the global queue used for flush_scheduled_work. This would allow linkwatch events to be synchronized against explicitly. For this solution though I think it would be nice to not have to have a thread per cpu for the linkwatch work queue. On the other hand, ic_open_devs appears to be the only place where rtnl_sem is held while going into a driver's open() function, and so maybe the right rule is that rtnl_sem is not held when calling dev->open(). --=20 Michal Ostrowski On Mon, 2004-01-05 at 09:50, Stefan Rompf wrote: > Am Montag, 05. Januar 2004 14:07 schrieb Michal Ostrowski: >=20 > > ic_open_devs grabs rtnl_sem with an rtnl_shlock() call. > > > > The sungem driver at some point calls gem_init_one, which calls > > netif_carrier_*, which in turn calls schedule_work (linkwatch_event). > > > > linkwatch_event in turn needs rtnl_sem. >=20 > Good catch! The sungem driver shows clearly that we need some way to remo= ve=20 > queued work without scheduling and waiting for other events. >=20 > I will change the linkwatch code to use rtnl_shlock_nowait() and backoff = and=20 > retry in case of failure this week. Call it a workaround, but it increase= s=20 > overall system stability. >=20 > Btw, what is the planned difference between rtnl_shlock() and rtnl_exlock= ()?=20 > Even though the later is a null operation right now, I don't want to hold= =20 > more locks than needed in the linkwatch code. >=20 > Stefan >=20 --=-T7Vi+m2S37xHNyQWLC01 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQA/+Y6MDMDCqU5zPMARAhzgAKCYfZQv5KwkiFbAsdhEoMaLk4ZgkwCdEclN kHFuP/wl7cmOAbijpPORTco= =+eos -----END PGP SIGNATURE----- --=-T7Vi+m2S37xHNyQWLC01-- From srompf@isg.de Mon Jan 5 09:29:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 09:29:21 -0800 (PST) Received: from mail.isg.de (rzfoobar.is-asp.com [217.11.194.155]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05HSvTa003296 for ; Mon, 5 Jan 2004 09:28:58 -0800 Received: from barkeeper.frankfurter-softwarefabrik.de (barkeeper.frankfurter-softwarefabrik.de [192.168.6.182]) by mail.isg.de (Postfix) with ESMTP id 866F8126644C; Mon, 5 Jan 2004 17:50:57 +0100 (CET) From: Stefan Rompf To: Michal Ostrowski Subject: Re: Deadlock in sungem/ip_auto_config/linkwatch Date: Mon, 5 Jan 2004 17:50:55 +0100 User-Agent: KMail/1.5.94 Cc: netdev@oss.sgi.com References: <1073307882.2041.98320.camel@brick.watson.ibm.com> <200401051550.51063.srompf@isg.de> <1073319565.2043.98923.camel@brick.watson.ibm.com> In-Reply-To: <1073319565.2043.98923.camel@brick.watson.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha1; boundary="Boundary-02=_wXZ+/M91EO8KJ7L"; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <200401051750.56233.srompf@isg.de> X-archive-position: 2223 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: srompf@isg.de Precedence: bulk X-list: netdev --Boundary-02=_wXZ+/M91EO8KJ7L Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Am Montag, 05. Januar 2004 17:19 schrieb Michal Ostrowski: > Suppose the linkwatch code backs-off in the case that rtnl_sem is held > legitimately by thread A. Meanwhile, thread B is doing a > flush_scheduled_work in order to wait for pending linkwatch events to > complete. This won't happen. If a pending linkwatch event needs to be scheduled=20 synchronously (f.e. when a device is unregistered), it is executed in conte= xt=20 of the calling process, not inside the workqueue thread. > My initial though was to use a seperate work-queue, un-entangled with > the global queue used for flush_scheduled_work. This would allow > linkwatch events to be synchronized against explicitly. That's overkill, and if I understand flush_workqueue() right, it doesn't ca= re=20 about work that is queued with delay, so it even wouldn't help. That's why = I=20 thought about a function to unregister pending work. Stefan =2D-=20 "doesn't work" is not a magic word to explain everything. --Boundary-02=_wXZ+/M91EO8KJ7L Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Description: signature Content-Disposition: attachment; filename="smime.p7s" MIIGlwYJKoZIhvcNAQcCoIIGiDCCBoQCAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3DQEHAaCCBEAw ggQ8MIIDpaADAgECAgFHMA0GCSqGSIb3DQEBBAUAMIG+MQswCQYDVQQGEwJERTEPMA0GA1UECBMG SGVzc2VuMRowGAYDVQQHExFGcmFua2Z1cnQgYW0gTWFpbjEhMB8GA1UEChMYSW5ub3ZhdGl2ZSBT b2Z0d2FyZSBHbWJIMR8wHQYDVQQLExZOZXR3b3JrIEFkbWluaXN0cmF0aW9uMSIwIAYDVQQDExlJ U0cgQ2VydGlmaWNhdGUgQXV0aG9yaXR5MRowGAYJKoZIhvcNAQkBFgtpbmZvQGlzZy5kZTAeFw0w MDA2MDUxMjIyNDlaFw0xMDAzMDUxMjIyNDlaMIG6MQswCQYDVQQGEwJERTEPMA0GA1UECBMGSGVz c2VuMRowGAYDVQQHExFGcmFua2Z1cnQgYW0gTWFpbjEhMB8GA1UEChMYSW5ub3ZhdGl2ZSBTb2Z0 d2FyZSBHbWJIMR0wGwYDVQQLExRTb2Z0d2FyZSBEZXZlbG9wbWVudDEeMBwGA1UEAxMVU3RlZmFu IFJvbXBmIChJU0cgQ0EpMRwwGgYJKoZIhvcNAQkBFg1zcm9tcGZAaXNnLmRlMIGfMA0GCSqGSIb3 DQEBAQUAA4GNADCBiQKBgQDBTlj0vQ81p59aQUY3XX81RM0MrcLzq5l5VetfLQOS4aTmwk/32siO rXQxxTwODgmui/kLzSgRRYqFtDfelSWvsrtYR4VaH3I1u7YgcXWTx4oTt0RdiMvJ3/VgKmfpbpRN DlyuV7b8daC+0vpMKbhBJyt5716u9LHrMZZbw5n1gwIDAQABo4IBSjCCAUYwCQYDVR0TBAIwADAs BglghkgBhvhCAQ0EHxYdT3BlblNTTCBHZW5lcmF0ZWQgQ2VydGlmaWNhdGUwHQYDVR0OBBYEFHGF b48a5LAWDpGcbEPw8Vn2aUYyMIHrBgNVHSMEgeMwgeCAFJyzCKYgVHSxntGNkKK4rHhml9G6oYHE pIHBMIG+MQswCQYDVQQGEwJERTEPMA0GA1UECBMGSGVzc2VuMRowGAYDVQQHExFGcmFua2Z1cnQg YW0gTWFpbjEhMB8GA1UEChMYSW5ub3ZhdGl2ZSBTb2Z0d2FyZSBHbWJIMR8wHQYDVQQLExZOZXR3 b3JrIEFkbWluaXN0cmF0aW9uMSIwIAYDVQQDExlJU0cgQ2VydGlmaWNhdGUgQXV0aG9yaXR5MRow GAYJKoZIhvcNAQkBFgtpbmZvQGlzZy5kZYIBADANBgkqhkiG9w0BAQQFAAOBgQAoJXnCqMVpxWXn Q2Oo/Jv8YHL5qeNyFMBmmWPMgt+ecjt1roE6hPpr22bny5nWgLaFgvQk6DptEOkmxFOmFLUDjQUA Rrypgr7Q5H3dILmsSKJPRWPiZPlwFtnA9Z7eSmnLbqF34kOmrt9NDPZxW/vx9fewVao2i0Fk0/n3 NpN0ETGCAh8wggIbAgEBMIHEMIG+MQswCQYDVQQGEwJERTEPMA0GA1UECBMGSGVzc2VuMRowGAYD VQQHExFGcmFua2Z1cnQgYW0gTWFpbjEhMB8GA1UEChMYSW5ub3ZhdGl2ZSBTb2Z0d2FyZSBHbWJI MR8wHQYDVQQLExZOZXR3b3JrIEFkbWluaXN0cmF0aW9uMSIwIAYDVQQDExlJU0cgQ2VydGlmaWNh dGUgQXV0aG9yaXR5MRowGAYJKoZIhvcNAQkBFgtpbmZvQGlzZy5kZQIBRzAJBgUrDgMCGgUAoIGx MBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTA0MDEwNTE2NTA1Nlow IwYJKoZIhvcNAQkEMRYEFMG1VMv2oQ9huoOf2E79m3hZ7W5TMFIGCSqGSIb3DQEJDzFFMEMwCgYI KoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMA0GCCqGSIb3DQMCAgFAMAcGBSsOAwIHMA0GCCqGSIb3 DQMCAgEoMA0GCSqGSIb3DQEBAQUABIGAMTE+Yg0t0/XRnOiw0D2VISwWOKjKKYnCkNdGHOkQWohJ +7uLlc1+pv6C/PK1JEcOM82l9ynE/ddOpcuKGdxPxujXc4uSq+kx/lAc+rWw7xSXUxxQpPjyVz9g oKDpnibM5P2dnVjyg0f7JZFUEILoPakkn/Rq8gG+VNvt2qr7RmM= --Boundary-02=_wXZ+/M91EO8KJ7L-- From erik@hensema.net Mon Jan 5 10:14:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 10:15:04 -0800 (PST) Received: from scrat.hensema.net (scrat.hensema.net [62.212.82.150]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05IEmTa004756 for ; Mon, 5 Jan 2004 10:14:48 -0800 Received: from dexter.hensema.net (cc78409-a.hnglo1.ov.home.nl [212.120.97.185]) by scrat.hensema.net (8.12.7/8.12.7/SuSE Linux 0.6) with ESMTP id i05IEhuB007560; Mon, 5 Jan 2004 19:14:43 +0100 Received: from bender.home.hensema.net (root@bender.ipv6.hensema.net [IPv6:2001:888:10a1:0:202:44ff:fe69:60f5]) by dexter.hensema.net (8.12.7/8.12.7) with ESMTP id i05IEhB5001230; Mon, 5 Jan 2004 19:14:43 +0100 Received: from bender.home.hensema.net (erik@localhost [127.0.0.1]) by bender.home.hensema.net (8.12.7/8.12.7) with ESMTP id i05IEhMY013681; Mon, 5 Jan 2004 19:14:43 +0100 Received: (from erik@localhost) by bender.home.hensema.net (8.12.7/8.12.7/Submit) id i05IEghW013680; Mon, 5 Jan 2004 19:14:42 +0100 Date: Mon, 5 Jan 2004 19:14:42 +0100 From: Erik Hensema To: "David S. Miller" Cc: Linus Torvalds , netdev@oss.sgi.com Subject: Re: 2.6.0: something is leaking memory Message-ID: <20040105181442.GA13674@bender.home.hensema.net> Reply-To: erik@hensema.net References: <20040104204834.40b6ca51.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040104204834.40b6ca51.davem@redhat.com> User-Agent: Mutt/1.4i X-archive-position: 2224 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: erik@hensema.net Precedence: bulk X-list: netdev On Sun, Jan 04, 2004 at 08:48:34PM -0800, David S. Miller wrote: > On Sun, 4 Jan 2004 18:30:21 -0800 (PST) > Linus Torvalds wrote: > > > You've got 19 _megabytes_ allocated to "tcp6_sock", and they are all > > marked as "active". That's almost certainly the leaking bug. > > > > Everything else looks reasonably normal. > ... > > David? > > Fixed by changeset 1.1496.16.1 which is in 2.6.1-rc1 Are you sure? tcp6_sock 1110 1136 1024 4 1 : tunables 54 27 0 : slabdata 284 284 0 dexter:~ # netstat --inet6 | wc -l 21 dexter:~ # uptime 19:13:22 up 4:36, 1 user, load average: 0.02, 0.08, 0.09 This is 2.6.1-rc1. I can't say anything definitive until after a few days of uptime, but it sure seems the leak is still there. -- Erik Hensema (erik@hensema.net) From davem@pizda.ninka.net Mon Jan 5 11:08:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 11:08:37 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05J8OTa006485 for ; Mon, 5 Jan 2004 11:08:24 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id LAA19381; Mon, 5 Jan 2004 11:02:48 -0800 Date: Mon, 5 Jan 2004 11:02:48 -0800 From: "David S. Miller" To: Stefan Rompf Cc: mostrows@watson.ibm.com, netdev@oss.sgi.com Subject: Re: Deadlock in sungem/ip_auto_config/linkwatch Message-Id: <20040105110248.04ed06b7.davem@redhat.com> In-Reply-To: <200401051550.51063.srompf@isg.de> References: <1073307882.2041.98320.camel@brick.watson.ibm.com> <200401051550.51063.srompf@isg.de> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2225 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 5 Jan 2004 15:50:50 +0100 Stefan Rompf wrote: > Btw, what is the planned difference between rtnl_shlock() and rtnl_exlock()? > Even though the later is a null operation right now, I don't want to hold > more locks than needed in the linkwatch code. The idea was originally to make the RTNL semaphore a read-write one, but I doubt we'll ever make that happen and the shlock bits will just disappear entirely. From shemminger@osdl.org Mon Jan 5 11:11:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 11:11:29 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05JBETa006920 for ; Mon, 5 Jan 2004 11:11:15 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i05JB3M26908; Mon, 5 Jan 2004 11:11:03 -0800 Date: Mon, 5 Jan 2004 11:10:59 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] multiple eth0 mixed PCI/ISA init Message-Id: <20040105111059.4a451146.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2226 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev This patch for 2.6 fixes the problem found by Zoltan Farkas with mixed PCI/ISA and a non-modular config. The problem is the old_netdev ISA probing isn't skipping "eth0" which already got assigned by the PCI initialization. diff -Nru a/drivers/net/Space.c b/drivers/net/Space.c --- a/drivers/net/Space.c Fri Dec 19 15:10:50 2003 +++ b/drivers/net/Space.c Fri Dec 19 15:10:50 2003 @@ -350,7 +350,7 @@ * Backwards compatibility - historically an I/O base of 1 was * used to indicate not to probe for this ethN interface */ - if (dev->base_addr == 1) { + if (__dev_get_by_name(dev->name) || dev->base_addr == 1) { free_netdev(dev); return -ENXIO; } From shemminger@osdl.org Mon Jan 5 12:17:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 12:17:23 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05KH6Ta010371 for ; Mon, 5 Jan 2004 12:17:06 -0800 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i05KGsM11421; Mon, 5 Jan 2004 12:16:54 -0800 Date: Mon, 5 Jan 2004 12:16:54 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] convert /proc/net/packet to seq_file Message-Id: <20040105121654.3cfbdfb6.shemminger@osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.7claws (GTK+ 1.2.10; i686-pc-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2227 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Convert AF_PACKET's /proc interface to seq_file. No change in functionality, but does fix problem with refcounting on this proc interface. diff -Nru a/net/packet/af_packet.c b/net/packet/af_packet.c --- a/net/packet/af_packet.c Mon Jan 5 12:14:55 2004 +++ b/net/packet/af_packet.c Mon Jan 5 12:14:55 2004 @@ -64,6 +64,7 @@ #include #include #include +#include #include #include #include @@ -1767,61 +1768,86 @@ }; #ifdef CONFIG_PROC_FS -static int packet_read_proc(char *buffer, char **start, off_t offset, - int length, int *eof, void *data) +static inline struct sock *packet_seq_idx(loff_t off) { - off_t pos=0; - off_t begin=0; - int len=0; struct sock *s; struct hlist_node *node; - - len+= sprintf(buffer,"sk RefCnt Type Proto Iface R Rmem User Inode\n"); + sk_for_each(s, node, &packet_sklist) { + if (!off--) + return s; + } + return NULL; +} + +static void *packet_seq_start(struct seq_file *seq, loff_t *pos) +{ read_lock(&packet_sklist_lock); + return *pos ? packet_seq_idx(*pos - 1) : SEQ_START_TOKEN; +} - sk_for_each(s, node, &packet_sklist) { - struct packet_opt *po = pkt_sk(s); +static void *packet_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + ++*pos; + return (v == SEQ_START_TOKEN) + ? sk_head(&packet_sklist) + : sk_next((struct sock*)v) ; +} - len+=sprintf(buffer+len,"%p %-6d %-4d %04x %-5d %1d %-6u %-6u %-6lu", - s, - atomic_read(&s->sk_refcnt), - s->sk_type, - ntohs(po->num), - po->ifindex, - po->running, - atomic_read(&s->sk_rmem_alloc), - sock_i_uid(s), - sock_i_ino(s) - ); +static void packet_seq_stop(struct seq_file *seq, void *v) +{ + read_unlock(&packet_sklist_lock); +} - buffer[len++]='\n'; - - pos=begin+len; - if(posoffset+length) - goto done; +static int packet_seq_show(struct seq_file *seq, void *v) +{ + if (v == SEQ_START_TOKEN) + seq_puts(seq, "sk RefCnt Type Proto Iface R Rmem User Inode\n"); + else { + struct sock *s = v; + const struct packet_opt *po = pkt_sk(s); + + seq_printf(seq, + "%p %-6d %-4d %04x %-5d %1d %-6u %-6u %-6lu\n", + s, + atomic_read(&s->sk_refcnt), + s->sk_type, + ntohs(po->num), + po->ifindex, + po->running, + atomic_read(&s->sk_rmem_alloc), + sock_i_uid(s), + sock_i_ino(s) ); } - *eof = 1; -done: - read_unlock(&packet_sklist_lock); - *start=buffer+(offset-begin); - len-=(offset-begin); - if(len>length) - len=length; - if(len<0) - len=0; - return len; + return 0; } + +static struct seq_operations packet_seq_ops = { + .start = packet_seq_start, + .next = packet_seq_next, + .stop = packet_seq_stop, + .show = packet_seq_show, +}; + +static int packet_seq_open(struct inode *inode, struct file *file) +{ + return seq_open(file, &packet_seq_ops); +} + +static struct file_operations packet_seq_fops = { + .owner = THIS_MODULE, + .open = packet_seq_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + #endif static void __exit packet_exit(void) { - remove_proc_entry("net/packet", 0); + proc_net_remove("packet"); unregister_netdevice_notifier(&packet_netdev_notifier); sock_unregister(PF_PACKET); return; @@ -1831,9 +1857,8 @@ { sock_register(&packet_family_ops); register_netdevice_notifier(&packet_netdev_notifier); -#ifdef CONFIG_PROC_FS - create_proc_read_entry("net/packet", 0, 0, packet_read_proc, NULL); -#endif + proc_net_fops_create("packet", 0, &packet_seq_fops); + return 0; } From davem@pizda.ninka.net Mon Jan 5 13:00:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 13:00:46 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05L0VTa017002 for ; Mon, 5 Jan 2004 13:00:33 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA19770; Mon, 5 Jan 2004 12:54:56 -0800 Date: Mon, 5 Jan 2004 12:54:56 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: [PATCH] convert /proc/net/packet to seq_file Message-Id: <20040105125456.05acad7b.davem@redhat.com> In-Reply-To: <20040105121654.3cfbdfb6.shemminger@osdl.org> References: <20040105121654.3cfbdfb6.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2228 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 5 Jan 2004 12:16:54 -0800 Stephen Hemminger wrote: > Convert AF_PACKET's /proc interface to seq_file. No change in functionality, > but does fix problem with refcounting on this proc interface. Applied, thanks Stephen. From davem@pizda.ninka.net Mon Jan 5 13:02:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 13:02:46 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05L2XTa017414 for ; Mon, 5 Jan 2004 13:02:33 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA19825; Mon, 5 Jan 2004 12:56:59 -0800 Date: Mon, 5 Jan 2004 12:56:59 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: [PATCH] multiple eth0 mixed PCI/ISA init Message-Id: <20040105125659.56c2f6b5.davem@redhat.com> In-Reply-To: <20040105111059.4a451146.shemminger@osdl.org> References: <20040105111059.4a451146.shemminger@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2229 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Mon, 5 Jan 2004 11:10:59 -0800 Stephen Hemminger wrote: > This patch for 2.6 fixes the problem found by Zoltan Farkas > with mixed PCI/ISA and a non-modular config. The problem is the old_netdev > ISA probing isn't skipping "eth0" which already got assigned by the PCI > initialization. Applied, thanks Stephen. From romieu@fr.zoreil.com Mon Jan 5 14:20:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 14:20:49 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05MKXTa020038 for ; Mon, 5 Jan 2004 14:20:34 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id i05MHgsW023074; Mon, 5 Jan 2004 23:17:42 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id i05MHf78023073; Mon, 5 Jan 2004 23:17:41 +0100 Date: Mon, 5 Jan 2004 23:17:41 +0100 From: Francois Romieu To: akpm@osdl.org Cc: netdev@oss.sgi.com, Jeff Garzik , Brad House Subject: [patch] 2.6.1-rc1-mm1 - typo of death in the r8169 driver Message-ID: <20040105231741.A19514@electric-eye.fr.zoreil.com> References: <20031122183001.GA16993@gtf.org> <20031124000939.A456@electric-eye.fr.zoreil.com> <20031126004550.A25408@electric-eye.fr.zoreil.com> <20031127235143.A16767@electric-eye.fr.zoreil.com> <20031130014738.A2589@electric-eye.fr.zoreil.com> <3FF846C3.5070207@mainstreetsoftworks.com> <20040104233849.A3214@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="45Z9DzgjV8m4Oswq" Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20040104233849.A3214@electric-eye.fr.zoreil.com>; from romieu@fr.zoreil.com on Sun, Jan 04, 2004 at 11:38:49PM +0100 X-Organisation: Land of Sunshine Inc. X-archive-position: 2230 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev --45Z9DzgjV8m4Oswq Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, silly bug in the r8169 driver. Please apply, thanks. -- Ueimor --45Z9DzgjV8m4Oswq Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="r8169-rx-fill-typo.patch" Oops... drivers/net/r8169.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -puN drivers/net/r8169.c~r8169-rx-fill-typo drivers/net/r8169.c --- linux-2.6.1-rc1-mm1/drivers/net/r8169.c~r8169-rx-fill-typo 2004-01-05 22:55:24.000000000 +0100 +++ linux-2.6.1-rc1-mm1-fr/drivers/net/r8169.c 2004-01-05 23:03:39.000000000 +0100 @@ -1186,7 +1186,7 @@ static u32 rtl8169_rx_fill(struct rtl816 { u32 cur; - for (cur = start; end - start > 0; cur++) { + for (cur = start; end - cur > 0; cur++) { int ret, i = cur % NUM_RX_DESC; if (tp->Rx_skbuff[i]) _ --45Z9DzgjV8m4Oswq-- From romieu@fr.zoreil.com Mon Jan 5 14:40:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 14:40:49 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i05MeZTa020774 for ; Mon, 5 Jan 2004 14:40:36 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id i05McOsW023388; Mon, 5 Jan 2004 23:38:24 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id i05McO19023387; Mon, 5 Jan 2004 23:38:24 +0100 Date: Mon, 5 Jan 2004 23:38:23 +0100 From: Francois Romieu To: "Krishnakumar. R" Cc: jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [PATCH] r8169 ethtool support. Message-ID: <20040105233823.B19514@electric-eye.fr.zoreil.com> References: <1073210391.3555.7.camel@l5ac210.l5.laser5.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <1073210391.3555.7.camel@l5ac210.l5.laser5.co.jp>; from krishnakumar@naturesoft.net on Sun, Jan 04, 2004 at 06:59:51PM +0900 X-Organisation: Land of Sunshine Inc. X-archive-position: 2231 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Krishnakumar. R : [minimal ethtool for r8169] Added to the pile. -- Ueimor From davem@pizda.ninka.net Mon Jan 5 19:59:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 19:59:55 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i063xgTa007897 for ; Mon, 5 Jan 2004 19:59:42 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id TAA20708; Mon, 5 Jan 2004 19:54:03 -0800 Date: Mon, 5 Jan 2004 19:54:03 -0800 From: "David S. Miller" To: Jeff Garzik Cc: benh@kernel.crashing.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Problem with dev_kfree_skb_any() in 2.6.0 Message-Id: <20040105195403.65ac4e9e.davem@redhat.com> In-Reply-To: <20040102025807.GB3851@gtf.org> References: <1072567054.4112.14.camel@gaston> <20031227170755.4990419b.davem@redhat.com> <3FF0FA6A.8000904@pobox.com> <20031229205157.4c631f28.davem@redhat.com> <20031230051519.GA6916@gtf.org> <20031229220122.30078657.davem@redhat.com> <3FF11745.4060705@pobox.com> <20031229221345.31c8c763.davem@redhat.com> <3FF1B939.1090108@pobox.com> <20040101124218.258e8b73.davem@redhat.com> <20040102025807.GB3851@gtf.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2232 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 1 Jan 2004 21:58:07 -0500 Jeff Garzik wrote: > On Thu, Jan 01, 2004 at 12:42:18PM -0800, David S. Miller wrote: > > Though, is there any particular reason you don't like adding a > > "|| irqs_disabled()" check to the if statement instead? > > I prefer that solution better actually. > > Yep, in fact when I wrote the above message, I came across a couple when I > was pondering... > * the destructor runs in a more predictable context. > * given the problem that started this thread, the 'if' test is a > potentially problematic area. Why not eliminate all possibility that > this problem will occur again? The way I see this, dev_kfree_skb_any() is not used in any performance critical path, so at worst during device shutdown, reset, or power-down, TX queue packet freeing work could be delayed by up to one jiffie. Therefore I've put the "|| irqs_disabled()" version of the fix into my tree. Thanks for working this out with me Jeff :) From jgarzik@pobox.com Mon Jan 5 23:54:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 23:54:18 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i067s2Ta016121 for ; Mon, 5 Jan 2004 23:54:05 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:42061 helo=pobox.com) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.22) id 1Adm2L-00075l-P2; Tue, 06 Jan 2004 07:54:01 +0000 Message-ID: <3FFA6981.7030006@pobox.com> Date: Tue, 06 Jan 2004 02:53:37 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu CC: akpm@osdl.org, netdev@oss.sgi.com, Brad House Subject: Re: [patch] 2.6.1-rc1-mm1 - typo of death in the r8169 driver References: <20031122183001.GA16993@gtf.org> <20031124000939.A456@electric-eye.fr.zoreil.com> <20031126004550.A25408@electric-eye.fr.zoreil.com> <20031127235143.A16767@electric-eye.fr.zoreil.com> <20031130014738.A2589@electric-eye.fr.zoreil.com> <3FF846C3.5070207@mainstreetsoftworks.com> <20040104233849.A3214@electric-eye.fr.zoreil.com> <20040105231741.A19514@electric-eye.fr.zoreil.com> In-Reply-To: <20040105231741.A19514@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2234 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev applied From jgarzik@pobox.com Mon Jan 5 23:53:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 05 Jan 2004 23:54:09 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i067rtTa016116 for ; Mon, 5 Jan 2004 23:53:56 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:42060 helo=pobox.com) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.22) id 1Adm2E-00075k-3g; Tue, 06 Jan 2004 07:53:54 +0000 Message-ID: <3FFA6978.3080509@pobox.com> Date: Tue, 06 Jan 2004 02:53:28 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Feldman, Scott" CC: netdev@oss.sgi.com, cramerj@intel.com Subject: Re: [e1000 2.6-exp] back out CSA interrupt fix References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2233 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev applied From jgarzik@pobox.com Tue Jan 6 00:04:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 00:04:54 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0684ZTa017071 for ; Tue, 6 Jan 2004 00:04:36 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:42077 helo=pobox.com) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.22) id 1AdmCY-0007DO-Ey; Tue, 06 Jan 2004 08:04:34 +0000 Message-ID: <3FFA6BFB.6020100@pobox.com> Date: Tue, 06 Jan 2004 03:04:11 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Amir Noam CC: Jay Vosburgh , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [PATCH 1/3] [bonding 2.6] Save parameters in a per-bond data structure References: <200401051729.52769.amir.noam@intel.com> In-Reply-To: <200401051729.52769.amir.noam@intel.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2235 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev applied 1-3 to 2.6.x From jgarzik@pobox.com Tue Jan 6 00:06:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 00:07:15 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0686wTa017590 for ; Tue, 6 Jan 2004 00:06:59 -0800 Received: from rdu74-153-143.nc.rr.com ([24.74.153.143]:42083 helo=pobox.com) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.22) id 1AdmEr-0007FD-T9; Tue, 06 Jan 2004 08:06:58 +0000 Message-ID: <3FFA6C8B.1030203@pobox.com> Date: Tue, 06 Jan 2004 03:06:35 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Amir Noam CC: Jay Vosburgh , bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [PATCH 1/3] [bonding 2.4] Save parameters in a per-bond data structure References: <200401051726.33613.amir.noam@intel.com> In-Reply-To: <200401051726.33613.amir.noam@intel.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2236 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev applied 1-3 to 2.4.x From hibi665@oki.com Tue Jan 6 04:14:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 04:14:29 -0800 (PST) Received: from iscan1.intra.oki.co.jp (okigate.oki.co.jp [202.226.91.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06CE3Ta029826 for ; Tue, 6 Jan 2004 04:14:04 -0800 Received: from aoi.okilab.oki.co.jp (localhost.localdomain [127.0.0.1]) by iscan1.intra.oki.co.jp (8.9.3/8.9.3) with SMTP id VAA18491 for ; Tue, 6 Jan 2004 21:14:02 +0900 Received: (qmail 18745 invoked from network); 6 Jan 2004 21:14:00 +0900 Received: from dhcp23233.okilab.oki.co.jp (HELO kiso) (172.24.23.233) by aoi.okilab.oki.co.jp with SMTP; 6 Jan 2004 21:14:00 +0900 Message-Id: <20040106211542.4fe9e531%hibi665@oki.com> MIME-Version: 1.0 Date: Tue, 06 Jan 2004 21:15:42 +0900 X-Mailer: Denshin 8 Go V32.1.4.3 X-My-Real-Login-Name: thibi; aoi From: Takashi Hibi To: netdev@oss.sgi.com Subject: MLD problems (again) X-archive-position: 2237 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hibi665@oki.com Precedence: bulk X-list: netdev A Happy new year. Let me discuss about the problems of MLD again. As I pointed out, there are two problems in current MLD code. 1. In MLDv1 compatibility mode, Older Version Querier Present timer expires prematurely. 2. After join SSM using setsockopt(MCAST_JOIN_SOURCE_GROUP), MLDv2 listener report isn't issued immediately. The cause of 1 is the wrong computation of timeout value. The cause of 2 is difficult to find. After tracing the code, I figured out that the mode of ifmcaddr6 isn't correctly set after setsockopt. After join by setsockopt, pmc->mca_sfmode should be MCAST_INCLUDE, but it remains MCAST_EXCLUDE. Eventually is_in() call returns false, and MLDv2 packet isn't composed. I don't know the right way to fix it, since the code is too complicated by a lot of flags. At least the following patch(diff from 2.6.0) works for me. Regards, Takashi Hibi --- mcast.c 2003-12-18 11:59:28.000000000 +0900 +++ new/mcast.c 2004-01-06 16:30:05.546174486 +0900 @@ -1050,7 +1050,7 @@ int igmp6_event_query(struct sk_buff *sk /* Translate milliseconds to jiffies */ max_delay = (ntohs(hdr->icmp6_maxdelay)*HZ)/1000; - switchback = (idev->mc_qrv + 1) * max_delay; + switchback = MLD_QRV_DEFAULT * 125*HZ + max_delay; idev->mc_v1_seen = jiffies + switchback; /* cancel the interface change timer */ @@ -1541,7 +1541,8 @@ static void mld_send_cr(struct inet6_dev type = MLD2_CHANGE_TO_EXCLUDE; else type = MLD2_CHANGE_TO_INCLUDE; - skb = add_grec(skb, pmc, type, 0, 0); + if (!skb) + skb = add_grec(skb, pmc, type, 0, 0); } spin_unlock_bh(&pmc->mca_lock); } @@ -1745,6 +1746,7 @@ static int ip6_mc_add1_src(struct ifmcad return -ENOBUFS; memset(psf, 0, sizeof(*psf)); psf->sf_addr = *psfsrc; + psf->sf_crcount = pmc->idev->mc_qrv; if (psf_prev) { psf_prev->sf_next = psf; } else @@ -1799,6 +1801,7 @@ int ip6_mc_add_src(struct inet6_dev *ide struct ifmcaddr6 *pmc; int isexclude; int i, err; + int first_join_src_grp; if (!idev) return -ENODEV; @@ -1816,6 +1819,8 @@ int ip6_mc_add_src(struct inet6_dev *ide sf_markstate(pmc); isexclude = pmc->mca_sfmode == MCAST_EXCLUDE; + first_join_src_grp = (!pmc->sources && sfmode == MCAST_INCLUDE && + sfcount == 1 && delta); if (!delta) pmc->mca_sfcount[sfmode]++; err = 0; @@ -1827,10 +1832,19 @@ int ip6_mc_add_src(struct inet6_dev *ide if (err) { int j; - pmc->mca_sfcount[sfmode]--; + if (!delta) + pmc->mca_sfcount[sfmode]--; for (j=0; jmca_sfcount[MCAST_EXCLUDE] != 0)) { + goto done; + } + if (first_join_src_grp) { + pmc->mca_sfmode = MCAST_INCLUDE; + if (sf_setstate(pmc)) + mld_ifc_event(idev); + goto done; + } + if (isexclude != (pmc->mca_sfcount[MCAST_EXCLUDE] != 0)) { struct inet6_dev *idev = pmc->idev; struct ip6_sf_list *psf; @@ -1848,6 +1862,7 @@ int ip6_mc_add_src(struct inet6_dev *ide mld_ifc_event(idev); } else if (sf_setstate(pmc)) mld_ifc_event(idev); +done: spin_unlock_bh(&pmc->mca_lock); read_unlock_bh(&idev->lock); return err; From cmadams@hiwaay.net Tue Jan 6 07:17:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 07:18:07 -0800 (PST) Received: from mail.hiwaay.net (bee.hiwaay.net [216.180.54.11]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06FHrTa007362 for ; Tue, 6 Jan 2004 07:17:53 -0800 Received: from bee.hiwaay.net (localhost [127.0.0.1]) by mail.hiwaay.net (8.12.10/8.12.10) with ESMTP id i06FHqu51538440 for ; Tue, 6 Jan 2004 09:17:52 -0600 (CST) Received: (from cmadams@localhost) by bee.hiwaay.net (8.12.10/8.12.10/DefSubmit) id i06FHq5Z1538608 for netdev@oss.sgi.com; Tue, 6 Jan 2004 09:17:52 -0600 (CST) Date: Tue, 6 Jan 2004 09:17:51 -0600 From: Chris Adams To: netdev@oss.sgi.com Subject: Problem with Compaq NC3131 dual eth card Message-ID: <20040106151751.GB1509622@hiwaay.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 2238 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cmadams@hiwaay.net Precedence: bulk X-list: netdev I have a Compaq NC3131 dual 10/100 ethernet PCI (64/32 bit in a 32 bit slot) card. The e100 module doesn't work, and the eepro100 module can be coerced to work but complains about it (this is with Fedora Core 1 and kernel-2.4.22-1.2135.nptl.athlon.rpm). The e100 module prints: ************************************************************************ Intel(R) PRO/100 Network Driver - version 2.3.18-k1 Copyright (c) 2003 Intel Corporation divert: allocating divert_blk for eth1 e100: selftest OK. e100: Invalid Ethernet address e100: Failed to initialize, instance #0 divert: freeing divert_blk for eth1 divert: allocating divert_blk for eth1 e100: selftest OK. e100: Invalid Ethernet address e100: Failed to initialize, instance #0 divert: freeing divert_blk for eth1 ************************************************************************ and unloads itself immediately. The eepro100 module prints: ************************************************************************ eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin and others divert: allocating divert_blk for eth1 eth1: Invalid EEPROM checksum 0xffc0, check settings before activating this device! eth1: OEM i82557/i82558 10/100 Ethernet, FF:FF:FF:FF:FF:FF, IRQ 11. Board assembly ffffff-255, Physical connectors present: RJ45 BNC AUI MII Primary interface chip unknown-15 PHY #31. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). divert: allocating divert_blk for eth2 eth2: Invalid EEPROM checksum 0xffc0, check settings before activating this device! eth2: OEM i82557/i82558 10/100 Ethernet, FF:FF:FF:FF:FF:FF, IRQ 10. Board assembly ffffff-255, Physical connectors present: RJ45 BNC AUI MII Primary interface chip unknown-15 PHY #31. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). ************************************************************************ It will load and let me configure the devices. If I force it to a different MAC address (reusing the MAC from eth0 and moving the wire to avoid ARP problems), it seems to work, but I shouldn't have to do that. Is my card defective or is this a compatibility problem? Is there a way to fix this? -- Chris Adams Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble. From erik@hensema.net Tue Jan 6 07:59:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 07:59:56 -0800 (PST) Received: from scrat.hensema.net (scrat.hensema.net [62.212.82.150]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06FxeTa009012 for ; Tue, 6 Jan 2004 07:59:41 -0800 Received: from dexter.hensema.net (cc78409-a.hnglo1.ov.home.nl [212.120.97.185]) by scrat.hensema.net (8.12.7/8.12.7/SuSE Linux 0.6) with ESMTP id i06FxYng008547; Tue, 6 Jan 2004 16:59:34 +0100 Received: from bender.home.hensema.net (root@bender.ipv6.hensema.net [IPv6:2001:888:10a1:0:202:44ff:fe69:60f5]) by dexter.hensema.net (8.12.7/8.12.7) with ESMTP id i06FxYB5022646; Tue, 6 Jan 2004 16:59:34 +0100 Received: from bender.home.hensema.net (erik@localhost [127.0.0.1]) by bender.home.hensema.net (8.12.7/8.12.7) with ESMTP id i06FxY72003383; Tue, 6 Jan 2004 16:59:34 +0100 Received: (from erik@localhost) by bender.home.hensema.net (8.12.7/8.12.7/Submit) id i06FxYVT003382; Tue, 6 Jan 2004 16:59:34 +0100 Date: Tue, 6 Jan 2004 16:59:34 +0100 From: Erik Hensema To: "David S. Miller" Cc: Linus Torvalds , netdev@oss.sgi.com Subject: Re: 2.6.0: something is leaking memory Message-ID: <20040106155933.GA3373@bender.home.hensema.net> Reply-To: erik@hensema.net References: <20040104204834.40b6ca51.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040104204834.40b6ca51.davem@redhat.com> User-Agent: Mutt/1.4i X-archive-position: 2239 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: erik@hensema.net Precedence: bulk X-list: netdev On Sun, Jan 04, 2004 at 08:48:34PM -0800, David S. Miller wrote: > On Sun, 4 Jan 2004 18:30:21 -0800 (PST) > Linus Torvalds wrote: > > > You've got 19 _megabytes_ allocated to "tcp6_sock", and they are all > > marked as "active". That's almost certainly the leaking bug. > > > > Everything else looks reasonably normal. > ... > > David? > > Fixed by changeset 1.1496.16.1 which is in 2.6.1-rc1 The leak seems to be in 2.6.1-rc1 too and I can't find any IPv6 related fixes wrt memory in the long format changelog of 2.6.1-rc1. David: are you sure it was fixed in rc1? It doesn't seem to be in -rc2 either. This is after 26 hours uptime: tcp6_sock 6246 6248 1024 4 1 : tunables 54 27 0 : slabdata 1562 1562 0 -- Erik Hensema (erik@hensema.net) From jmorris@redhat.com Tue Jan 6 08:01:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 08:01:25 -0800 (PST) Received: from thoron.boston.redhat.com (nat-pool-bos.redhat.com [66.187.230.200]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06G1ATa009420 for ; Tue, 6 Jan 2004 08:01:10 -0800 Received: from thoron.boston.redhat.com (localhost.localdomain [127.0.0.1]) by thoron.boston.redhat.com (8.12.8/8.12.8) with ESMTP id i06G13iF007581; Tue, 6 Jan 2004 11:01:03 -0500 Received: from localhost (jmorris@localhost) by thoron.boston.redhat.com (8.12.8/8.12.8/Submit) with ESMTP id i06G130N007577; Tue, 6 Jan 2004 11:01:03 -0500 X-Authentication-Warning: thoron.boston.redhat.com: jmorris owned process doing -bs Date: Tue, 6 Jan 2004 11:01:03 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: netdev@oss.sgi.com cc: netfilter-devel@lists.netfilter.org, "David S. Miller" , Stephen Smalley Subject: [RFC] IPv4 Netfilter hook priorities for SELinux Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2240 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev SELinux needs to use some Netfilter hooks, and I'd like to propose the hook priorities below for the mainline kernel. As SELinux is a mandatory access control system, it needs to be able to look at packets before and after they may have been modified. Two priorities are thus required. The SELINUX_LAST priority is straightforward: this is after all mangling and NAT has occurred. The SELINUX_FIRST priority needs to be located before any packet modification hooks, although it is also potentially useful if located prior to conntrack so that SELinux has an opportunity to reject packets before they enter the conntrack code. Does anyone have any objections to the patch below (which I'd propose for 2.6.2), or other comments? - James -- James Morris diff -urN -X dontdiff linux-2.6.1-rc1-mm2.pending/include/linux/netfilter_ipv4.h linux-2.6.1-rc1-mm2.w1/include/linux/netfilter_ipv4.h --- linux-2.6.1-rc1-mm2.pending/include/linux/netfilter_ipv4.h 2003-09-27 20:50:51.000000000 -0400 +++ linux-2.6.1-rc1-mm2.w1/include/linux/netfilter_ipv4.h 2004-01-06 10:14:59.503138800 -0500 @@ -51,6 +51,7 @@ enum nf_ip_hook_priorities { NF_IP_PRI_FIRST = INT_MIN, + NF_IP_PRI_SELINUX_FIRST = -225, NF_IP_PRI_CONNTRACK = -200, NF_IP_PRI_BRIDGE_SABOTAGE_FORWARD = -175, NF_IP_PRI_MANGLE = -150, @@ -58,6 +59,7 @@ NF_IP_PRI_BRIDGE_SABOTAGE_LOCAL_OUT = -50, NF_IP_PRI_FILTER = 0, NF_IP_PRI_NAT_SRC = 100, + NF_IP_PRI_SELINUX_LAST = 225, NF_IP_PRI_LAST = INT_MAX, }; From shmulik.hen@intel.com Tue Jan 6 09:04:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 09:04:51 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06H4bTa014309 for ; Tue, 6 Jan 2004 09:04:38 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by caduceus.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-outer.mc,v 1.12 2003/12/18 18:58:11 root Exp $) with ESMTP id i05FCB5B005806; Mon, 5 Jan 2004 15:12:11 GMT Received: from fmsmsxvs041.fm.intel.com (fmsmsxvs041.fm.intel.com [132.233.42.126]) by talaria.fm.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.7 2003/12/18 18:58:10 root Exp $) with SMTP id i05FA309011452; Mon, 5 Jan 2004 15:10:03 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.220.54]) by fmsmsxvs041.fm.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010507105104609 ; Mon, 05 Jan 2004 07:10:51 -0800 From: Shmuel Hen Organization: Intel Corporation Subject: Fwd: [bonding] trivial - Update comment blocks and version field Date: Mon, 5 Jan 2004 17:10:49 +0200 User-Agent: KMail/1.5.3 To: netdev@oss.sgi.com, bonding-devel@lists.sourceforge.net MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401051710.50622.shmulik.hen@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2241 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev ---------- Forwarded Message ---------- Subject: [bonding] trivial - Update comment blocks and version field Date: Monday 05 January 2004 16:41 From: Shmuel Hen To: "Jeff Garzik" Cc: "Jay Vosburgh" , "Shmulik Hen" , "Amir Noam" , "Noam Marom" Update comment blocks, version field and copyright years to match all the recent changes that were accepted into 2.4/2.6. Applies on top of latest netdev BK tree. -- Shmulik. diff -Nuarp a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c --- a/Documentation/networking/ifenslave.c Mon Jan 5 15:40:57 2004 +++ b/Documentation/networking/ifenslave.c Mon Jan 5 16:34:13 2004 @@ -89,13 +89,13 @@ * while it is running. It was already set during enslave. To * simplify things, it is now handeled separately. * - * - 2003/09/24 - Shmulik Hen + * - 2003/12/01 - Shmulik Hen * - Code cleanup and style changes * set version to 1.1.0 */ #define APP_VERSION "1.1.0" -#define APP_RELDATE "Septemer 24, 2003" +#define APP_RELDATE "December 1, 2003" #define APP_NAME "ifenslave" static char *version = diff -Nuarp a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c --- a/drivers/net/bonding/bond_3ad.c Mon Jan 5 15:40:57 2004 +++ b/drivers/net/bonding/bond_3ad.c Mon Jan 5 16:34:13 2004 @@ -1,5 +1,5 @@ /* - * Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. + * Copyright(c) 1999 - 2004 Intel Corporation. All rights reserved. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the Free @@ -48,7 +48,7 @@ * problem on very high Tx traffic load where packets may get dropped * by the slave. * - * 2003/09/24 - Shmulik Hen + * 2003/12/01 - Shmulik Hen * - Code cleanup and style changes */ diff -Nuarp a/drivers/net/bonding/bond_3ad.h b/drivers/net/bonding/bond_3ad.h --- a/drivers/net/bonding/bond_3ad.h Mon Jan 5 15:40:57 2004 +++ b/drivers/net/bonding/bond_3ad.h Mon Jan 5 16:34:13 2004 @@ -1,5 +1,5 @@ /* - * Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. + * Copyright(c) 1999 - 2004 Intel Corporation. All rights reserved. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the Free @@ -29,7 +29,7 @@ * - Renamed bond_3ad_link_status_changed() to * bond_3ad_handle_link_change() for compatibility with TLB. * - * 2003/09/24 - Shmulik Hen + * 2003/12/01 - Shmulik Hen * - Code cleanup and style changes */ diff -Nuarp a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c --- a/drivers/net/bonding/bond_alb.c Mon Jan 5 15:40:57 2004 +++ b/drivers/net/bonding/bond_alb.c Mon Jan 5 16:34:13 2004 @@ -1,5 +1,5 @@ /* - * Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. + * Copyright(c) 1999 - 2004 Intel Corporation. All rights reserved. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the @@ -29,8 +29,11 @@ * - Add support for setting bond's MAC address with special * handling required for ALB/TLB. * - * 2003/09/24 - Shmulik Hen + * 2003/12/01 - Shmulik Hen * - Code cleanup and style changes + * + * 2003/12/30 - Amir Noam + * - Fixed: Cannot remove and re-enslave the original active slave. */ //#define BONDING_DEBUG 1 diff -Nuarp a/drivers/net/bonding/bond_alb.h b/drivers/net/bonding/bond_alb.h --- a/drivers/net/bonding/bond_alb.h Mon Jan 5 15:40:57 2004 +++ b/drivers/net/bonding/bond_alb.h Mon Jan 5 16:34:14 2004 @@ -1,5 +1,5 @@ /* - * Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. + * Copyright(c) 1999 - 2004 Intel Corporation. All rights reserved. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the @@ -25,7 +25,7 @@ * - Add support for setting bond's MAC address with special * handling required for ALB/TLB. * - * 2003/09/24 - Shmulik Hen + * 2003/12/01 - Shmulik Hen * - Code cleanup and style changes */ diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Mon Jan 5 15:40:57 2004 +++ b/drivers/net/bonding/bond_main.c Mon Jan 5 16:34:14 2004 @@ -452,6 +452,12 @@ * o Change struct member names and types. * o Chomp trailing spaces, remove empty lines, fix indentations. * o Re-organize code according to context. + * + * 2003/12/30 - Amir Noam + * - Fixed: Cannot remove and re-enslave the original active slave. + * - Fixed: Releasing the original active slave causes mac address duplication. + * - Add support for slaves that use ethtool_ops. + * Set version to 2.5.3. */ //#define BONDING_DEBUG 1 diff -Nuarp a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h --- a/drivers/net/bonding/bonding.h Mon Jan 5 15:40:57 2004 +++ b/drivers/net/bonding/bonding.h Mon Jan 5 16:34:14 2004 @@ -23,7 +23,7 @@ * 2003/05/01 - Shmulik Hen * - Added support for Transmit load balancing mode. * - * 2003/09/24 - Shmulik Hen + * 2003/12/01 - Shmulik Hen * - Code cleanup and style changes */ @@ -36,8 +36,8 @@ #include "bond_3ad.h" #include "bond_alb.h" -#define DRV_VERSION "2.5.0" -#define DRV_RELDATE "December 1, 2003" +#define DRV_VERSION "2.5.3" +#define DRV_RELDATE "December 30, 2003" #define DRV_NAME "bonding" #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" diff -Nuarp a/include/linux/if_bonding.h b/include/linux/if_bonding.h --- a/include/linux/if_bonding.h Mon Jan 5 15:40:57 2004 +++ b/include/linux/if_bonding.h Mon Jan 5 16:34:14 2004 @@ -32,6 +32,9 @@ * 2003/05/01 - Amir Noam * - Added ABI version control to restore compatibility between * new/old ifenslave and new/old bonding. + * + * 2003/12/01 - Shmulik Hen + * - Code cleanup and style changes */ #ifndef _LINUX_IF_BONDING_H @@ -86,7 +89,7 @@ typedef struct ifbond { typedef struct ifslave { __s32 slave_id; /* Used as an IN param to the BOND_SLAVE_INFO_QUERY ioctl */ - __s8 slave_name[IFNAMSIZ]; + char slave_name[IFNAMSIZ]; __s8 link; __s8 state; __u32 link_failure_count; From davem@pizda.ninka.net Tue Jan 6 10:05:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 10:05:28 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06I5DTa016645 for ; Tue, 6 Jan 2004 10:05:13 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id JAA02701; Tue, 6 Jan 2004 09:59:09 -0800 Date: Tue, 6 Jan 2004 09:59:09 -0800 From: "David S. Miller" To: erik@hensema.net Cc: torvalds@osdl.org, netdev@oss.sgi.com, acme@conectiva.com.br Subject: Re: 2.6.0: something is leaking memory Message-Id: <20040106095909.7243b2ce.davem@redhat.com> In-Reply-To: <20040106155933.GA3373@bender.home.hensema.net> References: <20040104204834.40b6ca51.davem@redhat.com> <20040106155933.GA3373@bender.home.hensema.net> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2242 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 6 Jan 2004 16:59:34 +0100 Erik Hensema wrote: > David: are you sure it was fixed in rc1? > > It doesn't seem to be in -rc2 either. > > This is after 26 hours uptime: > > tcp6_sock 6246 6248 1024 4 1 : tunables 54 27 0 > : slabdata 1562 1562 0 Someone mentioned about a bug in the userland program you're using that is openning these sockets? Something about leaving sockets not closed. (Arnaldo, we aparently still have a TCP ipv6 socket leak...) From mroos@tartutest.cyber.ee Tue Jan 6 10:13:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 10:13:53 -0800 (PST) Received: from tartutest.cyber.ee (tartutest.cyber.ee [193.40.6.70]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06IDbTa017156 for ; Tue, 6 Jan 2004 10:13:38 -0800 Received: Message by Barricade tartutest.cyber.ee with ESMTP id i06IRnEw021613; Tue, 6 Jan 2004 20:27:49 +0200 Received: from mroos by rhn.tartu-labor with local (Exim 4.30) id 1Advho-0000Pq-3c; Tue, 06 Jan 2004 20:13:28 +0200 From: Meelis Roos To: cmadams@hiwaay.net, netdev@oss.sgi.com Subject: Re: Problem with Compaq NC3131 dual eth card In-Reply-To: <20040106151751.GB1509622@hiwaay.net> User-Agent: tin/1.7.4-20031226 ("Taransay") (UNIX) (Linux/2.4.25-pre4 (i686)) Message-Id: Date: Tue, 06 Jan 2004 20:13:28 +0200 X-archive-position: 2243 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mroos@linux.ee Precedence: bulk X-list: netdev CA> I have a Compaq NC3131 dual 10/100 ethernet PCI (64/32 bit in a 32 bit CA> slot) card. The e100 module doesn't work, and the eepro100 module can CA> be coerced to work but complains about it (this is with Fedora Core 1 CA> and kernel-2.4.22-1.2135.nptl.athlon.rpm). FWIW, I have a NC3134 (64/66 in 32/33 slot). Just tested it with 2.6.1-rc1 and 2.4.25-pre4, works fine and transfers fine. I tried only e100 since it just worked. -- Meelis Roos From acme@conectiva.com.br Tue Jan 6 10:53:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 10:53:28 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06IrCTa018239 for ; Tue, 6 Jan 2004 10:53:13 -0800 Received: from 200-103-242-110.ctame7042.dsl.brasiltelecom.net.br ([200.103.242.110] helo=oops.kerneljanitors.org) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1AdwRH-0004eH-00; Tue, 06 Jan 2004 17:00:27 -0200 Received: by oops.kerneljanitors.org (Postfix, from userid 500) id D7A851966D; Tue, 6 Jan 2004 17:03:58 -0200 (BRDT) Date: Tue, 6 Jan 2004 17:03:58 -0200 From: Arnaldo Carvalho de Melo To: "David S. Miller" Cc: erik@hensema.net, torvalds@osdl.org, netdev@oss.sgi.com Subject: Re: 2.6.0: something is leaking memory Message-ID: <20040106190358.GV28868@conectiva.com.br> References: <20040104204834.40b6ca51.davem@redhat.com> <20040106155933.GA3373@bender.home.hensema.net> <20040106095909.7243b2ce.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040106095909.7243b2ce.davem@redhat.com> X-Url: http://advogato.org/person/acme Organization: Conectiva S.A. User-Agent: Mutt/1.5.5.1i X-archive-position: 2244 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Em Tue, Jan 06, 2004 at 09:59:09AM -0800, David S. Miller escreveu: > On Tue, 6 Jan 2004 16:59:34 +0100 > Erik Hensema wrote: > > > David: are you sure it was fixed in rc1? > > > > It doesn't seem to be in -rc2 either. > > > > This is after 26 hours uptime: > > > > tcp6_sock 6246 6248 1024 4 1 : tunables 54 27 0 > > : slabdata 1562 1562 0 > > Someone mentioned about a bug in the userland program you're > using that is openning these sockets? Something about leaving sockets > not closed. > > (Arnaldo, we aparently still have a TCP ipv6 socket leak...) I'll take a look at this. From uucp@coruscant.gnumonks.org Tue Jan 6 11:16:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 11:16:49 -0800 (PST) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06JGWTa019311 for ; Tue, 6 Jan 2004 11:16:34 -0800 Received: from uucp by coruscant.gnumonks.org with local-bsmtp (Exim 4.20) id 1Adwgo-0001Fb-TA for netdev@oss.sgi.com; Tue, 06 Jan 2004 20:16:30 +0100 Received: from laforge by obroa-skai.gnumonks.org with local (Exim 3.36 #1) id 1Adwdb-0000Ob-00; Tue, 06 Jan 2004 20:13:11 +0100 Date: Tue, 6 Jan 2004 20:13:11 +0100 From: Harald Welte To: James Morris Cc: netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, "David S. Miller" , Stephen Smalley Subject: Re: [RFC] IPv4 Netfilter hook priorities for SELinux Message-ID: <20040106191311.GH934@obroa-skai.de.gnumonks.org> Mail-Followup-To: Harald Welte , James Morris , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, "David S. Miller" , Stephen Smalley References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="U3s59FfKcByyGl+j" Content-Disposition: inline In-Reply-To: X-Operating-System: Linux obroa-skai.de.gnumonks.org 2.6.0-test11 X-Date: Today is Sweetmorn, the 6th day of Chaos in the YOLD 3170 User-Agent: Mutt/1.5.4i X-archive-position: 2245 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --U3s59FfKcByyGl+j Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jan 06, 2004 at 11:01:03AM -0500, James Morris wrote: =20 > Does anyone have any objections to the patch below (which I'd propose for= =20 > 2.6.2), or other comments? Thanks James, I am perfectly fine with your patch. Feel free to put them into netfilter_arp.h and netfilter_ipv6.h, too. > - James --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --U3s59FfKcByyGl+j Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQE/+wjGXaXGVTD0i/8RAjFRAJ9Mub15xR+99zCM+eZgCi04/P3XCACeLCmI B9EEK83mrXSiQN6i549VcP4= =vGvz -----END PGP SIGNATURE----- --U3s59FfKcByyGl+j-- From erik@hensema.net Tue Jan 6 11:36:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 11:36:48 -0800 (PST) Received: from scrat.hensema.net (scrat.hensema.net [62.212.82.150]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06JaWTa020184 for ; Tue, 6 Jan 2004 11:36:33 -0800 Received: from dexter.hensema.net (cc78409-a.hnglo1.ov.home.nl [212.120.97.185]) by scrat.hensema.net (8.12.7/8.12.7/SuSE Linux 0.6) with ESMTP id i06JaGng021881; Tue, 6 Jan 2004 20:36:16 +0100 Received: from bender.home.hensema.net (root@bender.ipv6.hensema.net [IPv6:2001:888:10a1:0:202:44ff:fe69:60f5]) by dexter.hensema.net (8.12.7/8.12.7) with ESMTP id i06JaGB5016752; Tue, 6 Jan 2004 20:36:16 +0100 Received: from bender.home.hensema.net (erik@localhost [127.0.0.1]) by bender.home.hensema.net (8.12.7/8.12.7) with ESMTP id i06JaG72004558; Tue, 6 Jan 2004 20:36:16 +0100 Received: (from erik@localhost) by bender.home.hensema.net (8.12.7/8.12.7/Submit) id i06JaFWv004557; Tue, 6 Jan 2004 20:36:15 +0100 Date: Tue, 6 Jan 2004 20:36:15 +0100 From: Erik Hensema To: "David S. Miller" Cc: torvalds@osdl.org, netdev@oss.sgi.com, acme@conectiva.com.br Subject: Re: 2.6.0: something is leaking memory Message-ID: <20040106193615.GA4544@bender.home.hensema.net> Reply-To: erik@hensema.net References: <20040104204834.40b6ca51.davem@redhat.com> <20040106155933.GA3373@bender.home.hensema.net> <20040106095909.7243b2ce.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040106095909.7243b2ce.davem@redhat.com> User-Agent: Mutt/1.4i X-archive-position: 2246 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: erik@hensema.net Precedence: bulk X-list: netdev On Tue, Jan 06, 2004 at 09:59:09AM -0800, David S. Miller wrote: > On Tue, 6 Jan 2004 16:59:34 +0100 > Erik Hensema wrote: > > > David: are you sure it was fixed in rc1? > > > > It doesn't seem to be in -rc2 either. > > > > This is after 26 hours uptime: > > > > tcp6_sock 6246 6248 1024 4 1 : tunables 54 27 0 > > : slabdata 1562 1562 0 > > Someone mentioned about a bug in the userland program you're > using that is openning these sockets? Something about leaving sockets > not closed. That's correct. I suspect that this exposes the leak in the kernel more promimently than on other systems. The leak is in nscd, it doesn't properly close sockets to my LDAP server. This is not a kernel problem I think, because it also leaks in 2.4.x. The kernelspace leak however is not in 2.4, only in 2.6. I'm restarting nscd in cron every night, which makes the CLOSE_WAIT sockets go away. However the kernel resources are not freed it seems. -- Erik Hensema (erik@hensema.net) From scott.feldman@intel.com Tue Jan 6 11:43:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 11:44:09 -0800 (PST) Received: from petasus.ch.intel.com (petasus.ch.intel.com [143.182.124.5]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06JhsTa020748 for ; Tue, 6 Jan 2004 11:43:55 -0800 Received: from azsmsxvs040.ch.intel.com (azsmsxvs040.ch.intel.com [10.2.248.11]) by petasus.ch.intel.com (8.12.9-20030918-01/8.12.9/d: small-solo.mc,v 1.6 2003/12/18 18:58:11 root Exp $) with SMTP id i06JgfhD022754; Tue, 6 Jan 2004 19:43:43 GMT Received: from azsmsx331-2.ch.intel.com ([10.2.161.41]) by azsmsxvs040.ch.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010612434326281 ; Tue, 06 Jan 2004 12:43:43 -0700 Received: from rrsmsx401.amr.corp.intel.com ([10.14.9.74]) by azsmsx331-2.ch.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Tue, 6 Jan 2004 12:43:43 -0700 Received: from orsmsx312.amr.corp.intel.com ([192.168.65.62]) by rrsmsx401.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Tue, 6 Jan 2004 12:43:41 -0700 Received: from orsmsx402.amr.corp.intel.com ([192.168.65.208]) by orsmsx312.amr.corp.intel.com with Microsoft SMTPSVC(5.0.2195.5329); Tue, 6 Jan 2004 11:43:39 -0800 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 Subject: RE: Problem with Compaq NC3131 dual eth card Date: Tue, 6 Jan 2004 11:43:39 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Problem with Compaq NC3131 dual eth card Thread-Index: AcPUaFn0UwQhJlULT5e6luJhURXDwAAIPwwg From: "Feldman, Scott" To: "Chris Adams" , X-OriginalArrivalTime: 06 Jan 2004 19:43:39.0668 (UTC) FILETIME=[62081540:01C3D48D] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id i06JhsTa020748 X-archive-position: 2247 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev > eth1: OEM i82557/i82558 10/100 Ethernet, FF:FF:FF:FF:FF:FF, IRQ 11. [snip] > eth2: Invalid EEPROM checksum 0xffc0, check settings before > activating this device! You're EEPROM image is invalid - looks like it's all 0xFFs - verify with ethtool -e ethX. The first three words of the EEPROM should be the MAC address. e100 errors out because it wants a valid MAC address. > Is my card defective or is this a compatibility problem? Is > there a way to fix this? Get another nic or reprogram the EEPROM in this one. Ok, so you want to reprogram the eeprom? This isn't going to be pretty: 1) get a good image from a like-NC3131-nic using ethtool -e eth; 2) modify e100 to skip the check for valid MAC address and valid checksum so the driver will load; 3) run ethtool -E eth magic 4660 offset value , where x is your interface, y is the byte offset, and z is byte value to write. You can do this by hand, or write a script to parse the output from ethtool -e; 4) revert e100 back to original Told you it wasn't pretty. Maybe someone could write a little script to # ethtool -e eth0 | up_eeprom eth1 Enter unique MAC address: 00:A0:45:78:90:02 -scott From jmorris@redhat.com Tue Jan 6 12:38:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 12:38:50 -0800 (PST) Received: from thoron.boston.redhat.com (nat-pool-bos.redhat.com [66.187.230.200]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06KcZTa024982 for ; Tue, 6 Jan 2004 12:38:35 -0800 Received: from thoron.boston.redhat.com (localhost.localdomain [127.0.0.1]) by thoron.boston.redhat.com (8.12.8/8.12.8) with ESMTP id i06K5AiF008507; Tue, 6 Jan 2004 15:05:10 -0500 Received: from localhost (jmorris@localhost) by thoron.boston.redhat.com (8.12.8/8.12.8/Submit) with ESMTP id i06K5A4M008503; Tue, 6 Jan 2004 15:05:10 -0500 X-Authentication-Warning: thoron.boston.redhat.com: jmorris owned process doing -bs Date: Tue, 6 Jan 2004 15:05:10 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Harald Welte cc: netdev@oss.sgi.com, , "David S. Miller" , Stephen Smalley Subject: Re: [RFC] IPv4 Netfilter hook priorities for SELinux In-Reply-To: <20040106191311.GH934@obroa-skai.de.gnumonks.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2248 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev On Tue, 6 Jan 2004, Harald Welte wrote: > On Tue, Jan 06, 2004 at 11:01:03AM -0500, James Morris wrote: > > > Does anyone have any objections to the patch below (which I'd propose for > > 2.6.2), or other comments? > > Thanks James, I am perfectly fine with your patch. Feel free to put > them into netfilter_arp.h and netfilter_ipv6.h, too. Ok, here is the patch with support for IPv4 and IPv6. I've not added anything for ARP yet as SELinux does not have any ARP controls at this stage (and probably won't in the near future). Please apply. - James -- James Morris diff -urN -X dontdiff linux-2.6.1-rc1-mm2.pending/include/linux/netfilter_ipv4.h linux-2.6.1-rc1-mm2.w1/include/linux/netfilter_ipv4.h --- linux-2.6.1-rc1-mm2.pending/include/linux/netfilter_ipv4.h 2003-09-27 20:50:51.000000000 -0400 +++ linux-2.6.1-rc1-mm2.w1/include/linux/netfilter_ipv4.h 2004-01-06 10:14:59.000000000 -0500 @@ -51,6 +51,7 @@ enum nf_ip_hook_priorities { NF_IP_PRI_FIRST = INT_MIN, + NF_IP_PRI_SELINUX_FIRST = -225, NF_IP_PRI_CONNTRACK = -200, NF_IP_PRI_BRIDGE_SABOTAGE_FORWARD = -175, NF_IP_PRI_MANGLE = -150, @@ -58,6 +59,7 @@ NF_IP_PRI_BRIDGE_SABOTAGE_LOCAL_OUT = -50, NF_IP_PRI_FILTER = 0, NF_IP_PRI_NAT_SRC = 100, + NF_IP_PRI_SELINUX_LAST = 225, NF_IP_PRI_LAST = INT_MAX, }; diff -urN -X dontdiff linux-2.6.1-rc1-mm2.pending/include/linux/netfilter_ipv6.h linux-2.6.1-rc1-mm2.w1/include/linux/netfilter_ipv6.h --- linux-2.6.1-rc1-mm2.pending/include/linux/netfilter_ipv6.h 2003-09-27 20:50:51.000000000 -0400 +++ linux-2.6.1-rc1-mm2.w1/include/linux/netfilter_ipv6.h 2004-01-06 14:41:30.000000000 -0500 @@ -56,11 +56,13 @@ enum nf_ip6_hook_priorities { NF_IP6_PRI_FIRST = INT_MIN, + NF_IP6_PRI_SELINUX_FIRST = -225, NF_IP6_PRI_CONNTRACK = -200, NF_IP6_PRI_MANGLE = -150, NF_IP6_PRI_NAT_DST = -100, NF_IP6_PRI_FILTER = 0, NF_IP6_PRI_NAT_SRC = 100, + NF_IP6_PRI_SELINUX_LAST = 225, NF_IP6_PRI_LAST = INT_MAX, }; From dlstevens@us.ibm.com Tue Jan 6 13:45:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 13:45:19 -0800 (PST) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06LirTa027190 for ; Tue, 6 Jan 2004 13:45:00 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e34.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id i06Lij6t356310; Tue, 6 Jan 2004 16:44:45 -0500 Received: from d03nm121.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i06Lih9J154224; Tue, 6 Jan 2004 14:44:44 -0700 Importance: Normal Sensitivity: Subject: Re: MLD problems (again) To: Takashi Hibi Cc: netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.4a July 24, 2000 Message-ID: From: David Stevens Date: Tue, 6 Jan 2004 13:44:40 -0800 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.0.2CF2HF133 | November 14, 2003) at 01/06/2004 14:44:44 MIME-Version: 1.0 Content-type: multipart/alternative; Boundary="0__=07BBE480DFE791EC8f9e8a93df938690918c07BBE480DFE791EC" Content-Disposition: inline X-archive-position: 2249 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dlstevens@us.ibm.com Precedence: bulk X-list: netdev --0__=07BBE480DFE791EC8f9e8a93df938690918c07BBE480DFE791EC Content-type: text/plain; charset=US-ASCII Takashi, I am looking at this-- I just haven't been in the office for the last two weeks. I haven't looked at it in detail, but I believe there are problems with your patch. Offhand: 1) you added a check "if (skb) skb = addgrec(skb...)" "skb == NULL" is a perfectly valid argument for addgrec(), and there aren't any circumstances I know of where you'd want to only add the record when no others had been added yet (which appears to be what this code is doing). So, this looks completely wrong to me. Either you want the record or not and it doesn't depend on whether you have other records in the report or not. 2) your patch adds a set of crcount within ip_mc_add1_src(), which if necessary would mean the other code is completely broken (which it isn't). I think this is likely either redundant or it may break other assumptions about crcount handling. There are potential races I avoided in this code, so I'm wary of why you put that in there, but I'm pretty sure it's at best unnecessary. 3) it is not correct to always automatically join a group if it's listed in a setsockopt(). I think the result of your patch to fix your problem would also automatically join groups in cases where it should return an error. General comment-- the code is set up with the assumption that an ordinary join has a filter of "EXCLUDE, empty-set". That's because the protocol makes that assumption, and reports for includes of new groups are CHANGE_TO_INCLUDE. It looks like you're trying to bypass that and set the filter directly to INCLUDE, which I expect would result in incorrect reports for some cases. Again, I have only looked at the patch context in your mail, so some of the things I think are problems probably aren't. I suspect, though, that the original cause of your problem is simply a missing "mld_ifc_event" call, though I'm not sure. Those are also carefully constructed to try to avoid multiple reports for complex transitions, when only one report will do, so some care is needed. I will look at this further and try to have an alternative patch by the end of the week. +-DLS Takashi Hibi @oss.sgi.com on 01/06/2004 04:15:42 AM Sent by: netdev-bounce@oss.sgi.com To: netdev@oss.sgi.com cc: Subject: MLD problems (again) A Happy new year. Let me discuss about the problems of MLD again. As I pointed out, there are two problems in current MLD code. 1. In MLDv1 compatibility mode, Older Version Querier Present timer expires prematurely. 2. After join SSM using setsockopt(MCAST_JOIN_SOURCE_GROUP), MLDv2 listener report isn't issued immediately. The cause of 1 is the wrong computation of timeout value. The cause of 2 is difficult to find. After tracing the code, I figured out that the mode of ifmcaddr6 isn't correctly set after setsockopt. After join by setsockopt, pmc->mca_sfmode should be MCAST_INCLUDE, but it remains MCAST_EXCLUDE. Eventually is_in() call returns false, and MLDv2 packet isn't composed. I don't know the right way to fix it, since the code is too complicated by a lot of flags. At least the following patch(diff from 2.6.0) works for me. Regards, Takashi Hibi --- mcast.c 2003-12-18 11:59:28.000000000 +0900 +++ new/mcast.c 2004-01-06 16:30:05.546174486 +0900 @@ -1050,7 +1050,7 @@ int igmp6_event_query(struct sk_buff *sk /* Translate milliseconds to jiffies */ max_delay = (ntohs(hdr->icmp6_maxdelay)*HZ)/1000; - switchback = (idev->mc_qrv + 1) * max_delay; + switchback = MLD_QRV_DEFAULT * 125*HZ + max_delay; idev->mc_v1_seen = jiffies + switchback; /* cancel the interface change timer */ @@ -1541,7 +1541,8 @@ static void mld_send_cr(struct inet6_dev type = MLD2_CHANGE_TO_EXCLUDE; else type = MLD2_CHANGE_TO_INCLUDE; - skb = add_grec(skb, pmc, type, 0, 0); + if (!skb) + skb = add_grec(skb, pmc, type, 0, 0); } spin_unlock_bh(&pmc->mca_lock); } @@ -1745,6 +1746,7 @@ static int ip6_mc_add1_src(struct ifmcad return -ENOBUFS; memset(psf, 0, sizeof(*psf)); psf->sf_addr = *psfsrc; + psf->sf_crcount = pmc->idev->mc_qrv; if (psf_prev) { psf_prev->sf_next = psf; } else @@ -1799,6 +1801,7 @@ int ip6_mc_add_src(struct inet6_dev *ide struct ifmcaddr6 *pmc; int isexclude; int i, err; + int first_join_src_grp; if (!idev) return -ENODEV; @@ -1816,6 +1819,8 @@ int ip6_mc_add_src(struct inet6_dev *ide sf_markstate(pmc); isexclude = pmc->mca_sfmode == MCAST_EXCLUDE; + first_join_src_grp = (!pmc->sources && sfmode == MCAST_INCLUDE && + sfcount == 1 && delta); if (!delta) pmc->mca_sfcount[sfmode]++; err = 0; @@ -1827,10 +1832,19 @@ int ip6_mc_add_src(struct inet6_dev *ide if (err) { int j; - pmc->mca_sfcount[sfmode]--; + if (!delta) + pmc->mca_sfcount[sfmode]--; for (j=0; jmca_sfcount[MCAST_EXCLUDE] != 0)) { + goto done; + } + if (first_join_src_grp) { + pmc->mca_sfmode = MCAST_INCLUDE; + if (sf_setstate(pmc)) + mld_ifc_event(idev); + goto done; + } + if (isexclude != (pmc->mca_sfcount[MCAST_EXCLUDE] != 0)) { struct inet6_dev *idev = pmc->idev; struct ip6_sf_list *psf; @@ -1848,6 +1862,7 @@ int ip6_mc_add_src(struct inet6_dev *ide mld_ifc_event(idev); } else if (sf_setstate(pmc)) mld_ifc_event(idev); +done: spin_unlock_bh(&pmc->mca_lock); read_unlock_bh(&idev->lock); return err; --0__=07BBE480DFE791EC8f9e8a93df938690918c07BBE480DFE791EC Content-type: text/html; charset=US-ASCII Content-Disposition: inline

Takashi,
I am looking at this-- I just haven't been in the office for the last
two weeks. I haven't looked at it in detail, but I believe there are problems
with your patch.

Offhand:
1) you added a check "if (skb) skb = addgrec(skb...)"
"skb == NULL" is a perfectly valid argument for addgrec(), and
there aren't any circumstances I know of where you'd want to only add
the record when no others had been added yet (which appears to be
what this code is doing). So, this looks completely wrong to me. Either
you want the record or not and it doesn't depend on whether you have
other records in the report or not.

2) your patch adds a set of crcount within ip_mc_add1_src(), which if
necessary would mean the other code is completely broken (which it
isn't). I think this is likely either redundant or it may break other assumptions
about crcount handling. There are potential races I avoided in this code,
so I'm wary of why you put that in there, but I'm pretty sure it's at best
unnecessary.

3) it is not correct to always automatically join a group if it's listed in a
setsockopt(). I think the result of your patch to fix your problem would
also automatically join groups in cases where it should return an error.

General comment-- the code is set up with the assumption that an
ordinary join has a filter of "EXCLUDE, empty-set". That's because
the protocol makes that assumption, and reports for includes of new
groups are CHANGE_TO_INCLUDE. It looks like you're trying to
bypass that and set the filter directly to INCLUDE, which I expect would
result in incorrect reports for some cases.

Again, I have only looked at the patch context in your mail, so some of
the things I think are problems probably aren't. I suspect, though, that the
original cause of your problem is simply a missing "mld_ifc_event" call,
though I'm not sure. Those are also carefully constructed to try to avoid
multiple reports for complex transitions, when only one report will do, so
some care is needed. I will look at this further and try to have an alternative
patch by the end of the week.

+-DLS

Sent by: netdev-bounce@oss.sgi.com

To: netdev@oss.sgi.com
cc:
Subject: MLD problems (again)



A Happy new year.

Let me discuss about the problems of MLD again.
As I pointed out, there are two problems in current MLD code.

1. In MLDv1 compatibility mode, Older Version Querier Present timer expires
prematurely.

2. After join SSM using setsockopt(MCAST_JOIN_SOURCE_GROUP),
MLDv2 listener report isn't issued immediately.


The cause of 1 is the wrong computation of timeout value.
The cause of 2 is difficult to find. After tracing the code,
I figured out that the mode of ifmcaddr6 isn't correctly set after
setsockopt.

After join by setsockopt, pmc->mca_sfmode should be MCAST_INCLUDE,
but it remains MCAST_EXCLUDE. Eventually is_in() call returns false,
and MLDv2 packet isn't composed.
I don't know the right way to fix it, since the code is too complicated
by a lot of flags.
At least the following patch(diff from 2.6.0) works for me.

Regards,
Takashi Hibi

--- mcast.c 2003-12-18 11:59:28.000000000 +0900
+++ new/mcast.c 2004-01-06 16:30:05.546174486 +0900
@@ -1050,7 +1050,7 @@ int igmp6_event_query(struct sk_buff *sk

    /* Translate milliseconds to jiffies */
    max_delay = (ntohs(hdr->icmp6_maxdelay)*HZ)/1000;

- switchback = (idev->mc_qrv + 1) * max_delay;
+ switchback = MLD_QRV_DEFAULT * 125*HZ + max_delay;
    idev->mc_v1_seen = jiffies + switchback;
/* cancel the interface change timer */
@@ -1541,7 +1541,8 @@ static void mld_send_cr(struct inet6_dev
    type = MLD2_CHANGE_TO_EXCLUDE;
    else
type = MLD2_CHANGE_TO_INCLUDE;
- skb = add_grec(skb, pmc, type, 0, 0);
+ if (!skb)
+ skb = add_grec(skb, pmc, type, 0, 0);
    }
    spin_unlock_bh(&pmc->mca_lock);
}
@@ -1745,6 +1746,7 @@ static int ip6_mc_add1_src(struct ifmcad
    return -ENOBUFS;
    memset(psf, 0, sizeof(*psf));
    psf->sf_addr = *psfsrc;
+ psf->sf_crcount = pmc->idev->mc_qrv;
if (psf_prev) {
    psf_prev->sf_next = psf;
    } else
@@ -1799,6 +1801,7 @@ int ip6_mc_add_src(struct inet6_dev *ide
struct ifmcaddr6 *pmc;
int isexclude;
int i, err;

+ int     first_join_src_grp;
    if (!idev)
    return -ENODEV;
@@ -1816,6 +1819,8 @@ int ip6_mc_add_src(struct inet6_dev *ide
    sf_markstate(pmc);
    isexclude = pmc->mca_sfmode == MCAST_EXCLUDE;
+ first_join_src_grp = (!pmc->sources && sfmode == MCAST_INCLUDE &&
+ sfcount == 1 && delta);
    if (!delta)
    pmc->mca_sfcount[sfmode]++;
err = 0;
@@ -1827,10 +1832,19 @@ int ip6_mc_add_src(struct inet6_dev *ide
    if (err) {
    int j;

- pmc->mca_sfcount[sfmode]--;
+ if (!delta)
+ pmc->mca_sfcount[sfmode]--;
    for (j=0; j<i; j++)
    (void) ip6_mc_del1_src(pmc, sfmode, &psfsrc[i]);
- } else if (isexclude != (pmc->mca_sfcount[MCAST_EXCLUDE] != 0)) {
+ goto done;
+ }
+ if (first_join_src_grp) {
+ pmc->mca_sfmode = MCAST_INCLUDE;
+ if (sf_setstate(pmc))
+ mld_ifc_event(idev);
+ goto done;
+ }
+ if (isexclude != (pmc->mca_sfcount[MCAST_EXCLUDE] != 0)) {
    struct inet6_dev *idev = pmc->idev;
    struct ip6_sf_list *psf;

@@ -1848,6 +1862,7 @@ int ip6_mc_add_src(struct inet6_dev *ide
mld_ifc_event(idev);
    } else if (sf_setstate(pmc))
    mld_ifc_event(idev);
+done:
spin_unlock_bh(&pmc->mca_lock);
read_unlock_bh(&idev->lock);
return err;




--0__=07BBE480DFE791EC8f9e8a93df938690918c07BBE480DFE791EC-- From romieu@fr.zoreil.com Tue Jan 6 15:28:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 15:28:51 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i06NSaTa032053 for ; Tue, 6 Jan 2004 15:28:37 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.8/8.12.1) with ESMTP id i06NRlsW007905; Wed, 7 Jan 2004 00:27:47 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.8/8.12.1) id i06NRkwp007904; Wed, 7 Jan 2004 00:27:46 +0100 Date: Wed, 7 Jan 2004 00:27:46 +0100 From: Francois Romieu To: akpm@osdl.org Cc: Jeff Garzik , netdev@oss.sgi.com, Brad House Subject: [patch] 2.6.1-rc1-mm1 - erroneous __devinitdata in the r8169 driver Message-ID: <20040107002746.A7314@electric-eye.fr.zoreil.com> References: <20031122183001.GA16993@gtf.org> <20031124000939.A456@electric-eye.fr.zoreil.com> <20031126004550.A25408@electric-eye.fr.zoreil.com> <20031127235143.A16767@electric-eye.fr.zoreil.com> <20031130014738.A2589@electric-eye.fr.zoreil.com> <3FF846C3.5070207@mainstreetsoftworks.com> <20040104233849.A3214@electric-eye.fr.zoreil.com> <20040105231741.A19514@electric-eye.fr.zoreil.com> <3FFA6981.7030006@pobox.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="3V7upXqbjpZ4EhLz" Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3FFA6981.7030006@pobox.com>; from jgarzik@pobox.com on Tue, Jan 06, 2004 at 02:53:37AM -0500 X-Organisation: Land of Sunshine Inc. X-archive-position: 2250 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev --3V7upXqbjpZ4EhLz Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, more silly bug in the r8169 driver. Brad, I would really welcome you reporting that you can send/receive a few packets (more than 64 ?) before the driver panics horribly in rtl8169_rx_interrupt. I will not be reachable from tomorrow until late friday. -- Ueimor --3V7upXqbjpZ4EhLz Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="r8169-buggy-devinitdata.patch" Do not mark __devinitdata a data which is required when network device opens. drivers/net/r8169.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -puN drivers/net/r8169.c~r8169-buggy-devinitdata drivers/net/r8169.c --- linux-2.6.1-rc1-mm1/drivers/net/r8169.c~r8169-buggy-devinitdata 2004-01-07 00:01:50.000000000 +0100 +++ linux-2.6.1-rc1-mm1-romieu/drivers/net/r8169.c 2004-01-07 00:03:47.000000000 +0100 @@ -129,7 +129,7 @@ const static struct { const char *name; u8 mac_version; u32 RxConfigMask; /* Clears the bits supported by this chip */ -} rtl_chip_info[] __devinitdata = { +} rtl_chip_info[] = { _R("RTL8169", RTL_GIGA_MAC_VER_B, 0xff7e1880), _R("RTL8169s/8110s", RTL_GIGA_MAC_VER_D, 0xff7e1880), _R("RTL8169s/8110s", RTL_GIGA_MAC_VER_E, 0xff7e1880) _ --3V7upXqbjpZ4EhLz-- From hibi665@oki.com Tue Jan 6 17:52:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 17:52:32 -0800 (PST) Received: from iscan1.intra.oki.co.jp (okigate.oki.co.jp [202.226.91.194]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i071qATa005492 for ; Tue, 6 Jan 2004 17:52:11 -0800 Received: from aoi.okilab.oki.co.jp (localhost.localdomain [127.0.0.1]) by iscan1.intra.oki.co.jp (8.9.3/8.9.3) with SMTP id KAA19104 for ; Wed, 7 Jan 2004 10:52:05 +0900 Received: (qmail 16216 invoked from network); 7 Jan 2004 10:52:04 +0900 Received: from dhcp23233.okilab.oki.co.jp (HELO kiso) (172.24.23.233) by aoi.okilab.oki.co.jp with SMTP; 7 Jan 2004 10:52:04 +0900 Message-Id: <20040107105347.52d6dc1d%hibi665@oki.com> MIME-Version: 1.0 Date: Wed, 07 Jan 2004 10:53:47 +0900 X-Mailer: Denshin 8 Go V32.1.4.3 X-My-Real-Login-Name: thibi; aoi From: Takashi Hibi To: David Stevens Cc: netdev@oss.sgi.com In-Reply-To: (Your message of "Tue, 6 Jan 2004 13:44:40 -0800") References: Subject: Re: MLD problems (again) X-archive-position: 2251 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hibi665@oki.com Precedence: bulk X-list: netdev Stevens, > > Takashi, > I am looking at this-- I just haven't been in the office for the last > two weeks. I haven't looked at it in detail, but I believe there are > problems > with your patch. > > Offhand: > 1) you added a check "if (skb) skb = addgrec(skb...)" > "skb == NULL" is a perfectly valid argument for addgrec(), and > there aren't any circumstances I know of where you'd want to only add > the record when no others had been added yet (which appears to be > what this code is doing). So, this looks completely wrong to me. Either > you want the record or not and it doesn't depend on whether you have > other records in the report or not. > OK, I missed that there may be other records. > 2) your patch adds a set of crcount within ip_mc_add1_src(), which if > necessary would mean the other code is completely broken (which it > isn't). I think this is likely either redundant or it may break other > assumptions > about crcount handling. There are potential races I avoided in this code, > so I'm wary of why you put that in there, but I'm pretty sure it's at best > unnecessary. > It may be unnecessary, I think. > 3) it is not correct to always automatically join a group if it's listed in > a > setsockopt(). I think the result of your patch to fix your problem would > also automatically join groups in cases where it should return an error. > I don't know what kind of error can be occurred. I think that error check is done in other parts. > General comment-- the code is set up with the assumption that an > ordinary join has a filter of "EXCLUDE, empty-set". That's because > the protocol makes that assumption, and reports for includes of new > groups are CHANGE_TO_INCLUDE. It looks like you're trying to > bypass that and set the filter directly to INCLUDE, which I expect would > result in incorrect reports for some cases. In some cases, both CHANGE_TO_INCLUDE and ALLOW_NEW_SOURCES can be included. It is redundant. (That is why I added the code mentioned in 1) ) When to join SSM (without previous join), is it OK to use CHANGE_TO_INCLUDE? ALLOW_NEW_SOURCES seems ordinary (and Implementation of FreeBSD is so). I couldn't judge which should be used (or both OK) from MLDv2 draft. > Again, I have only looked at the patch context in your mail, so some of > the things I think are problems probably aren't. I suspect, though, that > the > original cause of your problem is simply a missing "mld_ifc_event" call, > though I'm not sure. Those are also carefully constructed to try to avoid > multiple reports for complex transitions, when only one report will do, so > some care is needed. I will look at this further and try to have an > alternative > patch by the end of the week. No, mld_ifc_event() is called when the problem occurrs. But because of the wrong mca_sfmode value, mld_send_cr() doen't send MLD listner report. In the following code in mld_send_cr(), all add_grec() returns NULL. Regards, Takashi Hibi /* change recs */ for (pmc=idev->mc_list; pmc; pmc=pmc->next) { spin_lock_bh(&pmc->mca_lock); if (pmc->mca_sfcount[MCAST_EXCLUDE]) { type = MLD2_BLOCK_OLD_SOURCES; dtype = MLD2_ALLOW_NEW_SOURCES; } else { type = MLD2_ALLOW_NEW_SOURCES; dtype = MLD2_BLOCK_OLD_SOURCES; } skb = add_grec(skb, pmc, type, 0, 0); skb = add_grec(skb, pmc, dtype, 0, 1); /* deleted sources */ /* filter mode changes */ if (pmc->mca_crcount) { pmc->mca_crcount--; if (pmc->mca_sfmode == MCAST_EXCLUDE) type = MLD2_CHANGE_TO_EXCLUDE; else type = MLD2_CHANGE_TO_INCLUDE; skb = add_grec(skb, pmc, type, 0, 0); } spin_unlock_bh(&pmc->mca_lock); } > > +-DLS > > > Takashi Hibi @oss.sgi.com on 01/06/2004 04:15:42 AM > > Sent by: netdev-bounce@oss.sgi.com > > > To: netdev@oss.sgi.com > cc: > Subject: MLD problems (again) > > > > A Happy new year. > > Let me discuss about the problems of MLD again. > As I pointed out, there are two problems in current MLD code. > > 1. In MLDv1 compatibility mode, Older Version Querier Present timer expires > prematurely. > 2. After join SSM using setsockopt(MCAST_JOIN_SOURCE_GROUP), > MLDv2 listener report isn't issued immediately. > > The cause of 1 is the wrong computation of timeout value. > The cause of 2 is difficult to find. After tracing the code, > I figured out that the mode of ifmcaddr6 isn't correctly set after > setsockopt. > > After join by setsockopt, pmc->mca_sfmode should be MCAST_INCLUDE, > but it remains MCAST_EXCLUDE. Eventually is_in() call returns false, > and MLDv2 packet isn't composed. > I don't know the right way to fix it, since the code is too complicated > by a lot of flags. > At least the following patch(diff from 2.6.0) works for me. > > Regards, > Takashi Hibi > > --- mcast.c 2003-12-18 11:59:28.000000000 +0900 > +++ new/mcast.c 2004-01-06 16:30:05.546174486 +0900 > @@ -1050,7 +1050,7 @@ int igmp6_event_query(struct sk_buff *sk > /* Translate milliseconds to jiffies */ > max_delay = (ntohs(hdr->icmp6_maxdelay)*HZ)/1000; > > - switchback = (idev->mc_qrv + 1) * max_delay; > + switchback = MLD_QRV_DEFAULT * 125*HZ + max_delay; > idev->mc_v1_seen = jiffies + switchback; > > /* cancel the interface change timer */ > @@ -1541,7 +1541,8 @@ static void mld_send_cr(struct inet6_dev > type = MLD2_CHANGE_TO_EXCLUDE; > else > type = MLD2_CHANGE_TO_INCLUDE; > - skb = add_grec(skb, pmc, type, 0, 0); > + if (!skb) > + skb = add_grec(skb, pmc, type, 0, 0); > } > spin_unlock_bh(&pmc->mca_lock); > } > @@ -1745,6 +1746,7 @@ static int ip6_mc_add1_src(struct ifmcad > return -ENOBUFS; > memset(psf, 0, sizeof(*psf)); > psf->sf_addr = *psfsrc; > + psf->sf_crcount = pmc->idev->mc_qrv; > if (psf_prev) { > psf_prev->sf_next = psf; > } else > @@ -1799,6 +1801,7 @@ int ip6_mc_add_src(struct inet6_dev *ide > struct ifmcaddr6 *pmc; > int isexclude; > int i, err; > + int first_join_src_grp; > > if (!idev) > return -ENODEV; > @@ -1816,6 +1819,8 @@ int ip6_mc_add_src(struct inet6_dev *ide > > sf_markstate(pmc); > isexclude = pmc->mca_sfmode == MCAST_EXCLUDE; > + first_join_src_grp = (!pmc->sources && sfmode == MCAST_INCLUDE && > + sfcount == 1 && delta); > if (!delta) > pmc->mca_sfcount[sfmode]++; > err = 0; > @@ -1827,10 +1832,19 @@ int ip6_mc_add_src(struct inet6_dev *ide > if (err) { > int j; > > - pmc->mca_sfcount[sfmode]--; > + if (!delta) > + pmc->mca_sfcount[sfmode]--; > for (j=0; j (void) ip6_mc_del1_src(pmc, sfmode, &psfsrc[i]); > - } else if (isexclude != (pmc->mca_sfcount[MCAST_EXCLUDE] != 0)) { > + goto done; > + } > + if (first_join_src_grp) { > + pmc->mca_sfmode = MCAST_INCLUDE; > + if (sf_setstate(pmc)) > + mld_ifc_event(idev); > + goto done; > + } > + if (isexclude != (pmc->mca_sfcount[MCAST_EXCLUDE] != 0)) { > struct inet6_dev *idev = pmc->idev; > struct ip6_sf_list *psf; > > @@ -1848,6 +1862,7 @@ int ip6_mc_add_src(struct inet6_dev *ide > mld_ifc_event(idev); > } else if (sf_setstate(pmc)) > mld_ifc_event(idev); > +done: > spin_unlock_bh(&pmc->mca_lock); > read_unlock_bh(&idev->lock); > return err; > > From dlstevens@us.ibm.com Tue Jan 6 19:47:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 19:47:45 -0800 (PST) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i073lPTa006891 for ; Tue, 6 Jan 2004 19:47:31 -0800 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e31.co.us.ibm.com (8.12.10/8.12.2) with ESMTP id i073lD19485168; Tue, 6 Jan 2004 22:47:13 -0500 Received: from d03nm121.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i073lD67063518; Tue, 6 Jan 2004 20:47:13 -0700 Importance: Normal Sensitivity: Subject: Re: MLD problems (again) To: Takashi Hibi Cc: netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.4a July 24, 2000 Message-ID: From: David Stevens Date: Tue, 6 Jan 2004 19:47:09 -0800 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.0.2CF2HF133 | November 14, 2003) at 01/06/2004 20:47:13 MIME-Version: 1.0 Content-type: multipart/alternative; Boundary="0__=07BBE487DF81945D8f9e8a93df938690918c07BBE487DF81945D" Content-Disposition: inline X-archive-position: 2252 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dlstevens@us.ibm.com Precedence: bulk X-list: netdev --0__=07BBE487DF81945D8f9e8a93df938690918c07BBE487DF81945D Content-type: text/plain; charset=US-ASCII >In some cases, both CHANGE_TO_INCLUDE and ALLOW_NEW_SOURCES can be included. >It is redundant. (That is why I added the code mentioned in 1) ) >When to join SSM (without previous join), is it OK to use CHANGE_TO_INCLUDE? >ALLOW_NEW_SOURCES seems ordinary (and Implementation of FreeBSD is so). >I couldn't judge which should be used (or both OK) from MLDv2 draft. We ran into something similar during testing. The problem is, I didn't have the option of rewriting all the existing multicast code. So some cases in the full-state interface that ought to be atomic are really a combination of a join and a filter mode change. The reason is because I wanted to use the existing multicast join/leave code, and leave all the existing join/leave callers alone. That can result in an extra change record, since it's indistinguishable from the two-step series of joining a group and changing the filter afterwards. But (at least with the tests we did), the report was still correct-- just not optimal. The easy solution was to write a duplicate version of mc_join_group and mc_leave_group, with some minor changes to them, for use by the source filter callers and use the nearly duplicated versions for the non-source filter callers. I chose not to do that. There is a 1-tick delay before generating the change reports and making that longer might have the desired effect, if all the non-atomic changes complete before the report is sent-- I didn't try that, but it's still not guaranteed to be atomic in the change reports. Alternatively, the source filters (if any) could've been added as an argument for the join and leave functions, which would've meant changing all of the existing calls, too. I didn't like either of those approaches, so I treated a full-state filter with auto-join as the series of join and filter change. I think the approach you're taking may mess up other cases, though, and since the report isn't incorrect, just not optimal, I planned to look at this later to see if I can think of a way around it. >No, mld_ifc_event() is called when the problem occurrs. >But because of the wrong mca_sfmode value, mld_send_cr() doen't >send MLD listner report. If the new filter has different sources, and it should, then it still should result in a change report. But again, I need to look at it in more detail before suggesting a fix. When I got your initial report, I wrote a test program and reproduced the problem, and I did some initial looking at the code but had not yet found what's going on; I'll get back to it this week. +-DLS --0__=07BBE487DF81945D8f9e8a93df938690918c07BBE487DF81945D Content-type: text/html; charset=US-ASCII Content-Disposition: inline

>In some cases, both CHANGE_TO_INCLUDE and ALLOW_NEW_SOURCES can be included.
>It is redundant. (That is why I added the code mentioned in 1) )
>When to join SSM (without previous join), is it OK to use CHANGE_TO_INCLUDE?
>ALLOW_NEW_SOURCES seems ordinary (and Implementation of FreeBSD is so).
>I couldn't judge which should be used (or both OK) from MLDv2 draft.


We ran into something similar during testing. The problem is, I didn't
have the option of rewriting all the existing multicast code. So some
cases in the full-state interface that ought to be atomic are really a
combination of a join and a filter mode change. The reason is because
I wanted to use the existing multicast join/leave code, and leave all
the existing join/leave callers alone. That can result in an
extra change record, since it's indistinguishable from the two-step series
of joining a group and changing the filter afterwards. But (at least
with the tests we did), the report was still correct-- just not optimal.

The easy solution was to write a duplicate version of mc_join_group and
mc_leave_group, with some minor changes to them, for use by the source
filter callers and use the nearly duplicated versions for the non-source
filter callers. I chose not to do that. There is a 1-tick delay before
generating the change reports and making that longer might have the
desired effect, if all the non-atomic changes complete before the report
is sent-- I didn't try that, but it's still not guaranteed to be
atomic in the change reports.

Alternatively, the source filters (if any) could've been added as an
argument for the join and leave functions, which would've meant changing
all of the existing calls, too. I didn't like either of those approaches,
so I treated a full-state filter with auto-join as the series of join and
filter change.

I think the approach you're taking may mess up other cases, though, and
since the report isn't incorrect, just not optimal, I planned to look at
this later to see if I can think of a way around it.

>No, mld_ifc_event() is called when the problem occurrs.
>But because of the wrong mca_sfmode value, mld_send_cr() doen't
>send MLD listner report.


If the new filter has different sources, and it should, then
it still should result in a change report. But again, I need to look
at it in more detail before suggesting a fix. When I got your initial
report, I wrote a test program and reproduced the problem, and I did
some initial looking at the code but had not yet found what's going
on; I'll get back to it this week.

+-DLS



--0__=07BBE487DF81945D8f9e8a93df938690918c07BBE487DF81945D-- From cmadams@hiwaay.net Tue Jan 6 20:18:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 20:19:00 -0800 (PST) Received: from mail.hiwaay.net (bee.hiwaay.net [216.180.54.11]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i074IlTa007605 for ; Tue, 6 Jan 2004 20:18:47 -0800 Received: from bee.hiwaay.net (localhost [127.0.0.1]) by mail.hiwaay.net (8.12.10/8.12.10) with ESMTP id i074Iku51465550 for ; Tue, 6 Jan 2004 22:18:46 -0600 (CST) Received: (from cmadams@localhost) by bee.hiwaay.net (8.12.10/8.12.10/DefSubmit) id i074Ikgh1465627 for netdev@oss.sgi.com; Tue, 6 Jan 2004 22:18:46 -0600 (CST) Date: Tue, 6 Jan 2004 22:18:46 -0600 From: Chris Adams To: netdev@oss.sgi.com Subject: Re: Problem with Compaq NC3131 dual eth card Message-ID: <20040107041845.GA1446903@hiwaay.net> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i X-archive-position: 2253 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cmadams@hiwaay.net Precedence: bulk X-list: netdev Once upon a time, Feldman, Scott said: > You're EEPROM image is invalid - looks like it's all 0xFFs - verify with > ethtool -e ethX. The first three words of the EEPROM should be the MAC > address. e100 errors out because it wants a valid MAC address. Yep, all 1s. > > Is my card defective or is this a compatibility problem? Is > > there a way to fix this? > > Get another nic or reprogram the EEPROM in this one. I looked at some other Intel NICs here and kind of winged it; I reprogrammed both EEPROMs, made up a couple of MACs, and it is working now. Thanks. Not bad for a $20 card picked up at a general surplus store here. It was still sealed in the box (Compaq labels on both box flaps, sealed static bag, etc.), so it must have come that way from Compaq. Maybe they had a bad batch and surplused them instead of fixing them? Now I might have to go back and pick up some more! :-) -- Chris Adams Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble. From davem@pizda.ninka.net Tue Jan 6 21:44:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 06 Jan 2004 21:45:03 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i075ifTa011740 for ; Tue, 6 Jan 2004 21:44:41 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id VAA04099; Tue, 6 Jan 2004 21:36:42 -0800 Date: Tue, 6 Jan 2004 21:36:42 -0800 From: "David S. Miller" To: James Morris Cc: laforge@netfilter.org, netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, sds@epoch.ncsc.mil Subject: Re: [RFC] IPv4 Netfilter hook priorities for SELinux Message-Id: <20040106213642.3b30f4bc.davem@redhat.com> In-Reply-To: References: <20040106191311.GH934@obroa-skai.de.gnumonks.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2254 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Tue, 6 Jan 2004 15:05:10 -0500 (EST) James Morris wrote: > Ok, here is the patch with support for IPv4 and IPv6. I've not added > anything for ARP yet as SELinux does not have any ARP controls at this > stage (and probably won't in the near future). > > Please apply. Applied, thanks guys. From ak@suse.de Wed Jan 7 01:45:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 01:45:37 -0800 (PST) Received: from Cantor.suse.de (ns.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i079jDTa019901 for ; Wed, 7 Jan 2004 01:45:14 -0800 Received: from Hermes.suse.de (Hermes.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id A3A0F19A211D; Wed, 7 Jan 2004 10:07:46 +0100 (CET) Date: Wed, 7 Jan 2004 10:07:43 +0100 From: Andi Kleen To: davem@redhat.com, netdev@oss.sgi.com Subject: [PATCH] Add 32bit emulation for cmsg SO_TIMESTAMP Message-Id: <20040107100743.1c0b18c2.ak@suse.de> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2255 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Some traceroute versions use the SO_TIMESTAMP cmsg for time measurement. This fixes them when running as 32bit on a 64bit x86-64 kernel. -Andi --- linux-2.6.1rc2-amd64/net/compat.c-o 2003-11-23 19:46:36.000000000 -0800 +++ linux-2.6.1rc2-amd64/net/compat.c 2004-01-07 00:21:43.741256711 -0800 @@ -215,15 +215,25 @@ int put_cmsg_compat(struct msghdr *kmsg, int level, int type, int len, void *data) { + struct compat_timeval ctv; struct compat_cmsghdr *cm = (struct compat_cmsghdr *) kmsg->msg_control; struct compat_cmsghdr cmhdr; - int cmlen = CMSG_COMPAT_LEN(len); + int cmlen; if(cm == NULL || kmsg->msg_controllen < sizeof(*cm)) { kmsg->msg_flags |= MSG_CTRUNC; return 0; /* XXX: return error? check spec. */ } + if (level == SOL_SOCKET && type == SO_TIMESTAMP) { + struct timeval *tv = (struct timeval *)data; + ctv.tv_sec = tv->tv_sec; + ctv.tv_usec = tv->tv_usec; + data = &ctv; + len = sizeof(struct compat_timeval); + } + + cmlen = CMSG_COMPAT_LEN(len); if(kmsg->msg_controllen < cmlen) { kmsg->msg_flags |= MSG_CTRUNC; cmlen = kmsg->msg_controllen; From vnuorval@tcs.hut.fi Wed Jan 7 01:51:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 01:51:57 -0800 (PST) Received: from neon.tcs.hut.fi (neon.tcs.hut.fi [130.233.215.20]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i079pXTa020436 for ; Wed, 7 Jan 2004 01:51:34 -0800 Received: from rhea.tcs.hut.fi (rhea.tcs.hut.fi [130.233.215.147]) by neon.tcs.hut.fi (Postfix) with ESMTP id 8DA3A8002F6; Wed, 7 Jan 2004 11:22:59 +0200 (EET) Received: from rhea.tcs.hut.fi (localhost [127.0.0.1]) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-6.6) with ESMTP id i079Mx7Y027967; Wed, 7 Jan 2004 11:22:59 +0200 Received: from localhost (vnuorval@localhost) by rhea.tcs.hut.fi (8.12.3/8.12.3/Debian-6.6) with ESMTP id i079Mvf4027963; Wed, 7 Jan 2004 11:22:57 +0200 Date: Wed, 7 Jan 2004 11:22:57 +0200 (EET) From: Ville Nuorvala To: davem@redhat.com Cc: yoshfuji@linux-ipv6.org, pekkas@netcore.fi, netdev@oss.sgi.com Subject: [PATCH][RESEND] IPv6: Autoconfig link-local address on ip6-ip6 tunnel device Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2256 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vnuorval@tcs.hut.fi Precedence: bulk X-list: netdev Hi Dave, This patch, made against cset 1.1492, adds a link-local address to a ip6-ip6 tunnel interface when it is brougth up. It also changes the router solicitation behavior slightly by checking that rtr_solicits is > 0, before sending a router solicitation after a successful DAD probe on the link-local address. 2.6.0 has been released, so would you apply this patch now? :) Thanks, Ville diff -Nru a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c --- a/net/ipv6/addrconf.c Wed Jan 7 11:07:42 2004 +++ b/net/ipv6/addrconf.c Wed Jan 7 11:07:42 2004 @@ -1809,6 +1809,54 @@ sit_route_add(dev); } +static inline int +ipv6_inherit_linklocal(struct inet6_dev *idev, struct net_device *link_dev) +{ + struct in6_addr lladdr; + + if (!ipv6_get_lladdr(link_dev, &lladdr)) { + addrconf_add_linklocal(idev, &lladdr); + return 0; + } + return -1; +} + +static void ip6_tnl_add_linklocal(struct inet6_dev *idev) +{ + struct net_device *link_dev; + + /* first try to inherit the link-local address from the link device */ + if (idev->dev->iflink && + (link_dev = __dev_get_by_index(idev->dev->iflink))) { + if (!ipv6_inherit_linklocal(idev, link_dev)) + return; + } + /* then try to inherit it from any device */ + for (link_dev = dev_base; link_dev; link_dev = link_dev->next) { + if (!ipv6_inherit_linklocal(idev, link_dev)) + return; + } + printk(KERN_DEBUG "init ip6-ip6: add_linklocal failed\n"); +} + +/* + * Autoconfigure tunnel with a link-local address so routing protocols, + * DHCPv6, MLD etc. can be run over the virtual link + */ + +static void addrconf_ip6_tnl_config(struct net_device *dev) +{ + struct inet6_dev *idev; + + ASSERT_RTNL(); + + if ((idev = addrconf_add_dev(dev)) == NULL) { + printk(KERN_DEBUG "init ip6-ip6: add_dev failed\n"); + return; + } + ip6_tnl_add_linklocal(idev); + addrconf_add_mroute(dev); +} int addrconf_notify(struct notifier_block *this, unsigned long event, void * data) @@ -1822,7 +1870,9 @@ case ARPHRD_SIT: addrconf_sit_config(dev); break; - + case ARPHRD_TUNNEL6: + addrconf_ip6_tnl_config(dev); + break; case ARPHRD_LOOPBACK: init_loopback(dev); break; @@ -2121,6 +2171,7 @@ */ if (ifp->idev->cnf.forwarding == 0 && + ifp->idev->cnf.rtr_solicits > 0 && (dev->flags&IFF_LOOPBACK) == 0 && (ipv6_addr_type(&ifp->addr) & IPV6_ADDR_LINKLOCAL)) { struct in6_addr all_routers; diff -Nru a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c --- a/net/ipv6/ip6_tunnel.c Wed Jan 7 11:07:42 2004 +++ b/net/ipv6/ip6_tunnel.c Wed Jan 7 11:07:42 2004 @@ -821,6 +821,8 @@ else dev->flags &= ~IFF_POINTOPOINT; + dev->iflink = p->link; + if (p->flags & IP6_TNL_F_CAP_XMIT) { struct rt6_info *rt = rt6_lookup(&p->raddr, &p->laddr, p->link, 0); @@ -829,8 +831,6 @@ return; if (rt->rt6i_dev) { - dev->iflink = rt->rt6i_dev->ifindex; - dev->hard_header_len = rt->rt6i_dev->hard_header_len + sizeof (struct ipv6hdr); @@ -1040,7 +1040,6 @@ dev->hard_header_len = LL_MAX_HEADER + sizeof (struct ipv6hdr); dev->mtu = ETH_DATA_LEN - sizeof (struct ipv6hdr); dev->flags |= IFF_NOARP; - dev->iflink = 0; dev->addr_len = sizeof(struct in6_addr); } -- Ville Nuorvala Research Assistant, Institute of Digital Communications, Helsinki University of Technology email: vnuorval@tcs.hut.fi, phone: +358 (0)9 451 5257 From g.liakhovetski@gmx.de Wed Jan 7 05:37:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 05:37:52 -0800 (PST) Received: from mail.gmx.net (imap.gmx.net [213.165.64.20]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i07DbXTa001195 for ; Wed, 7 Jan 2004 05:37:33 -0800 Received: (qmail 27485 invoked by uid 65534); 7 Jan 2004 13:37:25 -0000 Received: from dialin-145-254-137-059.arcor-ip.net (EHLO poirot.grange) (145.254.137.59) by mail.gmx.net (mp004) with SMTP; 07 Jan 2004 14:37:25 +0100 X-Authenticated: #20450766 Received: from lyakh (helo=localhost) by poirot.grange with local-esmtp (Exim 3.35 #1 (Debian)) id 1AeDmR-00008g-00; Wed, 07 Jan 2004 14:31:27 +0100 Date: Wed, 7 Jan 2004 14:31:27 +0100 (CET) From: Guennadi Liakhovetski To: nfs@lists.sourceforge.net cc: netdev@oss.sgi.com, Subject: 2.6.0 NFS-server low to 0 performance (fwd) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2257 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: g.liakhovetski@gmx.de Precedence: bulk X-list: netdev Sorry for cross-posting - wasn't sure which list would give the best chances. I am forwarding a problem-report, sent yesterday to lkml with a bit more data. Short: The NFS server on a PC with a 2.6.0 (release) kernel slows down to a crawl or stops completely. Searched archives - nothing fits exact enough. The server (PC1) is a 900MHz Duron with 384M RAM and a tulip 10/100 (LinkSys) network card (Linksys Network Everywhere Fast Ethernet 10/100 model NC100 (rev 17)). Clients: PC2 - Pentium 133MHz with 24M RAM and an onboard Lance 79C970 10mbps network card, a SA1100 platform (Tuxscreen / Shannon) with 16M RAM, PCMCIA Netgear 10/100mbps ne2000-compatible (pcnet_cs + 8390) card a PXA250 platform (Inphinity / Triton starter-kit) with 64M RAM, onboard SMC91C11xFD (smc91x driver) 10/100 chip In the tests below I was copying a 4M file from an NFS-mounted directory to a RAM-based fs (ramfs / tmpfs). Here are results: server with 2.6.0 kernel: PC2:2.6.0-test11 2m21s (*) PC2:2.4.20 16.5s SA1100:2.4.19-rmk7 never finishes (*) PXA:2.4.21-rmk1-pxa1 as above PXA:2.6.0-rmk1-pxa as above server: 2.4.21 PC2:2.6.0-test11 6s PC2:2.4.20 5s SA1100:2.4.19-rmk7 3.22s PXA:2.4.21-rmk1-pxa1 7s PXA:2.6.0-rmk2-pxa 1) 50s (**) (***) 2) 27s (**) (*) Messages "NFS server not responding" / "NFS server OK", "mount version older than kernel" on mount (**) Messages "NFS server not responding" / "NFS server OK", "mount version older than kernel" on mount, trafic shows as several peaks (***) 2.6.0-rmk2-pxa corresponds to the 2.6.0-rmk2 kernel with a PXA-patch forward-ported from diff-2.6.0-test2-rmk1-pxa1. The LinkSys card I bought recently, before I used a RTL (3c59x) card, only capable of 10mbps. Here are the results of today with this card: server: 2.6.0 PC2: 2.6.0-test11 9s PC2: 2.4.20 4s SA1100: 2.4.19-rmk7 never finishes server: 2.4.21 SA1100: 2.4.19-rmk7 6s Then I tried PC2 as a server with different kernels PC2: 2.6.0-test11 PC1: 2.6.0 10s SA1100: 2.4.19-rmk7 never finishes server PC2: 2.4.20 PC1: 2.6.0 6s SA1100: 2.4.19-rmk7 7s It is not just a problem of 2.6 with those specific network configurations - ftp / http / tftp transfers work fine. E.g. wget of the same file on the PXA with 2.6.0 from the PC1 with 2.4.21 over http takes about 2s. So, it is 2.6 + NFS. nfs-utils on both PC1 and PC2 version 1.0.6 (Debian Sarge). Is it fixed somewhere (2.6.1-rcx?), or what should I try / what further information is required? Sorry, I am not subscribed to any of these lists, so, please, CC. Thanks Guennadi --- Guennadi Liakhovetski From davem@pizda.ninka.net Wed Jan 7 12:18:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 12:19:11 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i07KIsTa019744 for ; Wed, 7 Jan 2004 12:18:54 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA18970; Wed, 7 Jan 2004 12:12:58 -0800 Date: Wed, 7 Jan 2004 12:12:58 -0800 From: "David S. Miller" To: Andi Kleen Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Add 32bit emulation for cmsg SO_TIMESTAMP Message-Id: <20040107121258.722a7f16.davem@redhat.com> In-Reply-To: <20040107100743.1c0b18c2.ak@suse.de> References: <20040107100743.1c0b18c2.ak@suse.de> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2258 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 7 Jan 2004 10:07:43 +0100 Andi Kleen wrote: > Some traceroute versions use the SO_TIMESTAMP cmsg for time measurement. > This fixes them when running as 32bit on a 64bit x86-64 kernel. Applied, thanks Andi. From davem@pizda.ninka.net Wed Jan 7 12:22:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 12:23:15 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i07KMnTa022732 for ; Wed, 7 Jan 2004 12:22:50 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id MAA18999; Wed, 7 Jan 2004 12:15:51 -0800 Date: Wed, 7 Jan 2004 12:15:50 -0800 From: "David S. Miller" To: Ville Nuorvala Cc: yoshfuji@linux-ipv6.org, pekkas@netcore.fi, netdev@oss.sgi.com Subject: Re: [PATCH][RESEND] IPv6: Autoconfig link-local address on ip6-ip6 tunnel device Message-Id: <20040107121550.1dba9972.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2259 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Wed, 7 Jan 2004 11:22:57 +0200 (EET) Ville Nuorvala wrote: > This patch, made against cset 1.1492, adds a link-local address to a > ip6-ip6 tunnel interface when it is brougth up. > > It also changes the router solicitation behavior slightly by checking that > rtr_solicits is > 0, before sending a router solicitation after a > successful DAD probe on the link-local address. > > 2.6.0 has been released, so would you apply this patch now? :) Patch applied, thanks Ville. From per@hedeland.org Wed Jan 7 12:58:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 12:58:51 -0800 (PST) Received: from pluto.hedeland.org (as1-2-8.mal.s.bonet.se [194.236.4.19]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i07KwaTa023790 for ; Wed, 7 Jan 2004 12:58:37 -0800 Received: from pluto.hedeland.org (localhost [127.0.0.1]) by pluto.hedeland.org (8.12.10/8.12.10) with ESMTP id i07KwWI5054482 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 7 Jan 2004 21:58:32 +0100 (CET) Received: (from per@localhost) by pluto.hedeland.org (8.12.10/8.12.10/Submit) id i07KwWba054481; Wed, 7 Jan 2004 21:58:32 +0100 (CET) Date: Wed, 7 Jan 2004 21:58:32 +0100 (CET) From: Per Hedeland Message-Id: <200401072058.i07KwWba054481@pluto.hedeland.org> To: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: [PATCH] [bonding 2.4] Add balance-xor-ip bonding mode X-archive-position: 2260 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: per@hedeland.org Precedence: bulk X-list: netdev This patch adds a new bonding policy, similar to the previously existing balance-xor, but using the IP addresses rather than MAC addresses for IP packets, with fallback to MAC-based balance-xor for non-IP packets. The patch is against the netdev-2.4 tree (with Shmulik's 'update comment blocks' and Amir's 'using per-bond parameters' patches applied). --Per Hedeland per@hedeland.org diff -Nru a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt --- a/Documentation/networking/bonding.txt Wed Jan 7 16:31:56 2004 +++ b/Documentation/networking/bonding.txt Wed Jan 7 16:33:42 2004 @@ -368,6 +368,17 @@ fails it's hw address is swapped with the new curr_active_slave that was chosen. + balance-xor-ip or 7 + + XOR IP policy: Transmit based on [(source IP address + XOR'd with destination IP address) modula slave count]. + I.e. similar to balance-xor, but uses the IP addresses + rather than MAC addresses for IP packets, which provides + better load balancing in some cases (e.g. most traffic + sent to a default gateway). For non-IP packets, it will + fall back to MAC-based balance-xor. This mode provides + load balancing and fault tolerance. + primary A string (eth0, eth2, etc) to equate to a primary device. If this diff -Nru a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Wed Jan 7 16:26:59 2004 +++ b/drivers/net/bonding/bond_main.c Wed Jan 7 16:41:28 2004 @@ -476,6 +476,7 @@ #include #include #include +#include #include #include #include @@ -585,6 +586,7 @@ { "balance-rr", BOND_MODE_ROUNDROBIN}, { "active-backup", BOND_MODE_ACTIVEBACKUP}, { "balance-xor", BOND_MODE_XOR}, +{ "balance-xor-ip", BOND_MODE_XOR_IP}, { "broadcast", BOND_MODE_BROADCAST}, { "802.3ad", BOND_MODE_8023AD}, { "balance-tlb", BOND_MODE_TLB}, @@ -607,6 +609,8 @@ return "fault-tolerance (active-backup)"; case BOND_MODE_XOR : return "load balancing (xor)"; + case BOND_MODE_XOR_IP: + return "load balancing (xor-ip)"; case BOND_MODE_BROADCAST : return "fault-tolerance (broadcast)"; case BOND_MODE_8023AD: @@ -3658,16 +3662,17 @@ /* * in XOR mode, we determine the output device by performing xor on - * the source and destination hw adresses. If this device is not + * the source and destination adresses. If this device is not * enabled, find the next slave following this xor slave. */ -static int bond_xmit_xor(struct sk_buff *skb, struct net_device *bond_dev) +static int bond_xmit_xor(struct sk_buff *skb, struct net_device *bond_dev, int use_ip) { struct bonding *bond = bond_dev->priv; struct ethhdr *data = (struct ethhdr *)skb->data; struct slave *slave, *start_at; - int slave_no; + int slave_no = 0; int i; + __u32 u; read_lock(&bond->lock); @@ -3675,7 +3680,30 @@ goto free_out; } - slave_no = (data->h_dest[5]^bond_dev->dev_addr[5]) % bond->slave_cnt; + if (use_ip) { + switch (ntohs(skb->protocol)) { + case ETH_P_IP: + u = skb->nh.iph->saddr ^ skb->nh.iph->daddr; + u ^= (u >> 24) ^ (u >> 16) ^ (u >> 8); + slave_no = (u & 0xff) % bond->slave_cnt; + break; + case ETH_P_IPV6: + for (u = 0, i = 0; i < 4; i++) { + u ^= skb->nh.ipv6h->saddr.s6_addr32[i] ^ + skb->nh.ipv6h->daddr.s6_addr32[i]; + } + u ^= (u >> 24) ^ (u >> 16) ^ (u >> 8); + slave_no = (u & 0xff) % bond->slave_cnt; + break; + default: + use_ip = 0; + break; + } + } + + if (!use_ip) { + slave_no = (data->h_dest[5]^bond_dev->dev_addr[5]) % bond->slave_cnt; + } bond_for_each_slave(bond, slave, i) { slave_no--; @@ -3708,6 +3736,16 @@ goto out; } +static int bond_xmit_xor_mac(struct sk_buff *skb, struct net_device *bond_dev) +{ + return bond_xmit_xor(skb, bond_dev, 0); +} + +static int bond_xmit_xor_ip(struct sk_buff *skb, struct net_device *bond_dev) +{ + return bond_xmit_xor(skb, bond_dev, 1); +} + /* * in broadcast mode, we send everything to all usable interfaces. */ @@ -3794,7 +3832,10 @@ bond_dev->hard_start_xmit = bond_xmit_activebackup; break; case BOND_MODE_XOR: - bond_dev->hard_start_xmit = bond_xmit_xor; + bond_dev->hard_start_xmit = bond_xmit_xor_mac; + break; + case BOND_MODE_XOR_IP: + bond_dev->hard_start_xmit = bond_xmit_xor_ip; break; case BOND_MODE_BROADCAST: bond_dev->hard_start_xmit = bond_xmit_broadcast; @@ -3915,8 +3956,7 @@ for (i = 0; tbl[i].modename; i++) { if ((isdigit(*mode_arg) && tbl[i].mode == simple_strtol(mode_arg, NULL, 0)) || - (strncmp(mode_arg, tbl[i].modename, - strlen(tbl[i].modename)) == 0)) { + (strcmp(mode_arg, tbl[i].modename) == 0)) { return tbl[i].mode; } } diff -Nru a/include/linux/if_bonding.h b/include/linux/if_bonding.h --- a/include/linux/if_bonding.h Wed Jan 7 16:24:48 2004 +++ b/include/linux/if_bonding.h Wed Jan 7 16:33:42 2004 @@ -67,6 +67,7 @@ #define BOND_MODE_8023AD 4 #define BOND_MODE_TLB 5 #define BOND_MODE_ALB 6 /* TLB + RLB (receive load balancing) */ +#define BOND_MODE_XOR_IP 7 /* each slave's link has 4 states */ #define BOND_LINK_UP 0 /* link is up and running */ From per@hedeland.org Wed Jan 7 13:01:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 13:01:57 -0800 (PST) Received: from pluto.hedeland.org (as1-2-8.mal.s.bonet.se [194.236.4.19]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i07L1hTa024229 for ; Wed, 7 Jan 2004 13:01:44 -0800 Received: from pluto.hedeland.org (localhost [127.0.0.1]) by pluto.hedeland.org (8.12.10/8.12.10) with ESMTP id i07L1cI5054665 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 7 Jan 2004 22:01:38 +0100 (CET) Received: (from per@localhost) by pluto.hedeland.org (8.12.10/8.12.10/Submit) id i07L1cOw054664; Wed, 7 Jan 2004 22:01:38 +0100 (CET) Date: Wed, 7 Jan 2004 22:01:38 +0100 (CET) From: Per Hedeland Message-Id: <200401072101.i07L1cOw054664@pluto.hedeland.org> To: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: [PATCH] [bonding 2.6] Add balance-xor-ip bonding mode X-archive-position: 2261 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: per@hedeland.org Precedence: bulk X-list: netdev This patch adds a new bonding policy, similar to the previously existing balance-xor, but using the IP addresses rather than MAC addresses for IP packets, with fallback to MAC-based balance-xor for non-IP packets. The patch is against the net-drivers-2.5-exp tree (with Shmulik's 'update comment blocks' and Amir's 'using per-bond parameters' patches applied). --Per Hedeland per@hedeland.org diff -Nru a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt --- a/Documentation/networking/bonding.txt Wed Jan 7 16:18:11 2004 +++ b/Documentation/networking/bonding.txt Wed Jan 7 16:45:11 2004 @@ -368,6 +368,17 @@ fails it's hw address is swapped with the new curr_active_slave that was chosen. + balance-xor-ip or 7 + + XOR IP policy: Transmit based on [(source IP address + XOR'd with destination IP address) modula slave count]. + I.e. similar to balance-xor, but uses the IP addresses + rather than MAC addresses for IP packets, which provides + better load balancing in some cases (e.g. most traffic + sent to a default gateway). For non-IP packets, it will + fall back to MAC-based balance-xor. This mode provides + load balancing and fault tolerance. + primary A string (eth0, eth2, etc) to equate to a primary device. If this diff -Nru a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Wed Jan 7 16:16:19 2004 +++ b/drivers/net/bonding/bond_main.c Wed Jan 7 16:45:11 2004 @@ -476,6 +476,7 @@ #include #include #include +#include #include #include #include @@ -585,6 +586,7 @@ { "balance-rr", BOND_MODE_ROUNDROBIN}, { "active-backup", BOND_MODE_ACTIVEBACKUP}, { "balance-xor", BOND_MODE_XOR}, +{ "balance-xor-ip", BOND_MODE_XOR_IP}, { "broadcast", BOND_MODE_BROADCAST}, { "802.3ad", BOND_MODE_8023AD}, { "balance-tlb", BOND_MODE_TLB}, @@ -607,6 +609,8 @@ return "fault-tolerance (active-backup)"; case BOND_MODE_XOR : return "load balancing (xor)"; + case BOND_MODE_XOR_IP: + return "load balancing (xor-ip)"; case BOND_MODE_BROADCAST : return "fault-tolerance (broadcast)"; case BOND_MODE_8023AD: @@ -3657,16 +3661,17 @@ /* * in XOR mode, we determine the output device by performing xor on - * the source and destination hw adresses. If this device is not + * the source and destination adresses. If this device is not * enabled, find the next slave following this xor slave. */ -static int bond_xmit_xor(struct sk_buff *skb, struct net_device *bond_dev) +static int bond_xmit_xor(struct sk_buff *skb, struct net_device *bond_dev, int use_ip) { struct bonding *bond = bond_dev->priv; struct ethhdr *data = (struct ethhdr *)skb->data; struct slave *slave, *start_at; - int slave_no; + int slave_no = 0; int i; + __u32 u; read_lock(&bond->lock); @@ -3674,7 +3679,30 @@ goto free_out; } - slave_no = (data->h_dest[5]^bond_dev->dev_addr[5]) % bond->slave_cnt; + if (use_ip) { + switch (ntohs(skb->protocol)) { + case ETH_P_IP: + u = skb->nh.iph->saddr ^ skb->nh.iph->daddr; + u ^= (u >> 24) ^ (u >> 16) ^ (u >> 8); + slave_no = (u & 0xff) % bond->slave_cnt; + break; + case ETH_P_IPV6: + for (u = 0, i = 0; i < 4; i++) { + u ^= skb->nh.ipv6h->saddr.s6_addr32[i] ^ + skb->nh.ipv6h->daddr.s6_addr32[i]; + } + u ^= (u >> 24) ^ (u >> 16) ^ (u >> 8); + slave_no = (u & 0xff) % bond->slave_cnt; + break; + default: + use_ip = 0; + break; + } + } + + if (!use_ip) { + slave_no = (data->h_dest[5]^bond_dev->dev_addr[5]) % bond->slave_cnt; + } bond_for_each_slave(bond, slave, i) { slave_no--; @@ -3707,6 +3735,16 @@ goto out; } +static int bond_xmit_xor_mac(struct sk_buff *skb, struct net_device *bond_dev) +{ + return bond_xmit_xor(skb, bond_dev, 0); +} + +static int bond_xmit_xor_ip(struct sk_buff *skb, struct net_device *bond_dev) +{ + return bond_xmit_xor(skb, bond_dev, 1); +} + /* * in broadcast mode, we send everything to all usable interfaces. */ @@ -3793,7 +3831,10 @@ bond_dev->hard_start_xmit = bond_xmit_activebackup; break; case BOND_MODE_XOR: - bond_dev->hard_start_xmit = bond_xmit_xor; + bond_dev->hard_start_xmit = bond_xmit_xor_mac; + break; + case BOND_MODE_XOR_IP: + bond_dev->hard_start_xmit = bond_xmit_xor_ip; break; case BOND_MODE_BROADCAST: bond_dev->hard_start_xmit = bond_xmit_broadcast; @@ -3914,8 +3955,7 @@ for (i = 0; tbl[i].modename; i++) { if ((isdigit(*mode_arg) && tbl[i].mode == simple_strtol(mode_arg, NULL, 0)) || - (strncmp(mode_arg, tbl[i].modename, - strlen(tbl[i].modename)) == 0)) { + (strcmp(mode_arg, tbl[i].modename) == 0)) { return tbl[i].mode; } } diff -Nru a/include/linux/if_bonding.h b/include/linux/if_bonding.h --- a/include/linux/if_bonding.h Wed Jan 7 16:21:37 2004 +++ b/include/linux/if_bonding.h Wed Jan 7 16:45:11 2004 @@ -67,6 +67,7 @@ #define BOND_MODE_8023AD 4 #define BOND_MODE_TLB 5 #define BOND_MODE_ALB 6 /* TLB + RLB (receive load balancing) */ +#define BOND_MODE_XOR_IP 7 /* each slave's link has 4 states */ #define BOND_LINK_UP 0 /* link is up and running */ From willy@w.ods.org Wed Jan 7 13:03:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 13:03:20 -0800 (PST) Received: from willy.net1.nerim.net (willy.net1.nerim.net [62.212.114.60]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i07L35Ta024586 for ; Wed, 7 Jan 2004 13:03:06 -0800 Date: Wed, 7 Jan 2004 22:02:55 +0100 From: Willy Tarreau To: Stephan von Krawczynski Cc: linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Problem with 2.4.24 e1000 and keepalived Message-ID: <20040107210255.GA545@alpha.home.local> References: <20040107200556.0d553c40.skraw@ithnet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040107200556.0d553c40.skraw@ithnet.com> User-Agent: Mutt/1.4i X-archive-position: 2262 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@w.ods.org Precedence: bulk X-list: netdev Hi Stephan, On Wed, Jan 07, 2004 at 08:05:56PM +0100, Stephan von Krawczynski wrote: > Setup is a simple pair of routers with 2 nics each, all e1000. If you start a > vrrp setup with keepalived and interface state is down during keepalived > startup, then the failover does not work. If the nics are UP during startup > everything works well. Now the kernel part of the story: the exact same setup > works with tulip cards. > Is there a difference regarding UP/DOWN state handling/events in e1000 and > tulip. e100 and eepro100 show the same problem btw. I noticed the exact same problem about 1 year ago with the early 2.4 bonding code and eepro100. At this time, I attributed this to a yet undiscovered but in the bonding state machine, and could not investigate much since it was on a remote production machine. Someone went there and rebooted it and everything went OK. Before the reboot, the switch alredy detected an UP link, while the bonding code saw it down (using MII at this time, not ethtool). I recently read one report (here or on keepalived list) about someone who got the same problem with another eepro100. I wonder whether there would not be a bug either in the driver or in the chip itself. What I noticed is that if you load the driver while the cable is unplugged, and then plug it, the MII status says the link is still down. Unfortunately, the only e100 I have access to are in prod at a customer's and I really cannot make tests there. Cheers, Willy From greearb@candelatech.com Wed Jan 7 18:45:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 18:45:28 -0800 (PST) Received: from ns1.wanfear.com (ns1.wanfear.com [207.212.57.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i082jFTa004198 for ; Wed, 7 Jan 2004 18:45:15 -0800 Received: from candelatech.com (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) (authenticated bits=0) by ns1.wanfear.com (8.12.10/8.12.8) with ESMTP id i082j5Eu021315; Wed, 7 Jan 2004 18:45:07 -0800 Message-ID: <3FFCC430.4060804@candelatech.com> Date: Wed, 07 Jan 2004 18:45:04 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Willy Tarreau CC: Stephan von Krawczynski , linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Problem with 2.4.24 e1000 and keepalived References: <20040107200556.0d553c40.skraw@ithnet.com> <20040107210255.GA545@alpha.home.local> In-Reply-To: <20040107210255.GA545@alpha.home.local> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2264 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 2044 Lines: 53 Willy Tarreau wrote: > Hi Stephan, > > On Wed, Jan 07, 2004 at 08:05:56PM +0100, Stephan von Krawczynski wrote: > >>Setup is a simple pair of routers with 2 nics each, all e1000. If you start a >>vrrp setup with keepalived and interface state is down during keepalived >>startup, then the failover does not work. If the nics are UP during startup >>everything works well. Now the kernel part of the story: the exact same setup >>works with tulip cards. >>Is there a difference regarding UP/DOWN state handling/events in e1000 and >>tulip. e100 and eepro100 show the same problem btw. > > > I noticed the exact same problem about 1 year ago with the early 2.4 > bonding code and eepro100. At this time, I attributed this to a yet > undiscovered but in the bonding state machine, and could not investigate > much since it was on a remote production machine. Someone went there and > rebooted it and everything went OK. Before the reboot, the switch alredy > detected an UP link, while the bonding code saw it down (using MII at this > time, not ethtool). I recently read one report (here or on keepalived list) > about someone who got the same problem with another eepro100. I wonder > whether there would not be a bug either in the driver or in the chip itself. > > What I noticed is that if you load the driver while the cable is unplugged, > and then plug it, the MII status says the link is still down. Unfortunately, > the only e100 I have access to are in prod at a customer's and I really > cannot make tests there. You have to bring the interface 'UP' before it will detect link, with something like: ifconfig eth2 up Could that be the problem? Ben > > Cheers, > Willy > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Ben Greear Candela Technologies Inc http://www.candelatech.com From willy@w.ods.org Wed Jan 7 21:20:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 21:20:38 -0800 (PST) Received: from willy.net1.nerim.net (willy.net1.nerim.net [62.212.114.60]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i085KGTa010187 for ; Wed, 7 Jan 2004 21:20:17 -0800 Date: Thu, 8 Jan 2004 06:20:00 +0100 From: Willy Tarreau To: Ben Greear Cc: Willy Tarreau , Stephan von Krawczynski , linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Problem with 2.4.24 e1000 and keepalived Message-ID: <20040108052000.GA8829@alpha.home.local> References: <20040107200556.0d553c40.skraw@ithnet.com> <20040107210255.GA545@alpha.home.local> <3FFCC430.4060804@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3FFCC430.4060804@candelatech.com> User-Agent: Mutt/1.4i X-archive-position: 2265 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@w.ods.org Precedence: bulk X-list: netdev Content-Length: 561 Lines: 16 Hi Ben, On Wed, Jan 07, 2004 at 06:45:04PM -0800, Ben Greear wrote: > You have to bring the interface 'UP' before it will detect link, > with something like: ifconfig eth2 up Don't you mean "after" instead of "before" here ? Because the case where it doesn't work is when everything is set up while the cable is unplugged, but conversely, if the system goes up with the cable plugged, setting the interface UP detects the link as UP and works. I believe that the problem is related to setting the interface UP with nothing plugged into it. Cheers, Willy From andi@muc.de Wed Jan 7 23:04:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 07 Jan 2004 23:04:35 -0800 (PST) Received: from averell.firstfloor.org (personne@pD9E56430.dip.t-dialin.net [217.229.100.48]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0874JTa011645 for ; Wed, 7 Jan 2004 23:04:20 -0800 Received: from averell.firstfloor.org (localhost [127.0.0.1]) by averell.firstfloor.org (8.12.6/8.12.6/SuSE Linux 0.6) with ESMTP id i0874EBQ031786; Thu, 8 Jan 2004 08:04:14 +0100 Received: (from andi@localhost) by averell.firstfloor.org (8.12.6/8.12.6/Submit) id i0874D0c031785; Thu, 8 Jan 2004 08:04:13 +0100 Date: Thu, 8 Jan 2004 08:04:13 +0100 From: Andi Kleen To: netdev@oss.sgi.com Cc: davem@redhat.com Subject: [PATCH] Mark SIOCSIFNAME as compatible ioctl Message-ID: <20040108070413.GA31778@averell> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 2266 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@muc.de Precedence: bulk X-list: netdev Content-Length: 678 Lines: 18 Mark SIOCSIFNAME as an ioctl that doesn't need 32bit conversion. Fixes nameif as 32bit executable. -Andi diff -X ../KDIFX -burpN linux-vanilla-2.6.1rc2/include/linux/compat_ioctl.h linux-2.6.1rc2-amd64/include/linux/compat_ioctl.h --- linux-vanilla-2.6.1rc2/include/linux/compat_ioctl.h 2004-01-07 02:36:31.000000000 -0800 +++ linux-2.6.1rc2-amd64/include/linux/compat_ioctl.h 2003-12-31 21:56:55.000000000 -0800 @@ -260,6 +260,7 @@ COMPATIBLE_IOCTL(SIOCATMARK) COMPATIBLE_IOCTL(SIOCSIFLINK) COMPATIBLE_IOCTL(SIOCSIFENCAP) COMPATIBLE_IOCTL(SIOCGIFENCAP) +COMPATIBLE_IOCTL(SIOCSIFNAME) COMPATIBLE_IOCTL(SIOCSIFBR) COMPATIBLE_IOCTL(SIOCGIFBR) COMPATIBLE_IOCTL(SIOCSARP) From greearb@candelatech.com Thu Jan 8 00:07:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 00:07:40 -0800 (PST) Received: from ns1.wanfear.com (ns1.wanfear.com [207.212.57.1]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0887HTa012814 for ; Thu, 8 Jan 2004 00:07:17 -0800 Received: from candelatech.com (evrtwa1-ar2-4-35-049-074.evrtwa1.dsl-verizon.net [4.35.49.74]) (authenticated bits=0) by ns1.wanfear.com (8.12.10/8.12.8) with ESMTP id i0887BEu030492; Thu, 8 Jan 2004 00:07:11 -0800 Message-ID: <3FFD0FAE.8050705@candelatech.com> Date: Thu, 08 Jan 2004 00:07:10 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Willy Tarreau CC: Stephan von Krawczynski , linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Problem with 2.4.24 e1000 and keepalived References: <20040107200556.0d553c40.skraw@ithnet.com> <20040107210255.GA545@alpha.home.local> <3FFCC430.4060804@candelatech.com> <20040108052000.GA8829@alpha.home.local> In-Reply-To: <20040108052000.GA8829@alpha.home.local> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 2267 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 1150 Lines: 38 Willy Tarreau wrote: > Hi Ben, > > On Wed, Jan 07, 2004 at 06:45:04PM -0800, Ben Greear wrote: > > >>You have to bring the interface 'UP' before it will detect link, >>with something like: ifconfig eth2 up > > > Don't you mean "after" instead of "before" here ? Because the case where > it doesn't work is when everything is set up while the cable is unplugged, > but conversely, if the system goes up with the cable plugged, setting the > interface UP detects the link as UP and works. I believe that the problem > is related to setting the interface UP with nothing plugged into it. No, I meant what I said: You have to tell many drivers to bring the interface up before they will attempt (or at least report) link negotiation. You do NOT have to give it an IP address or add any routes to it. But, I don't know about your particular program, I just suspect it is related to detecting link state. I think tg3 detects link when the interface is not UP, if you have some tg3 nics maybe you could try with them? Ben > > Cheers, > Willy > -- Ben Greear Candela Technologies Inc http://www.candelatech.com From skraw@ithnet.com Thu Jan 8 00:15:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 00:15:16 -0800 (PST) Received: from ithnet.com (mail3.ithnet.com [217.64.64.7]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i088F2Ta013257 for ; Thu, 8 Jan 2004 00:15:03 -0800 Received: (qmail 10514 invoked by uid 0); 8 Jan 2004 08:14:41 -0000 Received: from skraw@ithnet.com by heather-ng (Processed in 0.65375 secs); 08 Jan 2004 08:14:41 -0000 X-Virus-Status: No Received: from unknown (HELO ithnet.com) (217.64.64.14) by heather-ng.ithnet.com with SMTP; 8 Jan 2004 08:14:40 -0000 X-Sender-Authentication: net64 Date: Thu, 8 Jan 2004 09:14:41 +0100 From: Stephan von Krawczynski To: Ben Greear Cc: willy@w.ods.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Problem with 2.4.24 e1000 and keepalived Message-Id: <20040108091441.3ff81b53.skraw@ithnet.com> In-Reply-To: <3FFCC430.4060804@candelatech.com> References: <20040107200556.0d553c40.skraw@ithnet.com> <20040107210255.GA545@alpha.home.local> <3FFCC430.4060804@candelatech.com> Organization: ith Kommunikationstechnik GmbH X-Mailer: Sylpheed version 0.9.8 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2268 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: skraw@ithnet.com Precedence: bulk X-list: netdev Content-Length: 1020 Lines: 42 On Wed, 07 Jan 2004 18:45:04 -0800 Ben Greear wrote: > Willy Tarreau wrote: > > Hi Stephan, > > [...] > > What I noticed is that if you load the driver while the cable is unplugged, > > and then plug it, the MII status says the link is still down. > > Unfortunately, the only e100 I have access to are in prod at a customer's > > and I really cannot make tests there. > > You have to bring the interface 'UP' before it will detect link, > with something like: ifconfig eth2 up > > Could that be the problem? > > Ben Hi Ben, the situation is like this (exactly this works flawlessly with tulip): - unplug all interfaces from the switches - reboot box - plug in _one_ interface - log into the box (yes, network works flawlessly) - start keepalived - now plug in rest of the interfaces - watch keepalived do _nothing_ (seems no UP event shows up) in comparison to: - let all interfaces plugged in - reboot box - log in - start keepalived - watch it work as expected Regards, Stephan From willy@w.ods.org Thu Jan 8 00:46:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 00:46:35 -0800 (PST) Received: from willy.net1.nerim.net (willy.net1.nerim.net [62.212.114.60]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i088kITa016724 for ; Thu, 8 Jan 2004 00:46:19 -0800 Date: Thu, 8 Jan 2004 09:46:05 +0100 From: Willy Tarreau To: Ben Greear Cc: Willy Tarreau , Stephan von Krawczynski , linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Problem with 2.4.24 e1000 and keepalived Message-ID: <20040108084605.GA9050@alpha.home.local> References: <20040107200556.0d553c40.skraw@ithnet.com> <20040107210255.GA545@alpha.home.local> <3FFCC430.4060804@candelatech.com> <20040108052000.GA8829@alpha.home.local> <3FFD0FAE.8050705@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3FFD0FAE.8050705@candelatech.com> User-Agent: Mutt/1.4i X-archive-position: 2269 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@w.ods.org Precedence: bulk X-list: netdev Content-Length: 741 Lines: 20 On Thu, Jan 08, 2004 at 12:07:10AM -0800, Ben Greear wrote: > No, I meant what I said: You have to tell many drivers to bring the > interface > up before they will attempt (or at least report) link negotiation. > You do NOT have to give it an IP address or add any routes to it. ah, OK. No, anyway, it is just a matter of wrongly detecting link state after the link has been plugged while the interface was already UP, no matter if an IP was set or not. > But, I don't know about your particular program, I just suspect it > is related to detecting link state. I think tg3 detects link when > the interface is not UP, if you have some tg3 nics maybe you could > try with them? As far as I have tested, tg3 are fine WRT this. Willy From willy@w.ods.org Thu Jan 8 00:48:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 00:48:22 -0800 (PST) Received: from willy.net1.nerim.net (willy.net1.nerim.net [62.212.114.60]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i088m7Ta017074 for ; Thu, 8 Jan 2004 00:48:08 -0800 Date: Thu, 8 Jan 2004 09:47:58 +0100 From: Willy Tarreau To: Stephan von Krawczynski Cc: Ben Greear , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Problem with 2.4.24 e1000 and keepalived Message-ID: <20040108084758.GB9050@alpha.home.local> References: <20040107200556.0d553c40.skraw@ithnet.com> <20040107210255.GA545@alpha.home.local> <3FFCC430.4060804@candelatech.com> <20040108091441.3ff81b53.skraw@ithnet.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040108091441.3ff81b53.skraw@ithnet.com> User-Agent: Mutt/1.4i X-archive-position: 2270 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@w.ods.org Precedence: bulk X-list: netdev Content-Length: 520 Lines: 16 On Thu, Jan 08, 2004 at 09:14:41AM +0100, Stephan von Krawczynski wrote: > the situation is like this (exactly this works flawlessly with tulip): > > - unplug all interfaces from the switches > - reboot box > - plug in _one_ interface > - log into the box (yes, network works flawlessly) > - start keepalived > - now plug in rest of the interfaces > - watch keepalived do _nothing_ (seems no UP event shows up) I agree with this description, and would add : - mii-diag ethX or ethtool ethX report link down Willy From amir.noam@intel.com Thu Jan 8 07:34:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 07:34:33 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08FYFTa005086 for ; Thu, 8 Jan 2004 07:34:17 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08FXoek007175; Thu, 8 Jan 2004 15:33:50 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08FXoxV029495; Thu, 8 Jan 2004 15:33:50 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010817334918087 ; Thu, 08 Jan 2004 17:33:49 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08FXihb002518; Thu, 8 Jan 2004 17:33:46 +0200 (IST) From: Amir Noam To: "Per Hedeland" Subject: Re: [Bonding-devel] [PATCH] [bonding 2.4] Add balance-xor-ip bonding mode Date: Thu, 8 Jan 2004 17:33:44 +0200 User-Agent: KMail/1.5.3 References: In-Reply-To: Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081733.44744.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2271 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 1113 Lines: 39 On Wednesday 07 January 2004 10:58 pm, Per Hedeland wrote: > struct bonding *bond = bond_dev->priv; > struct ethhdr *data = (struct ethhdr *)skb->data; > struct slave *slave, *start_at; > - int slave_no; > + int slave_no = 0; > int i; > + __u32 u; > > read_lock(&bond->lock); > Please use u32 instead of __u32. > +static int bond_xmit_xor_mac(struct sk_buff *skb, struct net_device *bond_dev) > +{ > + return bond_xmit_xor(skb, bond_dev, 0); > +} > + > +static int bond_xmit_xor_ip(struct sk_buff *skb, struct net_device *bond_dev) > +{ > + return bond_xmit_xor(skb, bond_dev, 1); > +} > + hmm... I don't like this. The reason we give different tx function pointers to dev->hard_start_xmit in different bonding mode is to make the tx path as fast as possible. Otherwise we might as well use a single tx function that chooses its exact operation based on the bonding mode. It might be better to have some code duplication if it results in faster tx, but I'm not sure what's the optimal solution in this case. -- Amir From tony.cureington@hp.com Thu Jan 8 07:36:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 07:37:07 -0800 (PST) Received: from zcamail03.zca.compaq.com (zcamail03.zca.compaq.com [161.114.32.103]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08FasTa005369 for ; Thu, 8 Jan 2004 07:36:54 -0800 Received: from cacexg12.americas.cpqcorp.net (cacexg12.americas.cpqcorp.net [16.92.1.46]) by zcamail03.zca.compaq.com (Postfix) with ESMTP id B53D3B9D0; Thu, 8 Jan 2004 07:36:48 -0800 (PST) Received: from txnexc01.americas.cpqcorp.net ([16.74.7.21]) by cacexg12.americas.cpqcorp.net with Microsoft SMTPSVC(6.0.3790.0); Thu, 8 Jan 2004 07:36:46 -0800 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 Subject: RE: [Bonding-devel] [PATCH] [bonding 2.6] Add balance-xor-ip bonding mode Date: Thu, 8 Jan 2004 09:36:44 -0600 Message-ID: <72A87F7160C0994D8C5A36E2FDC227F502B3E9BE@txnexc01.americas.cpqcorp.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [Bonding-devel] [PATCH] [bonding 2.6] Add balance-xor-ip bonding mode Thread-Index: AcPVYdqSPbokhakcSIif9UDWkwRTywAG92qw From: "Cureington, Tony" To: "Per Hedeland" , , X-OriginalArrivalTime: 08 Jan 2004 15:36:46.0253 (UTC) FILETIME=[395FD5D0:01C3D5FD] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id i08FasTa005369 X-archive-position: 2272 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tony.cureington@hp.com Precedence: bulk X-list: netdev Content-Length: 6717 Lines: 204 I'm curious of the reasoning behind "u ^= (u >> 24) ^ (u >> 16) ^ (u >> 8);", what advantages does it have over using the xor'd addresses just before this line? Maybe someone loaded decaf on me this morning? :-/ Thanks! > -----Original Message----- > From: bonding-devel-admin@lists.sourceforge.net > [mailto:bonding-devel-admin@lists.sourceforge.net]On Behalf Of Per > Hedeland > Sent: Wednesday, January 07, 2004 3:02 PM > To: bonding-devel@lists.sourceforge.net; netdev@oss.sgi.com > Subject: [Bonding-devel] [PATCH] [bonding 2.6] Add balance-xor-ip > bonding mode > > > This patch adds a new bonding policy, similar to the > previously existing > balance-xor, but using the IP addresses rather than MAC > addresses for IP > packets, with fallback to MAC-based balance-xor for non-IP packets. > > The patch is against the net-drivers-2.5-exp tree (with Shmulik's > 'update comment blocks' and Amir's 'using per-bond parameters' patches > applied). > > --Per Hedeland > per@hedeland.org > > > diff -Nru a/Documentation/networking/bonding.txt > b/Documentation/networking/bonding.txt > --- a/Documentation/networking/bonding.txt Wed Jan 7 16:18:11 2004 > +++ b/Documentation/networking/bonding.txt Wed Jan 7 16:45:11 2004 > @@ -368,6 +368,17 @@ > fails it's hw address is swapped with the new > curr_active_slave > that was chosen. > > + balance-xor-ip or 7 > + > + XOR IP policy: Transmit based on [(source IP address > + XOR'd with destination IP address) modula slave count]. > + I.e. similar to balance-xor, but uses the IP addresses > + rather than MAC addresses for IP packets, which provides > + better load balancing in some cases (e.g. most traffic > + sent to a default gateway). For non-IP packets, it will > + fall back to MAC-based balance-xor. This mode provides > + load balancing and fault tolerance. > + > primary > > A string (eth0, eth2, etc) to equate to a primary > device. If this > diff -Nru a/drivers/net/bonding/bond_main.c > b/drivers/net/bonding/bond_main.c > --- a/drivers/net/bonding/bond_main.c Wed Jan 7 16:16:19 2004 > +++ b/drivers/net/bonding/bond_main.c Wed Jan 7 16:45:11 2004 > @@ -476,6 +476,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -585,6 +586,7 @@ > { "balance-rr", BOND_MODE_ROUNDROBIN}, > { "active-backup", BOND_MODE_ACTIVEBACKUP}, > { "balance-xor", BOND_MODE_XOR}, > +{ "balance-xor-ip", BOND_MODE_XOR_IP}, > { "broadcast", BOND_MODE_BROADCAST}, > { "802.3ad", BOND_MODE_8023AD}, > { "balance-tlb", BOND_MODE_TLB}, > @@ -607,6 +609,8 @@ > return "fault-tolerance (active-backup)"; > case BOND_MODE_XOR : > return "load balancing (xor)"; > + case BOND_MODE_XOR_IP: > + return "load balancing (xor-ip)"; > case BOND_MODE_BROADCAST : > return "fault-tolerance (broadcast)"; > case BOND_MODE_8023AD: > @@ -3657,16 +3661,17 @@ > > /* > * in XOR mode, we determine the output device by performing xor on > - * the source and destination hw adresses. If this device is not > + * the source and destination adresses. If this device is not > * enabled, find the next slave following this xor slave. > */ > -static int bond_xmit_xor(struct sk_buff *skb, struct > net_device *bond_dev) > +static int bond_xmit_xor(struct sk_buff *skb, struct > net_device *bond_dev, int use_ip) > { > struct bonding *bond = bond_dev->priv; > struct ethhdr *data = (struct ethhdr *)skb->data; > struct slave *slave, *start_at; > - int slave_no; > + int slave_no = 0; > int i; > + __u32 u; > > read_lock(&bond->lock); > > @@ -3674,7 +3679,30 @@ > goto free_out; > } > > - slave_no = (data->h_dest[5]^bond_dev->dev_addr[5]) % > bond->slave_cnt; > + if (use_ip) { > + switch (ntohs(skb->protocol)) { > + case ETH_P_IP: > + u = skb->nh.iph->saddr ^ skb->nh.iph->daddr; > + u ^= (u >> 24) ^ (u >> 16) ^ (u >> 8); > + slave_no = (u & 0xff) % bond->slave_cnt; > + break; > + case ETH_P_IPV6: > + for (u = 0, i = 0; i < 4; i++) { > + u ^= skb->nh.ipv6h->saddr.s6_addr32[i] ^ > + > skb->nh.ipv6h->daddr.s6_addr32[i]; > + } > + u ^= (u >> 24) ^ (u >> 16) ^ (u >> 8); > + slave_no = (u & 0xff) % bond->slave_cnt; > + break; > + default: > + use_ip = 0; > + break; > + } > + } > + > + if (!use_ip) { > + slave_no = > (data->h_dest[5]^bond_dev->dev_addr[5]) % bond->slave_cnt; > + } > > bond_for_each_slave(bond, slave, i) { > slave_no--; > @@ -3707,6 +3735,16 @@ > goto out; > } > > +static int bond_xmit_xor_mac(struct sk_buff *skb, struct > net_device *bond_dev) > +{ > + return bond_xmit_xor(skb, bond_dev, 0); > +} > + > +static int bond_xmit_xor_ip(struct sk_buff *skb, struct > net_device *bond_dev) > +{ > + return bond_xmit_xor(skb, bond_dev, 1); > +} > + > /* > * in broadcast mode, we send everything to all usable interfaces. > */ > @@ -3793,7 +3831,10 @@ > bond_dev->hard_start_xmit = bond_xmit_activebackup; > break; > case BOND_MODE_XOR: > - bond_dev->hard_start_xmit = bond_xmit_xor; > + bond_dev->hard_start_xmit = bond_xmit_xor_mac; > + break; > + case BOND_MODE_XOR_IP: > + bond_dev->hard_start_xmit = bond_xmit_xor_ip; > break; > case BOND_MODE_BROADCAST: > bond_dev->hard_start_xmit = bond_xmit_broadcast; > @@ -3914,8 +3955,7 @@ > for (i = 0; tbl[i].modename; i++) { > if ((isdigit(*mode_arg) && > tbl[i].mode == simple_strtol(mode_arg, NULL, 0)) || > - (strncmp(mode_arg, tbl[i].modename, > - strlen(tbl[i].modename)) == 0)) { > + (strcmp(mode_arg, tbl[i].modename) == 0)) { > return tbl[i].mode; > } > } > diff -Nru a/include/linux/if_bonding.h b/include/linux/if_bonding.h > --- a/include/linux/if_bonding.h Wed Jan 7 16:21:37 2004 > +++ b/include/linux/if_bonding.h Wed Jan 7 16:45:11 2004 > @@ -67,6 +67,7 @@ > #define BOND_MODE_8023AD 4 > #define BOND_MODE_TLB 5 > #define BOND_MODE_ALB 6 /* TLB + RLB (receive > load balancing) */ > +#define BOND_MODE_XOR_IP 7 > > /* each slave's link has 4 states */ > #define BOND_LINK_UP 0 /* link is up and running */ > > > ------------------------------------------------------- > This SF.net email is sponsored by: Perforce Software. > Perforce is the Fast Software Configuration Management System offering > advanced branching capabilities and atomic changes on 50+ platforms. > Free Eval! http://www.perforce.com/perforce/loadprog.html > _______________________________________________ > Bonding-devel mailing list > Bonding-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bonding-devel > From amir.noam@intel.com Thu Jan 8 08:20:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:20:57 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GKcTa010497 for ; Thu, 8 Jan 2004 08:20:40 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GJxek011628; Thu, 8 Jan 2004 16:19:59 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GJqxZ000523; Thu, 8 Jan 2004 16:19:59 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818195818982 ; Thu, 08 Jan 2004 18:19:58 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GJshb003823; Thu, 8 Jan 2004 18:19:55 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [bonding] Add basic support for dynamic configuration of bond interfaces Date: Thu, 8 Jan 2004 18:19:54 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081819.54484.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2273 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 471 Lines: 15 The following patch sets provide basic support for future bonding operations (specifically for dynamic configuration of bonding interfaces). This is done by adding two new bonding ioctls: one for deviceless commands (an ioctl hook) and one for device oriented commands. Like ethtool, the first u32 value in the data structure will indicate the exact sub-command to be executed. The sets are against the latest netdev-2.4 and net-drivers-2.5-exp trees. -- Amir From amir.noam@intel.com Thu Jan 8 08:21:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:21:57 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GLeTa010554 for ; Thu, 8 Jan 2004 08:21:43 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GLQek011940; Thu, 8 Jan 2004 16:21:26 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GLJxX000750; Thu, 8 Jan 2004 16:21:25 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818212519064 ; Thu, 08 Jan 2004 18:21:25 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GLOhb003850; Thu, 8 Jan 2004 18:21:25 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 1/4] [bonding 2.4] Add bonding ioctl hook Date: Thu, 8 Jan 2004 18:21:23 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081821.24881.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2274 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 3218 Lines: 102 Add two bonding ioctls: SIOCBONDING: ioctl hook to handle commands not directed at a specific bond interface. SIOCBONDDEVICE: ioctl to handle commands for a bond interface. This ioctl can also handle all existing commands, so we can regard them as obsolete in the future. All future bonding operations will be a sub-command of one of these ioctls. diff -Nuarp a/include/linux/sockios.h b/include/linux/sockios.h --- a/include/linux/sockios.h Tue Jan 6 20:40:06 2004 +++ b/include/linux/sockios.h Tue Jan 6 20:40:08 2004 @@ -115,7 +115,9 @@ #define SIOCBONDSLAVEINFOQUERY 0x8993 /* rtn info about slave state */ #define SIOCBONDINFOQUERY 0x8994 /* rtn info about bond state */ #define SIOCBONDCHANGEACTIVE 0x8995 /* update to a new active slave */ - +#define SIOCBONDING 0x8996 /* deviceless bonding commands */ +#define SIOCBONDDEVICE 0x8997 /* device oriented bonding commands */ + /* Device private ioctl calls */ /* diff -Nuarp a/net/core/dev.c b/net/core/dev.c --- a/net/core/dev.c Tue Jan 6 20:40:06 2004 +++ b/net/core/dev.c Tue Jan 6 20:40:08 2004 @@ -2199,6 +2199,7 @@ static int dev_ifsioc(struct ifreq *ifr, cmd == SIOCBONDSLAVEINFOQUERY || cmd == SIOCBONDINFOQUERY || cmd == SIOCBONDCHANGEACTIVE || + cmd == SIOCBONDDEVICE || cmd == SIOCGMIIPHY || cmd == SIOCGMIIREG || cmd == SIOCSMIIREG || @@ -2358,6 +2359,7 @@ int dev_ioctl(unsigned int cmd, void *ar case SIOCBONDSLAVEINFOQUERY: case SIOCBONDINFOQUERY: case SIOCBONDCHANGEACTIVE: + case SIOCBONDDEVICE: if (!capable(CAP_NET_ADMIN)) return -EPERM; dev_load(ifr.ifr_name); diff -Nuarp a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c --- a/net/ipv4/af_inet.c Tue Jan 6 20:40:06 2004 +++ b/net/ipv4/af_inet.c Tue Jan 6 20:40:08 2004 @@ -147,6 +147,16 @@ int (*br_ioctl_hook)(unsigned long); int (*vlan_ioctl_hook)(unsigned long arg); #endif +static DECLARE_MUTEX(bond_ioctl_mutex); +int (*bond_ioctl_hook)(unsigned long arg); + +void bond_ioctl_set(int (*hook)(unsigned long)) +{ + down(&bond_ioctl_mutex); + bond_ioctl_hook = hook; + up(&bond_ioctl_mutex); +} + /* The inetsw table contains everything that inet_create needs to * build a new socket. */ @@ -924,6 +934,21 @@ int inet_ioctl(struct socket *sock, unsi #endif return -ENOPKG; + case SIOCBONDING: + err = -ENOPKG; + +#ifdef CONFIG_KMOD + if (bond_ioctl_hook == NULL) + request_module("bonding"); +#endif + + down(&bond_ioctl_mutex); + if (bond_ioctl_hook != NULL) + err = bond_ioctl_hook(arg); + up(&bond_ioctl_mutex); + + return err; + default: if ((cmd >= SIOCDEVPRIVATE) && (cmd <= (SIOCDEVPRIVATE + 15))) diff -Nuarp a/net/netsyms.c b/net/netsyms.c --- a/net/netsyms.c Tue Jan 6 20:40:06 2004 +++ b/net/netsyms.c Tue Jan 6 20:40:08 2004 @@ -296,6 +296,9 @@ extern int (*dlci_ioctl_hook)(unsigned i EXPORT_SYMBOL(dlci_ioctl_hook); #endif +extern void bond_ioctl_set(int (*hook)(unsigned long)); +EXPORT_SYMBOL(bond_ioctl_set); + #if defined (CONFIG_IPV6_MODULE) || defined (CONFIG_KHTTPD) || defined (CONFIG_KHTTPD_MODULE) || defined (CONFIG_IP_SCTP_MODULE) /* inet functions common to v4 and v6 */ From amir.noam@intel.com Thu Jan 8 08:23:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:23:41 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GNMTa011219 for ; Thu, 8 Jan 2004 08:23:24 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GN7ek012261; Thu, 8 Jan 2004 16:23:07 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GN7xV000973; Thu, 8 Jan 2004 16:23:07 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818230719136 ; Thu, 08 Jan 2004 18:23:07 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GN6hb003877; Thu, 8 Jan 2004 18:23:07 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 2/4] [bonding 2.4] Reduce usage of the global Date: Thu, 8 Jan 2004 18:23:05 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081823.06694.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2275 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 6428 Lines: 190 - Reduce usage of the global values of the ABI version received from the application. Instead, pass it as a function argument were needed. - Save a new slave's original HW address regardless of ABI version. - Move the clearing of the bond's address and some references to the bond's params structure so they are protected by the relevant locks. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Thu Jan 8 18:03:19 2004 +++ b/drivers/net/bonding/bond_main.c Thu Jan 8 18:03:21 2004 @@ -561,6 +561,11 @@ static int arp_ip_count = 0; static u32 my_ip = 0; static int bond_mode = BOND_MODE_ROUNDROBIN; static int lacp_fast = 0; + +/* The global abi_ver vars are only for providing backward compatibility with + * versions that locked bonding into using only the first abi_ver it has seen + * from userspace. + */ static int app_abi_ver = 0; static int orig_app_abi_ver = -1; /* This is used to save the first ABI version * we receive from the application. Once set, @@ -1207,7 +1212,7 @@ static int bond_sethwaddr(struct net_dev } /* enslave device to bond device */ -static int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) +static int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, int abi_ver) { struct bonding *bond = bond_dev->priv; struct slave *new_slave = NULL; @@ -1234,7 +1239,7 @@ static int bond_enslave(struct net_devic return -EBUSY; } - if (app_abi_ver >= 1) { + if (abi_ver >= 1) { /* The application is using an ABI, which requires the * slave interface to be closed. */ @@ -1289,13 +1294,12 @@ static int bond_enslave(struct net_devic */ new_slave->original_flags = slave_dev->flags; - if (app_abi_ver >= 1) { - /* save slave's original ("permanent") mac address for - * modes that needs it, and for restoring it upon release, - * and then set it to the master's address - */ - memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN); + /* save slave's original ("permanent") mac address for restoring it + * upon release + */ + memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN); + if (abi_ver >= 1) { /* set slave to master's mac address * The application already set the master's * mac address to that of the first slave @@ -1319,7 +1323,7 @@ static int bond_enslave(struct net_devic res = netdev_set_master(slave_dev, bond_dev); if (res) { dprintk("Error %d calling netdev_set_master\n", res); - if (app_abi_ver < 1) { + if (abi_ver < 1) { goto err_free; } else { goto err_close; @@ -1520,7 +1524,7 @@ static int bond_enslave(struct net_devic write_unlock_bh(&bond->lock); - if (app_abi_ver < 1) { + if (abi_ver < 1) { /* * !!! This is to support old versions of ifenslave. * We can remove this in 2.5 because our ifenslave takes @@ -1689,6 +1693,14 @@ static int bond_release(struct net_devic } } + if (bond->slave_cnt == 0) { + /* if the last slave was removed, zero the mac address + * of the master so it will be set by the application + * to the mac address of the first slave + */ + memset(bond_dev->dev_addr, 0, bond_dev->addr_len); + } + write_unlock_bh(&bond->lock); /* If the mode USES_PRIMARY, then we should only remove its @@ -1715,12 +1727,10 @@ static int bond_release(struct net_devic /* close slave before restoring its mac address */ dev_close(slave_dev); - if (app_abi_ver >= 1) { - /* restore original ("permanent") mac address */ - memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); - addr.sa_family = slave_dev->type; - slave_dev->set_mac_address(slave_dev, &addr); - } + /* restore original ("permanent") mac address */ + memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave_dev->type; + slave_dev->set_mac_address(slave_dev, &addr); /* restore the original state of the * IFF_NOARP flag that might have been @@ -1732,14 +1742,6 @@ static int bond_release(struct net_devic kfree(slave); - /* if the last slave was removed, zero the mac address - * of the master so it will be set by the application - * to the mac address of the first slave - */ - if (bond->slave_cnt == 0) { - memset(bond_dev->dev_addr, 0, bond_dev->addr_len); - } - return 0; /* deletion OK */ } @@ -1812,12 +1814,10 @@ static int bond_release_all(struct net_d /* close slave before restoring its mac address */ dev_close(slave_dev); - if (app_abi_ver >= 1) { - /* restore original ("permanent") mac address*/ - memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); - addr.sa_family = slave_dev->type; - slave_dev->set_mac_address(slave_dev, &addr); - } + /* restore original ("permanent") mac address*/ + memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave_dev->type; + slave_dev->set_mac_address(slave_dev, &addr); /* restore the original state of the IFF_NOARP flag that might have * been set by bond_set_slave_inactive_flags() @@ -1958,10 +1958,9 @@ static int bond_info_query(struct net_de { struct bonding *bond = bond_dev->priv; + read_lock_bh(&bond->lock); info->bond_mode = bond->params.mode; info->miimon = bond->params.miimon; - - read_lock_bh(&bond->lock); info->num_slaves = bond->slave_cnt; read_unlock_bh(&bond->lock); @@ -2754,16 +2753,14 @@ static void bond_info_show_slave(struct seq_printf(seq, "Link Failure Count: %d\n", slave->link_failure_count); - if (app_abi_ver >= 1) { - seq_printf(seq, - "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", - slave->perm_hwaddr[0], - slave->perm_hwaddr[1], - slave->perm_hwaddr[2], - slave->perm_hwaddr[3], - slave->perm_hwaddr[4], - slave->perm_hwaddr[5]); - } + seq_printf(seq, + "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", + slave->perm_hwaddr[0], + slave->perm_hwaddr[1], + slave->perm_hwaddr[2], + slave->perm_hwaddr[3], + slave->perm_hwaddr[4], + slave->perm_hwaddr[5]); if (bond->params.mode == BOND_MODE_8023AD) { const struct aggregator *agg @@ -3326,7 +3323,7 @@ static int bond_do_ioctl(struct net_devi switch (cmd) { case BOND_ENSLAVE_OLD: case SIOCBONDENSLAVE: - res = bond_enslave(bond_dev, slave_dev); + res = bond_enslave(bond_dev, slave_dev, app_abi_ver); break; case BOND_RELEASE_OLD: case SIOCBONDRELEASE: From amir.noam@intel.com Thu Jan 8 08:24:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:24:47 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GOVTa011329 for ; Thu, 8 Jan 2004 08:24:33 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GOHek012540; Thu, 8 Jan 2004 16:24:17 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GNpxf001065; Thu, 8 Jan 2004 16:24:17 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818241619202 ; Thu, 08 Jan 2004 18:24:16 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GOFhb003904; Thu, 8 Jan 2004 18:24:16 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 3/4] [bonding 2.4] Add support for the bond_hook in bonding Date: Thu, 8 Jan 2004 18:24:14 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081824.15833.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2276 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 3246 Lines: 126 Add support for the bond_hook in bonding, and use it to export some parameters to the calling app. These parameters will be needed later by the application for dynamic configuration of bonding interfaces. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Thu Jan 8 18:03:23 2004 +++ b/drivers/net/bonding/bond_main.c Thu Jan 8 18:03:25 2004 @@ -599,6 +599,7 @@ static struct bond_parm_tbl bond_mode_tb /*-------------------------- Forward declarations ---------------------------*/ +extern void bond_ioctl_set(int (*hook)(unsigned long)); static inline void bond_set_mode_ops(struct net_device *bond_dev, int mode); /*---------------------------- General routines -----------------------------*/ @@ -3899,6 +3900,57 @@ static void bond_free_all(void) #endif } +static int bond_ioctl_deviceless(unsigned long arg) +{ + void *addr = (void *)arg; + u32 cmd; + + if (!capable(CAP_NET_ADMIN)) { + return -EPERM; + } + + if (get_user(cmd, (u32 *)addr)) { + return -EFAULT; + } + + switch (cmd) { + case BOND_CMD_DRV_INFO: { + struct bond_ioctl_drv_info drvinfo; + + if (copy_from_user(&drvinfo, addr, sizeof(drvinfo))) { + return -EFAULT; + } + + /* This is for backward compatibility only. + * Unconditionaly set both global abi_ver vars so we can block + * old ioctls in bond_do_ioctl(). + */ + orig_app_abi_ver = drvinfo.abi_ver; + app_abi_ver = drvinfo.abi_ver; + + drvinfo.abi_ver = BOND_ABI_VERSION; + drvinfo.num_prms = 0; + drvinfo.num_arp_targets = BOND_MAX_ARP_TARGETS; + + if (copy_to_user(addr, &drvinfo, sizeof(drvinfo))) { + return -EFAULT; + } + + break; + } + + /* TODO: implement dynamic add/del of bond interfaces + case BOND_CMD_ADD_BOND: + case BOND_CMD_DEL_BOND: + */ + + default: + return -EOPNOTSUPP; + } + + return 0; +} + /*------------------------- Module initialization ---------------------------*/ /* @@ -4220,6 +4272,8 @@ static int __init bonding_init(void) } rtnl_unlock(); + + bond_ioctl_set(bond_ioctl_deviceless); register_netdevice_notifier(&bond_netdev_notifier); return 0; @@ -4236,6 +4290,7 @@ out_err: static void __exit bonding_exit(void) { unregister_netdevice_notifier(&bond_netdev_notifier); + bond_ioctl_set(NULL); rtnl_lock(); bond_free_all(); diff -Nuarp a/include/linux/if_bonding.h b/include/linux/if_bonding.h --- a/include/linux/if_bonding.h Thu Jan 8 18:03:23 2004 +++ b/include/linux/if_bonding.h Thu Jan 8 18:03:25 2004 @@ -103,6 +103,30 @@ struct ad_info { __u8 partner_system[ETH_ALEN]; }; + +/* + * The following are the available command codes for the SIOCBONDING and + * SIOCBONDDEVICE ioctls. The command codes are the first u32 value of the + * passed struct. The second u32 value is the ABI version of the application. + */ +#define BOND_CMD_DRV_INFO 0x00000001 +#define BOND_CMD_ADD_BOND 0x00000002 +#define BOND_CMD_DEL_BOND 0x00000003 + +/** + * bond_ioctl_drv_info + * + * Used by the %BOND_CMD_DRV_INFO command to retrieve some parameters of the + * bonding driver. + */ +struct bond_ioctl_drv_info { + __u32 cmd; + __u32 abi_ver; + __u32 num_prms; + __u32 num_arp_targets; + char reserved[32]; +}; + #endif /* _LINUX_IF_BONDING_H */ /* From amir.noam@intel.com Thu Jan 8 08:26:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:26:26 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GQATa011978 for ; Thu, 8 Jan 2004 08:26:12 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GPuek012721; Thu, 8 Jan 2004 16:25:56 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GPtxV001284; Thu, 8 Jan 2004 16:25:55 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818255519237 ; Thu, 08 Jan 2004 18:25:55 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GPshb003949; Thu, 8 Jan 2004 18:25:55 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 4/4] [bonding 2.4] Support old commands over new bonding ioctl Date: Thu, 8 Jan 2004 18:25:53 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081825.54932.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2277 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 4415 Lines: 177 Support old commands (enslave, release, change-active) over the new bonding ioctl (SIOCBONDDEVICE). diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Thu Jan 8 18:03:27 2004 +++ b/drivers/net/bonding/bond_main.c Thu Jan 8 18:03:29 2004 @@ -1955,6 +1955,86 @@ static int bond_ethtool_ioctl(struct net } } +static int bond_ioctl_slave_dev(struct bonding *bond, int cmd, void *addr) +{ + struct bond_ioctl_cmd bond_cmd; + struct net_device *slave_dev; + int prev_abi_ver = app_abi_ver; + int prev_orig_abi_ver = orig_app_abi_ver; + int res = 0; + + if (copy_from_user(&bond_cmd, addr, sizeof(bond_cmd))) { + return -EFAULT; + } + + bond_cmd.ifname[IFNAMSIZ - 1] = 0; + + slave_dev = dev_get_by_name(bond_cmd.ifname); + if (!slave_dev) { + return -ENODEV; + } + + /* This is for backward compatibility only. + * Unconditionaly set both global abi_ver vars so we can block + * old ioctls in bond_do_ioctl(). + */ + orig_app_abi_ver = bond_cmd.abi_ver; + app_abi_ver = bond_cmd.abi_ver; + + switch (cmd) { + case BOND_CMD_ENSLAVE: + res = bond_enslave(bond->dev, slave_dev, bond_cmd.abi_ver); + break; + + case BOND_CMD_RELEASE: + res = bond_release(bond->dev, slave_dev); + break; + + case BOND_CMD_CHANGE_ACTIVE: + res = bond_ioctl_change_active(bond->dev, slave_dev); + break; + + default: + res = -EOPNOTSUPP; + break; + } + + dev_put(slave_dev); + + if (res < 0) { + /* The ioctl failed, so there's no point in changing the + * orig_app_abi_ver. We'll restore it's value just in case + * we've changed it earlier in this function. + */ + app_abi_ver = prev_abi_ver; + orig_app_abi_ver = prev_orig_abi_ver; + } + + return res; +} + +static int bond_ioctl_device(struct bonding *bond, void *addr) +{ + u32 cmd; + + if (get_user(cmd, (u32 *) addr)) { + return -EFAULT; + } + + switch (cmd) { + case BOND_CMD_ENSLAVE: + case BOND_CMD_RELEASE: + case BOND_CMD_CHANGE_ACTIVE: + /* these ioctl cmds receive a slave name as an arg */ + return bond_ioctl_slave_dev(bond, cmd, addr); + + default: + return -EOPNOTSUPP; + } + + return 0; +} + static int bond_info_query(struct net_device *bond_dev, struct ifbond *info) { struct bonding *bond = bond_dev->priv; @@ -3214,6 +3294,7 @@ static struct net_device_stats *bond_get static int bond_do_ioctl(struct net_device *bond_dev, struct ifreq *ifr, int cmd) { + struct bonding *bond = bond_dev->priv; struct net_device *slave_dev = NULL; struct ifbond *u_binfo = NULL, k_binfo; struct ifslave *u_sinfo = NULL, k_sinfo; @@ -3224,6 +3305,10 @@ static int bond_do_ioctl(struct net_devi dprintk("bond_ioctl: master=%s, cmd=%d\n", bond_dev->name, cmd); + if (!capable(CAP_NET_ADMIN)) { + return -EPERM; + } + switch (cmd) { case SIOCETHTOOL: return bond_ethtool_ioctl(bond_dev, ifr); @@ -3245,7 +3330,6 @@ static int bond_do_ioctl(struct net_devi } if (mii->reg_num == 1) { - struct bonding *bond = bond_dev->priv; mii->val_out = 0; read_lock_bh(&bond->lock); read_lock(&bond->curr_slave_lock); @@ -3289,13 +3373,19 @@ static int bond_do_ioctl(struct net_devi } return res; + case SIOCBONDDEVICE: + return bond_ioctl_device(bond, ifr->ifr_data); + default: /* Go on */ break; } - if (!capable(CAP_NET_ADMIN)) { - return -EPERM; + if (orig_app_abi_ver > 2) { + /* Refuse to support old ioctls if the app has already + * declared it is new enough for SIOCBONDDEVICE commands. + */ + return -EOPNOTSUPP; } if (orig_app_abi_ver == -1) { diff -Nuarp a/include/linux/if_bonding.h b/include/linux/if_bonding.h --- a/include/linux/if_bonding.h Thu Jan 8 18:03:27 2004 +++ b/include/linux/if_bonding.h Thu Jan 8 18:03:29 2004 @@ -112,6 +112,9 @@ struct ad_info { #define BOND_CMD_DRV_INFO 0x00000001 #define BOND_CMD_ADD_BOND 0x00000002 #define BOND_CMD_DEL_BOND 0x00000003 +#define BOND_CMD_ENSLAVE 0x00000004 +#define BOND_CMD_RELEASE 0x00000005 +#define BOND_CMD_CHANGE_ACTIVE 0x00000006 /** * bond_ioctl_drv_info @@ -127,6 +130,19 @@ struct bond_ioctl_drv_info { char reserved[32]; }; +/** + * bond_ioctl_cmd + * + * %BOND_CMD_ENSLAVE, %BOND_CMD_RELEASE and %BOND_CMD_CHANGE_ACTIVE pass the + * name of the slave to work on in @ifname. + */ +struct bond_ioctl_cmd { + __u32 cmd; + __u32 abi_ver; + __u32 num_prms; + char ifname[IFNAMSIZ]; +}; + #endif /* _LINUX_IF_BONDING_H */ /* From amir.noam@intel.com Thu Jan 8 08:29:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:29:26 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GT9Ta012484 for ; Thu, 8 Jan 2004 08:29:11 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GStek012962; Thu, 8 Jan 2004 16:28:55 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GStxV001443; Thu, 8 Jan 2004 16:28:55 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818285519276 ; Thu, 08 Jan 2004 18:28:55 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GSthb004000; Thu, 8 Jan 2004 18:28:55 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 2/4] [bonding 2.6] Reduce usage of the global value of abi_ver Date: Thu, 8 Jan 2004 18:28:53 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081828.54991.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2278 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 6428 Lines: 190 - Reduce usage of the global values of the ABI version received from the application. Instead, pass it as a function argument were needed. - Save a new slave's original HW address regardless of ABI version. - Move the clearing of the bond's address and some references to the bond's params structure so they are protected by the relevant locks. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Thu Jan 8 18:06:45 2004 +++ b/drivers/net/bonding/bond_main.c Thu Jan 8 18:06:46 2004 @@ -561,6 +561,11 @@ static int arp_ip_count = 0; static u32 my_ip = 0; static int bond_mode = BOND_MODE_ROUNDROBIN; static int lacp_fast = 0; + +/* The global abi_ver vars are only for providing backward compatibility with + * versions that locked bonding into using only the first abi_ver it has seen + * from userspace. + */ static int app_abi_ver = 0; static int orig_app_abi_ver = -1; /* This is used to save the first ABI version * we receive from the application. Once set, @@ -1207,7 +1212,7 @@ static int bond_sethwaddr(struct net_dev } /* enslave device to bond device */ -static int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) +static int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, int abi_ver) { struct bonding *bond = bond_dev->priv; struct slave *new_slave = NULL; @@ -1234,7 +1239,7 @@ static int bond_enslave(struct net_devic return -EBUSY; } - if (app_abi_ver >= 1) { + if (abi_ver >= 1) { /* The application is using an ABI, which requires the * slave interface to be closed. */ @@ -1289,13 +1294,12 @@ static int bond_enslave(struct net_devic */ new_slave->original_flags = slave_dev->flags; - if (app_abi_ver >= 1) { - /* save slave's original ("permanent") mac address for - * modes that needs it, and for restoring it upon release, - * and then set it to the master's address - */ - memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN); + /* save slave's original ("permanent") mac address for restoring it + * upon release + */ + memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN); + if (abi_ver >= 1) { /* set slave to master's mac address * The application already set the master's * mac address to that of the first slave @@ -1319,7 +1323,7 @@ static int bond_enslave(struct net_devic res = netdev_set_master(slave_dev, bond_dev); if (res) { dprintk("Error %d calling netdev_set_master\n", res); - if (app_abi_ver < 1) { + if (abi_ver < 1) { goto err_free; } else { goto err_close; @@ -1520,7 +1524,7 @@ static int bond_enslave(struct net_devic write_unlock_bh(&bond->lock); - if (app_abi_ver < 1) { + if (abi_ver < 1) { /* * !!! This is to support old versions of ifenslave. * We can remove this in 2.5 because our ifenslave takes @@ -1689,6 +1693,14 @@ static int bond_release(struct net_devic } } + if (bond->slave_cnt == 0) { + /* if the last slave was removed, zero the mac address + * of the master so it will be set by the application + * to the mac address of the first slave + */ + memset(bond_dev->dev_addr, 0, bond_dev->addr_len); + } + write_unlock_bh(&bond->lock); /* If the mode USES_PRIMARY, then we should only remove its @@ -1715,12 +1727,10 @@ static int bond_release(struct net_devic /* close slave before restoring its mac address */ dev_close(slave_dev); - if (app_abi_ver >= 1) { - /* restore original ("permanent") mac address */ - memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); - addr.sa_family = slave_dev->type; - slave_dev->set_mac_address(slave_dev, &addr); - } + /* restore original ("permanent") mac address */ + memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave_dev->type; + slave_dev->set_mac_address(slave_dev, &addr); /* restore the original state of the * IFF_NOARP flag that might have been @@ -1732,14 +1742,6 @@ static int bond_release(struct net_devic kfree(slave); - /* if the last slave was removed, zero the mac address - * of the master so it will be set by the application - * to the mac address of the first slave - */ - if (bond->slave_cnt == 0) { - memset(bond_dev->dev_addr, 0, bond_dev->addr_len); - } - return 0; /* deletion OK */ } @@ -1812,12 +1814,10 @@ static int bond_release_all(struct net_d /* close slave before restoring its mac address */ dev_close(slave_dev); - if (app_abi_ver >= 1) { - /* restore original ("permanent") mac address*/ - memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); - addr.sa_family = slave_dev->type; - slave_dev->set_mac_address(slave_dev, &addr); - } + /* restore original ("permanent") mac address*/ + memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave_dev->type; + slave_dev->set_mac_address(slave_dev, &addr); /* restore the original state of the IFF_NOARP flag that might have * been set by bond_set_slave_inactive_flags() @@ -1958,10 +1958,9 @@ static int bond_info_query(struct net_de { struct bonding *bond = bond_dev->priv; + read_lock_bh(&bond->lock); info->bond_mode = bond->params.mode; info->miimon = bond->params.miimon; - - read_lock_bh(&bond->lock); info->num_slaves = bond->slave_cnt; read_unlock_bh(&bond->lock); @@ -2754,16 +2753,14 @@ static void bond_info_show_slave(struct seq_printf(seq, "Link Failure Count: %d\n", slave->link_failure_count); - if (app_abi_ver >= 1) { - seq_printf(seq, - "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", - slave->perm_hwaddr[0], - slave->perm_hwaddr[1], - slave->perm_hwaddr[2], - slave->perm_hwaddr[3], - slave->perm_hwaddr[4], - slave->perm_hwaddr[5]); - } + seq_printf(seq, + "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", + slave->perm_hwaddr[0], + slave->perm_hwaddr[1], + slave->perm_hwaddr[2], + slave->perm_hwaddr[3], + slave->perm_hwaddr[4], + slave->perm_hwaddr[5]); if (bond->params.mode == BOND_MODE_8023AD) { const struct aggregator *agg @@ -3325,7 +3322,7 @@ static int bond_do_ioctl(struct net_devi switch (cmd) { case BOND_ENSLAVE_OLD: case SIOCBONDENSLAVE: - res = bond_enslave(bond_dev, slave_dev); + res = bond_enslave(bond_dev, slave_dev, app_abi_ver); break; case BOND_RELEASE_OLD: case SIOCBONDRELEASE: From amir.noam@intel.com Thu Jan 8 08:30:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:30:27 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GU9Ta012543 for ; Thu, 8 Jan 2004 08:30:11 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GTtek013040; Thu, 8 Jan 2004 16:29:55 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GTsxV001508; Thu, 8 Jan 2004 16:29:55 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818295419288 ; Thu, 08 Jan 2004 18:29:54 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GTshb004033; Thu, 8 Jan 2004 18:29:54 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 3/4] [bonding 2.6] Add support for the bond_hook in bonding Date: Thu, 8 Jan 2004 18:29:53 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081829.54372.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2279 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 3246 Lines: 126 Add support for the bond_hook in bonding, and use it to export some parameters to the calling app. These parameters will be needed later by the application for dynamic configuration of bonding interfaces. diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Thu Jan 8 18:06:49 2004 +++ b/drivers/net/bonding/bond_main.c Thu Jan 8 18:06:50 2004 @@ -599,6 +599,7 @@ static struct bond_parm_tbl bond_mode_tb /*-------------------------- Forward declarations ---------------------------*/ +extern void bond_ioctl_set(int (*hook)(unsigned long)); static inline void bond_set_mode_ops(struct net_device *bond_dev, int mode); /*---------------------------- General routines -----------------------------*/ @@ -3898,6 +3899,57 @@ static void bond_free_all(void) #endif } +static int bond_ioctl_deviceless(unsigned long arg) +{ + void *addr = (void *)arg; + u32 cmd; + + if (!capable(CAP_NET_ADMIN)) { + return -EPERM; + } + + if (get_user(cmd, (u32 *)addr)) { + return -EFAULT; + } + + switch (cmd) { + case BOND_CMD_DRV_INFO: { + struct bond_ioctl_drv_info drvinfo; + + if (copy_from_user(&drvinfo, addr, sizeof(drvinfo))) { + return -EFAULT; + } + + /* This is for backward compatibility only. + * Unconditionaly set both global abi_ver vars so we can block + * old ioctls in bond_do_ioctl(). + */ + orig_app_abi_ver = drvinfo.abi_ver; + app_abi_ver = drvinfo.abi_ver; + + drvinfo.abi_ver = BOND_ABI_VERSION; + drvinfo.num_prms = 0; + drvinfo.num_arp_targets = BOND_MAX_ARP_TARGETS; + + if (copy_to_user(addr, &drvinfo, sizeof(drvinfo))) { + return -EFAULT; + } + + break; + } + + /* TODO: implement dynamic add/del of bond interfaces + case BOND_CMD_ADD_BOND: + case BOND_CMD_DEL_BOND: + */ + + default: + return -EOPNOTSUPP; + } + + return 0; +} + /*------------------------- Module initialization ---------------------------*/ /* @@ -4219,6 +4271,8 @@ static int __init bonding_init(void) } rtnl_unlock(); + + bond_ioctl_set(bond_ioctl_deviceless); register_netdevice_notifier(&bond_netdev_notifier); return 0; @@ -4235,6 +4289,7 @@ out_err: static void __exit bonding_exit(void) { unregister_netdevice_notifier(&bond_netdev_notifier); + bond_ioctl_set(NULL); rtnl_lock(); bond_free_all(); diff -Nuarp a/include/linux/if_bonding.h b/include/linux/if_bonding.h --- a/include/linux/if_bonding.h Thu Jan 8 18:06:49 2004 +++ b/include/linux/if_bonding.h Thu Jan 8 18:06:50 2004 @@ -103,6 +103,30 @@ struct ad_info { __u8 partner_system[ETH_ALEN]; }; + +/* + * The following are the available command codes for the SIOCBONDING and + * SIOCBONDDEVICE ioctls. The command codes are the first u32 value of the + * passed struct. The second u32 value is the ABI version of the application. + */ +#define BOND_CMD_DRV_INFO 0x00000001 +#define BOND_CMD_ADD_BOND 0x00000002 +#define BOND_CMD_DEL_BOND 0x00000003 + +/** + * bond_ioctl_drv_info + * + * Used by the %BOND_CMD_DRV_INFO command to retrieve some parameters of the + * bonding driver. + */ +struct bond_ioctl_drv_info { + __u32 cmd; + __u32 abi_ver; + __u32 num_prms; + __u32 num_arp_targets; + char reserved[32]; +}; + #endif /* _LINUX_IF_BONDING_H */ /* From amir.noam@intel.com Thu Jan 8 08:31:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:31:32 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GVDTa012934 for ; Thu, 8 Jan 2004 08:31:15 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GUwek013210; Thu, 8 Jan 2004 16:30:58 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GUwxX001629; Thu, 8 Jan 2004 16:30:58 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818305719342 ; Thu, 08 Jan 2004 18:30:57 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GUthb004074; Thu, 8 Jan 2004 18:30:56 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 4/4] [bonding 2.6] Support old commands over new bonding ioctl Date: Thu, 8 Jan 2004 18:30:55 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081830.55891.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2280 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 4415 Lines: 177 Support old commands (enslave, release, change-active) over the new bonding ioctl (SIOCBONDDEVICE). diff -Nuarp a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c --- a/drivers/net/bonding/bond_main.c Thu Jan 8 18:06:53 2004 +++ b/drivers/net/bonding/bond_main.c Thu Jan 8 18:06:54 2004 @@ -1955,6 +1955,86 @@ static int bond_ethtool_ioctl(struct net } } +static int bond_ioctl_slave_dev(struct bonding *bond, int cmd, void *addr) +{ + struct bond_ioctl_cmd bond_cmd; + struct net_device *slave_dev; + int prev_abi_ver = app_abi_ver; + int prev_orig_abi_ver = orig_app_abi_ver; + int res = 0; + + if (copy_from_user(&bond_cmd, addr, sizeof(bond_cmd))) { + return -EFAULT; + } + + bond_cmd.ifname[IFNAMSIZ - 1] = 0; + + slave_dev = dev_get_by_name(bond_cmd.ifname); + if (!slave_dev) { + return -ENODEV; + } + + /* This is for backward compatibility only. + * Unconditionaly set both global abi_ver vars so we can block + * old ioctls in bond_do_ioctl(). + */ + orig_app_abi_ver = bond_cmd.abi_ver; + app_abi_ver = bond_cmd.abi_ver; + + switch (cmd) { + case BOND_CMD_ENSLAVE: + res = bond_enslave(bond->dev, slave_dev, bond_cmd.abi_ver); + break; + + case BOND_CMD_RELEASE: + res = bond_release(bond->dev, slave_dev); + break; + + case BOND_CMD_CHANGE_ACTIVE: + res = bond_ioctl_change_active(bond->dev, slave_dev); + break; + + default: + res = -EOPNOTSUPP; + break; + } + + dev_put(slave_dev); + + if (res < 0) { + /* The ioctl failed, so there's no point in changing the + * orig_app_abi_ver. We'll restore it's value just in case + * we've changed it earlier in this function. + */ + app_abi_ver = prev_abi_ver; + orig_app_abi_ver = prev_orig_abi_ver; + } + + return res; +} + +static int bond_ioctl_device(struct bonding *bond, void *addr) +{ + u32 cmd; + + if (get_user(cmd, (u32 *) addr)) { + return -EFAULT; + } + + switch (cmd) { + case BOND_CMD_ENSLAVE: + case BOND_CMD_RELEASE: + case BOND_CMD_CHANGE_ACTIVE: + /* these ioctl cmds receive a slave name as an arg */ + return bond_ioctl_slave_dev(bond, cmd, addr); + + default: + return -EOPNOTSUPP; + } + + return 0; +} + static int bond_info_query(struct net_device *bond_dev, struct ifbond *info) { struct bonding *bond = bond_dev->priv; @@ -3213,6 +3293,7 @@ static struct net_device_stats *bond_get static int bond_do_ioctl(struct net_device *bond_dev, struct ifreq *ifr, int cmd) { + struct bonding *bond = bond_dev->priv; struct net_device *slave_dev = NULL; struct ifbond *u_binfo = NULL, k_binfo; struct ifslave *u_sinfo = NULL, k_sinfo; @@ -3223,6 +3304,10 @@ static int bond_do_ioctl(struct net_devi dprintk("bond_ioctl: master=%s, cmd=%d\n", bond_dev->name, cmd); + if (!capable(CAP_NET_ADMIN)) { + return -EPERM; + } + switch (cmd) { case SIOCETHTOOL: return bond_ethtool_ioctl(bond_dev, ifr); @@ -3244,7 +3329,6 @@ static int bond_do_ioctl(struct net_devi } if (mii->reg_num == 1) { - struct bonding *bond = bond_dev->priv; mii->val_out = 0; read_lock_bh(&bond->lock); read_lock(&bond->curr_slave_lock); @@ -3288,13 +3372,19 @@ static int bond_do_ioctl(struct net_devi } return res; + case SIOCBONDDEVICE: + return bond_ioctl_device(bond, ifr->ifr_data); + default: /* Go on */ break; } - if (!capable(CAP_NET_ADMIN)) { - return -EPERM; + if (orig_app_abi_ver > 2) { + /* Refuse to support old ioctls if the app has already + * declared it is new enough for SIOCBONDDEVICE commands. + */ + return -EOPNOTSUPP; } if (orig_app_abi_ver == -1) { diff -Nuarp a/include/linux/if_bonding.h b/include/linux/if_bonding.h --- a/include/linux/if_bonding.h Thu Jan 8 18:06:53 2004 +++ b/include/linux/if_bonding.h Thu Jan 8 18:06:54 2004 @@ -112,6 +112,9 @@ struct ad_info { #define BOND_CMD_DRV_INFO 0x00000001 #define BOND_CMD_ADD_BOND 0x00000002 #define BOND_CMD_DEL_BOND 0x00000003 +#define BOND_CMD_ENSLAVE 0x00000004 +#define BOND_CMD_RELEASE 0x00000005 +#define BOND_CMD_CHANGE_ACTIVE 0x00000006 /** * bond_ioctl_drv_info @@ -127,6 +130,19 @@ struct bond_ioctl_drv_info { char reserved[32]; }; +/** + * bond_ioctl_cmd + * + * %BOND_CMD_ENSLAVE, %BOND_CMD_RELEASE and %BOND_CMD_CHANGE_ACTIVE pass the + * name of the slave to work on in @ifname. + */ +struct bond_ioctl_cmd { + __u32 cmd; + __u32 abi_ver; + __u32 num_prms; + char ifname[IFNAMSIZ]; +}; + #endif /* _LINUX_IF_BONDING_H */ /* From per@hedeland.org Thu Jan 8 08:44:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:44:19 -0800 (PST) Received: from pluto.hedeland.org (as1-2-8.mal.s.bonet.se [194.236.4.19]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08Gi4Ta014007 for ; Thu, 8 Jan 2004 08:44:05 -0800 Received: from pluto.hedeland.org (localhost [127.0.0.1]) by pluto.hedeland.org (8.12.10/8.12.10) with ESMTP id i08GhwI5059383 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 8 Jan 2004 17:43:58 +0100 (CET) Received: (from per@localhost) by pluto.hedeland.org (8.12.10/8.12.10/Submit) id i08GhwRP059382; Thu, 8 Jan 2004 17:43:58 +0100 (CET) Date: Thu, 8 Jan 2004 17:43:58 +0100 (CET) From: Per Hedeland Message-Id: <200401081643.i08GhwRP059382@pluto.hedeland.org> To: amir.noam@intel.com Cc: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com Subject: Re: [Bonding-devel] [PATCH] [bonding 2.4] Add balance-xor-ip bonding mode In-Reply-To: <200401081733.44744.amir.noam@intel.com> X-archive-position: 2281 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: per@hedeland.org Precedence: bulk X-list: netdev Content-Length: 875 Lines: 24 Amir Noam wrote: >Please use u32 instead of __u32. OK. >hmm... > >I don't like this. The reason we give different tx function pointers >to dev->hard_start_xmit in different bonding mode is to make the tx >path as fast as possible. Otherwise we might as well use a single tx >function that chooses its exact operation based on the bonding mode. > >It might be better to have some code duplication if it results in >faster tx, but I'm not sure what's the optimal solution in this case. Well, I don't really have an opinion since I don't have a good idea about the cost of a function call relative to "everything else" that is happening here. I don't see a way to do "limited" duplication without using function calls though, but I'm quite happy to make it two entirely separate functions for MAC vs IP. Please advise. --Per Hedeland per@hedeland.org From chas@cmf.nrl.navy.mil Thu Jan 8 08:58:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:59:05 -0800 (PST) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GwpTa014932 for ; Thu, 8 Jan 2004 08:58:52 -0800 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.7/8.12.7) with ESMTP id i08GwdRr017374; Thu, 8 Jan 2004 11:58:39 -0500 (EST) Message-Id: <200401081658.i08GwdRr017374@ginger.cmf.nrl.navy.mil> To: davem@redhat.com cc: netdev@oss.sgi.com, ajz@cambridgebroadband.com Subject: [PATCH][ATM]: br2684 incorrectly handles frames recvd with FCS (by Alex Zeffertt ) Reply-To: chas3@users.sourceforge.net Date: Thu, 08 Jan 2004 11:58:40 -0500 From: chas williams (contractor) X-Spam-Score: () hits=0.5 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-archive-position: 2282 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev Content-Length: 1076 Lines: 31 please apply to 2.6 (and 2.4 as well!) thanks # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1492 -> 1.1493 # net/atm/br2684.c 1.9 -> 1.10 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 04/01/07 chas@relax.cmf.nrl.navy.mil 1.1493 # [ATM]: br2684 incorrectly handles frames recvd with FCS (by Alex Zeffertt ) # -------------------------------------------- # diff -Nru a/net/atm/br2684.c b/net/atm/br2684.c --- a/net/atm/br2684.c Wed Jan 7 13:23:54 2004 +++ b/net/atm/br2684.c Wed Jan 7 13:23:54 2004 @@ -437,6 +437,10 @@ dev_kfree_skb(skb); return; } + + /* Strip FCS if present */ + if (skb->len > 7 && skb->data[7] == 0x01) + __skb_trim(skb, skb->len - 4); } else { plen = PADLEN + ETH_HLEN; /* pad, dstmac,srcmac, ethtype */ /* first 2 chars should be 0 */ From per@hedeland.org Thu Jan 8 08:59:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 08:59:41 -0800 (PST) Received: from pluto.hedeland.org (as1-2-8.mal.s.bonet.se [194.236.4.19]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08GxRTa014960 for ; Thu, 8 Jan 2004 08:59:28 -0800 Received: from pluto.hedeland.org (localhost [127.0.0.1]) by pluto.hedeland.org (8.12.10/8.12.10) with ESMTP id i08GwXI5059632 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 8 Jan 2004 17:58:33 +0100 (CET) Received: (from per@localhost) by pluto.hedeland.org (8.12.10/8.12.10/Submit) id i08GwXZV059631; Thu, 8 Jan 2004 17:58:33 +0100 (CET) Date: Thu, 8 Jan 2004 17:58:33 +0100 (CET) From: Per Hedeland Message-Id: <200401081658.i08GwXZV059631@pluto.hedeland.org> To: bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com, tony.cureington@hp.com Subject: RE: [Bonding-devel] [PATCH] [bonding 2.6] Add balance-xor-ip bonding mode In-Reply-To: <72A87F7160C0994D8C5A36E2FDC227F502B3E9BE@txnexc01.americas.cpqcorp.net> X-archive-position: 2283 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: per@hedeland.org Precedence: bulk X-list: netdev Content-Length: 711 Lines: 13 "Cureington, Tony" wrote: > >I'm curious of the reasoning behind "u ^= (u >> 24) ^ (u >> 16) ^ (u >> 8);", what advantages does it have over using the xor'd addresses just before this line? Maybe someone loaded decaf on me this morning? :-/ The idea is to take all octets of the addresses into account (similar logic is already used in bond_alb.c btw). E.g. if the number of slaves is a power of 2 (2 or 4 is probably very common), a full_address % num_slaves operation will effectively only use one octet (happens to be the first one on x86, which is probably the worst choice, but of course that could be compensated for). Or am I missing something? --Per Hedeland per@hedeland.org From amir.noam@intel.com Thu Jan 8 13:28:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:28:23 -0800 (PST) Received: from hermes.iil.intel.com (hermes.iil.intel.com [192.198.152.99]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LS0Ta027487 for ; Thu, 8 Jan 2004 13:28:04 -0800 Received: from petasus.iil.intel.com (petasus.iil.intel.com [143.185.77.3]) by hermes.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-outer.mc,v 1.6 2003/12/18 18:57:17 root Exp $) with ESMTP id i08GRSek012840; Thu, 8 Jan 2004 16:27:28 GMT Received: from hasmsxvs01.iil.intel.com (hasmsxvs01.iil.intel.com [143.185.63.58]) by petasus.iil.intel.com (8.12.9-20030918-01/8.12.9/d: large-inner.mc,v 1.8 2003/12/18 18:57:16 root Exp $) with SMTP id i08GRRxV001366; Thu, 8 Jan 2004 16:27:27 GMT Received: from sun111.npdj.intel.com ([10.12.254.111]) by hasmsxvs01.iil.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004010818272719257 ; Thu, 08 Jan 2004 18:27:27 +0200 Received: from jrslxjul4.npdj.intel.com (jrslxjul4 [10.12.220.54]) by sun111.npdj.intel.com (8.12.10/8.12.9/MailSET/Hub) with ESMTP id i08GRQhb003985; Thu, 8 Jan 2004 18:27:27 +0200 (IST) From: Amir Noam To: "Jeff Garzik" , "Jay Vosburgh" Subject: [PATCH 1/4] [bonding 2.6] Add bonding ioctl hook Date: Thu, 8 Jan 2004 18:27:24 +0200 User-Agent: KMail/1.5.3 Cc: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200401081827.26718.amir.noam@intel.com> X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 2284 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: amir.noam@intel.com Precedence: bulk X-list: netdev Content-Length: 2706 Lines: 86 Add two bonding ioctls: SIOCBONDING: ioctl hook to handle commands not directed at a specific bond interface. SIOCBONDDEVICE: ioctl to handle commands for a bond interface. This ioctl can also handle all existing commands, so we can regard them as obsolete in the future. All future bonding operations will be a sub-command of one of these ioctls. diff -Nuarp a/include/linux/sockios.h b/include/linux/sockios.h --- a/include/linux/sockios.h Thu Jan 8 18:06:41 2004 +++ b/include/linux/sockios.h Thu Jan 8 18:06:42 2004 @@ -115,7 +115,9 @@ #define SIOCBONDSLAVEINFOQUERY 0x8993 /* rtn info about slave state */ #define SIOCBONDINFOQUERY 0x8994 /* rtn info about bond state */ #define SIOCBONDCHANGEACTIVE 0x8995 /* update to a new active slave */ - +#define SIOCBONDING 0x8996 /* deviceless bonding commands */ +#define SIOCBONDDEVICE 0x8997 /* device oriented bonding commands */ + /* Device private ioctl calls */ /* diff -Nuarp a/net/core/dev.c b/net/core/dev.c --- a/net/core/dev.c Thu Jan 8 18:06:41 2004 +++ b/net/core/dev.c Thu Jan 8 18:06:42 2004 @@ -2408,6 +2408,7 @@ static int dev_ifsioc(struct ifreq *ifr, cmd == SIOCBONDSLAVEINFOQUERY || cmd == SIOCBONDINFOQUERY || cmd == SIOCBONDCHANGEACTIVE || + cmd == SIOCBONDDEVICE || cmd == SIOCGMIIPHY || cmd == SIOCGMIIREG || cmd == SIOCSMIIREG || @@ -2565,6 +2566,7 @@ int dev_ioctl(unsigned int cmd, void *ar case SIOCBONDSLAVEINFOQUERY: case SIOCBONDINFOQUERY: case SIOCBONDCHANGEACTIVE: + case SIOCBONDDEVICE: if (!capable(CAP_NET_ADMIN)) return -EPERM; dev_load(ifr.ifr_name); diff -Nuarp a/net/socket.c b/net/socket.c --- a/net/socket.c Thu Jan 8 18:06:41 2004 +++ b/net/socket.c Thu Jan 8 18:06:42 2004 @@ -754,6 +754,17 @@ void dlci_ioctl_set(int (*hook)(unsigned } EXPORT_SYMBOL(dlci_ioctl_set); +static DECLARE_MUTEX(bond_ioctl_mutex); +static int (*bond_ioctl_hook)(unsigned long arg); + +void bond_ioctl_set(int (*hook)(unsigned long)) +{ + down(&bond_ioctl_mutex); + bond_ioctl_hook = hook; + up(&bond_ioctl_mutex); +} +EXPORT_SYMBOL(bond_ioctl_set); + /* * With an ioctl, arg may well be a user mode pointer, but we don't know * what to do with it - that's up to the protocol still. @@ -826,6 +837,17 @@ static int sock_ioctl(struct inode *inod up(&dlci_ioctl_mutex); } break; + case SIOCBONDING: + err = -ENOPKG; + if (!bond_ioctl_hook) + request_module("bonding"); + + down(&bond_ioctl_mutex); + if (bond_ioctl_hook) { + err = bond_ioctl_hook(arg); + } + up(&bond_ioctl_mutex); + break; default: err = sock->ops->ioctl(sock, cmd, arg); break; From shemminger@osdl.org Thu Jan 8 13:39:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:39:27 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LdCTa028019 for ; Thu, 8 Jan 2004 13:39:13 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08Ld1o03897; Thu, 8 Jan 2004 13:39:01 -0800 Date: Fri, 9 Jan 2004 13:39:50 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] (2/17) ipv4/ipv6 - size_t for send/recvmsg Message-Id: <20040109133950.52e423fd@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2286 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 10472 Lines: 314 Convert sendmsg/recvmsg from size as int to size as size_t. Remove comment in UDP that addresses this very issue. diff -Nru a/include/net/inet_common.h b/include/net/inet_common.h --- a/include/net/inet_common.h Fri Jan 9 13:38:06 2004 +++ b/include/net/inet_common.h Fri Jan 9 13:38:06 2004 @@ -23,11 +23,11 @@ extern int inet_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *ubuf, - int size, int flags); + size_t size, int flags); extern int inet_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, - int size); + size_t size); extern int inet_shutdown(struct socket *sock, int how); extern unsigned int inet_poll(struct file * file, struct socket *sock, struct poll_table_struct *wait); extern int inet_setsockopt(struct socket *sock, int level, diff -Nru a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h Fri Jan 9 13:38:06 2004 +++ b/include/net/tcp.h Fri Jan 9 13:38:06 2004 @@ -752,7 +752,7 @@ extern int tcp_v4_tw_remember_stamp(struct tcp_tw_bucket *tw); extern int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, - struct msghdr *msg, int size); + struct msghdr *msg, size_t size); extern ssize_t tcp_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags); extern int tcp_ioctl(struct sock *sk, @@ -846,7 +846,7 @@ extern void tcp_set_keepalive(struct sock *sk, int val); extern int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - int len, int nonblock, + size_t len, int nonblock, int flags, int *addr_len); extern int tcp_listen_start(struct sock *sk); diff -Nru a/include/net/udp.h b/include/net/udp.h --- a/include/net/udp.h Fri Jan 9 13:38:06 2004 +++ b/include/net/udp.h Fri Jan 9 13:38:06 2004 @@ -68,7 +68,7 @@ struct sockaddr *usin, int addr_len); extern int udp_sendmsg(struct kiocb *iocb, struct sock *sk, - struct msghdr *msg, int len); + struct msghdr *msg, size_t len); extern int udp_rcv(struct sk_buff *skb); extern int udp_ioctl(struct sock *sk, int cmd, unsigned long arg); diff -Nru a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c --- a/net/ipv4/af_inet.c Fri Jan 9 13:38:06 2004 +++ b/net/ipv4/af_inet.c Fri Jan 9 13:38:06 2004 @@ -731,7 +731,7 @@ int inet_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, - int size, int flags) + size_t size, int flags) { struct sock *sk = sock->sk; int addr_len = 0; @@ -746,7 +746,7 @@ int inet_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, - int size) + size_t size) { struct sock *sk = sock->sk; diff -Nru a/net/ipv4/raw.c b/net/ipv4/raw.c --- a/net/ipv4/raw.c Fri Jan 9 13:38:06 2004 +++ b/net/ipv4/raw.c Fri Jan 9 13:38:06 2004 @@ -324,7 +324,7 @@ } static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - int len) + size_t len) { struct inet_opt *inet = inet_sk(sk); struct ipcm_cookie ipc; @@ -335,17 +335,6 @@ u8 tos; int err; - /* This check is ONLY to check for arithmetic overflow - on integer(!) len. Not more! Real check will be made - in ip_build_xmit --ANK - - BTW socket.c -> af_*.c -> ... make multiple - invalid conversions size_t -> int. We MUST repair it f.e. - by replacing all of them with size_t and revise all - the places sort of len += sizeof(struct iphdr) - If len was ULONG_MAX-10 it would be cathastrophe --ANK - */ - err = -EMSGSIZE; if (len < 0 || len > 0xFFFF) goto out; @@ -523,10 +512,10 @@ */ int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - int len, int noblock, int flags, int *addr_len) + size_t len, int noblock, int flags, int *addr_len) { struct inet_opt *inet = inet_sk(sk); - int copied = 0; + size_t copied = 0; int err = -EOPNOTSUPP; struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; struct sk_buff *skb; diff -Nru a/net/ipv4/tcp.c b/net/ipv4/tcp.c --- a/net/ipv4/tcp.c Fri Jan 9 13:38:06 2004 +++ b/net/ipv4/tcp.c Fri Jan 9 13:38:06 2004 @@ -1029,7 +1029,7 @@ } int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - int size) + size_t size) { struct iovec *iov; struct tcp_opt *tp = tcp_sk(sk); @@ -1498,7 +1498,7 @@ */ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - int len, int nonblock, int flags, int *addr_len) + size_t len, int nonblock, int flags, int *addr_len) { struct tcp_opt *tp = tcp_sk(sk); int copied = 0; diff -Nru a/net/ipv4/udp.c b/net/ipv4/udp.c --- a/net/ipv4/udp.c Fri Jan 9 13:38:06 2004 +++ b/net/ipv4/udp.c Fri Jan 9 13:38:06 2004 @@ -478,7 +478,7 @@ } int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - int len) + size_t len) { struct inet_opt *inet = inet_sk(sk); struct udp_opt *up = udp_sk(sk); @@ -493,18 +493,7 @@ int err; int corkreq = up->corkflag || msg->msg_flags&MSG_MORE; - /* This check is ONLY to check for arithmetic overflow - on integer(!) len. Not more! Real check will be made - in ip_append_* --ANK - - BTW socket.c -> af_*.c -> ... make multiple - invalid conversions size_t -> int. We MUST repair it f.e. - by replacing all of them with size_t and revise all - the places sort of len += sizeof(struct iphdr) - If len was ULONG_MAX-10 it would be cathastrophe --ANK - */ - - if (len < 0 || len > 0xFFFF) + if (len > 0xFFFF) return -EMSGSIZE; /* @@ -782,7 +771,7 @@ */ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - int len, int noblock, int flags, int *addr_len) + size_t len, int noblock, int flags, int *addr_len) { struct inet_opt *inet = inet_sk(sk); struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; diff -Nru a/net/ipv6/raw.c b/net/ipv6/raw.c --- a/net/ipv6/raw.c Fri Jan 9 13:38:06 2004 +++ b/net/ipv6/raw.c Fri Jan 9 13:38:06 2004 @@ -345,13 +345,15 @@ * we return it, otherwise we block. */ -static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, int len, +static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk, + struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) { struct ipv6_pinfo *np = inet6_sk(sk); struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)msg->msg_name; struct sk_buff *skb; - int copied, err; + size_t copied; + int err; if (flags & MSG_OOB) return -EOPNOTSUPP; @@ -527,7 +529,8 @@ IP6_INC_STATS(Ip6OutDiscards); return err; } -static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, int len) +static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk, + struct msghdr *msg, size_t len) { struct ipv6_txoptions opt_space; struct sockaddr_in6 * sin6 = (struct sockaddr_in6 *) msg->msg_name; diff -Nru a/net/ipv6/udp.c b/net/ipv6/udp.c --- a/net/ipv6/udp.c Fri Jan 9 13:38:06 2004 +++ b/net/ipv6/udp.c Fri Jan 9 13:38:06 2004 @@ -366,12 +366,14 @@ * return it, otherwise we block. */ -static int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, int len, +static int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk, + struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) { struct ipv6_pinfo *np = inet6_sk(sk); struct sk_buff *skb; - int copied, err; + size_t copied; + int err; if (addr_len) *addr_len=sizeof(struct sockaddr_in6); @@ -774,7 +776,8 @@ return err; } -static int udpv6_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, int len) +static int udpv6_sendmsg(struct kiocb *iocb, struct sock *sk, + struct msghdr *msg, size_t len) { struct ipv6_txoptions opt_space; struct udp_opt *up = udp_sk(sk); @@ -841,7 +844,7 @@ /* Rough check on arithmetic overflow, better check is made in ip6_build_xmit */ - if (len < 0 || len > INT_MAX - sizeof(struct udphdr)) + if (len > INT_MAX - sizeof(struct udphdr)) return -EMSGSIZE; if (up->pending) { diff -Nru a/net/sctp/socket.c b/net/sctp/socket.c --- a/net/sctp/socket.c Fri Jan 9 13:38:06 2004 +++ b/net/sctp/socket.c Fri Jan 9 13:38:06 2004 @@ -90,7 +90,7 @@ static inline void sctp_set_owner_w(struct sctp_chunk *chunk); static void sctp_wfree(struct sk_buff *skb); static int sctp_wait_for_sndbuf(struct sctp_association *, long *timeo_p, - int msg_len); + size_t msg_len); static int sctp_wait_for_packet(struct sock * sk, int *err, long *timeo_p); static int sctp_wait_for_connect(struct sctp_association *, long *timeo_p); static int sctp_wait_for_accept(struct sock *sk, long timeo); @@ -943,7 +943,7 @@ SCTP_STATIC int sctp_msghdr_parse(const struct msghdr *, sctp_cmsgs_t *); SCTP_STATIC int sctp_sendmsg(struct kiocb *iocb, struct sock *sk, - struct msghdr *msg, int msg_len) + struct msghdr *msg, size_t msg_len) { struct sctp_opt *sp; struct sctp_endpoint *ep; @@ -965,7 +965,7 @@ struct list_head *pos; int msg_flags = msg->msg_flags; - SCTP_DEBUG_PRINTK("sctp_sendmsg(sk: %p, msg: %p, msg_len: %d)\n", + SCTP_DEBUG_PRINTK("sctp_sendmsg(sk: %p, msg: %p, msg_len: %u)\n", sk, msg, msg_len); err = 0; @@ -1021,7 +1021,7 @@ associd = sinfo->sinfo_assoc_id; } - SCTP_DEBUG_PRINTK("msg_len: %d, sinfo_flags: 0x%x\n", + SCTP_DEBUG_PRINTK("msg_len: %u, sinfo_flags: 0x%x\n", msg_len, sinfo_flags); /* MSG_EOF or MSG_ABORT cannot be set on a TCP-style socket. */ @@ -1377,7 +1377,7 @@ static struct sk_buff *sctp_skb_recv_datagram(struct sock *, int, int, int *); SCTP_STATIC int sctp_recvmsg(struct kiocb *iocb, struct sock *sk, - struct msghdr *msg, int len, int noblock, + struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) { struct sctp_ulpevent *event = NULL; @@ -4157,14 +4157,14 @@ /* Helper function to wait for space in the sndbuf. */ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, - int msg_len) + size_t msg_len) { struct sock *sk = asoc->base.sk; int err = 0; long current_timeo = *timeo_p; DEFINE_WAIT(wait); - SCTP_DEBUG_PRINTK("wait_for_sndbuf: asoc=%p, timeo=%ld, msg_len=%d\n", + SCTP_DEBUG_PRINTK("wait_for_sndbuf: asoc=%p, timeo=%ld, msg_len=%u\n", asoc, (long)(*timeo_p), msg_len); /* Increment the association's refcnt. */ From shemminger@osdl.org Thu Jan 8 13:39:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:39:26 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LdCTa028018 for ; Thu, 8 Jan 2004 13:39:12 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08Ld0o03884; Thu, 8 Jan 2004 13:39:00 -0800 Date: Fri, 9 Jan 2004 13:26:58 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] (1/17) protocol sendmsg/revmsg prototype Message-Id: <20040109132658.01aa28dd@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2285 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 6285 Lines: 156 When sendmsg (or recvmsg) system call is processed, the total size of the message is calculated as unsigned (size_t), but then passed through as integer in the protocol switch table. This leads to possible problems when protocols only check size > pmtu and size could potentially be negative. The protocols work around this (mostly) today by checking for less than zero or recasting, but the right thing is to change protocol switch to pass it through as size_t. This doesn't change the user ABI for sendmsg/recvmsg. The first patch changes the prototype, and causes all the protocols to generate warnings that are cleared up in the next sixteen. This was found by Chris Wright who raised it a potential DOS attack point. diff -Nru a/include/linux/net.h b/include/linux/net.h --- a/include/linux/net.h Mon Dec 8 16:19:31 2003 +++ b/include/linux/net.h Mon Dec 8 16:19:31 2003 @@ -120,9 +120,9 @@ int (*getsockopt)(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen); int (*sendmsg) (struct kiocb *iocb, struct socket *sock, - struct msghdr *m, int total_len); + struct msghdr *m, size_t total_len); int (*recvmsg) (struct kiocb *iocb, struct socket *sock, - struct msghdr *m, int total_len, + struct msghdr *m, size_t total_len, int flags); int (*mmap) (struct file *file, struct socket *sock, struct vm_area_struct * vma); @@ -151,13 +151,13 @@ struct socket **res); extern void sock_release(struct socket *sock); extern int sock_sendmsg(struct socket *sock, struct msghdr *msg, - int len); + size_t len); extern int sock_recvmsg(struct socket *sock, struct msghdr *msg, - int size, int flags); + size_t size, int flags); extern int sock_readv_writev(int type, struct inode *inode, struct file *file, const struct iovec *iov, long count, - long size); + size_t size); extern int sock_map_fd(struct socket *sock); extern struct socket *sockfd_lookup(int fd, int *err); #define sockfd_put(sock) fput(sock->file) @@ -216,9 +216,9 @@ char *optval, int optlen), (sock, level, optname, optval, optlen)) \ SOCKCALL_WRAP(name, getsockopt, (struct socket *sock, int level, int optname, \ char *optval, int *optlen), (sock, level, optname, optval, optlen)) \ -SOCKCALL_WRAP(name, sendmsg, (struct kiocb *iocb, struct socket *sock, struct msghdr *m, int len), \ +SOCKCALL_WRAP(name, sendmsg, (struct kiocb *iocb, struct socket *sock, struct msghdr *m, size_t len), \ (iocb, sock, m, len)) \ -SOCKCALL_WRAP(name, recvmsg, (struct kiocb *iocb, struct socket *sock, struct msghdr *m, int len, int flags), \ +SOCKCALL_WRAP(name, recvmsg, (struct kiocb *iocb, struct socket *sock, struct msghdr *m, size_t len, int flags), \ (iocb, sock, m, len, flags)) \ SOCKCALL_WRAP(name, mmap, (struct file *file, struct socket *sock, struct vm_area_struct *vma), \ (file, sock, vma)) \ diff -Nru a/include/net/sock.h b/include/net/sock.h --- a/include/net/sock.h Mon Dec 8 16:19:31 2003 +++ b/include/net/sock.h Mon Dec 8 16:19:31 2003 @@ -418,10 +418,10 @@ int optname, char *optval, int *option); int (*sendmsg)(struct kiocb *iocb, struct sock *sk, - struct msghdr *msg, int len); + struct msghdr *msg, size_t len); int (*recvmsg)(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - int len, int noblock, int flags, + size_t len, int noblock, int flags, int *addr_len); int (*sendpage)(struct sock *sk, struct page *page, int offset, size_t size, int flags); @@ -624,9 +624,9 @@ extern int sock_no_setsockopt(struct socket *, int, int, char *, int); extern int sock_no_sendmsg(struct kiocb *, struct socket *, - struct msghdr *, int); + struct msghdr *, size_t); extern int sock_no_recvmsg(struct kiocb *, struct socket *, - struct msghdr *, int, int); + struct msghdr *, size_t, int); extern int sock_no_mmap(struct file *file, struct socket *sock, struct vm_area_struct *vma); diff -Nru a/net/core/sock.c b/net/core/sock.c --- a/net/core/sock.c Mon Dec 8 16:19:31 2003 +++ b/net/core/sock.c Mon Dec 8 16:19:31 2003 @@ -966,13 +966,13 @@ } int sock_no_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m, - int flags) + size_t len) { return -EOPNOTSUPP; } int sock_no_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m, - int len, int flags) + size_t len, int flags) { return -EOPNOTSUPP; } diff -Nru a/net/socket.c b/net/socket.c --- a/net/socket.c Mon Dec 8 16:19:31 2003 +++ b/net/socket.c Mon Dec 8 16:19:31 2003 @@ -523,7 +523,8 @@ sock->file=NULL; } -static inline int __sock_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, int size) +static inline int __sock_sendmsg(struct kiocb *iocb, struct socket *sock, + struct msghdr *msg, size_t size) { struct sock_iocb *si = kiocb_to_siocb(iocb); int err; @@ -540,7 +541,7 @@ return sock->ops->sendmsg(iocb, sock, msg, size); } -int sock_sendmsg(struct socket *sock, struct msghdr *msg, int size) +int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) { struct kiocb iocb; int ret; @@ -553,7 +554,8 @@ } -static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, int size, int flags) +static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock, + struct msghdr *msg, size_t size, int flags) { int err; struct sock_iocb *si = kiocb_to_siocb(iocb); @@ -571,7 +573,8 @@ return sock->ops->recvmsg(iocb, sock, msg, size, flags); } -int sock_recvmsg(struct socket *sock, struct msghdr *msg, int size, int flags) +int sock_recvmsg(struct socket *sock, struct msghdr *msg, + size_t size, int flags) { struct kiocb iocb; int ret; @@ -668,7 +671,7 @@ } int sock_readv_writev(int type, struct inode * inode, struct file * file, - const struct iovec * iov, long count, long size) + const struct iovec * iov, long count, size_t size) { struct msghdr msg; struct socket *sock; From shemminger@osdl.org Thu Jan 8 13:39:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:39:27 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LdDTa028020 for ; Thu, 8 Jan 2004 13:39:13 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08Ld1o03902; Thu, 8 Jan 2004 13:39:01 -0800 Date: Fri, 9 Jan 2004 13:40:00 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com, Maxim Krasnyansky Subject: [PATCH] (3/17) bluetooth -- size_t for send/recvmsg Message-Id: <20040109134000.7201c3af@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2287 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 4695 Lines: 136 Convert bluetooth sendmsg/recvmsg from size as int to size_t. Add check in HCI that sendmsg < max allowed frame size. diff -Nru a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h --- a/include/net/bluetooth/bluetooth.h Mon Dec 8 16:19:37 2003 +++ b/include/net/bluetooth/bluetooth.h Mon Dec 8 16:19:37 2003 @@ -129,7 +129,7 @@ struct sock *bt_sock_alloc(struct socket *sock, int proto, int pi_size, int prio); void bt_sock_link(struct bt_sock_list *l, struct sock *s); void bt_sock_unlink(struct bt_sock_list *l, struct sock *s); -int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, int len, int flags); +int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, size_t len, int flags); uint bt_sock_poll(struct file * file, struct socket *sock, poll_table *wait); int bt_sock_wait_state(struct sock *sk, int state, unsigned long timeo); diff -Nru a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c --- a/net/bluetooth/af_bluetooth.c Mon Dec 8 16:19:37 2003 +++ b/net/bluetooth/af_bluetooth.c Mon Dec 8 16:19:37 2003 @@ -201,12 +201,13 @@ } int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len, int flags) + struct msghdr *msg, size_t len, int flags) { int noblock = flags & MSG_DONTWAIT; struct sock *sk = sock->sk; struct sk_buff *skb; - int copied, err; + size_t copied; + int err; BT_DBG("sock %p sk %p len %d", sock, sk, len); diff -Nru a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c --- a/net/bluetooth/hci_sock.c Mon Dec 8 16:19:37 2003 +++ b/net/bluetooth/hci_sock.c Mon Dec 8 16:19:37 2003 @@ -319,7 +319,8 @@ put_cmsg(msg, SOL_HCI, HCI_CMSG_TSTAMP, sizeof(skb->stamp), &skb->stamp); } -static int hci_sock_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, int len, int flags) +static int hci_sock_recvmsg(struct kiocb *iocb, struct socket *sock, + struct msghdr *msg, size_t len, int flags) { int noblock = flags & MSG_DONTWAIT; struct sock *sk = sock->sk; @@ -355,7 +356,8 @@ return err ? : copied; } -static int hci_sock_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, int len) +static int hci_sock_sendmsg(struct kiocb *iocb, struct socket *sock, + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct hci_dev *hdev; @@ -370,9 +372,9 @@ if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_NOSIGNAL|MSG_ERRQUEUE)) return -EINVAL; - if (len < 4) + if (len < 4 || len > HCI_MAX_FRAME_SIZE) return -EINVAL; - + lock_sock(sk); if (!(hdev = hci_pi(sk)->hdev)) { diff -Nru a/net/bluetooth/l2cap.c b/net/bluetooth/l2cap.c --- a/net/bluetooth/l2cap.c Mon Dec 8 16:19:37 2003 +++ b/net/bluetooth/l2cap.c Mon Dec 8 16:19:37 2003 @@ -706,7 +706,8 @@ return err; } -static int l2cap_sock_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, int len) +static int l2cap_sock_sendmsg(struct kiocb *iocb, struct socket *sock, + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; int err = 0; diff -Nru a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c --- a/net/bluetooth/rfcomm/sock.c Mon Dec 8 16:19:37 2003 +++ b/net/bluetooth/rfcomm/sock.c Mon Dec 8 16:19:37 2003 @@ -482,12 +482,12 @@ } static int rfcomm_sock_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct rfcomm_dlc *d = rfcomm_pi(sk)->dlc; struct sk_buff *skb; - int err, size; + int err; int sent = 0; if (msg->msg_flags & MSG_OOB) @@ -501,7 +501,7 @@ lock_sock(sk); while (len) { - size = min_t(uint, len, d->mtu); + size_t size = min(len, d->mtu); skb = sock_alloc_send_skb(sk, size + RFCOMM_SKB_RESERVE, msg->msg_flags & MSG_DONTWAIT, &err); @@ -556,10 +556,11 @@ } static int rfcomm_sock_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, int flags) + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; - int target, err = 0, copied = 0; + int err = 0; + size_t target, copied = 0; long timeo; if (flags & MSG_OOB) diff -Nru a/net/bluetooth/sco.c b/net/bluetooth/sco.c --- a/net/bluetooth/sco.c Mon Dec 8 16:19:37 2003 +++ b/net/bluetooth/sco.c Mon Dec 8 16:19:37 2003 @@ -630,7 +630,8 @@ return 0; } -static int sco_sock_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, int len) +static int sco_sock_sendmsg(struct kiocb *iocb, struct socket *sock, + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; int err = 0; From shemminger@osdl.org Thu Jan 8 13:41:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:41:33 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LfJTa029044 for ; Thu, 8 Jan 2004 13:41:19 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08Lf8o04216; Thu, 8 Jan 2004 13:41:08 -0800 Date: Fri, 9 Jan 2004 13:42:15 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com, Michal Ostrowski Subject: [PATCH] (4/17) pppoe -- size_t in send/recvmsg Message-Id: <20040109134215.14fbcf8a@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2288 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 760 Lines: 25 Convert PPPoe for size_t in send/recv msg diff -Nru a/drivers/net/pppoe.c b/drivers/net/pppoe.c --- a/drivers/net/pppoe.c Mon Dec 8 16:19:40 2003 +++ b/drivers/net/pppoe.c Mon Dec 8 16:19:40 2003 @@ -775,8 +775,8 @@ } -static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m, - int total_len) +static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock, + struct msghdr *m, size_t total_len) { struct sk_buff *skb = NULL; struct sock *sk = sock->sk; @@ -939,7 +939,7 @@ }; static int pppoe_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *m, int total_len, int flags) + struct msghdr *m, size_t total_len, int flags) { struct sock *sk = sock->sk; struct sk_buff *skb = NULL; From shemminger@osdl.org Thu Jan 8 13:43:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:43:53 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LheTa029410 for ; Thu, 8 Jan 2004 13:43:40 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08LhSo04595; Thu, 8 Jan 2004 13:43:29 -0800 Date: Fri, 9 Jan 2004 13:44:36 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com, acme@conectiva.com.br Subject: [PATCH] (5/17) ddp -- size_t for send/recvmsg Message-Id: <20040109134436.6aa8fe1f@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2289 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 695 Lines: 23 Change send/recvmsg to match prototype change in sock.h diff -Nru a/net/appletalk/ddp.c b/net/appletalk/ddp.c --- a/net/appletalk/ddp.c Mon Dec 8 16:19:43 2003 +++ b/net/appletalk/ddp.c Mon Dec 8 16:19:43 2003 @@ -1552,7 +1552,7 @@ } static int atalk_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, - int len) + size_t len) { struct sock *sk = sock->sk; struct atalk_sock *at = at_sk(sk); @@ -1712,7 +1712,7 @@ } static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, - int size, int flags) + size_t size, int flags) { struct sock *sk = sock->sk; struct sockaddr_at *sat = (struct sockaddr_at *)msg->msg_name; From shemminger@osdl.org Thu Jan 8 13:47:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:47:42 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LlSTa029857 for ; Thu, 8 Jan 2004 13:47:28 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08Ll5o05168; Thu, 8 Jan 2004 13:47:05 -0800 Date: Fri, 9 Jan 2004 13:48:12 -0800 From: Stephen Hemminger To: "David S. Miller" , Chas Williams Cc: netdev@oss.sgi.com Subject: [PATCH[ (6/15) atm -- size_t in send/recvmsg Message-Id: <20040109134812.6fa874c8@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2291 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1368 Lines: 38 Convert ATM for unsigned length send/recvmsg to match new prototype. diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c Mon Dec 8 16:19:46 2003 +++ b/net/atm/common.c Mon Dec 8 16:19:46 2003 @@ -463,7 +463,7 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, - int size, int flags) + size_t size, int flags) { struct sock *sk = sock->sk; struct atm_vcc *vcc; @@ -503,7 +503,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m, - int total_len) + size_t total_len) { struct sock *sk = sock->sk; DEFINE_WAIT(wait); diff -Nru a/net/atm/common.h b/net/atm/common.h --- a/net/atm/common.h Mon Dec 8 16:19:46 2003 +++ b/net/atm/common.h Mon Dec 8 16:19:46 2003 @@ -14,9 +14,9 @@ int vcc_release(struct socket *sock); int vcc_connect(struct socket *sock, int itf, short vpi, int vci); int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, - int size, int flags); + size_t size, int flags); int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m, - int total_len); + size_t total_len); unsigned int vcc_poll(struct file *file, struct socket *sock, poll_table *wait); int vcc_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg); int vcc_setsockopt(struct socket *sock, int level, int optname, char *optval, From shemminger@osdl.org Thu Jan 8 13:47:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:47:40 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LlQTa029854 for ; Thu, 8 Jan 2004 13:47:27 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08Ll8o05200; Thu, 8 Jan 2004 13:47:08 -0800 Date: Fri, 9 Jan 2004 13:48:15 -0800 From: Stephen Hemminger To: "David S. Miller" , Ralf Baechle Cc: netdev@oss.sgi.com Subject: [PATCH] (7/15) ax25 - size_t in send/recvmsg Message-Id: <20040109134815.09ef4df0@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2290 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1254 Lines: 46 Convert Amateur Radio X.25 to unsigned size for send/receive. Also enforce an MTU here, since there is no fragmentation logic. diff -Nru a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c --- a/net/ax25/af_ax25.c Mon Dec 8 16:19:49 2003 +++ b/net/ax25/af_ax25.c Mon Dec 8 16:19:49 2003 @@ -1415,7 +1415,7 @@ } static int ax25_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sockaddr_ax25 *usax = (struct sockaddr_ax25 *)msg->msg_name; struct sock *sk = sock->sk; @@ -1424,7 +1424,8 @@ ax25_digi dtmp, *dp; unsigned char *asmptr; ax25_cb *ax25; - int lv, size, err, addr_len = msg->msg_namelen; + size_t size; + int lv, err, addr_len = msg->msg_namelen; if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR)) { return -EINVAL; @@ -1449,6 +1450,11 @@ goto out; } + if (len > ax25->ax25_dev->dev->mtu) { + err = -EMSGSIZE; + goto out; + } + if (usax != NULL) { if (usax->sax25_family != AF_AX25) { err = -EINVAL; @@ -1594,7 +1600,7 @@ } static int ax25_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, int flags) + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct sk_buff *skb; From acme@conectiva.com.br Thu Jan 8 13:54:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 13:55:01 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08LslTa031032 for ; Thu, 8 Jan 2004 13:54:48 -0800 Received: from 4-203.ctame700-6.telepar.net.br ([200.140.237.203] helo=oops.kerneljanitors.org) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1AeiEN-0000eu-00; Thu, 08 Jan 2004 20:02:20 -0200 Received: by oops.kerneljanitors.org (Postfix, from userid 500) id B6C191966D; Thu, 8 Jan 2004 20:05:30 -0200 (BRDT) Date: Thu, 8 Jan 2004 20:05:30 -0200 From: Arnaldo Carvalho de Melo To: Stephen Hemminger Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] (5/17) ddp -- size_t for send/recvmsg Message-ID: <20040108220530.GB3062@conectiva.com.br> References: <20040109134436.6aa8fe1f@linux.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040109134436.6aa8fe1f@linux.local> X-Url: http://advogato.org/person/acme Organization: Conectiva S.A. User-Agent: Mutt/1.5.5.1i X-archive-position: 2292 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Content-Length: 363 Lines: 11 Em Fri, Jan 09, 2004 at 01:44:36PM -0800, Stephen Hemminger escreveu: > Change send/recvmsg to match prototype change in sock.h > > diff -Nru a/net/appletalk/ddp.c b/net/appletalk/ddp.c > --- a/net/appletalk/ddp.c Mon Dec 8 16:19:43 2003 > +++ b/net/appletalk/ddp.c Mon Dec 8 16:19:43 2003 > @@ -1552,7 +1552,7 @@ I'm OK with this, thanks Stephen. - Arnaldo From shemminger@osdl.org Thu Jan 8 14:02:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:10 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M2uTa031576 for ; Thu, 8 Jan 2004 14:02:56 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2fo08259; Thu, 8 Jan 2004 14:02:41 -0800 Date: Fri, 9 Jan 2004 14:03:11 -0800 From: Stephen Hemminger To: "David S. Miller" , acme@conectiva.com.br Cc: netdev@oss.sgi.com Subject: [PATCH] (13/17) llc -- size_t for send/recvmsg Message-Id: <20040109140311.66c2fa57@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2300 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1365 Lines: 40 Convert LLC to unsigned for send/recv msg. diff -Nru a/net/llc/af_llc.c b/net/llc/af_llc.c --- a/net/llc/af_llc.c Mon Dec 8 16:20:07 2003 +++ b/net/llc/af_llc.c Mon Dec 8 16:20:07 2003 @@ -671,12 +671,13 @@ * Returns non-negative upon success, negative otherwise. */ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, int flags) + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct sockaddr_llc *uaddr = (struct sockaddr_llc *)msg->msg_name; struct sk_buff *skb; - int rc = -ENOMEM, copied = 0, timeout; + size_t copied = 0; + int rc = -ENOMEM, timeout; int noblock = flags & MSG_DONTWAIT; dprintk("%s: receiving in %02X from %02X\n", __FUNCTION__, @@ -725,7 +726,7 @@ * Returns non-negative upon success, negative otherwise. */ static int llc_ui_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct llc_opt *llc = llc_sk(sk); @@ -734,7 +735,8 @@ int noblock = flags & MSG_DONTWAIT; struct net_device *dev; struct sk_buff *skb; - int rc = -EINVAL, size = 0, copied = 0, hdrlen; + size_t size = 0; + int rc = -EINVAL, copied = 0, hdrlen; dprintk("%s: sending from %02X to %02X\n", __FUNCTION__, llc->laddr.lsap, llc->daddr.lsap); From shemminger@osdl.org Thu Jan 8 14:02:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:10 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M2sTa031570 for ; Thu, 8 Jan 2004 14:02:54 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2fo08239; Thu, 8 Jan 2004 14:02:41 -0800 Date: Fri, 9 Jan 2004 14:02:58 -0800 From: Stephen Hemminger To: "David S. Miller" , Ralf Baechle Cc: netdev@oss.sgi.com Subject: [PATCH] (14/17) netrom -- size_t for send/recvmsg Message-Id: <20040109140258.34aa7c73@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2299 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 817 Lines: 28 Convert Netrom to unsigned length for send/receive msg. diff -Nru a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c --- a/net/netrom/af_netrom.c Mon Dec 8 16:20:10 2003 +++ b/net/netrom/af_netrom.c Mon Dec 8 16:20:10 2003 @@ -1003,7 +1003,7 @@ } static int nr_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; nr_cb *nr = nr_sk(sk); @@ -1112,11 +1112,11 @@ } static int nr_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, int flags) + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct sockaddr_ax25 *sax = (struct sockaddr_ax25 *)msg->msg_name; - int copied; + size_t copied; struct sk_buff *skb; int er; From shemminger@osdl.org Thu Jan 8 14:02:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:08 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M2pTa031564 for ; Thu, 8 Jan 2004 14:02:51 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2bo08194; Thu, 8 Jan 2004 14:02:37 -0800 Date: Fri, 9 Jan 2004 14:03:36 -0800 From: Stephen Hemminger To: "David S. Miller" , Steven Whitehouse Cc: netdev@oss.sgi.com Subject: [PATCH] (9/17) decnet -- size_t in sned/recvmsg Message-Id: <20040109140336.45c393c5@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2297 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1582 Lines: 56 Send/recv msg now take an unsigned total message length. diff -Nru a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c --- a/net/decnet/af_decnet.c Mon Dec 8 16:19:55 2003 +++ b/net/decnet/af_decnet.c Mon Dec 8 16:19:55 2003 @@ -1659,13 +1659,13 @@ static int dn_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, int flags) + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct dn_scp *scp = DN_SK(sk); struct sk_buff_head *queue = &sk->sk_receive_queue; - int target = size > 1 ? 1 : 0; - int copied = 0; + size_t target = size > 1 ? 1 : 0; + size_t copied = 0; int rv = 0; struct sk_buff *skb, *nskb; struct dn_skb_cb *cb = NULL; @@ -1746,7 +1746,7 @@ } for(skb = queue->next; skb != (struct sk_buff *)queue; skb = nskb) { - int chunk = skb->len; + unsigned int chunk = skb->len; cb = DN_SKB_CB(skb); if ((chunk + copied) > size) @@ -1888,20 +1888,20 @@ } static int dn_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size) + struct msghdr *msg, size_t size) { struct sock *sk = sock->sk; struct dn_scp *scp = DN_SK(sk); - int mss; + size_t mss; struct sk_buff_head *queue = &scp->data_xmit_queue; int flags = msg->msg_flags; int err = 0; - int sent = 0; + size_t sent = 0; int addr_len = msg->msg_namelen; struct sockaddr_dn *addr = (struct sockaddr_dn *)msg->msg_name; struct sk_buff *skb = NULL; struct dn_skb_cb *cb; - int len; + size_t len; unsigned char fctype; long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); From shemminger@osdl.org Thu Jan 8 14:02:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:09 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M2sTa031571 for ; Thu, 8 Jan 2004 14:02:55 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2eo08231; Thu, 8 Jan 2004 14:02:40 -0800 Date: Fri, 9 Jan 2004 14:02:23 -0800 From: Stephen Hemminger To: "David S. Miller" , Arnaldo Carvalho de Melo Cc: netdev@oss.sgi.com Subject: [PATCH] (8/17) ipx -- size_t for send/recvmsg Message-Id: <20040109140223.3926cbeb@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2298 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1575 Lines: 54 Convert IPX to unsigned for send/receive message length. Enforce MTU limited by header pktsize. diff -Nru a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c --- a/net/ipx/af_ipx.c Mon Dec 8 16:19:52 2003 +++ b/net/ipx/af_ipx.c Mon Dec 8 16:19:52 2003 @@ -1683,7 +1683,7 @@ } static int ipx_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct ipx_opt *ipxs = ipx_sk(sk); @@ -1698,6 +1698,10 @@ if (flags & ~MSG_DONTWAIT) goto out; + /* Max possible packet size limited by 16 bit pktsize in header */ + if (len >= 65535 - sizeof(struct ipxhdr)) + goto out; + if (usipx) { if (!ipxs->port) { struct sockaddr_ipx uaddr; @@ -1744,7 +1748,7 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, int flags) + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct ipx_opt *ipxs = ipx_sk(sk); diff -Nru a/net/ipx/ipx_route.c b/net/ipx/ipx_route.c --- a/net/ipx/ipx_route.c Mon Dec 8 16:19:52 2003 +++ b/net/ipx/ipx_route.c Mon Dec 8 16:19:52 2003 @@ -169,13 +169,13 @@ * Route an outgoing frame from a socket. */ int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx, - struct iovec *iov, int len, int noblock) + struct iovec *iov, size_t len, int noblock) { struct sk_buff *skb; struct ipx_opt *ipxs = ipx_sk(sk); struct ipx_interface *intrfc; struct ipxhdr *ipx; - int size; + size_t size; int ipx_offset; struct ipx_route *rt = NULL; int rc; From shemminger@osdl.org Thu Jan 8 14:02:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:08 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M2qTa031565 for ; Thu, 8 Jan 2004 14:02:52 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2eo08235; Thu, 8 Jan 2004 14:02:40 -0800 Date: Fri, 9 Jan 2004 14:02:36 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] (10/17) netlink/packet -- size_t for send/recvmsg Message-Id: <20040109140236.35fb993a@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2296 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 2752 Lines: 93 Convert "maintance protocols" af_key, packet, netlink over to use unsigned length for send/receive message. diff -Nru a/net/key/af_key.c b/net/key/af_key.c --- a/net/key/af_key.c Mon Dec 8 16:19:58 2003 +++ b/net/key/af_key.c Mon Dec 8 16:19:58 2003 @@ -2655,7 +2655,7 @@ } static int pfkey_sendmsg(struct kiocb *kiocb, - struct socket *sock, struct msghdr *msg, int len) + struct socket *sock, struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct sk_buff *skb = NULL; @@ -2697,7 +2697,7 @@ } static int pfkey_recvmsg(struct kiocb *kiocb, - struct socket *sock, struct msghdr *msg, int len, + struct socket *sock, struct msghdr *msg, size_t len, int flags) { struct sock *sk = sock->sk; diff -Nru a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c --- a/net/netlink/af_netlink.c Mon Dec 8 16:19:58 2003 +++ b/net/netlink/af_netlink.c Mon Dec 8 16:19:58 2003 @@ -601,7 +601,7 @@ } static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock_iocb *siocb = kiocb_to_siocb(kiocb); struct sock *sk = sock->sk; @@ -641,7 +641,7 @@ } err = -EMSGSIZE; - if ((unsigned)len > sk->sk_sndbuf - 32) + if (len > sk->sk_sndbuf - 32) goto out; err = -ENOBUFS; skb = alloc_skb(len, GFP_KERNEL); @@ -683,7 +683,7 @@ } static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock, - struct msghdr *msg, int len, + struct msghdr *msg, size_t len, int flags) { struct sock_iocb *siocb = kiocb_to_siocb(kiocb); @@ -691,7 +691,7 @@ struct sock *sk = sock->sk; struct netlink_opt *nlk = nlk_sk(sk); int noblock = flags&MSG_DONTWAIT; - int copied; + size_t copied; struct sk_buff *skb; int err; diff -Nru a/net/packet/af_packet.c b/net/packet/af_packet.c --- a/net/packet/af_packet.c Mon Dec 8 16:19:58 2003 +++ b/net/packet/af_packet.c Mon Dec 8 16:19:58 2003 @@ -279,7 +279,7 @@ */ static int packet_sendmsg_spkt(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct sockaddr_pkt *saddr=(struct sockaddr_pkt *)msg->msg_name; @@ -651,7 +651,7 @@ static int packet_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct sockaddr_ll *saddr=(struct sockaddr_ll *)msg->msg_name; @@ -999,7 +999,7 @@ */ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len, int flags) + struct msghdr *msg, size_t len, int flags) { struct sock *sk = sock->sk; struct sk_buff *skb; From shemminger@osdl.org Thu Jan 8 14:02:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:06 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M2oTa031562 for ; Thu, 8 Jan 2004 14:02:50 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2do08223; Thu, 8 Jan 2004 14:02:39 -0800 Date: Fri, 9 Jan 2004 14:01:16 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] (16/17) af_unix -- size_t for send/recvmsg Message-Id: <20040109140116.639569b6@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2294 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1490 Lines: 50 Convert Unix domain to handle unsigned for length in send/recv msg. diff -Nru a/net/unix/af_unix.c b/net/unix/af_unix.c --- a/net/unix/af_unix.c Mon Dec 8 16:20:16 2003 +++ b/net/unix/af_unix.c Mon Dec 8 16:20:16 2003 @@ -1176,7 +1176,7 @@ */ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock_iocb *siocb = kiocb_to_siocb(kiocb); struct sock *sk = sock->sk; @@ -1217,7 +1217,7 @@ goto out; err = -EMSGSIZE; - if ((unsigned)len > sk->sk_sndbuf - 32) + if (len > sk->sk_sndbuf - 32) goto out; skb = sock_alloc_send_skb(sk, len, msg->msg_flags&MSG_DONTWAIT, &err); @@ -1324,7 +1324,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock_iocb *siocb = kiocb_to_siocb(kiocb); struct sock *sk = sock->sk; @@ -1447,7 +1447,7 @@ } static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, + struct msghdr *msg, size_t size, int flags) { struct sock_iocb *siocb = kiocb_to_siocb(iocb); @@ -1555,7 +1555,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, + struct msghdr *msg, size_t size, int flags) { struct sock_iocb *siocb = kiocb_to_siocb(iocb); From shemminger@osdl.org Thu Jan 8 14:02:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:06 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M2oTa031563 for ; Thu, 8 Jan 2004 14:02:51 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2do08227; Thu, 8 Jan 2004 14:02:39 -0800 Date: Fri, 9 Jan 2004 14:02:08 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] (17/17) x25 -- size_t for send/recvmsg Message-Id: <20040109140208.7dd9a0fe@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2295 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1170 Lines: 41 Convert X.25 to handle unsigned for len in send/receive msg. diff -Nru a/net/x25/af_x25.c b/net/x25/af_x25.c --- a/net/x25/af_x25.c Mon Dec 8 16:20:19 2003 +++ b/net/x25/af_x25.c Mon Dec 8 16:20:19 2003 @@ -910,7 +910,7 @@ } static int x25_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct x25_opt *x25 = x25_sk(sk); @@ -919,7 +919,8 @@ struct sk_buff *skb; unsigned char *asmptr; int noblock = msg->msg_flags & MSG_DONTWAIT; - int size, qbit = 0, rc = -EINVAL; + size_t size; + int qbit = 0, rc = -EINVAL; if (msg->msg_flags & ~(MSG_DONTWAIT | MSG_OOB | MSG_EOR)) goto out; @@ -1085,13 +1086,14 @@ static int x25_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct x25_opt *x25 = x25_sk(sk); struct sockaddr_x25 *sx25 = (struct sockaddr_x25 *)msg->msg_name; - int copied, qbit; + size_t copied; + int qbit; struct sk_buff *skb; unsigned char *asmptr; int rc = -ENOTCONN; From shemminger@osdl.org Thu Jan 8 14:02:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:05 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M2mTa031561 for ; Thu, 8 Jan 2004 14:02:48 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2bo08190; Thu, 8 Jan 2004 14:02:37 -0800 Date: Fri, 9 Jan 2004 14:03:23 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] (11/17) econet -- size_t for send/recvmsg Message-Id: <20040109140323.7d48f9db@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2293 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1046 Lines: 40 Convert econet over to unsigned length for send/recv msg. Add a bounds check based on device mtu - header. diff -Nru a/net/econet/af_econet.c b/net/econet/af_econet.c --- a/net/econet/af_econet.c Mon Dec 8 16:20:01 2003 +++ b/net/econet/af_econet.c Mon Dec 8 16:20:01 2003 @@ -113,11 +113,12 @@ */ static int econet_recvmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len, int flags) + struct msghdr *msg, size_t len, int flags) { struct sock *sk = sock->sk; struct sk_buff *skb; - int copied, err; + size_t copied; + int err; msg->msg_namelen = sizeof(struct sockaddr_ec); @@ -246,7 +247,7 @@ */ static int econet_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct sockaddr_ec *saddr=(struct sockaddr_ec *)msg->msg_name; @@ -307,6 +308,9 @@ if (dev == NULL) return -ENETDOWN; } + + if (len + 15 > dev->mtu) + return -EMSGSIZE; if (dev->type == ARPHRD_ECONET) { From shemminger@osdl.org Thu Jan 8 14:03:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:03:49 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M3YTa031898 for ; Thu, 8 Jan 2004 14:03:35 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i08M2ao08186; Thu, 8 Jan 2004 14:02:36 -0800 Date: Fri, 9 Jan 2004 14:03:17 -0800 From: Stephen Hemminger To: "David S. Miller" , Jean Tourrilhes Cc: netdev@oss.sgi.com Subject: [PATCH] (12/17) irda -- size_t for send/recvmsg Message-Id: <20040109140317.0a8a30a8@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2301 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 1919 Lines: 63 Convert IRDA to unsigned length for send/recv msg. diff -Nru a/net/irda/af_irda.c b/net/irda/af_irda.c --- a/net/irda/af_irda.c Mon Dec 8 16:20:04 2003 +++ b/net/irda/af_irda.c Mon Dec 8 16:20:04 2003 @@ -1257,7 +1257,7 @@ * fragment the message if necessary */ static int irda_sendmsg(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct irda_sock *self; @@ -1329,12 +1329,13 @@ * after being read, regardless of how much the user actually read */ static int irda_recvmsg_dgram(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, int flags) + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct irda_sock *self = irda_sk(sk); struct sk_buff *skb; - int copied, err; + size_t copied; + int err; IRDA_DEBUG(4, "%s()\n", __FUNCTION__); @@ -1379,12 +1380,12 @@ * Function irda_recvmsg_stream (iocb, sock, msg, size, flags) */ static int irda_recvmsg_stream(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int size, int flags) + struct msghdr *msg, size_t size, int flags) { struct sock *sk = sock->sk; struct irda_sock *self = irda_sk(sk); int noblock = flags & MSG_DONTWAIT; - int copied = 0; + size_t copied = 0; int target = 1; DECLARE_WAITQUEUE(waitq, current); @@ -1505,7 +1506,7 @@ * */ static int irda_sendmsg_dgram(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct irda_sock *self; @@ -1571,7 +1572,7 @@ */ #ifdef CONFIG_IRDA_ULTRA static int irda_sendmsg_ultra(struct kiocb *iocb, struct socket *sock, - struct msghdr *msg, int len) + struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct irda_sock *self; From acme@conectiva.com.br Thu Jan 8 14:07:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:07:15 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M70Ta002299 for ; Thu, 8 Jan 2004 14:07:01 -0800 Received: from 4-203.ctame700-6.telepar.net.br ([200.140.237.203] helo=oops.kerneljanitors.org) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1AeiQD-0000fI-00; Thu, 08 Jan 2004 20:14:34 -0200 Received: by oops.kerneljanitors.org (Postfix, from userid 500) id B0C1C1966D; Thu, 8 Jan 2004 20:17:51 -0200 (BRDT) Date: Thu, 8 Jan 2004 20:17:51 -0200 From: Arnaldo Carvalho de Melo To: Stephen Hemminger Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] (8/17) ipx -- size_t for send/recvmsg Message-ID: <20040108221751.GC3062@conectiva.com.br> References: <20040109140223.3926cbeb@linux.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040109140223.3926cbeb@linux.local> X-Url: http://advogato.org/person/acme Organization: Conectiva S.A. User-Agent: Mutt/1.5.5.1i X-archive-position: 2302 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Content-Length: 192 Lines: 5 Em Fri, Jan 09, 2004 at 02:02:23PM -0800, Stephen Hemminger escreveu: > Convert IPX to unsigned for send/receive message length. > Enforce MTU limited by header pktsize. ACK, thanks Stephen. From acme@conectiva.com.br Thu Jan 8 14:07:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 14:07:26 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08M7CTa002307 for ; Thu, 8 Jan 2004 14:07:13 -0800 Received: from 4-203.ctame700-6.telepar.net.br ([200.140.237.203] helo=oops.kerneljanitors.org) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1AeiQQ-0000fO-00; Thu, 08 Jan 2004 20:14:46 -0200 Received: by oops.kerneljanitors.org (Postfix, from userid 500) id C9F1C1966D; Thu, 8 Jan 2004 20:18:04 -0200 (BRDT) Date: Thu, 8 Jan 2004 20:18:04 -0200 From: Arnaldo Carvalho de Melo To: Stephen Hemminger Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] (13/17) llc -- size_t for send/recvmsg Message-ID: <20040108221804.GD3062@conectiva.com.br> References: <20040109140311.66c2fa57@linux.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040109140311.66c2fa57@linux.local> X-Url: http://advogato.org/person/acme Organization: Conectiva S.A. User-Agent: Mutt/1.5.5.1i X-archive-position: 2303 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Content-Length: 138 Lines: 5 Em Fri, Jan 09, 2004 at 02:03:11PM -0800, Stephen Hemminger escreveu: > Convert LLC to unsigned for send/recv msg. ACK, thanks Stephen. From jt@bougret.hpl.hp.com Thu Jan 8 15:18:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 15:18:47 -0800 (PST) Received: from palrel11.hp.com (palrel11.hp.com [156.153.255.246]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08NIOTa005875 for ; Thu, 8 Jan 2004 15:18:24 -0800 Received: from tomil.hpl.hp.com (tomil.hpl.hp.com [15.0.152.100]) by palrel11.hp.com (Postfix) with ESMTP id 8317F1C0060D; Thu, 8 Jan 2004 14:56:57 -0800 (PST) Received: from bougret.hpl.hp.com (bougret.hpl.hp.com [15.4.92.227]) by tomil.hpl.hp.com (8.9.3 (PHNE_28810+JAGae91741+JAGae92668)/8.9.3 HPLabs Timeshare Server) with ESMTP id OAA07537; Thu, 8 Jan 2004 14:56:57 -0800 (PST) Received: from jt by bougret.hpl.hp.com with local (Exim 3.35 #1 (Debian)) id 1Aej5F-0005xQ-00; Thu, 08 Jan 2004 14:56:57 -0800 Date: Thu, 8 Jan 2004 14:56:57 -0800 To: Stephen Hemminger Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] (12/17) irda -- size_t for send/recvmsg Message-ID: <20040108225657.GB22438@bougret.hpl.hp.com> Reply-To: jt@hpl.hp.com References: <20040109140317.0a8a30a8@linux.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040109140317.0a8a30a8@linux.local> User-Agent: Mutt/1.3.28i Organisation: HP Labs Palo Alto Address: HP Labs, 1U-17, 1501 Page Mill road, Palo Alto, CA 94304, USA. E-mail: jt@hpl.hp.com From: Jean Tourrilhes X-archive-position: 2304 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jt@bougret.hpl.hp.com Precedence: bulk X-list: netdev Content-Length: 211 Lines: 7 On Fri, Jan 09, 2004 at 02:03:17PM -0800, Stephen Hemminger wrote: > Convert IRDA to unsigned length for send/recv msg. Ok, I've added that on my web page and will remind Dave if he doesn't pick it up. Jean From davem@pizda.ninka.net Thu Jan 8 15:44:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 15:44:36 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08NiNTa006702 for ; Thu, 8 Jan 2004 15:44:23 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id PAA03186; Thu, 8 Jan 2004 15:36:17 -0800 Date: Thu, 8 Jan 2004 15:36:17 -0800 From: "David S. Miller" To: jt@hpl.hp.com Cc: jt@bougret.hpl.hp.com, shemminger@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] (12/17) irda -- size_t for send/recvmsg Message-Id: <20040108153617.58a75a75.davem@redhat.com> In-Reply-To: <20040108225657.GB22438@bougret.hpl.hp.com> References: <20040109140317.0a8a30a8@linux.local> <20040108225657.GB22438@bougret.hpl.hp.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2305 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 387 Lines: 11 On Thu, 8 Jan 2004 14:56:57 -0800 Jean Tourrilhes wrote: > On Fri, Jan 09, 2004 at 02:03:17PM -0800, Stephen Hemminger wrote: > > Convert IRDA to unsigned length for send/recv msg. > > Ok, I've added that on my web page and will remind Dave if he > doesn't pick it up. Don't worry, I'll queue all of these up for 2.6.2, it's too late for 2.6.1 at this point. From jt@bougret.hpl.hp.com Thu Jan 8 15:47:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 15:48:09 -0800 (PST) Received: from palrel13.hp.com (palrel13.hp.com [156.153.255.238]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i08NluTa007119 for ; Thu, 8 Jan 2004 15:47:56 -0800 Received: from tomil.hpl.hp.com (tomil.hpl.hp.com [15.0.152.100]) by palrel13.hp.com (Postfix) with ESMTP id 248531C007DE; Thu, 8 Jan 2004 15:47:56 -0800 (PST) Received: from bougret.hpl.hp.com (bougret.hpl.hp.com [15.4.92.227]) by tomil.hpl.hp.com (8.9.3 (PHNE_28810+JAGae91741+JAGae92668)/8.9.3 HPLabs Timeshare Server) with ESMTP id PAA09318; Thu, 8 Jan 2004 15:47:55 -0800 (PST) Received: from jt by bougret.hpl.hp.com with local (Exim 3.35 #1 (Debian)) id 1AejsZ-00064I-00; Thu, 08 Jan 2004 15:47:55 -0800 Date: Thu, 8 Jan 2004 15:47:55 -0800 To: "David S. Miller" Cc: shemminger@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] (12/17) irda -- size_t for send/recvmsg Message-ID: <20040108234755.GA23320@bougret.hpl.hp.com> Reply-To: jt@hpl.hp.com References: <20040109140317.0a8a30a8@linux.local> <20040108225657.GB22438@bougret.hpl.hp.com> <20040108153617.58a75a75.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040108153617.58a75a75.davem@redhat.com> User-Agent: Mutt/1.3.28i Organisation: HP Labs Palo Alto Address: HP Labs, 1U-17, 1501 Page Mill road, Palo Alto, CA 94304, USA. E-mail: jt@hpl.hp.com From: Jean Tourrilhes X-archive-position: 2306 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jt@bougret.hpl.hp.com Precedence: bulk X-list: netdev Content-Length: 573 Lines: 18 On Thu, Jan 08, 2004 at 03:36:17PM -0800, David S. Miller wrote: > On Thu, 8 Jan 2004 14:56:57 -0800 > Jean Tourrilhes wrote: > > > On Fri, Jan 09, 2004 at 02:03:17PM -0800, Stephen Hemminger wrote: > > > Convert IRDA to unsigned length for send/recv msg. > > > > Ok, I've added that on my web page and will remind Dave if he > > doesn't pick it up. > > Don't worry, I'll queue all of these up for 2.6.2, it's too late for 2.6.1 at this > point. Perfect. 2.6.2 is what I was aiming for, it's not like we are in a hurry. Thanks a lot ! Jean From shemminger@osdl.org Thu Jan 8 16:52:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 16:53:04 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i090qoTa011580 for ; Thu, 8 Jan 2004 16:52:50 -0800 Received: from linux.local (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i090qgo07604; Thu, 8 Jan 2004 16:52:42 -0800 Date: Thu, 8 Jan 2004 16:53:43 -0800 From: Stephen Hemminger To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: [PATCH] bugfixes for dgrs.c Message-Id: <20040108165343.7ed94da9@linux.local> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2307 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Content-Length: 12334 Lines: 517 Update the RightSwitch dgrs.c driver for net-drivers-2.5-exp (2.6.1-rc3) to resolve some of the outstanding cruft there. Al may have a better/newer patch. * Don't copy net_device structure on slave device! This won't work because of state variables in structure. * make sure and request_regions before doing i/o to the card * use cpu_relax rather than barrier while spin waiting * don't use dev->init to do the probing work because hard to get unwind correct * Use new pci/eisa probing model, don't search the bus directly Beneficial side effect, don't need to keep on device list anymore. * Be more careful about releaseing resources in error paths Compiled and module loaded/unloaded, but don't have this hardware. diff -Nru a/drivers/net/dgrs.c b/drivers/net/dgrs.c --- a/drivers/net/dgrs.c Thu Jan 8 16:44:05 2004 +++ b/drivers/net/dgrs.c Thu Jan 8 16:44:05 2004 @@ -121,11 +121,22 @@ #include "dgrs_asstruct.h" #include "dgrs_bcomm.h" +#ifdef CONFIG_PCI static struct pci_device_id dgrs_pci_tbl[] = { { SE6_PCI_VENDOR_ID, SE6_PCI_DEVICE_ID, PCI_ANY_ID, PCI_ANY_ID, }, { } /* Terminating entry */ }; MODULE_DEVICE_TABLE(pci, dgrs_pci_tbl); +#endif + +#ifdef CONFIG_EISA +static struct eisa_device_id dgrs_eisa_tbl[] = { + { "DBI0A01" }, + { } +}; +MODULE_DEVICE_TABLE(eisa, dgrs_eisa_tbl); +#endif + MODULE_LICENSE("GPL"); @@ -179,11 +190,6 @@ static int dgrs_nicmode; /* - * Chain of device structures - */ -static struct net_device *dgrs_root_dev; - -/* * Private per-board data structure (dev->priv) */ typedef struct @@ -191,7 +197,6 @@ /* * Stuff for generic ethercard I/F */ - struct net_device *next_dev; struct net_device_stats stats; /* @@ -1187,7 +1192,7 @@ priv->intrcnt = 0; for (i = jiffies + 2*HZ + HZ/2; time_after(i, jiffies); ) { - barrier(); /* gcc 2.95 needs this */ + cpu_relax(); if (priv->intrcnt >= 2) break; } @@ -1200,16 +1205,6 @@ } /* - * Register the /proc/ioports information... - */ - if (!request_region(dev->base_addr, 256, "RightSwitch")) { - printk(KERN_ERR "%s: io 0x%3lX, which is busy.\n", dev->name, - dev->base_addr); - rc = -EBUSY; - goto err_free_irq; - } - - /* * Entry points... */ dev->open = &dgrs_open; @@ -1242,22 +1237,23 @@ return (0); } -static int __init +static struct net_device * __init dgrs_found_device( int io, ulong mem, int irq, ulong plxreg, - ulong plxdma + ulong plxdma, + struct device *pdev ) { - DGRS_PRIV *priv; - struct net_device *dev, *aux; - int i, ret; + DGRS_PRIV *priv; + struct net_device *dev; + int i, ret = -ENOMEM; dev = alloc_etherdev(sizeof(DGRS_PRIV)); if (!dev) - return -ENOMEM; + goto err0; priv = (DGRS_PRIV *)dev->priv; @@ -1272,19 +1268,19 @@ priv->chan = 1; priv->devtbl[0] = dev; - dev->init = dgrs_probe1; SET_MODULE_OWNER(dev); - - if (register_netdev(dev) != 0) { - free_netdev(dev); - return -EIO; - } - - priv->next_dev = dgrs_root_dev; - dgrs_root_dev = dev; + SET_NETDEV_DEV(dev, pdev); + + ret = dgrs_probe1(dev); + if (ret) + goto err1; + + ret = register_netdev(dev); + if (ret) + goto err2; if ( !dgrs_nicmode ) - return (0); /* Switch mode, we are done */ + return dev; /* Switch mode, we are done */ /* * Operating card as N separate NICs @@ -1302,8 +1298,7 @@ if (!devN) goto fail; - /* Make it an exact copy of dev[0]... */ - *devN = *dev; + /* Don't copy the network device structure! */ /* copy the priv structure of dev[0] */ privN = (DGRS_PRIV *)devN->priv; @@ -1316,123 +1311,212 @@ devN->irq = 0; /* ... and base MAC address off address of 1st port */ devN->dev_addr[5] += i; - /* ... choose a new name */ - strncpy(devN->name, "eth%d", IFNAMSIZ); - devN->init = dgrs_initclone; + + ret = dgrs_initclone(devN); + if (ret) + goto fail; + SET_MODULE_OWNER(devN); + SET_NETDEV_DEV(dev, pdev); - ret = -EIO; - if (register_netdev(devN)) { + ret = register_netdev(devN); + if (ret) { free_netdev(devN); goto fail; } privN->chan = i+1; priv->devtbl[i] = devN; - privN->next_dev = dgrs_root_dev; - dgrs_root_dev = devN; } - return 0; -fail: aux = priv->next_dev; - while (dgrs_root_dev != aux) { - struct net_device *d = dgrs_root_dev; - - dgrs_root_dev = ((DGRS_PRIV *)d->priv)->next_dev; + return dev; + + fail: + while (i >= 0) { + struct net_device *d = priv->devtbl[i--]; unregister_netdev(d); free_netdev(d); } - return ret; + + err2: + free_irq(dev->irq, dev); + err1: + free_netdev(dev); + err0: + return ERR_PTR(ret); } -/* - * Scan for all boards - */ -static int is2iv[8] __initdata = { 0, 3, 5, 7, 10, 11, 12, 15 }; +static void __devexit dgrs_remove(struct net_device *dev) +{ + DGRS_PRIV *priv = dev->priv; + int i; + + unregister_netdev(dev); + + for (i = 1; i < priv->nports; ++i) { + struct net_device *d = priv->devtbl[i]; + if (d) { + unregister_netdev(d); + free_netdev(d); + } + } + + proc_reset(priv->devtbl[0], 1); -static int __init dgrs_scan(void) + if (priv->vmem) + iounmap(priv->vmem); + if (priv->vplxdma) + iounmap((uchar *) priv->vplxdma); + + if (dev->irq) + free_irq(dev->irq, dev); + + for (i = 1; i < priv->nports; ++i) { + if (priv->devtbl[i]) + unregister_netdev(priv->devtbl[i]); + } +} + +#ifdef CONFIG_PCI +static int __init dgrs_pci_probe(struct pci_dev *pdev, + const struct pci_device_id *ent) { - int cards_found = 0; + struct net_device *dev; + int err; uint io; uint mem; uint irq; uint plxreg; uint plxdma; - struct pci_dev *pdev = NULL; /* - * First, check for PCI boards - */ - while ((pdev = pci_find_device(SE6_PCI_VENDOR_ID, SE6_PCI_DEVICE_ID, pdev)) != NULL) - { - /* - * Get and check the bus-master and latency values. - * Some PCI BIOSes fail to set the master-enable bit, - * and the latency timer must be set to the maximum - * value to avoid data corruption that occurs when the - * timer expires during a transfer. Yes, it's a bug. - */ - if (pci_enable_device(pdev)) - continue; - pci_set_master(pdev); - - plxreg = pci_resource_start (pdev, 0); - io = pci_resource_start (pdev, 1); - mem = pci_resource_start (pdev, 2); - pci_read_config_dword(pdev, 0x30, &plxdma); - irq = pdev->irq; - plxdma &= ~15; + * Get and check the bus-master and latency values. + * Some PCI BIOSes fail to set the master-enable bit, + * and the latency timer must be set to the maximum + * value to avoid data corruption that occurs when the + * timer expires during a transfer. Yes, it's a bug. + */ + err = pci_enable_device(pdev); + if (err) + return err; + err = pci_request_regions(pdev, "RightSwitch"); + if (err) + return err; + + pci_set_master(pdev); + + plxreg = pci_resource_start (pdev, 0); + io = pci_resource_start (pdev, 1); + mem = pci_resource_start (pdev, 2); + pci_read_config_dword(pdev, 0x30, &plxdma); + irq = pdev->irq; + plxdma &= ~15; + + /* + * On some BIOSES, the PLX "expansion rom" (used for DMA) + * address comes up as "0". This is probably because + * the BIOS doesn't see a valid 55 AA ROM signature at + * the "ROM" start and zeroes the address. To get + * around this problem the SE-6 is configured to ask + * for 4 MB of space for the dual port memory. We then + * must set its range back to 2 MB, and use the upper + * half for DMA register access + */ + OUTL(io + PLX_SPACE0_RANGE, 0xFFE00000L); + if (plxdma == 0) + plxdma = mem + (2048L * 1024L); + pci_write_config_dword(pdev, 0x30, plxdma + 1); + pci_read_config_dword(pdev, 0x30, &plxdma); + plxdma &= ~15; + + dev = dgrs_found_device(io, mem, irq, plxreg, plxdma, &pdev->dev); + if (IS_ERR(dev)) { + pci_release_regions(pdev); + return PTR_ERR(dev); + } - /* - * On some BIOSES, the PLX "expansion rom" (used for DMA) - * address comes up as "0". This is probably because - * the BIOS doesn't see a valid 55 AA ROM signature at - * the "ROM" start and zeroes the address. To get - * around this problem the SE-6 is configured to ask - * for 4 MB of space for the dual port memory. We then - * must set its range back to 2 MB, and use the upper - * half for DMA register access - */ - OUTL(io + PLX_SPACE0_RANGE, 0xFFE00000L); - if (plxdma == 0) - plxdma = mem + (2048L * 1024L); - pci_write_config_dword(pdev, 0x30, plxdma + 1); - pci_read_config_dword(pdev, 0x30, &plxdma); - plxdma &= ~15; + pci_set_drvdata(pdev, dev); + return 0; +} - dgrs_found_device(io, mem, irq, plxreg, plxdma); +static void __devexit dgrs_pci_remove(struct pci_dev *pdev) +{ + struct net_device *dev = pci_get_drvdata(pdev); - cards_found++; - } + dgrs_remove(dev); + pci_release_regions(pdev); + free_netdev(dev); +} - /* - * Second, check for EISA boards - */ - if (EISA_bus) - { - for (io = 0x1000; io < 0x9000; io += 0x1000) - { - if (inb(io+ES4H_MANUFmsb) != 0x10 - || inb(io+ES4H_MANUFlsb) != 0x49 - || inb(io+ES4H_PRODUCT) != ES4H_PRODUCT_CODE) - continue; +static struct pci_driver dgrs_pci_driver = { + .name = "dgrs", + .id_table = dgrs_pci_tbl, + .probe = dgrs_pci_probe, + .remove = __devexit_p(dgrs_pci_remove), +}; +#endif + + +#ifdef CONFIG_EISA +static int is2iv[8] __initdata = { 0, 3, 5, 7, 10, 11, 12, 15 }; - if ( ! (inb(io+ES4H_EC) & ES4H_EC_ENABLE) ) - continue; /* Not EISA configured */ +static int __init dgrs_eisa_probe (struct device *gendev) +{ + struct net_device *dev; + struct eisa_device *edev = to_eisa_device(gendev); + uint io = edev->base_addr; + uint mem; + uint irq; + int rc = -ENODEV; /* Not EISA configured */ + + if (!request_region(io, 256, "RightSwitch")) { + printk(KERN_ERR "%s: io 0x%3lX, which is busy.\n", dev->name, + dev->base_addr); + return -EBUSY; + } - mem = (inb(io+ES4H_AS_31_24) << 24) - + (inb(io+ES4H_AS_23_16) << 16); + if ( ! (inb(io+ES4H_EC) & ES4H_EC_ENABLE) ) + goto err_out; - irq = is2iv[ inb(io+ES4H_IS) & ES4H_IS_INTMASK ]; + mem = (inb(io+ES4H_AS_31_24) << 24) + + (inb(io+ES4H_AS_23_16) << 16); - dgrs_found_device(io, mem, irq, 0L, 0L); + irq = is2iv[ inb(io+ES4H_IS) & ES4H_IS_INTMASK ]; - ++cards_found; - } + dev = dgrs_found_device(io, mem, irq, 0L, 0L, gendev); + if (IS_ERR(dev)) { + rc = PTR_ERR(dev); + goto err_out; } - return cards_found; + gendev->driver_data = dev; + return 0; + err_out: + release_region(io, 256); + return rc; +} + +static int __devexit dgrs_eisa_remove(struct device *gendev) +{ + struct net_device *dev = gendev->driver_data; + + dgrs_remove(dev); + + release_region(dev->base_addr, 256); + + free_netdev(dev); + return 0; } +static struct eisa_driver dgrs_eisa_driver = { + .id_table = dgrs_eisa_tbl, + .driver = { + .name = "dgrs", + .probe = dgrs_eisa_probe, + .remove = __devexit_p(dgrs_eisa_remove), + } +}; +#endif + /* * Variables that can be overriden from module command line */ @@ -1459,8 +1543,8 @@ static int __init dgrs_init_module (void) { - int cards_found; int i; + int eisacount = 0, pcicount = 0; /* * Command line variable overrides @@ -1501,38 +1585,27 @@ /* * Find and configure all the cards */ - dgrs_root_dev = NULL; - cards_found = dgrs_scan(); - - return cards_found ? 0 : -ENODEV; +#ifdef CONFIG_EISA + eisacount = eisa_driver_register(&dgrs_eisa_driver); + if (eisacount < 0) + return eisacount; +#endif +#ifdef CONFIG_PCI + pcicount = pci_register_driver(&dgrs_pci_driver); + if (pcicount < 0) + return pcicount; +#endif + return (eisacount + pcicount) == 0 ? -ENODEV : 0; } static void __exit dgrs_cleanup_module (void) { - while (dgrs_root_dev) - { - struct net_device *next_dev; - DGRS_PRIV *priv; - - priv = (DGRS_PRIV *) dgrs_root_dev->priv; - next_dev = priv->next_dev; - unregister_netdev(dgrs_root_dev); - - proc_reset(priv->devtbl[0], 1); - - if (priv->vmem) - iounmap(priv->vmem); - if (priv->vplxdma) - iounmap((uchar *) priv->vplxdma); - - release_region(dgrs_root_dev->base_addr, 256); - - if (dgrs_root_dev->irq) - free_irq(dgrs_root_dev->irq, dgrs_root_dev); - - free_netdev(dgrs_root_dev); - dgrs_root_dev = next_dev; - } +#ifdef CONFIG_EISA + eisa_driver_unregister (&dgrs_eisa_driver); +#endif +#ifdef CONFIG_PCI + pci_unregister_driver (&dgrs_pci_driver); +#endif } module_init(dgrs_init_module); From willy@w.ods.org Thu Jan 8 16:54:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 16:54:47 -0800 (PST) Received: from willy.net1.nerim.net (willy.net1.nerim.net [62.212.114.60]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i090sWTa011747 for ; Thu, 8 Jan 2004 16:54:33 -0800 Date: Fri, 9 Jan 2004 01:54:23 +0100 From: Willy Tarreau To: Nathaniel M Nelson Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Possible weird TCP/IP bug Message-ID: <20040109005423.GD545@alpha.home.local> References: <3FFDDE33.1070006@chartermi.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3FFDDE33.1070006@chartermi.net> User-Agent: Mutt/1.4i X-archive-position: 2308 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@w.ods.org Precedence: bulk X-list: netdev Content-Length: 917 Lines: 18 You should post this with more information to netdev@oss.sgi.com. Please describe your setup a bit. Where do the packets originate. Are they forwarded by the firewall or emitted by it ? In the later case, are they generated by a local process or are they replies ? etc... Willy On Thu, Jan 08, 2004 at 05:48:19PM -0500, Nathaniel M Nelson wrote: > I have 3 machines running 2.4.22 (Slackware 9.1) and only one of them, > which happens to be my firewall, sends out TCP sequence numbers starting > with "0". This does not seem right to me. If this is not a bug, I > apoligize...if anyone thinks it might be, please tell me if you need > more in-depth info. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ From nmn@chartermi.net Thu Jan 8 20:22:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 08 Jan 2004 20:22:30 -0800 (PST) Received: from proxy3-baycity.chartermi.net (proxy3baymi.bay.mi.charter.com [24.247.24.41]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i094MGTa019639 for ; Thu, 8 Jan 2004 20:22:16 -0800 Received: from chartermi.net (24.231.146.33.gha.mi.chartermi.net [24.231.146.33] (may be forged)) by proxy3-baycity.chartermi.net (8.11.7p1+Sun/8.11.6) with ESMTP id i094M5A09755 for ; Thu, 8 Jan 2004 23:22:05 -0500 (EST) Message-ID: <3FFE2B00.2030607@chartermi.net> Date: Thu, 08 Jan 2004 23:16:00 -0500 From: Nathaniel M Nelson User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: Possible weird TCP bug Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Charter-MailScanner-Information: X-Charter-MailScanner: X-archive-position: 2309 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nmn@chartermi.net Precedence: bulk X-list: netdev Content-Length: 1880 Lines: 34 I have encountered a strange issue with one of my linux machines (which happens to be my firewall/masquerading box)....it seems that the TCP sequence numbers that it generates for output start with "0". This goes for any packet that originates from the firewall itself or any packets that are forwarded to that machine. This does not seem right to me...any other linux box that I hook up to the WAN looks like they generate a normal sequence number. This particular system is running a Tyan Thunder/LE-T 2518GN motherboard which is a Dual Socket 370 board. It has two Intel 82559 LAN controllers. Let me know if anyone needs more specs. It was running the 2.4.22 kernel and now runs the 2.4.24 kernel and both have the same tcp sequence problem. Below is a sample SYN packet going out to google.com. It has a sequence # of "0". 0000 00 02 7d 66 a4 54 00 e0 81 23 14 78 08 00 45 00 ..}f.T.. .#.x..E. 0010 00 3c 9a 41 40 00 3f 06 f4 1f 18 e7 92 21 d8 ef .<.A@.?. .....!.. 0020 29 63 89 37 00 50 e5 4b 22 e0 00 00 00 00 a0 02 )c.7.P.K "....... 0030 16 d0 36 6a 00 00 02 04 05 b4 04 02 08 0a 03 1d ..6j.... ........ 0040 b8 a1 00 00 00 00 01 03 03 00 ........ .. Then after I get the SYN,ACK back, the firewall will send out the next ACK with the sequence number correctly incremented by 1. 0000 00 02 7d 66 a4 54 00 e0 81 23 14 78 08 00 45 00 ..}f.T.. .#.x..E. 0010 00 28 9a 42 40 00 3f 06 f4 32 18 e7 92 21 d8 ef .(.B@.?. .2...!.. 0020 29 63 89 37 00 50 e5 4b 22 e1 db f2 5c c5 50 10 )c.7.P.K "...\.P. 0030 16 d0 21 3d 00 00 ..!=. So of course the sequence is "1" in that packet. Both sequence numbers seem a little low though... and not very cryptic. If this is not a bug I apoligize in advance. (Please CC replies to nmn@chartermi.net as I am not subscribed.) From michal@logix.cz Fri Jan 9 00:51:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 00:51:26 -0800 (PST) Received: from maxipes.logix.cz (maxipes.logix.cz [81.0.234.97]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i098pATa028999 for ; Fri, 9 Jan 2004 00:51:13 -0800 Received: from logix.cz (styx.suse.cz [213.210.157.162]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (Client CN "Michal Ludvig", Issuer "Personal Freemail RSA 2000.8.30" (verified OK)) by maxipes.logix.cz (Postfix) with ESMTP id 71147299F3; Fri, 9 Jan 2004 09:51:08 +0100 (CET) Message-ID: <3FFE6B72.9030808@logix.cz> Date: Fri, 09 Jan 2004 09:50:58 +0100 From: Michal Ludvig User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.5) Gecko/20030925 X-Accept-Language: cs, cz, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] sha2-256 truncation Content-Type: multipart/mixed; boundary="------------090003070806070002080905" X-archive-position: 2310 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: michal@logix.cz Precedence: bulk X-list: netdev Content-Length: 1233 Lines: 40 This is a multi-part message in MIME format. --------------090003070806070002080905 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi, the attached trivial patch corrects the truncation size of computed hashes that are used in IPsec ESP/AH packets for SHA2-256. All other hash algorithms use 96 bits as well as does SuperFreeS/WAN and FreeBSD also for SHA2-256. Only the native Linux sha2-256 used 128 bits what led to incompatibility with other IPsec implementations. Please apply, thanks! Michal Ludvig -- * A mouse is a device used to point at the xterm you want to type in. * Personal homepage - http://www.logix.cz/michal --------------090003070806070002080905 Content-Type: text/plain; name="kernel-sha256.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="kernel-sha256.diff" --- linux-2.6.0/net/xfrm/xfrm_algo.c 2004-01-08 01:29:52.067261651 +0100 +++ linux-2.6.0.orig/net/xfrm/xfrm_algo.c 2004-01-08 01:28:38.668690081 +0100 @@ -85,7 +85,7 @@ static struct xfrm_algo_desc aalg_list[] .uinfo = { .auth = { - .icv_truncbits = 96, + .icv_truncbits = 128, .icv_fullbits = 256, } }, --------------090003070806070002080905-- From andi@muc.de Fri Jan 9 01:09:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 01:09:37 -0800 (PST) Received: from averell.firstfloor.org (niemand@pD95263A3.dip.t-dialin.net [217.82.99.163]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0999ETa029721 for ; Fri, 9 Jan 2004 01:09:15 -0800 Received: from averell.firstfloor.org (localhost [127.0.0.1]) by averell.firstfloor.org (8.12.6/8.12.6/SuSE Linux 0.6) with ESMTP id i0999BpL001815; Fri, 9 Jan 2004 10:09:11 +0100 Received: (from andi@localhost) by averell.firstfloor.org (8.12.6/8.12.6/Submit) id i09998YV001814; Fri, 9 Jan 2004 10:09:08 +0100 Date: Fri, 9 Jan 2004 10:09:08 +0100 From: Andi Kleen To: jgarzik@pobox.com Cc: netdev@oss.sgi.com, mikep@linuxtr.net, linux-tr@linuxtr.net Subject: [PATCH] Mark IBM TR driver as not 64 bit clean Message-ID: <20040109090908.GA1772@averell> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 2311 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@muc.de Precedence: bulk X-list: netdev Content-Length: 764 Lines: 18 This driver doesn't seem to be 64bit clean judging from the warnings on x86-64. Mark it as !64BIT. -Andi diff -burpN -X ../KDIFX linux-vanilla/drivers/net/pcmcia/Kconfig linux-2.6.1-amd64/drivers/net/pcmcia/Kconfig --- linux-vanilla/drivers/net/pcmcia/Kconfig 2003-09-28 10:55:00.000000000 +0200 +++ linux-2.6.1-amd64/drivers/net/pcmcia/Kconfig 2004-01-01 06:56:50.000000000 +0100 @@ -119,7 +119,7 @@ config ARCNET_COM20020_CS config PCMCIA_IBMTR tristate "IBM PCMCIA tokenring adapter support" - depends on NET_PCMCIA && IBMTR!=y && TR && PCMCIA + depends on NET_PCMCIA && IBMTR!=y && TR && PCMCIA && !64BIT help Say Y here if you intend to attach this type of Token Ring PCMCIA card to your computer. You then also need to say Y to "Token Ring From davem@pizda.ninka.net Fri Jan 9 02:01:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 02:01:51 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i09A1bTa031701 for ; Fri, 9 Jan 2004 02:01:38 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id BAA04474; Fri, 9 Jan 2004 01:55:27 -0800 Date: Fri, 9 Jan 2004 01:55:27 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: [PATCH] (1/17) protocol sendmsg/revmsg prototype Message-Id: <20040109015527.1131a9ab.davem@redhat.com> In-Reply-To: <20040109132658.01aa28dd@linux.local> References: <20040109132658.01aa28dd@linux.local> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2312 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 275 Lines: 9 On Fri, 9 Jan 2004 13:26:58 -0800 Stephen Hemminger wrote: > The first patch changes the prototype, and causes all the protocols to > generate warnings that are cleared up in the next sixteen. All added to my 2.6.2-preX pending tree. Thanks Stephen. From davem@pizda.ninka.net Fri Jan 9 02:07:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 02:07:17 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i09A74Ta032160 for ; Fri, 9 Jan 2004 02:07:04 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id CAA04513; Fri, 9 Jan 2004 02:00:43 -0800 Date: Fri, 9 Jan 2004 02:00:43 -0800 From: "David S. Miller" To: chas3@users.sourceforge.net Cc: chas@cmf.nrl.navy.mil, netdev@oss.sgi.com, ajz@cambridgebroadband.com Subject: Re: [PATCH][ATM]: br2684 incorrectly handles frames recvd with FCS (by Alex Zeffertt ) Message-Id: <20040109020043.223dc1b1.davem@redhat.com> In-Reply-To: <200401081658.i08GwdRr017374@ginger.cmf.nrl.navy.mil> References: <200401081658.i08GwdRr017374@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2313 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 164 Lines: 6 On Thu, 08 Jan 2004 11:58:40 -0500 chas williams (contractor) wrote: > please apply to 2.6 (and 2.4 as well!) Applied, thanks a lot Chas. From davem@pizda.ninka.net Fri Jan 9 02:11:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 02:11:21 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i09AB8Ta001049 for ; Fri, 9 Jan 2004 02:11:08 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id CAA04592; Fri, 9 Jan 2004 02:04:56 -0800 Date: Fri, 9 Jan 2004 02:04:56 -0800 From: "David S. Miller" To: Andi Kleen Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Mark SIOCSIFNAME as compatible ioctl Message-Id: <20040109020456.045b447e.davem@redhat.com> In-Reply-To: <20040108070413.GA31778@averell> References: <20040108070413.GA31778@averell> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2314 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 293 Lines: 10 On Thu, 8 Jan 2004 08:04:13 +0100 Andi Kleen wrote: > Mark SIOCSIFNAME as an ioctl that doesn't need 32bit conversion. > > Fixes nameif as 32bit executable. How can we mark it compatible? It needs the stuff dev_ifname32() in fs/compat_ioctl.c does for SIOCGIFNAME doesn't it? From michal@logix.cz Fri Jan 9 02:51:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 02:51:15 -0800 (PST) Received: from maxipes.logix.cz (maxipes.logix.cz [81.0.234.97]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i09AooTa005787 for ; Fri, 9 Jan 2004 02:50:51 -0800 Received: from logix.cz (styx.suse.cz [213.210.157.162]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (Client CN "Michal Ludvig", Issuer "Personal Freemail RSA 2000.8.30" (verified OK)) by maxipes.logix.cz (Postfix) with ESMTP id 8ACC029A1C; Fri, 9 Jan 2004 11:12:46 +0100 (CET) Message-ID: <3FFE7E98.6060201@logix.cz> Date: Fri, 09 Jan 2004 11:12:40 +0100 From: Michal Ludvig User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.5) Gecko/20030925 X-Accept-Language: cs, cz, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: [PATCH] sha2-256 truncation References: <3FFE6B72.9030808@logix.cz> In-Reply-To: <3FFE6B72.9030808@logix.cz> Content-Type: multipart/mixed; boundary="------------080108030503020203040007" X-archive-position: 2315 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: michal@logix.cz Precedence: bulk X-list: netdev Content-Length: 1324 Lines: 42 This is a multi-part message in MIME format. --------------080108030503020203040007 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Michal Ludvig told me that: > the attached trivial patch corrects the truncation size of computed > hashes that are used in IPsec ESP/AH packets for SHA2-256. All other > hash algorithms use 96 bits as well as does SuperFreeS/WAN and FreeBSD > also for SHA2-256. Only the native Linux sha2-256 used 128 bits what led > to incompatibility with other IPsec implementations. Oops, sorry. I sent a reversed patch originally. Please use this one instead. Michal Ludvig -- * A mouse is a device used to point at the xterm you want to type in. * Personal homepage - http://www.logix.cz/michal --------------080108030503020203040007 Content-Type: text/plain; name="kernel-sha256.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="kernel-sha256.diff" --- linux-2.6.0/net/xfrm/xfrm_algo.c 2004-01-08 01:29:52.067261651 +0100 +++ linux-2.6.0.orig/net/xfrm/xfrm_algo.c 2004-01-08 01:28:38.668690081 +0100 @@ -85,7 +85,7 @@ static struct xfrm_algo_desc aalg_list[] .uinfo = { .auth = { - .icv_truncbits = 128, + .icv_truncbits = 96, .icv_fullbits = 256, } }, --------------080108030503020203040007-- From Holger.Kiehl@dwd.de Fri Jan 9 07:01:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 07:01:34 -0800 (PST) Received: from dwdmx2.dwd.de (dwdmx2.dwd.de [141.38.3.197]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i09F1ETa017191 for ; Fri, 9 Jan 2004 07:01:15 -0800 Received: (qmail 6968996 invoked from network); 9 Jan 2004 15:01:12 -0000 Received: from mhofsv1.dwd.de (141.38.32.42) by dwdmx2.dwd.de with SMTP; 9 Jan 2004 15:01:12 -0000 Received: from praktifix.dwd.de by mhofsv1.dwd.de with ESMTP for netdev@oss.sgi.com; Fri, 9 Jan 2004 16:01:12 +0100 Date: Fri, 9 Jan 2004 15:01:12 +0000 (GMT) From: Holger Kiehl X-X-Sender: kiehl@praktifix.dwd.de To: netdev@oss.sgi.com Subject: Re: Problems with ipv4 multicast implementation in 2.4? (fwd) In-Reply-To: Message-Id: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT X-archive-position: 2317 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Holger.Kiehl@dwd.de Precedence: bulk X-list: netdev Content-Length: 9093 Lines: 166 Hello The problem has been resolved by Rafal Malek from tellique, here his reasoning: Each network device supports sending and receiving of data packets by default. But, in case of one-way satellite communication (used by your TelliCast service) the device is not able to send data - it only receives. So, if the DVB device (configured as a network interface) wants to send data (e.g. broadcasts for its own subnet) the data cannot be delivered and the packets' information is cached, the DVB driver does _not_ free the buffer of the packets which should be sent. Both the drivers from linuxtv.org and the open source driver from the pent@value card have this error. Simply inserting a dev_kfree_skb(skb) in the dvb_net_tx() (linuxtv.org) / PentaVal_start_xmit() (pent@value) function solved this problem. Rafal Malek verified this for the linuxtv.org driver and I did it for the pent@value driver. Here the relevant values for a system with a pent@value card and nearly 23 days uptime: ip_dst_cache 270 270 256 18 18 1 : 252 126 skbuff_head_cache 795 795 256 53 53 1 : 252 126 size-2048 452 482 2048 238 241 1 : 60 30 Thank you to all the people who helped to find this error! Unfortunatly there still is another leak, the dentry_cache value does not go down. But this problem does not belong to this list. Holger On Fri, 14 Nov 2003, Holger Kiehl wrote: > Sorry for cross posting this message, but it has been pointed out that > linux-kernel is the wrong list for this question. > > Hello > > We have about 25 systems that receive data via a pci DVB card from satellite. > The data is received through multiple muticast streams by some closed > source software. On all systems we notice that the free memory decreases > until in most cases the system are no longer reachable via network. They > then constantly print out: dst cache overflow. But I also have noticed that > some systems lock up hard, I assume this is because we just increase > the ip_dst_cache in /proc/sys/net/ipv4/route/max_size to some very > large value. > > I also know that the German Telekom and Eumetsat have the same problems > and always have to reboot their systems. I also have reports from Austria > and expect many more systems in Europe are effected. > > To get more information I have setup 3 systems with different kernels and > hardware and noticed that over the time ip_dst_cache and skbuff_head_cache > in /proc/slabinfo always increase. They never go down. Also one or more of > the of the size-x values always increase depending on the kernel and DVB > card being used. Here some more slabinfo details and hardware being used: > > System1 : PIII 450MHz, 256MB ram, Kernel 2.4.23-pre9, pent@value DVB card > System2 : PII 350MHz, 384MB ram, Kernel 2.4.21, pent@value DVB card > System3 : P4 2.4GHz with HT enabled, 1 GB ram (high mem enabled), > Kernel 2.4.23-rc1 and libata patch, Nova-S DVB card > > Now the slabinfo data every 24 hours: > > System1: > > ip_dst_cache 647 672 160 27 28 1 > ip_dst_cache 7444 7464 160 311 311 1 > ip_dst_cache 14339 14352 160 598 598 1 > ip_dst_cache 21106 21120 160 880 880 1 > ip_dst_cache 28101 28104 160 1171 1171 1 > > skbuff_head_cache 796 1008 160 41 42 1 > skbuff_head_cache 7588 7824 160 326 326 1 > skbuff_head_cache 14482 14688 160 612 612 1 > skbuff_head_cache 21258 21480 160 895 895 1 > skbuff_head_cache 28255 28416 160 1184 1184 1 > > size-2048 685 968 2048 343 484 1 > size-2048 7483 7676 2048 3742 3838 1 > size-2048 14376 14398 2048 7188 7199 1 > size-2048 21146 21216 2048 10573 10608 1 > size-2048 28142 28292 2048 14071 14146 1 > > System2: > > ip_dst_cache 9 48 160 1 2 1 > ip_dst_cache 7437 7464 160 311 311 1 > ip_dst_cache 15161 15168 160 632 632 1 > ip_dst_cache 18831 18840 160 785 785 1 > > skbuff_head_cache 14 24 160 1 1 1 > skbuff_head_cache 11482 12168 160 500 507 1 > skbuff_head_cache 23312 23904 160 996 996 1 > skbuff_head_cache 28900 29640 160 1235 1235 1 > > size-128 611 660 128 21 22 1 > size-128 11987 12210 128 402 407 1 > size-128 23800 23970 128 798 799 1 > size-128 29445 29670 128 983 989 1 > > > Slabinfo for every 12 hours and CONFIG_DEBUG_SLAB set: > > System3: > > ip_dst_cache 576 576 160 24 24 1 : 576 576 24 0 0 : 252 126 : 1946 48 1426 0 > ip_dst_cache 17760 17760 160 740 740 1 : 17760 17760 740 0 0 : 252 126 : 46553 1480 29557 0 > ip_dst_cache 35376 35376 160 1474 1474 1 : 35376 36403 1474 0 0 : 252 126 : 94140 3014 60309 0 > ip_dst_cache 51624 51624 160 2151 2151 1 : 51624 53444 2151 0 0 : 252 126 : 138864 4431 89547 0 > > skbuff_head_cache 1311 1311 168 57 57 1 : 1311 79557 57 0 0 : 252 126 : 82108 735 81114 621 > skbuff_head_cache 18492 18492 168 804 804 1 : 18492 3300792 804 0 0 : 252 126 : 3320868 27658 3303434 26050 > skbuff_head_cache 36133 36133 168 1571 1571 1 : 36133 6652585 1583 12 0 : 252 126 : 6684139 55715 6649977 52420 > skbuff_head_cache 52371 52371 168 2277 2277 1 : 52371 9913620 2294 17 0 : 252 126 : 9957116 82923 9907545 78097 > > size-8192 540 540 8192 540 540 2 : 540 3196 540 0 0 : 0 0 : 0 0 0 0 > size-8192 17736 17738 8192 17736 17738 2 : 17738 23194 17738 0 0 : 0 0 : 0 0 0 0 > size-8192 35367 35367 8192 35367 35367 2 : 35367 43715 35374 7 0 : 0 0 : 0 0 0 0 > size-8192 51596 51598 8192 51596 51598 2 : 51598 62824 51611 13 0 : 0 0 : 0 0 0 0 > > size-2048 452 512 2048 240 256 1 : 512 75002 256 0 0 : 60 30 : 140293 2995 140145 2485 > size-2048 454 514 2048 238 257 1 : 514 3029044 257 0 0 : 60 30 : 5130850 101465 5130703 100953 > size-2048 456 486 2048 241 243 1 : 530 6113873 593 350 0 : 60 30 : 10457205 204975 10457530 203655 > size-2048 454 484 2048 239 242 1 : 542 9104228 1042 800 0 : 60 30 : 15398297 305608 15399447 303014 > > size-128 2016 2268 136 78 81 1 : 2268 9125 81 0 0 : 252 126 : 23644 195 22128 56 > size-128 19096 19096 136 682 682 1 : 19096 26457 682 0 0 : 252 126 : 131136 1401 113018 58 > size-128 36708 36708 136 1311 1311 1 : 36708 59707 1317 6 0 : 252 126 : 255889 2833 220918 144 > size-128 52920 52920 136 1890 1890 1 : 52920 81855 1911 21 0 : 252 126 : 370264 4135 319786 153 > > size-64 7844 7844 72 148 148 1 : 7844 7931 148 0 0 : 252 126 : 15660 253 9102 0 > size-64 18497 18497 72 349 349 1 : 18497 18584 349 0 0 : 252 126 : 110763 655 93784 0 > size-64 24963 24963 72 471 471 1 : 24963 32458 471 0 0 : 252 126 : 209402 1008 186275 0 > size-64 34503 34503 72 651 651 1 : 34503 48900 651 0 0 : 252 126 : 305026 1613 272674 0 > > There is much more data available, the full slabinfo was taken every > hour for each system. Additionally with the help of Jörn Engel I managed > to setup System1 with gcov kernel patch and have all data available on > an hourly basis until the system has reached "dst cache overflow". I have > tried very hard to evaluate this data myself, but find that the linux > network code is way beyond my c programming knowledge. > > Another thing noticed is that as the memory usage increases the systems > become slower, when you log in on them and work there. > > Has anyone any suggestion of what else I can do to narrow down the problem? > > What I am also not sure if it is correct to assume the bug in the ipv4 > multicast implementation, or can it still be a driver problem? But I assume > two completely different drivers make this very unlikely. > > Please, can someone help me to find the bug. I am willing to do any tests > or provide more information. > > Thanks, > Holger > > PS: Please cc me, since I am not on the list. > > > -- From ak@suse.de Fri Jan 9 08:28:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 08:28:59 -0800 (PST) Received: from Cantor.suse.de (ns.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i09GSZTa027847 for ; Fri, 9 Jan 2004 08:28:35 -0800 Received: from Hermes.suse.de (Hermes.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id B12C519B779C; Fri, 9 Jan 2004 16:56:28 +0100 (CET) Date: Fri, 9 Jan 2004 16:56:27 +0100 From: Andi Kleen To: "David S. Miller" Cc: ak@muc.de, netdev@oss.sgi.com Subject: Re: [PATCH] Mark SIOCSIFNAME as compatible ioctl Message-Id: <20040109165627.2e0845af.ak@suse.de> In-Reply-To: <20040109020456.045b447e.davem@redhat.com> References: <20040108070413.GA31778@averell> <20040109020456.045b447e.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2318 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Content-Length: 793 Lines: 33 On Fri, 9 Jan 2004 02:04:56 -0800 "David S. Miller" wrote: > On Thu, 8 Jan 2004 08:04:13 +0100 > Andi Kleen wrote: > > > Mark SIOCSIFNAME as an ioctl that doesn't need 32bit conversion. > > > > Fixes nameif as 32bit executable. > > How can we mark it compatible? It needs the stuff dev_ifname32() in > fs/compat_ioctl.c does for SIOCGIFNAME doesn't it? It takes two strings. This should be compatible: struct ifreq { union { char ifrn_name[IFNAMSIZ]; /* if name, e.g. "en0" */ / } ifr_ifrn; union { ... char ifru_newname[IFNAMSIZ]; ... } ifr_ifru; }; -Andi P.S.: Maybe it would be time update the "en0" comment in if.h too ;-) I bet that comes from VAX/BSD. From krkumar@us.ibm.com Fri Jan 9 17:48:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 17:48:16 -0800 (PST) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.102]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0A1luTa015969 for ; Fri, 9 Jan 2004 17:48:03 -0800 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e2.ny.us.ibm.com (8.12.10/8.12.2) with ESMTP id i0A1lod6386116; Fri, 9 Jan 2004 20:47:50 -0500 Received: from linux-udp15191261uds.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i0A1lmqb107198; Fri, 9 Jan 2004 20:47:49 -0500 Date: Fri, 9 Jan 2004 17:40:03 -0800 (PST) From: Krishna Kumar X-X-Sender: krkumar@linux-udp14999547uds To: "David S. Miller" cc: netdev@oss.sgi.com, KK , Subject: [PATCH] Bugs in xfrm In-Reply-To: <20031108223049.36651f8d.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 2319 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: krkumar@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 5781 Lines: 171 Hi Dave, The following look to be bugs in xfrm related code. 1. In xfrm_lookup, a couple of bugs : - the found or allocated xfrm_states are not passed correctly to xfrm_bundle_create (and to the subsequent frees in case of create failing) if the first xfrm_tmpl_resolve failed and the second one succeeded. - error handling is wrong. 2. In pfkey_get(), the xfrm_state is dereferenced after it is dropped, which could lead to dereferencing freed memory. 3. ipcomp_tunnel_create and xfrm_state_construct don't set x->km.state to XFRM_STATE_DEAD. This can lead to the BUG_TRAP in __xfrm_state_destroy when xfrm_state_put() finds this is the last reference. This was reported as one of the problems of [Bug 1754] some time back. 4. I am not sure of this one. I think dst_free() cannot be used when a bundle of dst's are allocated and have to be freed. We need a new dst_bundle_free() routine to free all linked dst's ?? Change is in __xfrm[46]_bundle_create & xfrm_lookup(), the lookup one I am not very sure. Why are we doing a dst_put(), shouldn't this be a free of all dst's off the dst->child list since the routine marking 'dead' cleared all entries off the dst->next list ? These changes compile cleanly, but I couldn't test since these are corner cases. Please let me know if this can be applied. I am sending as one patch file for now instead of multiple files as they all small. Thanks, - KK diff -ruN linux-2.6.0-rc2-bk6.org/include/net/dst.h linux-2.6.0-rc2-bk6/include/net/dst.h --- linux-2.6.0-rc2-bk6.org/include/net/dst.h 2004-01-09 17:08:18.000000000 -0800 +++ linux-2.6.0-rc2-bk6/include/net/dst.h 2004-01-09 17:08:55.000000000 -0800 @@ -168,6 +168,7 @@ extern void * dst_alloc(struct dst_ops * ops); extern void __dst_free(struct dst_entry * dst); +extern void dst_bundle_free(struct dst_entry * dst); extern struct dst_entry *dst_destroy(struct dst_entry * dst); static inline void dst_free(struct dst_entry * dst) diff -ruN linux-2.6.0-rc2-bk6.org/net/ipv4/ipcomp.c linux-2.6.0-rc2-bk6/net/ipv4/ipcomp.c --- linux-2.6.0-rc2-bk6.org/net/ipv4/ipcomp.c 2004-01-05 13:43:50.000000000 -0800 +++ linux-2.6.0-rc2-bk6/net/ipv4/ipcomp.c 2004-01-09 13:00:22.000000000 -0800 @@ -294,6 +294,7 @@ return t; error: + t->km.state = XFRM_STATE_DEAD; xfrm_state_put(t); t = NULL; goto out; diff -ruN linux-2.6.0-rc2-bk6.org/net/ipv4/xfrm4_policy.c linux-2.6.0-rc2-bk6/net/ipv4/xfrm4_policy.c --- linux-2.6.0-rc2-bk6.org/net/ipv4/xfrm4_policy.c 2004-01-09 15:02:48.000000000 -0800 +++ linux-2.6.0-rc2-bk6/net/ipv4/xfrm4_policy.c 2004-01-09 16:11:57.000000000 -0800 @@ -162,7 +162,7 @@ error: if (dst) - dst_free(dst); + dst_bundle_free(dst); return err; } diff -ruN linux-2.6.0-rc2-bk6.org/net/ipv6/xfrm6_policy.c linux-2.6.0-rc2-bk6/net/ipv6/xfrm6_policy.c --- linux-2.6.0-rc2-bk6.org/net/ipv6/xfrm6_policy.c 2004-01-09 16:43:45.000000000 -0800 +++ linux-2.6.0-rc2-bk6/net/ipv6/xfrm6_policy.c 2004-01-09 16:44:03.000000000 -0800 @@ -184,7 +184,7 @@ error: if (dst) - dst_free(dst); + dst_bundle_free(dst); return err; } diff -ruN linux-2.6.0-rc2-bk6.org/net/key/af_key.c linux-2.6.0-rc2-bk6/net/key/af_key.c --- linux-2.6.0-rc2-bk6.org/net/key/af_key.c 2004-01-05 13:45:47.000000000 -0800 +++ linux-2.6.0-rc2-bk6/net/key/af_key.c 2004-01-09 12:41:30.000000000 -0800 @@ -1283,6 +1283,7 @@ static int pfkey_get(struct sock *sk, struct sk_buff *skb, struct sadb_msg *hdr, void **ext_hdrs) { + __u8 proto; struct sk_buff *out_skb; struct sadb_msg *out_hdr; struct xfrm_state *x; @@ -1297,6 +1298,7 @@ return -ESRCH; out_skb = pfkey_xfrm_state2msg(x, 1, 3); + proto = x->id.proto; xfrm_state_put(x); if (IS_ERR(out_skb)) return PTR_ERR(out_skb); @@ -1304,7 +1306,7 @@ out_hdr = (struct sadb_msg *) out_skb->data; out_hdr->sadb_msg_version = hdr->sadb_msg_version; out_hdr->sadb_msg_type = SADB_DUMP; - out_hdr->sadb_msg_satype = pfkey_proto2satype(x->id.proto); + out_hdr->sadb_msg_satype = pfkey_proto2satype(proto); out_hdr->sadb_msg_errno = 0; out_hdr->sadb_msg_reserved = 0; out_hdr->sadb_msg_seq = hdr->sadb_msg_seq; diff -ruN linux-2.6.0-rc2-bk6.org/net/xfrm/xfrm_policy.c linux-2.6.0-rc2-bk6/net/xfrm/xfrm_policy.c --- linux-2.6.0-rc2-bk6.org/net/xfrm/xfrm_policy.c 2004-01-09 12:42:53.000000000 -0800 +++ linux-2.6.0-rc2-bk6/net/xfrm/xfrm_policy.c 2004-01-09 17:31:05.000000000 -0800 @@ -694,6 +694,16 @@ static int stale_bundle(struct dst_entry *dst); +void dst_bundle_free(struct dst_entry *dst) +{ + struct dst_entry *next; + + while (dst) { + next = dst->child; + dst_free(dst); + } +} + /* Main function: finds/creates a bundle for given flow. * * At the moment we eat a raw IP route. Mostly to speed up lookups @@ -799,9 +809,16 @@ goto restart; } } - if (err) + if (err < 0) goto error; - } else if (nx == 0) { + /* + * Save number of xfrm_state's found/created for both + * the nx == 0 check below as well as to pass the + * right value to xfrm_bundle_create(). + */ + nx = err; + } + if (nx == 0) { /* Flow passes not transformed. */ xfrm_pol_put(policy); return 0; @@ -827,8 +844,8 @@ write_unlock_bh(&policy->lock); xfrm_pol_put(policy); - if (dst) - dst_free(dst); + if (dst) /* 'dead' freed dst->next list only */ + dst_bundle_free(dst); goto restart; } dst->next = policy->bundles; diff -ruN linux-2.6.0-rc2-bk6.org/net/xfrm/xfrm_user.c linux-2.6.0-rc2-bk6/net/xfrm/xfrm_user.c --- linux-2.6.0-rc2-bk6.org/net/xfrm/xfrm_user.c 2004-01-09 12:57:42.000000000 -0800 +++ linux-2.6.0-rc2-bk6/net/xfrm/xfrm_user.c 2004-01-09 12:59:00.000000000 -0800 @@ -241,6 +241,7 @@ return x; error: + x->km.state = XFRM_STATE_DEAD; xfrm_state_put(x); error_no_put: *errp = err; From davem@pizda.ninka.net Fri Jan 9 20:54:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 20:54:47 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0A4sSTa022993 for ; Fri, 9 Jan 2004 20:54:29 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id UAA07194; Fri, 9 Jan 2004 20:48:10 -0800 Date: Fri, 9 Jan 2004 20:48:08 -0800 From: "David S. Miller" To: Krishna Kumar Cc: netdev@oss.sgi.com, krkumar@us.ibm.com, kumarkr@us.ibm.com Subject: Re: [PATCH] Bugs in xfrm Message-Id: <20040109204808.3db77be6.davem@redhat.com> In-Reply-To: References: <20031108223049.36651f8d.davem@redhat.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2320 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 944 Lines: 28 On Fri, 9 Jan 2004 17:40:03 -0800 (PST) Krishna Kumar wrote: > These changes compile cleanly, but I couldn't test since these are > corner cases. Please let me know if this can be applied. I am sending > as one patch file for now instead of multiple files as they all small. Maybe you should actually try to test these changes before I think about applying them, for example: > +void dst_bundle_free(struct dst_entry *dst) > +{ > + struct dst_entry *next; > + > + while (dst) { > + next = dst->child; > + dst_free(dst); > + } > +} Explain to me how that won't loop forever if given a non-NULL dst? Next, this dst_bundle_free() thing is totally not needed as far as I can tell. When dst_free() is made, the top-level of the bundle's dst gets added to the garbage collection list, the garbage collection properly walks the children to process the whole bundle. Please redo this patch and please test it this time :) From davem@pizda.ninka.net Fri Jan 9 20:57:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 09 Jan 2004 20:57:52 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0A4vdTa023398 for ; Fri, 9 Jan 2004 20:57:39 -0800 Received: (from davem@localhost) by pizda.ninka.net (8.9.3/8.9.3) id UAA07225; Fri, 9 Jan 2004 20:51:19 -0800 Date: Fri, 9 Jan 2004 20:51:19 -0800 From: "David S. Miller" To: Michal Ludvig Cc: netdev@oss.sgi.com Subject: Re: [PATCH] sha2-256 truncation Message-Id: <20040109205119.71482788.davem@redhat.com> In-Reply-To: <3FFE7E98.6060201@logix.cz> References: <3FFE6B72.9030808@logix.cz> <3FFE7E98.6060201@logix.cz> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.6; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 2321 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 584 Lines: 15 On Fri, 09 Jan 2004 11:12:40 +0100 Michal Ludvig wrote: > Michal Ludvig told me that: > > > the attached trivial patch corrects the truncation size of computed > > hashes that are used in IPsec ESP/AH packets for SHA2-256. All other > > hash algorithms use 96 bits as well as does SuperFreeS/WAN and FreeBSD > > also for SHA2-256. Only the native Linux sha2-256 used 128 bits what led > > to incompatibility with other IPsec implementations. > > Oops, sorry. I sent a reversed patch originally. Please use this one > instead. Patch applied, thanks Michal. From rask@sygehus.dk Sat Jan 10 04:14:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 10 Jan 2004 04:14:50 -0800 (PST) Received: from 0x50a17250.albnxx15.adsl-dhcp.tele.dk (0x50a144d5.albnxx15.adsl-dhcp.tele.dk [80.161.68.213]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0ACEMTa006942 for ; Sat, 10 Jan 2004 04:14:23 -0800 Received: by 0x50a17250.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id C5BFF19308; Sat, 10 Jan 2004 00:33:07 +0100 (CET) Date: Sat, 10 Jan 2004 00:33:07 +0100 From: Rask Ingemann Lambertsen To: netdev@oss.sgi.com, linux-net@vger.kernel.org Cc: jgarzik@pobox.com Subject: [PATCH] [EXPERIMENTAL] 2.6: de2104x.c jumbo frames, take two Message-ID: <20040110003304.A3788@sygehus.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-archive-position: 2324 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev Content-Length: 996 Lines: 26 Hi. Attached is a patch which enables jumbo frames on DS2104x based boards. The maximum supported MTU is 4066 bytes (but see below). Room for a VLAN tag is included. The maximum supported MTU is about twice as large as in my first jumbo frame patch because both buffers of RX and TX descriptors are used (just like the winbond-840 driver on TX). The media tables have been adjusted to disable the RX watchdog and TX jabber timers. Tested with a DS21041 and an MTU of 3544 on BNC media with an NE3200 board at the other end without problems. Known bugs and problems: 1. Although the DS21143 manual says that the RX frame descriptor size field is 14 bits wide, the DS21041 seems to use only 11 bits. This means that frames larger than 1518+2048 bytes can not be told apart from frames larger than 1518 bytes but smaller than 2048 bytes. As a result, setting the MTU larger than 3544 will cause problems. Please report success or failure with this patch. -- Regards, Rask Ingemann Lambertsen From rask@sygehus.dk Sat Jan 10 05:47:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 10 Jan 2004 05:47:56 -0800 (PST) Received: from 0x50a144d5.albnxx15.adsl-dhcp.tele.dk (0x50a144d5.albnxx15.adsl-dhcp.tele.dk [80.161.68.213]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0ADldTa011772 for ; Sat, 10 Jan 2004 05:47:40 -0800 Received: by 0x50a17250.albnxx15.adsl-dhcp.tele.dk (Postfix, from userid 500) id B324819E7B; Sat, 10 Jan 2004 14:47:38 +0100 (CET) Date: Sat, 10 Jan 2004 14:47:38 +0100 From: Rask Ingemann Lambertsen To: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: [PATCH] [EXPERIMENTAL] 2.6: de2104x.c jumbo frames, take two Message-ID: <20040110144738.A1385@sygehus.dk> References: <20040110003304.A3788@sygehus.dk> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="zhXaljGHf11kAtnf" Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2.5.1i In-Reply-To: <20040110003304.A3788@sygehus.dk>; from rask@sygehus.dk on Sat, Jan 10, 2004 at 12:33:07AM +0100 X-archive-position: 2325 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rask@sygehus.dk Precedence: bulk X-list: netdev Content-Length: 6944 Lines: 211 --zhXaljGHf11kAtnf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Jan 10, 2004 at 12:33:07AM +0100, Rask Ingemann Lambertsen wrote: > Hi. > > Attached is a patch which enables jumbo frames on DS2104x based boards. And then I forgot to include the patch. :-( Regards, Rask Ingemann Lambertsen --zhXaljGHf11kAtnf Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: attachment; filename="de2104x-mtu.patch" Content-Transfer-Encoding: 8bit --- linux-2.6.0/drivers/net/tulip/de2104x.c-før-mtu Sun Dec 21 14:31:04 2003 +++ linux-2.6.0/drivers/net/tulip/de2104x.c Mon Jan 5 02:01:15 2004 @@ -19,7 +19,6 @@ like dl2k.c/sundance.c * Constants (module parms?) for Rx work limit * Complete reset on PciErr - * Jumbo frames / dev->change_mtu * Adjust Rx FIFO threshold and Max Rx DMA burst on Rx FIFO error * Adjust Tx FIFO threshold and Max Tx DMA burst on Tx FIFO error * Implement Tx software interrupt mitigation via @@ -36,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -67,11 +67,15 @@ static int debug = -1; MODULE_PARM (debug, "i"); MODULE_PARM_DESC (debug, "de2104x bitmapped message enable number"); +/* Each descriptor has two buffers. */ +#define DESC_BUF_SZ_MAX 2044 +#define PKT_BUF_SZ_MAX 4088 /* Maximum Rx buffer size. */ + /* Set the copy breakpoint for the copy-only-tiny-buffer Rx structure. */ #if defined(__alpha__) || defined(__arm__) || defined(__hppa__) \ || defined(__sparc_) || defined(__ia64__) \ || defined(__sh__) || defined(__mips__) -static int rx_copybreak = 1518; +static int rx_copybreak = PKT_BUF_SZ_MAX; #else static int rx_copybreak = 100; #endif @@ -356,12 +360,12 @@ static const char * const media_name[DE_ * TP AUTO(unused), BNC(unused), AUI, TP, TP FD*/ static u16 t21040_csr13[] = { 0, 0, 0x8F09, 0x8F01, 0x8F01, }; static u16 t21040_csr14[] = { 0, 0, 0x0705, 0xFFFF, 0xFFFD, }; -static u16 t21040_csr15[] = { 0, 0, 0x0006, 0x0000, 0x0000, }; +static u16 t21040_csr15[] = { 0, 0, 0x0017, 0x0011, 0x0011, }; /* 21041 transceiver register settings: TP AUTO, BNC, AUI, TP, TP FD*/ static u16 t21041_csr13[] = { 0xEF01, 0xEF09, 0xEF09, 0xEF01, 0xEF09, }; static u16 t21041_csr14[] = { 0xFFFF, 0xF7FD, 0xF7FD, 0x6F3F, 0x6F3D, }; -static u16 t21041_csr15[] = { 0x0008, 0x0006, 0x000E, 0x0008, 0x0008, }; +static u16 t21041_csr15[] = { 0x0019, 0x0017, 0x001F, 0x0019, 0x0019, }; static inline unsigned long @@ -374,6 +378,12 @@ msec_to_jiffies(unsigned long ms) #define dr32(reg) readl(de->regs + (reg)) #define dw32(reg,val) writel((val), de->regs + (reg)) +static inline u32 opts2_sizes (uint total_size) +{ + if (total_size <= DESC_BUF_SZ_MAX) + return (total_size); + return (((total_size - DESC_BUF_SZ_MAX) << 11) | DESC_BUF_SZ_MAX); +} static void de_rx_err_acct (struct de_private *de, unsigned rx_tail, u32 status, u32 len) @@ -422,7 +432,9 @@ static void de_rx (struct de_private *de if (status & DescOwn) break; - len = ((status >> 16) & 0x7ff) - 4; + len = ((status >> 16) & 0x3fff) - 4; + if ((len <= 1514) && (status & RxErrLong)) + len += 2048; mapping = de->rx_skb[rx_tail].mapping; if (unlikely(drop)) { @@ -430,7 +442,7 @@ static void de_rx (struct de_private *de goto rx_next; } - if (unlikely((status & 0x38008300) != 0x0300)) { + if (unlikely((status & 0x38004b53) != 0x0300)) { de_rx_err_acct(de, rx_tail, status, len); goto rx_next; } @@ -483,11 +495,13 @@ static void de_rx (struct de_private *de rx_next: de->rx_ring[rx_tail].opts1 = cpu_to_le32(DescOwn); if (rx_tail == (DE_RX_RING_SIZE - 1)) - de->rx_ring[rx_tail].opts2 = - cpu_to_le32(RingEnd | de->rx_buf_sz); + de->rx_ring[rx_tail].opts2 = cpu_to_le32( + RingEnd | opts2_sizes (de->rx_buf_sz)); else - de->rx_ring[rx_tail].opts2 = cpu_to_le32(de->rx_buf_sz); + de->rx_ring[rx_tail].opts2 = cpu_to_le32( + opts2_sizes (de->rx_buf_sz)); de->rx_ring[rx_tail].addr1 = cpu_to_le32(mapping); + de->rx_ring[rx_tail].addr2 = cpu_to_le32(mapping + DESC_BUF_SZ_MAX); rx_tail = NEXT_RX(rx_tail); } @@ -632,9 +646,10 @@ static int de_start_xmit (struct sk_buff flags |= RingEnd; if (!tx_free || (tx_free == (DE_TX_RING_SIZE / 2))) flags |= TxSwInt; - flags |= len; + flags |= opts2_sizes (len); txd->opts2 = cpu_to_le32(flags); txd->addr1 = cpu_to_le32(mapping); + txd->addr2 = cpu_to_le32(mapping + DESC_BUF_SZ_MAX); de->tx_skb[entry].skb = skb; de->tx_skb[entry].mapping = mapping; @@ -770,6 +785,7 @@ static void __de_set_rx_mode (struct net dummy_txd->opts2 = (entry == (DE_TX_RING_SIZE - 1)) ? cpu_to_le32(RingEnd) : 0; dummy_txd->addr1 = 0; + dummy_txd->addr2 = 0; /* Must set DescOwned later to avoid race with chip */ @@ -788,6 +804,7 @@ static void __de_set_rx_mode (struct net else txd->opts2 = cpu_to_le32(SetupFrame | sizeof (de->setup_frame)); txd->addr1 = cpu_to_le32(mapping); + txd->addr2 = cpu_to_le32(0); wmb(); txd->opts1 = cpu_to_le32(DescOwn); @@ -1277,7 +1294,9 @@ static int de_init_hw (struct de_private static int de_refill_rx (struct de_private *de) { unsigned i; + u32 opts2; + opts2 = opts2_sizes (de->rx_buf_sz); for (i = 0; i < DE_RX_RING_SIZE; i++) { struct sk_buff *skb; @@ -1294,11 +1313,11 @@ static int de_refill_rx (struct de_priva de->rx_ring[i].opts1 = cpu_to_le32(DescOwn); if (i == (DE_RX_RING_SIZE - 1)) de->rx_ring[i].opts2 = - cpu_to_le32(RingEnd | de->rx_buf_sz); + cpu_to_le32(RingEnd | opts2); else - de->rx_ring[i].opts2 = cpu_to_le32(de->rx_buf_sz); + de->rx_ring[i].opts2 = cpu_to_le32(opts2); de->rx_ring[i].addr1 = cpu_to_le32(de->rx_skb[i].mapping); - de->rx_ring[i].addr2 = 0; + de->rx_ring[i].addr2 = cpu_to_le32(de->rx_skb[i].mapping + DESC_BUF_SZ_MAX); } return 0; @@ -1386,6 +1405,8 @@ static int de_open (struct net_device *d printk(KERN_DEBUG "%s: enabling interface\n", dev->name); de->rx_buf_sz = (dev->mtu <= 1500 ? PKT_BUF_SZ : dev->mtu + 32); + if (de->rx_buf_sz > PKT_BUF_SZ_MAX) + de->rx_buf_sz = PKT_BUF_SZ_MAX; rc = de_alloc_rings(de); if (rc) { @@ -1476,6 +1497,18 @@ static void de_tx_timeout (struct net_de netif_wake_queue(dev); } +static int de_change_mtu (struct net_device *dev, int mtu) +{ + if (netif_running (dev)) + return (-EBUSY); + + if (mtu < 0 || mtu > PKT_BUF_SZ_MAX - VLAN_ETH_HLEN - 4) + return (-EINVAL); + + dev->mtu = mtu; + return (0); +} + static void __de_get_regs(struct de_private *de, u8 *buf) { int i; @@ -1980,6 +2013,7 @@ static int __devinit de_init_one (struct dev->ethtool_ops = &de_ethtool_ops; dev->tx_timeout = de_tx_timeout; dev->watchdog_timeo = TX_TIMEOUT; + dev->change_mtu = de_change_mtu; dev->irq = pdev->irq; --zhXaljGHf11kAtnf-- From kumarkr@us.ibm.com Sat Jan 10 10:34:34 2004 Recei