From arjanv@redhat.com Wed Sep 1 00:03:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 00:03:37 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i8173TkB000493 for ; Wed, 1 Sep 2004 00:03:29 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.10/8.12.10) with ESMTP id i81738S0014472; Wed, 1 Sep 2004 03:03:13 -0400 Received: from [172.31.3.35] (arjanv.cipe.redhat.com [10.0.2.48]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id i81732322346; Wed, 1 Sep 2004 03:03:02 -0400 Subject: Re: [Announce] Update on ipw2100, ipw2200, and support for Intel PRO/Wireless 2915ABG From: Arjan van de Ven Reply-To: arjanv@redhat.com To: James Ketrenos Cc: Linux kernel mailing list , Netdev In-Reply-To: <4134E8EA.9080605@linux.intel.com> References: <4134E8EA.9080605@linux.intel.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-X7szBpNK5mljMglKnPP8" Organization: Red Hat UK Message-Id: <1094022177.2801.1.camel@laptop.fenrus.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Wed, 01 Sep 2004 09:02:57 +0200 X-archive-position: 8295 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: arjanv@redhat.com Precedence: bulk X-list: netdev --=-X7szBpNK5mljMglKnPP8 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Tue, 2004-08-31 at 23:08, James Ketrenos wrote: > It's been a while since I've updated lkml and netdev on the progress of > the ipw projects. Given the recent announcement by Intel for the > introduction of Intel PRO/Wireless 2915 ABG Network Connection miniPCI > adapter, I thought now was a good time... you guys seem to be doing a really great job on these drivers; thanks! --=-X7szBpNK5mljMglKnPP8 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQBBNXQgxULwo51rQBIRAvHCAJwOJrJP6DLAJuGhAuK9d1qoOUQJZgCghNlA Msq9xpw2hH+gug8KYZg9s0c= =kpDZ -----END PGP SIGNATURE----- --=-X7szBpNK5mljMglKnPP8-- From ricklind@us.ibm.com Wed Sep 1 01:33:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 01:33:59 -0700 (PDT) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.102]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i818XlqM005862 for ; Wed, 1 Sep 2004 01:33:53 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e2.ny.us.ibm.com (8.12.10/8.12.9) with ESMTP id i818XOKK349602; Wed, 1 Sep 2004 04:33:24 -0400 Received: from owlet.beaverton.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i818YXrm160062; Wed, 1 Sep 2004 04:34:34 -0400 Received: from owlet.beaverton.ibm.com (rick@localhost) by owlet.beaverton.ibm.com (8.11.6/8.11.6) with ESMTP id i818XPZ04210; Wed, 1 Sep 2004 01:33:26 -0700 Message-Id: <200409010833.i818XPZ04210@owlet.beaverton.ibm.com> To: Nivedita Singhvi cc: Andrew Morton , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Fw: Re: 2.6.9-rc1-mm2 In-reply-to: Your message of "Tue, 31 Aug 2004 17:49:10 PDT." <41351C86.7000704@us.ibm.com> Date: Wed, 01 Sep 2004 01:33:25 -0700 From: Rick Lindsley X-archive-position: 8296 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ricklind@us.ibm.com Precedence: bulk X-list: netdev Thanks for pointing out the specific config options. Granted a more recent config is warranted .. the one I'm using is 2.6.0-based. But considering I ran make oldconfig on this and chose the defaults in each and every case, should I end up with a config that doesn't compile? Is there still a config issue here, especially considering that both rc1 and rc1-mm1 compiled fine using this method? Or is make oldconfig only going to help for a version or two back? Rick From akpm@osdl.org Wed Sep 1 01:43:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 01:43:57 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i818hpT5006334 for ; Wed, 1 Sep 2004 01:43:51 -0700 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i818h7114811; Wed, 1 Sep 2004 01:43:07 -0700 Date: Wed, 1 Sep 2004 01:41:18 -0700 From: Andrew Morton To: Rick Lindsley Cc: niv@us.ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Fw: Re: 2.6.9-rc1-mm2 Message-Id: <20040901014118.45204bcb.akpm@osdl.org> In-Reply-To: <200409010833.i818XPZ04210@owlet.beaverton.ibm.com> References: <41351C86.7000704@us.ibm.com> <200409010833.i818XPZ04210@owlet.beaverton.ibm.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8297 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Rick Lindsley wrote: > > But considering I ran make oldconfig on this and chose > the defaults in each and every case, should I end up with a config that > doesn't compile? No, you shouldn't. This indicates a Kconfig bug. From yoshfuji@linux-ipv6.org Wed Sep 1 02:01:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 02:01:56 -0700 (PDT) Received: from yue.st-paulia.net ([203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i8191pPF007324 for ; Wed, 1 Sep 2004 02:01:51 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id B532333CE6; Wed, 1 Sep 2004 18:02:37 +0900 (JST) Date: Wed, 01 Sep 2004 18:02:36 +0900 (JST) Message-Id: <20040901.180236.102852119.yoshfuji@linux-ipv6.org> To: thomasz@hostmaster.org Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: 2.6.9-rc1 oops From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <1094028647.16908.42.camel@hostmaster.org> References: <1093945177.16908.14.camel@hostmaster.org> <1094028647.16908.42.camel@hostmaster.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 8298 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <1094028647.16908.42.camel@hostmaster.org> (at Wed, 01 Sep 2004 10:50:47 +0200), Thomas Zehetbauer says: > I have now created a bug report for this issue: > http://bugzilla.kernel.org/show_bug.cgi?id=3323 (Plase use netdev...) I think this is already fixed in current bk tree. --yoshfuji From zhikui.chen@rus.uni-stuttgart.de Wed Sep 1 03:23:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 03:24:07 -0700 (PDT) Received: from uni-stuttgart.de (mbox.rus.uni-stuttgart.de [129.69.1.9]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81ANuWY011252 for ; Wed, 1 Sep 2004 03:23:57 -0700 Received: from [129.69.30.152] (HELO rus.uni-stuttgart.de) by uni-stuttgart.de (CommuniGate Pro SMTP 4.0.3) with ESMTP id 8883643; Wed, 01 Sep 2004 12:23:47 +0200 Message-ID: <4135A32A.4030901@rus.uni-stuttgart.de> Date: Wed, 01 Sep 2004 12:23:38 +0200 From: Zhikui Chen User-Agent: Mozilla Thunderbird 0.5 (Windows/20040207) X-Accept-Language: en-us, en MIME-Version: 1.0 To: hadi@cyberus.ca CC: dccp@ietf.org, netdev@oss.sgi.com, acme@conectiva.com.br Subject: Re: HELP for dccp implementation. References: <412CC269.8080907@rus.uni-stuttgart.de> <1093454747.1034.85.camel@jzny.localdomain> In-Reply-To: <1093454747.1034.85.camel@jzny.localdomain> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 8299 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zhikui.chen@rus.uni-stuttgart.de Precedence: bulk X-list: netdev Hi, all If I assign a value such as 0x ee9fbc00 to sk in dccp_rcv (before lookup calling), and comment lookkup calling, I get a error report from bh_lock_sock(sk) calling inside dccp_rcv, which error report is spin_is_locked on uninitialized spinlock ee9fbc00, and spin_lock (:ee9fbc00) already locked by /73. Do you know its reason? Thanks, Best regards, Zhikui i there, >Could you please work with >Arnaldo Carvalho de Melo since he is >already working on this - this way we could have a coherent >implementation. >He is quiet knowledgeable on the internals of Linux and you could bring >in the protocol expertise. > >cheers, >jamal > >On Wed, 2004-08-25 at 12:46, Zhikui Chen wrote: > > >>Hi, dear all >> >>I could not assign __sk_head(head) value to sk in lookup_listen. >> >>I have writen the partial code for receive the request packet at server >>accodring to kernel TCP stuff, which is almost closed to TCP stuff. >> >>Anyone can tell me the reason or any hints? Thanks in advance. >> >>The details is following: >> >>The server for receiveing request packet firstly has following steps: >>1. Initialize dccp sock, >>2. dccp bind >>3. get_port >>3. hash >>4. accpet and waiting packet >>5. calling dccp_rcv to get packet ( I have checked dccp_rcv got the >>request packet). >>6. to get sk value by call dccp_lookup >>7 .... >> >>My problem is still in geting sk value, The follwing is my printing out: >> >>Aug 25 09:28:38 localhost kernel: DCCP: Hash tables configured >>(established 262144 bind 65536) >> >>dccp_init_sock: >>dccp_sock_init_common: >>allocated cctp successfully >>allocated pkt vectors successfully >>dccp_bind. >>New dccp_get_port start.65536 >>New dccp_get_port start.else:start >>db not found. >>bind hash add:sk:ee9fbc00,node:0,snum:7000 >>New dccp_get_port start.OK. sk:ee9fbc00,node:0 >>New dccp_get_port start.65536 >>New dccp_get_port start.else:start >>hlist_empty(&db->owners) not empty. >>New dccp_get_port start.OK. sk:ee9fbc00,node:ee5a0444 >>__dccp_v4_hash, list:c04eb670,num:7000,c0558780 >>__dccp_v4_hash, list:c04eb670,sk:ee9fbc00 >>dccp_accept start.7000,sk->sk_family=2,sk->sk_state=1,sk:ee9fbc00 >>dccp_accept 1 ..flags=2 >>dccp_accept 2 .. >>dccp_accept 3 ..timeo=2147483647,sk:ee9fbc00 >>wait_for_incoming_connection: >>dccp wait for connect start!sk:ee9fbc00 >>dccp wait for connect start!..sk:ee9fbc00 >>dccp_rcv start.ee9d3580 >>dccp_rcv: sk->sk_state=0, type=0,dh->dport=22555 >>dccp_v4_lookup >>__dccp_v4_lookup. >>dccp_v4_lookup_connection. >>hash 13291 >>dccp_v4_lookup_connection. head:f7619f58,node:eeaf5834,sk:ee9d3580 >>dccp_v4_lookup_connection. head:f7619f58,node:,sk:0 >>dccp_bhash_size: 65536,ntohs(dport):7000 >>first of head is not empty >>dccp_v4_lookup_listen: head: c04eb670,c0558780,__sk_head(head):ee9fbc00 >>dccp_v4_lookup_listen:sk: 0 >>dccp_rcv: unable to find socket() >> >>At print out, dccp_bind did not call get_port and inet_sk(sk) is >>assigned a port number which is 7000 from application. >> >>For printing __sk_head(head):ee9fbc00, I let sk = NULL in the >>dccp_v4_lookup_listen. >>HASH_TABLE = 32 or 128 I have the same result. >>And the source code is enclosed. >> >>Best regards, >> >>Zhikui >> >>------------------------------------------------------------------------ >> >>struct dccp_hashinfo __cacheline_aligned dccp_hashinfo = { >> .__dccp_lhash_lock = RW_LOCK_UNLOCKED, >> .__dccp_lhash_users = ATOMIC_INIT(0), >> .__dccp_lhash_wait >> = __WAIT_QUEUE_HEAD_INITIALIZER(dccp_hashinfo.__dccp_lhash_wait), >> .__dccp_portalloc_lock = SPIN_LOCK_UNLOCKED >>}; >> >> >> >> >>struct sockaddr_dccp { >> struct sockaddr_in in; >> __u32 service; >>}; >> >>static __inline__ int dccp_hashfn(__u32 laddr, __u16 lport, >> __u32 faddr, __u16 fport) >>{ >> int h = (laddr ^ lport) ^ (faddr ^ fport); >> h ^= h >> 16; >> h ^= h >> 8; >> return h & (dccp_ehash_size - 1);; >>} >> >>static __inline__ int dccp_sk_hashfn(struct sock *sk) >>{ >> struct inet_opt *inet = inet_sk(sk); >> __u32 laddr = inet->rcv_saddr; >> __u16 lport = inet->num; >> __u32 faddr = inet->daddr; >> __u16 fport = inet->dport; >> >> return dccp_hashfn(laddr, lport, faddr, fport); >>} >> >>kmem_cache_t *dccp_bucket_cachep; >> >>struct dccp_bind_bucket *dccp_bucket_create(struct dccp_bind_hashbucket *head, >> unsigned short snum) >>{ >> struct dccp_bind_bucket *db = kmem_cache_alloc(dccp_bucket_cachep, >> SLAB_ATOMIC); >> if (db) { >> db->port = snum; >> db->fastreuse = 0; >> INIT_HLIST_HEAD(&db->owners); >> hlist_add_head(&db->node, &head->chain); >> } >> return db; >>} >> >>void dccp_bucket_destroy(struct dccp_bind_bucket *db) >>{ >> if (hlist_empty(&db->owners)) { >> __hlist_del(&db->node); >> kmem_cache_free(dccp_bucket_cachep, db); >> } >>} >> >>/******************************************************************************/ >> >>static int parse_uaddr(struct sockaddr *uaddr, int addr_len, struct sockaddr_in **iaddr, struct sockaddr_dccp **dccp_addr){ >> if(addr_len < sizeof(struct sockaddr_in)) return -1; >> if(addr_len >= sizeof(struct sockaddr_dccp)){ >> *dccp_addr = (struct sockaddr_dccp *)uaddr; >> *iaddr = &((*dccp_addr)->in); >> }else{ >> *dccp_addr = NULL; >> *iaddr = (struct sockaddr_in *)uaddr; >> } >> return 0; >>} >> >>/******************************************************************************/ >>/* refer to net/ipv4/af_inet.c:inet_bind() */ >>static int dccp_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len){ >> printk("dccp_bind.\n"); >> struct sockaddr_in *iaddr; >> struct sockaddr_dccp *dccp_addr; >> struct inet_opt *inet = inet_sk(sk); >> int addr_type; >> int err; >> unsigned short port; >> >> if(parse_uaddr(uaddr, addr_len, &iaddr, &dccp_addr)) return -EINVAL; >> >> addr_type = inet_addr_type(iaddr->sin_addr.s_addr); >> if( inet->freebind == 0 >> && iaddr->sin_addr.s_addr != INADDR_ANY && addr_type != RTN_LOCAL >> && addr_type != RTN_MULTICAST && addr_type != RTN_BROADCAST) >> return -EADDRNOTAVAIL; >> >> port = ntohs(iaddr->sin_port); >> if(port && port < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE)) >> return -EACCES; >> >> lock_sock(sk); >> >> if(sk->sk_state != DCCP_STATE_CLOSED) ERR(-EISCONN); >> >> if(inet->num) ERR(-EINVAL); >> >> inet->rcv_saddr = inet->saddr = iaddr->sin_addr.s_addr; >> if(addr_type == RTN_MULTICAST || addr_type == RTN_BROADCAST) >> inet->saddr = 0; >> >> if(dccp_addr) dccp_sk(sk)->service = dccp_addr->service; >> else dccp_sk(sk)->service = 0; >>/*Note if we comment sk_port->getport() function calling, we should assign a local listen port number for building a listen hash and adding hash to node.*/ >>/* >> if(sk->sk_prot->get_port(sk, port) != 0){ >> inet->saddr = inet->rcv_saddr = 0; >> ERR(-EADDRINUSE); >> } >>*/ >> if(inet->rcv_saddr) sk->sk_userlocks |= SOCK_BINDADDR_LOCK; >> if(port) sk->sk_userlocks |= SOCK_BINDPORT_LOCK; >> inet->num = port;/*added 24.08.04, Note if we comment sk_port->getport() function calling, we should assign a local listen port number for building a listen hash and adding hash to node.*/ >> inet->dport = inet->daddr = 0; >> sk_dst_reset(sk); >> err = 0; >>out: >> release_sock(sk); >> return err; >>} >> >>void dccp_bind_hash(struct sock *sk, struct dccp_bind_bucket *db, >> unsigned short snum) >>{ >> inet_sk(sk)->num = snum; >> sk_add_bind_node(sk, &db->owners); >> dccp_sk(sk)->bind_hash = db; >>} >> >>static inline int dccp_bind_conflict(struct sock *sk, struct dccp_bind_bucket *db) >>{ >> printk("dccp_bind_conflict is called.\n"); >> const u32 sk_rcv_saddr = dccp_v4_rcv_saddr(sk); >> struct sock *sk2; >> struct hlist_node *node; >> int reuse = sk->sk_reuse; >> >> sk_for_each_bound(sk2, node, &db->owners) { >> if (sk != sk2 && >> !dccp_v6_ipv6only(sk2) && >> (!sk->sk_bound_dev_if || >> !sk2->sk_bound_dev_if || >> sk->sk_bound_dev_if == sk2->sk_bound_dev_if)) { >> if (!reuse || !sk2->sk_reuse || >> sk2->sk_state == DCCP_STATE_LISTEN) { >> const u32 sk2_rcv_saddr = dccp_v4_rcv_saddr(sk2); >> if (!sk2_rcv_saddr || !sk_rcv_saddr || >> sk2_rcv_saddr == sk_rcv_saddr) >> break; >> } >> } >> } >> return node != NULL; >>} >> >>/* Obtain a reference to a local port for the given sock, >> * if snum is zero it means select any available local port. >> */ >>static int dccp_get_port(struct sock *sk, unsigned short snum) >>{ >> printk("New dccp_get_port start.%d, inet_sk(sk)->num=%d\n",dccp_bhash_size,inet_sk(sk)->num); >> struct dccp_bind_hashbucket *head; >> >> struct hlist_node *node; >> struct dccp_bind_bucket *db; >> int ret; >> >> if(inet_sk(sk)->num !=snum) >> snum=inet_sk(sk)->num; >> local_bh_disable(); >> if (!snum) { >> int low = sysctl_local_port_range[0]; >> int high = sysctl_local_port_range[1]; >> int remaining = (high - low) + 1; >> int rover; >> >> spin_lock(&dccp_portalloc_lock); >> rover = dccp_port_rover; >> do { >> printk("New dccp_get_port start.rover:%d\n",rover); >> rover++; >> if (rover < low || rover > high) >> rover = low; >> head = &dccp_bhash[dccp_bhashfn(rover)]; >> spin_lock(&head->lock); >> db_for_each(db, node, &head->chain) >> if (db->port == rover) >> goto next; >> break; >> next: >> spin_unlock(&head->lock); >> } while (--remaining > 0); >> dccp_port_rover = rover; >> spin_unlock(&dccp_portalloc_lock); >> >> /* Exhausted local port range during search? */ >> ret = 1; >> if (remaining <= 0) >> goto fail; >> >> /* OK, here is the one we will use. HEAD is >> * non-NULL and we hold it's mutex. >> */ >> printk("New dccp_get_port start.if:OK\n"); >> snum = rover; >> } else { >> printk("New dccp_get_port start.else:start\n"); >> head = &dccp_bhash[dccp_bhashfn(snum)]; >> spin_lock(&head->lock); >> db_for_each(db, node, &head->chain) >> if (db->port == snum) >> goto db_found; >> } >> db = NULL; >> goto db_not_found; >>db_found: >> if (!hlist_empty(&db->owners)) { >> printk("hlist_empty(&db->owners) not empty.\n"); >> if (sk->sk_reuse > 1) >> goto success; >> if (db->fastreuse > 0 && >> sk->sk_reuse && sk->sk_state != DCCP_STATE_LISTEN) { >> goto success; >> } else { >> ret = 1; >> if (dccp_bind_conflict(sk, db)) >> goto fail_unlock; >> } >> } >>db_not_found: >> printk("db not found.\n"); >> ret = 1; >> if (!db && (db = dccp_bucket_create(head, snum)) == NULL) >> goto fail_unlock; >> if (hlist_empty(&db->owners)) { >> if (sk->sk_reuse && sk->sk_state != DCCP_STATE_LISTEN) >> db->fastreuse = 1; >> else >> db->fastreuse = 0; >> } else if (db->fastreuse && >> (!sk->sk_reuse || sk->sk_state == DCCP_STATE_LISTEN)) >> db->fastreuse = 0; >>success: >> if (!dccp_sk(sk)->bind_hash){ >> dccp_bind_hash(sk, db, snum); >> printk("bind hash add:sk:%x,node:%x,snum:%d\n",sk,node,snum); >> } >> BUG_TRAP(dccp_sk(sk)->bind_hash == db); >> ret = 0; >> >>fail_unlock: >> spin_unlock(&head->lock); >>fail: >> local_bh_enable(); >> printk("New dccp_get_port start.OK. sk:%x,node:%x\n",sk,node); >> return ret; >>} >>/*****************************************************************************/ >>static int wait_for_incoming_connection(struct sock *sk, long timeo) >>{ >> printk("wait_for_incoming_connection: \n"); >> DECLARE_WAITQUEUE(wait, current); >> int err; >> struct dccp_opt *tp = dccp_sk(sk); >> >> /* >> * True wake-one mechanism for incoming connections: only >> * one process gets woken up, not the 'whole herd'. >> * Since we do not 'race & poll' for established sockets >> * anymore, the common case will execute the loop only once. >> * >> * Subtle issue: "add_wait_queue_exclusive()" will be added >> * after any current non-exclusive waiters, and we know that >> * it will always _stay_ after any new non-exclusive waiters >> * because all non-exclusive waiters are added at the >> * beginning of the wait-queue. As such, it's ok to "drop" >> * our exclusiveness temporarily when we get woken up without >> * having to remove and re-insert us on the wait queue. >> */ >> add_wait_queue_exclusive(sk->sk_sleep, &wait); >> printk("dccp wait for connect start!sk:%x\n",sk); >> for (;;) { >> current->state = TASK_INTERRUPTIBLE; >> release_sock(sk); >> printk("dccp wait for connect start!..sk:%x\n",sk); >> if (tp->accept_queue == NULL){ >> timeo = schedule_timeout(timeo); >> } >> printk("dccp wait for connect start .1!sk_state=%d, sk_family=%d\n",sk->sk_state,sk->sk_family); >> lock_sock(sk); >> err = 0; >> if (tp->accept_queue){ >> break; >> } >> err = -EINVAL; >> printk("dccp wait for connect start .1!sk_state=%d, sk_family=%d\n",sk->sk_state,sk->sk_family); >> if (sk->sk_state != DCCP_STATE_LISTEN){ >> printk("dccp wait for connect start .01!sk_state=%d\n",sk->sk_state); >> break; >> } >> err = sock_intr_errno(timeo); >> printk("dccp wait for connect start .2!\n"); >> if (signal_pending(current)){ >> break; >> } >> err = -EAGAIN; >> if (!timeo) >> break; >> } >> printk("dccp wait for connect end!\n"); >> current->state = TASK_RUNNING; >> remove_wait_queue(sk->sk_sleep, &wait); >> printk("dccp wait for connect end ok err=%d\n",err); >> return err; >>} >> >>struct sock *dccp_accept(struct sock *sk, int flags, int *err){ >> struct dccp_opt *tp = dccp_sk(sk); >> int error; >> struct sock *newsk = NULL; >> >> lock_sock(sk); >> >> printk("dccp_accept start.%d,sk->sk_family=%d,sk->sk_state=%d,sk:%x\n",inet_sk(sk)->num,sk->sk_family,sk->sk_state,sk); >> /* this socket must be listening */ >> error = -EINVAL; >> printk("dccp_accept 1 ..flags=%d\n",flags); >> if(sk->sk_state != DCCP_STATE_LISTEN) >> goto out; >> printk("dccp_accept 2 ..\n"); >> >> /* Find already established connection */ >> if(!tp->accept_queue){ >> long timeo = sock_rcvtimeo(sk, flags & O_NONBLOCK); >> printk("dccp_accept 3 ..timeo=%d,sk:%x\n",timeo,sk); >> >> error = -EAGAIN; >> if(!timeo) >> goto out; >> >> error = wait_for_incoming_connection(sk, timeo); >>// error = wait_for_connection(sk, timeo); >> printk("dccp_accept 4 ..\n"); >> //sleep(1000); >> if(error) goto out; >> BUG_TRAP(tp->accept_queue); >> } >> printk("dccp_accept 5 ..\n"); >> newsk = tp->accept_queue; >> tp->accept_queue = sk_next(newsk);//newsk->sk_bind_next; >> if(tp->accept_queue == NULL) tp->accept_queue_tail = NULL; >> BUG_TRAP(sk->sk_ack_backlog); >> sk->sk_ack_backlog -- ; /* since we are removing one */ >> dccp_sk(newsk)->flag_hashandle = 1; >>#if 0 >> /* remove from accept queue, will be referenced by socket */ >> sock_put(newsk); /* removed from the queue */ >> sock_hold(newsk); >>#endif >> >> error = 0; >>out: >> printk("dccp_accept 6 ..err=%d\n",err); >> release_sock(sk); >> *err = error; >> return newsk; >>} >> >>void dccp_listen_wlock(void) >>{ >> write_lock(&dccp_lhash_lock); >> >> if (atomic_read(&dccp_lhash_users)) { >> DEFINE_WAIT(wait); >> >> for (;;) { >> prepare_to_wait_exclusive(&dccp_lhash_wait, >> &wait, TASK_UNINTERRUPTIBLE); >> if (!atomic_read(&dccp_lhash_users)) >> break; >> write_unlock_bh(&dccp_lhash_lock); >> schedule(); >> write_lock_bh(&dccp_lhash_lock); >> } >> >> finish_wait(&dccp_lhash_wait, &wait); >> } >>} >> >>static __inline__ void __dccp_v4_hash(struct sock *sk, const int listen_possible) >>{ >> struct hlist_head *list; >> rwlock_t *lock; >> >> BUG_TRAP(sk_unhashed(sk)); >> if (listen_possible && sk->sk_state == DCCP_STATE_LISTEN) { >> list = &dccp_listening_hash[dccp_sk_listen_hashfn(sk)]; >> >> printk("__dccp_v4_hash, list:%x,num:%d,%x\n",list,inet_sk(sk)->num,&dccp_hash[inet_sk(sk)->num & (DCCP_HTABLE_SIZE - 1)]); >> lock = &dccp_lhash_lock; >> dccp_listen_wlock(); >> } else { >> list = &dccp_ehash[(sk->sk_hashent = dccp_sk_hashfn(sk))].chain; >> lock = &dccp_ehash[sk->sk_hashent].lock; >> write_lock(lock); >> } >> __sk_add_node(sk, list); >> sock_prot_inc_use(sk->sk_prot); >> write_unlock(lock); >> if (listen_possible && sk->sk_state == DCCP_STATE_LISTEN) >> wake_up(&dccp_lhash_wait); >> printk("__dccp_v4_hash, list:%x,sk:%x\n",list,sk); >>} >> >>static void dccp_v4_hash(struct sock *sk) >>{ >> if (sk->sk_state != DCCP_STATE_CLOSED) { >> local_bh_disable(); >> __dccp_v4_hash(sk, 1); >> local_bh_enable(); >> } >>} >> >>void dccp_unhash(struct sock *sk) >>{ >> rwlock_t *lock; >> >> if (sk_unhashed(sk)) >> goto ende; >> >> if (sk->sk_state == DCCP_STATE_LISTEN) { >> local_bh_disable(); >> dccp_listen_wlock(); >> lock = &dccp_lhash_lock; >> } else { >> struct dccp_ehash_bucket *head = &dccp_ehash[sk->sk_hashent]; >> lock = &head->lock; >> write_lock_bh(&head->lock); >> } >> >> if (__sk_del_node_init(sk)) >> sock_prot_dec_use(sk->sk_prot); >> write_unlock_bh(lock); >> >> ende: >> if (sk->sk_state == DCCP_STATE_LISTEN) >> wake_up(&dccp_lhash_wait); >>} >> >>/*****************************************************************************/ >>static struct sock *__dccp_v4_lookup_listen(struct hlist_head *head, u32 daddr, >> unsigned short hnum, int dif) >>{ >> struct sock *result = NULL, *sk; >> struct hlist_node *node; >> int score, hiscore; >> >> printk("__dccp_v4_lookup_listen: sk:%x,node:%x,head:%x,sk_state:%d\n",sk,node,head,sk->sk_state); >> hiscore=-1; >> sk_for_each(sk, node, head) { >> struct inet_opt *inet = inet_sk(sk); >> >> if (inet->num == hnum && !ipv6_only_sock(sk)) { >> __u32 rcv_saddr = inet->rcv_saddr; >> >> score = (sk->sk_family == PF_INET ? 1 : 0); >> if (rcv_saddr) { >> if (rcv_saddr != daddr) >> continue; >> score+=2; >> } >> if (sk->sk_bound_dev_if) { >> if (sk->sk_bound_dev_if != dif) >> continue; >> score+=2; >> } >> if (score == 5) >> return sk; >> if (score > hiscore) { >> hiscore = score; >> result = sk; >> } >> } >> } >> printk("dccp_v4_lookup_listen:sk:%x,result:%x\n",sk,result); >> return result; >>} >> >>/* Optimize the common listener case. */ >>inline struct sock *dccp_v4_lookup_listen(u32 daddr, u16 hnum,int dif) >>{ >> struct sock *sk = NULL; >> struct hlist_head *head; >> >> read_lock(&dccp_lhash_lock); >> printk("dccp_bhash_size: %d,ntohs(dport):%d\n",dccp_bhash_size,hnum); >> head = &dccp_listening_hash[dccp_lhashfn(hnum)]; >> >> if(head->first) >> printk("first of head is not empty\n"); >> >> printk("dccp_v4_lookup_listen: head: %x,%x,__sk_head(head):%x\n",head,&dccp_hash[hnum & (DCCP_HTABLE_SIZE - 1)],__sk_head(head)); >> >> if (!hlist_empty(head)) { >> struct inet_opt *inet = inet_sk((sk = __sk_head(head))); >> printk("dccp_v4_lookup_listen:sk: %x\n",sk); >> >> if (inet->num == hnum && !sk->sk_node.next && >> (!inet->rcv_saddr || inet->rcv_saddr == daddr) && >> (sk->sk_family == PF_INET || !ipv6_only_sock(sk)) && >> !sk->sk_bound_dev_if) >> goto sherry_cache; >> sk = __dccp_v4_lookup_listen(head, daddr, hnum, dif); >> } >> else >> printk("hlist_empty(head) is empty.\n"); >> if (sk) { >>sherry_cache: >> sock_hold(sk); >> } >> printk("dccp_v4_lookup_listen:sk: %x\n",sk); >> read_unlock(&dccp_lhash_lock); >> return sk; >>} >> >> >>/*****************************************************************************/ >> >>static inline struct sock *dccp_v4_lookup_connection(u32 saddr, u16 sport, u32 daddr, u16 hnum, int dif){ >> printk("dccp_v4_lookup_connection.\n"); >> struct dccp_ehash_bucket *head; >> DCCP_V4_ADDR_COOKIE(acookie, saddr, daddr) >> __u32 ports = DCCP_COMBINED_PORTS(sport, hnum); >> struct sock *sk; >> struct hlist_node *node; >> int hash = dccp_hashfn(daddr, hnum, saddr, sport); >> printk("hash %d\n",hash); >> head = &dccp_ehash[hash]; >> printk("dccp_v4_lookup_connection. head:%x,node:%x,sk:%x\n",head,node,sk); >> read_lock(&head->lock); >> sk_for_each(sk, node, &head->chain) { >> if (DCCP_IPV4_MATCH(sk, acookie, saddr, daddr, ports, dif)) >> goto hit; >> } >> >> sk_for_each(sk, node, &(head + dccp_ehash_size)->chain) { >> if (DCCP_IPV4_DW_MATCH(sk, acookie, saddr, daddr, ports, dif)) >> goto hit; >> } >> sk = NULL; >>out: >> read_unlock(&head->lock); >> printk("dccp_v4_lookup_connection. head:%x,node:,sk:%x\n",head,sk); >> return sk; >>hit: >> sock_hold(sk); >> goto out; >>} >> >>/*****************************************************************************/ >> >>static inline struct sock *__dccp_v4_lookup(u32 saddr, u16 sport, u32 daddr, u16 dport,int dif){ >> printk("__dccp_v4_lookup.\n"); >> >> struct sock *sk = dccp_v4_lookup_connection(saddr, sport, daddr, ntohs(dport), dif); >> return sk ? : dccp_v4_lookup_listen(daddr, ntohs(dport),dif); >>} >> >>inline struct sock *dccp_v4_lookup(u32 saddr, u16 sport, u32 daddr, >> u16 dport, int dif) >>{ >> printk("dccp_v4_lookup\n"); >> struct sock *sk; >> >> local_bh_disable(); >> sk = __dccp_v4_lookup(saddr, sport, daddr, dport, dif); >> local_bh_enable(); >> >> return sk; >>} >> >>Best regards, >> >>Zhikui >> >> >> >> >> > > > > From vatsa@in.ibm.com Wed Sep 1 04:34:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 04:35:01 -0700 (PDT) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81BYlS7017442 for ; Wed, 1 Sep 2004 04:34:48 -0700 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e6.ny.us.ibm.com (8.12.10/8.12.9) with ESMTP id i81BYOnt158880; Wed, 1 Sep 2004 07:34:24 -0400 Received: from snowy.in.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i81BZSGY128368; Wed, 1 Sep 2004 07:35:32 -0400 Received: by snowy.in.ibm.com (Postfix, from userid 502) id 981C424E32; Wed, 1 Sep 2004 17:06:41 +0530 (IST) Date: Wed, 1 Sep 2004 17:06:41 +0530 From: Srivatsa Vaddagiri To: Andi Kleen Cc: davem@redhat.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, Dipankar , paulmck@us.ibm.com Subject: Re: [RFC] Use RCU for tcp_ehash lookup Message-ID: <20040901113641.GA3918@in.ibm.com> Reply-To: vatsa@in.ibm.com References: <20040831125941.GA5534@in.ibm.com> <20040831135419.GA17642@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040831135419.GA17642@wotan.suse.de> User-Agent: Mutt/1.4.1i X-archive-position: 8300 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vatsa@in.ibm.com Precedence: bulk X-list: netdev On Tue, Aug 31, 2004 at 03:54:20PM +0200, Andi Kleen wrote: > I bet also when you just do rdtsc timing for the TCP receive > path the cycle numbers will be way down (excluding the copy). I got cycle numbers for the lookup routine (with CONFIG_PREEMPT turned off). They were taken on a 900MHz 8way Intel P3 SMP box. The results are as below: ------------------------------------------------------------------------------- | 2.6.8.1 | 2.6.8.1 + my patch ------------------------------------------------------------------------------- Average cycles | | spent in | | __tcp_v4_lookup_established | 2970.65 | 668.227 | (~3.3 micro-seconds) | (~0.74 microseconds) ------------------------------------------------------------------------------- This repesents improvement by a factor of 77.5%! > > And it should also fix the performance problems with > cat /proc/net/tcp on ppc64/ia64 for large hash tables because the rw locks > are gone. But spinlocks are in! Would that still improve the performance compared to rw locks? (See me earlier note where I have explained that lookup done for /proc/net/tcp is _not_ lock-free yet). > I haven't studied it in detail (yet), just two minor style > comments: [snip] > Can you rewrite that without goto? [snip] > And that too. I have avoided the goto's in the updated patch below. Thanks!! --- linux-2.6.8.1-vatsa/include/net/sock.h | 22 +++++++++-- linux-2.6.8.1-vatsa/include/net/tcp.h | 24 +++++++++--- linux-2.6.8.1-vatsa/net/core/sock.c | 11 +++++ linux-2.6.8.1-vatsa/net/ipv4/tcp.c | 2 - linux-2.6.8.1-vatsa/net/ipv4/tcp_diag.c | 11 +++-- linux-2.6.8.1-vatsa/net/ipv4/tcp_ipv4.c | 50 ++++++++++++++++----------- linux-2.6.8.1-vatsa/net/ipv4/tcp_minisocks.c | 47 ++++++++++++++++++++----- linux-2.6.8.1-vatsa/net/ipv6/tcp_ipv6.c | 22 +++++++---- 8 files changed, 135 insertions(+), 54 deletions(-) diff -puN include/net/sock.h~tcp_ehash include/net/sock.h --- linux-2.6.8.1/include/net/sock.h~tcp_ehash 2004-08-25 18:06:42.000000000 +0530 +++ linux-2.6.8.1-vatsa/include/net/sock.h 2004-09-01 10:09:43.000000000 +0530 @@ -50,6 +50,7 @@ #include #include +#include #include #include @@ -178,6 +179,7 @@ struct sock_common { * @sk_error_report - callback to indicate errors (e.g. %MSG_ERRQUEUE) * @sk_backlog_rcv - callback to process the backlog * @sk_destruct - called at sock freeing time, i.e. when all refcnt == 0 + * @sk_rcu - RCU callback structure */ struct sock { /* @@ -266,6 +268,7 @@ struct sock { int (*sk_backlog_rcv)(struct sock *sk, struct sk_buff *skb); void (*sk_destruct)(struct sock *sk); + struct rcu_head sk_rcu; }; /* @@ -350,7 +353,7 @@ static __inline__ int sk_del_node_init(s static __inline__ void __sk_add_node(struct sock *sk, struct hlist_head *list) { - hlist_add_head(&sk->sk_node, list); + hlist_add_head_rcu(&sk->sk_node, list); } static __inline__ void sk_add_node(struct sock *sk, struct hlist_head *list) @@ -371,7 +374,7 @@ static __inline__ void sk_add_bind_node( } #define sk_for_each(__sk, node, list) \ - hlist_for_each_entry(__sk, node, list, sk_node) + hlist_for_each_entry_rcu(__sk, node, list, sk_node) #define sk_for_each_from(__sk, node) \ if (__sk && ({ node = &(__sk)->sk_node; 1; })) \ hlist_for_each_entry_from(__sk, node, sk_node) @@ -703,6 +706,7 @@ extern void FASTCALL(release_sock(struct extern struct sock * sk_alloc(int family, int priority, int zero_it, kmem_cache_t *slab); extern void sk_free(struct sock *sk); +extern void sk_free_rcu(struct rcu_head *head); extern struct sk_buff *sock_wmalloc(struct sock *sk, unsigned long size, int force, @@ -888,8 +892,18 @@ static inline void sk_filter_charge(stru /* Ungrab socket and destroy it, if it was the last reference. */ static inline void sock_put(struct sock *sk) { - if (atomic_dec_and_test(&sk->sk_refcnt)) - sk_free(sk); + while (atomic_dec_and_test(&sk->sk_refcnt)) { + /* Restore ref count and schedule callback. + * If we don't restore ref count, then the callback can be + * scheduled by more than one CPU. + */ + atomic_inc(&sk->sk_refcnt); + + if (atomic_read(&sk->sk_refcnt) == 1) { + call_rcu(&sk->sk_rcu, sk_free_rcu); + break; + } + } } /* Detach socket from process context. diff -puN include/net/tcp.h~tcp_ehash include/net/tcp.h --- linux-2.6.8.1/include/net/tcp.h~tcp_ehash 2004-08-25 18:06:42.000000000 +0530 +++ linux-2.6.8.1-vatsa/include/net/tcp.h 2004-09-01 10:13:40.000000000 +0530 @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -44,7 +45,7 @@ * for the rest. I'll experiment with dynamic table growth later. */ struct tcp_ehash_bucket { - rwlock_t lock; + spinlock_t lock; struct hlist_head chain; } __attribute__((__aligned__(8))); @@ -222,12 +223,13 @@ struct tcp_tw_bucket { struct in6_addr tw_v6_rcv_saddr; int tw_v6_ipv6only; #endif + struct rcu_head tw_rcu; }; static __inline__ void tw_add_node(struct tcp_tw_bucket *tw, struct hlist_head *list) { - hlist_add_head(&tw->tw_node, list); + hlist_add_head_rcu(&tw->tw_node, list); } static __inline__ void tw_add_bind_node(struct tcp_tw_bucket *tw, @@ -305,14 +307,22 @@ static inline int tcp_v6_ipv6only(const #endif extern kmem_cache_t *tcp_timewait_cachep; +extern void tcp_tw_free(struct rcu_head *head); static inline void tcp_tw_put(struct tcp_tw_bucket *tw) { - if (atomic_dec_and_test(&tw->tw_refcnt)) { -#ifdef INET_REFCNT_DEBUG - printk(KERN_DEBUG "tw_bucket %p released\n", tw); -#endif - kmem_cache_free(tcp_timewait_cachep, tw); + while (atomic_dec_and_test(&tw->tw_refcnt)) { + /* Restore ref count and schedule callback. + * If we don't restore ref count, then the callback can be + * scheduled by more than one CPU. + */ + + atomic_inc(&tw->tw_refcnt); + + if (atomic_read(&tw->tw_refcnt) == 1) { + call_rcu(&tw->tw_rcu, tcp_tw_free); + break; + } } } diff -puN net/core/sock.c~tcp_ehash net/core/sock.c --- linux-2.6.8.1/net/core/sock.c~tcp_ehash 2004-08-25 18:06:42.000000000 +0530 +++ linux-2.6.8.1-vatsa/net/core/sock.c 2004-08-26 16:53:14.000000000 +0530 @@ -657,6 +657,16 @@ void sk_free(struct sock *sk) module_put(owner); } +/* RCU callback to free a socket */ + +void sk_free_rcu(struct rcu_head *head) +{ + struct sock *sk = container_of(head, struct sock, sk_rcu); + + if (atomic_dec_and_test(&sk->sk_refcnt)) + sk_free(sk); +} + void __init sk_init(void) { sk_cachep = kmem_cache_create("sock", sizeof(struct sock), 0, @@ -1347,6 +1357,7 @@ EXPORT_SYMBOL(__lock_sock); EXPORT_SYMBOL(__release_sock); EXPORT_SYMBOL(sk_alloc); EXPORT_SYMBOL(sk_free); +EXPORT_SYMBOL(sk_free_rcu); EXPORT_SYMBOL(sk_send_sigurg); EXPORT_SYMBOL(sock_alloc_send_pskb); EXPORT_SYMBOL(sock_alloc_send_skb); diff -puN net/ipv4/tcp_ipv4.c~tcp_ehash net/ipv4/tcp_ipv4.c --- linux-2.6.8.1/net/ipv4/tcp_ipv4.c~tcp_ehash 2004-08-25 18:06:42.000000000 +0530 +++ linux-2.6.8.1-vatsa/net/ipv4/tcp_ipv4.c 2004-08-25 18:07:27.000000000 +0530 @@ -351,7 +351,8 @@ void tcp_listen_wlock(void) static __inline__ void __tcp_v4_hash(struct sock *sk, const int listen_possible) { struct hlist_head *list; - rwlock_t *lock; + rwlock_t *lock = NULL; + spinlock_t *slock = NULL; BUG_TRAP(sk_unhashed(sk)); if (listen_possible && sk->sk_state == TCP_LISTEN) { @@ -360,14 +361,16 @@ static __inline__ void __tcp_v4_hash(str tcp_listen_wlock(); } else { list = &tcp_ehash[(sk->sk_hashent = tcp_sk_hashfn(sk))].chain; - lock = &tcp_ehash[sk->sk_hashent].lock; - write_lock(lock); + slock = &tcp_ehash[sk->sk_hashent].lock; + spin_lock(slock); } __sk_add_node(sk, list); sock_prot_inc_use(sk->sk_prot); - write_unlock(lock); - if (listen_possible && sk->sk_state == TCP_LISTEN) + if (listen_possible && sk->sk_state == TCP_LISTEN) { + write_unlock(lock); wake_up(&tcp_lhash_wait); + } else + spin_unlock(slock); } static void tcp_v4_hash(struct sock *sk) @@ -381,7 +384,8 @@ static void tcp_v4_hash(struct sock *sk) void tcp_unhash(struct sock *sk) { - rwlock_t *lock; + rwlock_t *lock = NULL; + spinlock_t *slock = NULL; if (sk_unhashed(sk)) goto ende; @@ -392,17 +396,20 @@ void tcp_unhash(struct sock *sk) lock = &tcp_lhash_lock; } else { struct tcp_ehash_bucket *head = &tcp_ehash[sk->sk_hashent]; - lock = &head->lock; - write_lock_bh(&head->lock); + slock = &head->lock; + spin_lock_bh(&head->lock); } if (__sk_del_node_init(sk)) sock_prot_dec_use(sk->sk_prot); - write_unlock_bh(lock); + if (sk->sk_state != TCP_LISTEN) + spin_unlock_bh(slock); + else { + write_unlock_bh(lock); ende: - if (sk->sk_state == TCP_LISTEN) wake_up(&tcp_lhash_wait); + } } /* Don't inline this cruft. Here are some nice properties to @@ -494,7 +501,7 @@ static inline struct sock *__tcp_v4_look */ int hash = tcp_hashfn(daddr, hnum, saddr, sport); head = &tcp_ehash[hash]; - read_lock(&head->lock); + rcu_read_lock(); sk_for_each(sk, node, &head->chain) { if (TCP_IPV4_MATCH(sk, acookie, saddr, daddr, ports, dif)) goto hit; /* You sunk my battleship! */ @@ -507,7 +514,7 @@ static inline struct sock *__tcp_v4_look } sk = NULL; out: - read_unlock(&head->lock); + rcu_read_unlock(); return sk; hit: sock_hold(sk); @@ -559,7 +566,7 @@ static int __tcp_v4_check_established(st struct hlist_node *node; struct tcp_tw_bucket *tw; - write_lock(&head->lock); + spin_lock(&head->lock); /* Check TIME-WAIT sockets first. */ sk_for_each(sk2, node, &(head + tcp_ehash_size)->chain) { @@ -614,7 +621,7 @@ unique: BUG_TRAP(sk_unhashed(sk)); __sk_add_node(sk, &head->chain); sock_prot_inc_use(sk->sk_prot); - write_unlock(&head->lock); + spin_unlock(&head->lock); if (twp) { *twp = tw; @@ -630,7 +637,7 @@ unique: return 0; not_unique: - write_unlock(&head->lock); + spin_unlock(&head->lock); return -EADDRNOTAVAIL; } @@ -2228,7 +2235,10 @@ static void *established_get_first(struc struct hlist_node *node; struct tcp_tw_bucket *tw; - read_lock(&tcp_ehash[st->bucket].lock); + /* Take the spinlock. Otherwise a dancing socket + * (__tcp_tw_hashdance) may be reported twice! + */ + spin_lock(&tcp_ehash[st->bucket].lock); sk_for_each(sk, node, &tcp_ehash[st->bucket].chain) { if (sk->sk_family != st->family) { continue; @@ -2245,7 +2255,7 @@ static void *established_get_first(struc rc = tw; goto out; } - read_unlock(&tcp_ehash[st->bucket].lock); + spin_unlock(&tcp_ehash[st->bucket].lock); st->state = TCP_SEQ_STATE_ESTABLISHED; } out: @@ -2272,10 +2282,10 @@ get_tw: cur = tw; goto out; } - read_unlock(&tcp_ehash[st->bucket].lock); + spin_unlock(&tcp_ehash[st->bucket].lock); st->state = TCP_SEQ_STATE_ESTABLISHED; if (++st->bucket < tcp_ehash_size) { - read_lock(&tcp_ehash[st->bucket].lock); + spin_lock(&tcp_ehash[st->bucket].lock); sk = sk_head(&tcp_ehash[st->bucket].chain); } else { cur = NULL; @@ -2385,7 +2395,7 @@ static void tcp_seq_stop(struct seq_file case TCP_SEQ_STATE_TIME_WAIT: case TCP_SEQ_STATE_ESTABLISHED: if (v) - read_unlock(&tcp_ehash[st->bucket].lock); + spin_unlock(&tcp_ehash[st->bucket].lock); local_bh_enable(); break; } diff -puN net/ipv4/tcp.c~tcp_ehash net/ipv4/tcp.c --- linux-2.6.8.1/net/ipv4/tcp.c~tcp_ehash 2004-08-25 18:06:42.000000000 +0530 +++ linux-2.6.8.1-vatsa/net/ipv4/tcp.c 2004-08-25 18:07:27.000000000 +0530 @@ -2258,7 +2258,7 @@ void __init tcp_init(void) if (!tcp_ehash) panic("Failed to allocate TCP established hash table\n"); for (i = 0; i < (tcp_ehash_size << 1); i++) { - tcp_ehash[i].lock = RW_LOCK_UNLOCKED; + tcp_ehash[i].lock = SPIN_LOCK_UNLOCKED; INIT_HLIST_HEAD(&tcp_ehash[i].chain); } diff -puN net/ipv4/tcp_diag.c~tcp_ehash net/ipv4/tcp_diag.c --- linux-2.6.8.1/net/ipv4/tcp_diag.c~tcp_ehash 2004-08-25 18:06:42.000000000 +0530 +++ linux-2.6.8.1-vatsa/net/ipv4/tcp_diag.c 2004-08-25 18:07:27.000000000 +0530 @@ -522,7 +522,10 @@ skip_listen_ht: if (i > s_i) s_num = 0; - read_lock_bh(&head->lock); + /* Take the spinlock. Otherwise a dancing socket + * (__tcp_tw_hashdance) may be reported twice! + */ + spin_lock_bh(&head->lock); num = 0; sk_for_each(sk, node, &head->chain) { @@ -542,7 +545,7 @@ skip_listen_ht: if (tcpdiag_fill(skb, sk, r->tcpdiag_ext, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq) <= 0) { - read_unlock_bh(&head->lock); + spin_unlock_bh(&head->lock); goto done; } ++num; @@ -568,13 +571,13 @@ skip_listen_ht: if (tcpdiag_fill(skb, sk, r->tcpdiag_ext, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq) <= 0) { - read_unlock_bh(&head->lock); + spin_unlock_bh(&head->lock); goto done; } ++num; } } - read_unlock_bh(&head->lock); + spin_unlock_bh(&head->lock); } done: diff -puN net/ipv4/tcp_minisocks.c~tcp_ehash net/ipv4/tcp_minisocks.c --- linux-2.6.8.1/net/ipv4/tcp_minisocks.c~tcp_ehash 2004-08-25 18:06:42.000000000 +0530 +++ linux-2.6.8.1-vatsa/net/ipv4/tcp_minisocks.c 2004-08-26 16:58:08.000000000 +0530 @@ -64,14 +64,14 @@ static void tcp_timewait_kill(struct tcp /* Unlink from established hashes. */ ehead = &tcp_ehash[tw->tw_hashent]; - write_lock(&ehead->lock); + spin_lock(&ehead->lock); if (hlist_unhashed(&tw->tw_node)) { - write_unlock(&ehead->lock); + spin_unlock(&ehead->lock); return; } __hlist_del(&tw->tw_node); sk_node_init(&tw->tw_node); - write_unlock(&ehead->lock); + spin_unlock(&ehead->lock); /* Disassociate with bind bucket. */ bhead = &tcp_bhash[tcp_bhashfn(tw->tw_num)]; @@ -308,17 +308,28 @@ static void __tcp_tw_hashdance(struct so tw_add_bind_node(tw, &tw->tw_tb->owners); spin_unlock(&bhead->lock); - write_lock(&ehead->lock); + spin_lock(&ehead->lock); - /* Step 2: Remove SK from established hash. */ - if (__sk_del_node_init(sk)) - sock_prot_dec_use(sk->sk_prot); + /* + * We have to be carefull here since there could be racing + * (lock-free) lookups happening on other CPUs. If we remove SK first + * and then add TW, then there is a tiny window where this socket is + * in neither the established half nor in the TIMEWAIT half of the ehash + * table. Lookups occuring in that window can drop packets! + * Hence we first add TW and then remove SK, with a barrier in between. + */ - /* Step 3: Hash TW into TIMEWAIT half of established hash table. */ + /* Step 2: Hash TW into TIMEWAIT half of established hash table. */ tw_add_node(tw, &(ehead + tcp_ehash_size)->chain); atomic_inc(&tw->tw_refcnt); - write_unlock(&ehead->lock); + smp_wmb(); + + /* Step 3: Remove SK from established hash. */ + if (__sk_del_node_init(sk)) + sock_prot_dec_use(sk->sk_prot); + + spin_unlock(&ehead->lock); } /* @@ -1069,11 +1080,29 @@ int tcp_child_process(struct sock *paren return ret; } +/* RCU callback to free a timewait bucket */ + +void tcp_tw_free(struct rcu_head *head) +{ + struct tcp_tw_bucket *tw = + container_of(head, struct tcp_tw_bucket, tw_rcu); + + if (atomic_dec_and_test(&tw->tw_refcnt)) { +#ifdef INET_REFCNT_DEBUG + printk(KERN_DEBUG "tw_bucket %p released\n", tw); +#endif + kmem_cache_free(tcp_timewait_cachep, tw); + } +} + + + EXPORT_SYMBOL(tcp_check_req); EXPORT_SYMBOL(tcp_child_process); EXPORT_SYMBOL(tcp_create_openreq_child); EXPORT_SYMBOL(tcp_timewait_state_process); EXPORT_SYMBOL(tcp_tw_deschedule); +EXPORT_SYMBOL(tcp_tw_free); #ifdef CONFIG_SYSCTL EXPORT_SYMBOL(sysctl_tcp_tw_recycle); diff -puN net/ipv6/tcp_ipv6.c~tcp_ehash net/ipv6/tcp_ipv6.c --- linux-2.6.8.1/net/ipv6/tcp_ipv6.c~tcp_ehash 2004-08-25 18:06:42.000000000 +0530 +++ linux-2.6.8.1-vatsa/net/ipv6/tcp_ipv6.c 2004-08-25 18:07:27.000000000 +0530 @@ -210,7 +210,8 @@ fail: static __inline__ void __tcp_v6_hash(struct sock *sk) { struct hlist_head *list; - rwlock_t *lock; + rwlock_t *lock = NULL; + spinlock_t *slock = NULL; BUG_TRAP(sk_unhashed(sk)); @@ -221,13 +222,16 @@ static __inline__ void __tcp_v6_hash(str } else { sk->sk_hashent = tcp_v6_sk_hashfn(sk); list = &tcp_ehash[sk->sk_hashent].chain; - lock = &tcp_ehash[sk->sk_hashent].lock; - write_lock(lock); + slock = &tcp_ehash[sk->sk_hashent].lock; + spin_lock(slock); } __sk_add_node(sk, list); sock_prot_inc_use(sk->sk_prot); - write_unlock(lock); + if (sk->sk_state == TCP_LISTEN) + write_unlock(lock); + else + spin_unlock(slock); } @@ -307,7 +311,7 @@ static inline struct sock *__tcp_v6_look */ hash = tcp_v6_hashfn(daddr, hnum, saddr, sport); head = &tcp_ehash[hash]; - read_lock(&head->lock); + rcu_read_lock(); sk_for_each(sk, node, &head->chain) { /* For IPV6 do the cheaper port and family tests first. */ if(TCP_IPV6_MATCH(sk, saddr, daddr, ports, dif)) @@ -326,12 +330,12 @@ static inline struct sock *__tcp_v6_look goto hit; } } - read_unlock(&head->lock); + rcu_read_unlock(); return NULL; hit: sock_hold(sk); - read_unlock(&head->lock); + rcu_read_unlock(); return sk; } @@ -452,7 +456,7 @@ static int tcp_v6_check_established(stru struct hlist_node *node; struct tcp_tw_bucket *tw; - write_lock_bh(&head->lock); + spin_lock_bh(&head->lock); /* Check TIME-WAIT sockets first. */ sk_for_each(sk2, node, &(head + tcp_ehash_size)->chain) { @@ -491,7 +495,7 @@ unique: __sk_add_node(sk, &head->chain); sk->sk_hashent = hash; sock_prot_inc_use(sk->sk_prot); - write_unlock_bh(&head->lock); + spin_unlock_bh(&head->lock); if (tw) { /* Silly. Should hash-dance instead... */ _ -- Thanks and Regards, Srivatsa Vaddagiri, Linux Technology Center, IBM Software Labs, Bangalore, INDIA - 560017 From DanE@aiinet.com Wed Sep 1 07:13:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 07:13:42 -0700 (PDT) Received: from aiexchange.ai.aiinet.com (ai181-26.aiinet.com [205.245.181.26]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81EDZc4021183 for ; Wed, 1 Sep 2004 07:13:35 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----_=_NextPart_001_01C4902D.D3F1B09D" Subject: RE: [Bridge] BCP status Date: Wed, 1 Sep 2004 10:13:17 -0400 Message-ID: X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Thread-Topic: [Bridge] BCP status Thread-Index: AcSQBeEZdLXyu5JNTQGZbEJfr4S2kAAIN/oA From: "Eble, Dan" To: "Petter Larsen" Cc: , X-archive-position: 8301 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: DanE@aiinet.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. ------_=_NextPart_001_01C4902D.D3F1B09D Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable > -----Original Message----- > From: Petter Larsen [mailto:petter.larsen@morecom.no]=20 > Subject: RE: [Bridge] BCP status >=20 > Ok. The patches that you sent to the mailing list in March,=20 > are they the latest from your side? You mentioned that you had some=20 > plug-ins for HDLC ppp-layer drivers. =20 >=20 > Are you working on this now, or are you finished?=20 Nothing is ever finished; however, I have not worked on it for months. Kernel BCP patch: http://marc.theaimsgroup.com/?l=3Dlinux-ppp&m=3D107814944215772&w=3D2 No changes since then, AFAICT. PPP daemon BCP patch: http://marc.theaimsgroup.com/?l=3Dlinux-ppp&m=3D107772784930462&w=3D2 No changes since then, AFAICT. PPP daemon "wanppp" plugin: http://marc.theaimsgroup.com/?l=3Dlinux-ppp&m=3D105526128502635&w=3D2 It looks current except for a correction of spelling (s/compliled/compiled/) and the addition of prototypes for wanppp_cleanup() and wanppp_close(). Kernel generic HDLC to generic PPP layer: http://marc.theaimsgroup.com/?l=3Dlinux-net&m=3D105525978800615&w=3D2 This has changed slightly. I have attached our latest drivers/net/wan/hdlc_ppp.c. --=20 Dan Eble _____ . Software Engineer | _ |/| Applied Innovation Inc. | |_| | | http://www.aiinet.com/ |__/|_|_| ------_=_NextPart_001_01C4902D.D3F1B09D Content-Type: application/octet-stream; name="hdlc_ppp.c" Content-Transfer-Encoding: base64 Content-Description: hdlc_ppp.c Content-Disposition: attachment; filename="hdlc_ppp.c" LyoKICogR2VuZXJpYyBIRExDIHN1cHBvcnQgcm91dGluZXMgZm9yIExpbnV4CiAqIFBvaW50LXRv LXBvaW50IHByb3RvY29sIHN1cHBvcnQKICoKICogQ29weXJpZ2h0IChDKSAxOTk5IC0gMjAwMyBL cnp5c3p0b2YgSGFsYXNhIDxraGNAcG0ud2F3LnBsPgogKgogKiBUaGlzIHByb2dyYW0gaXMgZnJl ZSBzb2Z0d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUgaXQgYW5kL29yIG1vZGlmeSBpdAogKiB1 bmRlciB0aGUgdGVybXMgb2YgdmVyc2lvbiAyIG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGlj ZW5zZQogKiBhcyBwdWJsaXNoZWQgYnkgdGhlIEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbi4KICov CgojaW5jbHVkZSA8bGludXgvY29uZmlnLmg+CiNpbmNsdWRlIDxsaW51eC9tb2R1bGUuaD4KI2lu Y2x1ZGUgPGxpbnV4L2tlcm5lbC5oPgojaW5jbHVkZSA8bGludXgvc2xhYi5oPgojaW5jbHVkZSA8 bGludXgvcG9sbC5oPgojaW5jbHVkZSA8bGludXgvZXJybm8uaD4KI2luY2x1ZGUgPGxpbnV4L2lm X2FycC5oPgojaW5jbHVkZSA8bGludXgvaW5pdC5oPgojaW5jbHVkZSA8bGludXgvc2tidWZmLmg+ CiNpbmNsdWRlIDxsaW51eC9wa3Rfc2NoZWQuaD4KI2luY2x1ZGUgPGxpbnV4L2luZXRkZXZpY2Uu aD4KI2luY2x1ZGUgPGxpbnV4L2xhcGIuaD4KI2luY2x1ZGUgPGxpbnV4L3J0bmV0bGluay5oPgoj aW5jbHVkZSA8bGludXgvaGRsYy5oPgojaW5jbHVkZSA8bGludXgvcHBwX2RlZnMuaD4KI2luY2x1 ZGUgPGxpbnV4L2lmX3BwcC5oPgojaW5jbHVkZSA8bGludXgvcHBwX2NoYW5uZWwuaD4KCi8qKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKgogKiBDb25zdGFudHMgYW5kIFN0cnVjdHVyZXMKICoqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KiovCgovKiBUaGUgUFBQIGhlYWRlciBpbmNsdWRlcyB0aGUgSERMQyBhZGRyZXNzICYgY29udHJv bCBieXRlcywKICogc28gZG8gbm90IGNvdW50IHRoZW0gaW4gTVRVIGFkanVzdG1lbnRzLgogKi8K I2RlZmluZSBQUFBfT1ZFUkhFQUQJKFBQUF9IRFJMRU4gLSAyKQoKLyoqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq CiAqIFByb3RvdHlwZXMKICoqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKiovCgpzdGF0aWMgaW50IGhkbGNfcHBwX3Jl Z2lzdGVyKGhkbGNfZGV2aWNlICpoZGxjKTsKc3RhdGljIHZvaWQgaGRsY19wcHBfdW5yZWdpc3Rl cihoZGxjX2RldmljZSAqaGRsYyk7CnN0YXRpYyB2b2lkIGhkbGNfcHBwX25ldGlmX3J4KHN0cnVj dCBza19idWZmICpza2IpOwpzdGF0aWMgaW50IGhkbGNfZ2VucHBwX3N0YXJ0X3htaXQoc3RydWN0 IHBwcF9jaGFubmVsICosIHN0cnVjdCBza19idWZmICopOwpzdGF0aWMgaW50IGhkbGNfZ2VucHBw X2lvY3RsKHN0cnVjdCBwcHBfY2hhbm5lbCosIHVuc2lnbmVkIGludCwgdW5zaWduZWQgbG9uZyk7 CgovKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioKICogVmFyaWFibGVzCiAqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqLwoKc3Rh dGljIHN0cnVjdCBwcHBfY2hhbm5lbF9vcHMgaGRsY19nZW5wcHBfb3BzID0KewoJLnN0YXJ0X3ht aXQgPQloZGxjX2dlbnBwcF9zdGFydF94bWl0LAoJLmlvY3RsID0JaGRsY19nZW5wcHBfaW9jdGwK fTsKCi8qKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKgogKiBJbmxpbmUgRnVuY3Rpb25zCiAqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqLwoKLyoqIEdldCB0aGUgUFBQIGNoYW5uZWwgb3duZWQgYnkgYW4gSERMQyBkZXZpY2UuICov CnN0YXRpYyBfX2lubGluZV9fIHN0cnVjdCBwcHBfY2hhbm5lbCogaGRsY190b19jaGFuKGhkbGNf ZGV2aWNlICpoZGxjKQp7CglyZXR1cm4gKHN0cnVjdCBwcHBfY2hhbm5lbCopJmhkbGMtPnN0YXRl LnBwcC5jaGFuOwp9CgovKiogR2V0IGFuIEhETEMgZGV2aWNlIGZyb20gaXRzIFBQUCBjaGFubmVs LiAqLwpzdGF0aWMgX19pbmxpbmVfXyBoZGxjX2RldmljZSogY2hhbl90b19oZGxjKHN0cnVjdCBw cHBfY2hhbm5lbCAqY2hhbikKewoJcmV0dXJuIChoZGxjX2RldmljZSopY2hhbi0+cHJpdmF0ZTsK fQoKLyoqIENoZWNrIFVQLCBSVU5OSU5HLCBhbmQgY2FycmllciBhbGwgYXQgb25jZS4gKi8Kc3Rh dGljIF9faW5saW5lX18gaW50IG5ldGlmX2dvb2RfdG9fZ28oc3RydWN0IG5ldF9kZXZpY2UgKmRl dikKewoJcmV0dXJuIChkZXYtPmZsYWdzICYgSUZGX1VQKSAmJgoJCW5ldGlmX3J1bm5pbmcoZGV2 KSAmJgoJCW5ldGlmX2NhcnJpZXJfb2soZGV2KTsKfQoKLyoqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqCiAqIEZ1 bmN0aW9ucwogKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKi8KCi8qKgogKiBJbml0aWFsaXplIGFuZCByZWdpc3Rl ciBhIFBQUCBjaGFubmVsIHdpdGggdGhlIGdlbmVyaWMgUFBQIGxheWVyLgogKgogKiBoZGxjX3Bw cF9yZWdpc3RlcigpIGlzIGNhbGxlZCBmcm9tIHByb2Nlc3MgY29udGV4dCB3aGlsZSB0aGUKICog aW50ZXJmYWNlIGlzIGRvd24uCiAqLwpzdGF0aWMgaW50IGhkbGNfcHBwX3JlZ2lzdGVyKGhkbGNf ZGV2aWNlICpoZGxjKQp7CglzdHJ1Y3QgbmV0X2RldmljZSAqY29uc3QgZGV2ID0gaGRsY190b19k ZXYoaGRsYyk7CglzdHJ1Y3QgcHBwX2NoYW5uZWwgKmNvbnN0IGNoYW4gPSBoZGxjX3RvX2NoYW4o aGRsYyk7CglpbnQgb2xkX210dSwgbmV3X210dTsKCWludCBlcnI7CgoJLyoKCSAqIFNhdmUgdGhl IG9sZCBNVFUgYW5kIHVzZSBvbmUgbW9yZSBhZGVxdWF0ZSBmb3IgUFBQLgoJICovCglvbGRfbXR1 ID0gZGV2LT5tdHU7CgoJLyogVHJ5IGFuIE1UVSBmb3IgYnJpZGdpbmcgRXRoZXJuZXQgZnJhbWVz LiAqLwoJbmV3X210dSA9IFBQUF9PVkVSSEVBRCArQkNQXzgwMl8zX0hEUkxFTiArIEVUSF9GUkFN RV9MRU4gKyBFVEhfRkNTX0xFTjsKCWlmIChkZXYtPm10dSA8IG5ld19tdHUpCgkJZGV2LT5jaGFu Z2VfbXR1KGRldiwgbmV3X210dSk7CgoJLyogSWYgdGhhdCBNVFUgZGlkbid0IHdvcmssIHRyeSBp dCB3aXRob3V0IHJvb20gZm9yIHRoZSBFdGhlcm5ldCBGQ1MsCgkgKiB3aGljaCBpcyBvcHRpb25h bCBmb3IgQkNQIGVuY2Fwc3VsYXRpb24uICovCgluZXdfbXR1IC09IEVUSF9GQ1NfTEVOOwoJaWYg KGRldi0+bXR1IDwgbmV3X210dSkKCQlkZXYtPmNoYW5nZV9tdHUoZGV2LCBuZXdfbXR1KTsKCgkv KiBJZiAqdGhhdCogZGlkbid0IHdvcmssIHRyeSB0aGUgZGVmYXVsdCBQUFAgTVRVLiAqLwoJbmV3 X210dSA9IFBQUF9PVkVSSEVBRCArIFBQUF9NVFU7CglpZiAoZGV2LT5tdHUgPCBuZXdfbXR1KQoJ CWRldi0+Y2hhbmdlX210dShkZXYsIG5ld19tdHUpOwoKCS8qIE1UVSBzaG91bGQgbm90IGJlIGNo YW5nZWQgd2hpbGUgUFBQIG1vZGUgaXMgaW4gZWZmZWN0LiAqLwoJaGRsYy0+bXR1X2xvY2tlZCA9 IDE7CgoJLyogcmVzZXQgdGhlIFBQUCBzdGF0ZSAqLwoJbWVtc2V0KCZoZGxjLT5zdGF0ZS5wcHAs IDAsIHNpemVvZihoZGxjLT5zdGF0ZS5wcHApKTsKCWNoYW4tPnByaXZhdGUgPSBoZGxjOwoJY2hh bi0+b3BzID0gJmhkbGNfZ2VucHBwX29wczsKCgkvKiBUaGUgY2hhbm5lbCBNVFUgaXMgdGhlIHVu ZGVybHlpbmcgSERMQyBNVFUgcmVkdWNlZCBieSB0aGUKCSAqIG92ZXJoZWFkIFBQUCByZXF1aXJl cy4KCSAqLwoJY2hhbi0+bXR1ID0gZGV2LT5tdHUgLSBQUFBfT1ZFUkhFQUQ7CgoJLyogY2hhbi0+ aGRybGVuIGlzIHRoZSBoZWFkcm9vbSB0aGUgZ2VuZXJpYyBQUFAgbGF5ZXIgd2lsbAoJICogcmVz ZXJ2ZSBvbiBidWZmZXJzIGl0IHNlbmRzIHRvIHRoaXMgZHJpdmVyLiAgV2UgbmVlZCBlbm91Z2gK CSAqIGZvciB0aGUgSERMQyBhZGRyZXNzIGFuZCBjb250cm9sIGJ5dGVzLiAqLwoJY2hhbi0+aGRy bGVuID0gMjsKCiAJLyogVGhlIHBwcF9jaGFubmVsIG9iamVjdCBtdXN0IGV4aXN0IGZyb20gdGhl IHRpbWUgdGhhdAoJICogcHBwX3JlZ2lzdGVyX2NoYW5uZWwoKSBpcyBjYWxsZWQgdW50aWwgYWZ0 ZXIgdGhlIGNhbGwgdG8KCSAqIHBwcF91bnJlZ2lzdGVyX2NoYW5uZWwoKSByZXR1cm5zLgoJICov CgllcnIgPSBwcHBfcmVnaXN0ZXJfY2hhbm5lbChjaGFuKTsKCWlmICghZXJyKQoJewoJCWhkbGMt Pm9wZW4gPSBOVUxMOwoJCWhkbGMtPnN0b3AgPSBOVUxMOwoJCWhkbGMtPnByb3RvX2RldGFjaCA9 IGhkbGNfcHBwX3VucmVnaXN0ZXI7CgkJaGRsYy0+bmV0aWZfcnggPSBoZGxjX3BwcF9uZXRpZl9y eDsKCQloZGxjLT50eXBlX3RyYW5zID0gTlVMTDsJLyogZm9yY2UgdXNlIG9mIG5ldGlmX3J4KCkg Ki8KCQloZGxjLT5wcm90byA9IElGX1BST1RPX1BQUDsKCgkJZGV2LT5oYXJkX3N0YXJ0X3htaXQg PSBoZGxjLT54bWl0OwoJCWRldi0+aGFyZF9oZWFkZXIgPSBOVUxMOwoJCWRldi0+dHlwZSA9IEFS UEhSRF9QUFA7CgkJZGV2LT5oYXJkX2hlYWRlcl9sZW4gPSAyOwoJCWRldi0+YWRkcl9sZW4gPSAx OwoJCWRldi0+YnJvYWRjYXN0WzBdID0gUFBQX0FMTFNUQVRJT05TOwoJCWRldi0+ZGV2X2FkZHJb MF0gPSAwOwkJLyogbm90IGltcG9ydGFudCBmb3IgUFBQICovCgoJCWhkbGMtPnN0YXRlLnBwcC5z ZXR0aW5ncy5jaGFubmVsID0KCQkJcHBwX2NoYW5uZWxfaW5kZXgoaGRsY190b19jaGFuKGhkbGMp KTsKCgkJZ290byBTdWNjZXNzOwoJfQoKCS8qIHJlc3RvcmUgb2xkIE1UVSAqLwoJaGRsYy0+bXR1 X2xvY2tlZCA9IDA7CglpZiAoZGV2LT5tdHUgIT0gb2xkX210dSkKCQlkZXYtPmNoYW5nZV9tdHUo ZGV2LCBvbGRfbXR1KTsKCiBTdWNjZXNzOgoJcmV0dXJuIGVycjsKfQoKCgovKioKICogQ2xvc2Ug dGhlIGNoYW5uZWwgdG8gdGhlIGdlbmVyaWMgUFBQIGxheWVyLgogKgogKiBoZGxjX3BwcF91bnJl Z2lzdGVyKCkgaXMgY2FsbGVkIGZyb20gcHJvY2VzcyBjb250ZXh0IHdoaWxlIHRoZQogKiBpbnRl cmZhY2UgaXMgZG93bi4KICovCnN0YXRpYyB2b2lkIGhkbGNfcHBwX3VucmVnaXN0ZXIoaGRsY19k ZXZpY2UgKmhkbGMpCnsKCXN0cnVjdCBwcHBfY2hhbm5lbCAqY29uc3QgY2hhbiA9IGhkbGNfdG9f Y2hhbihoZGxjKTsKCXN0cnVjdCBuZXRfZGV2aWNlICpjb25zdCBkZXYgPSBoZGxjX3RvX2Rldiho ZGxjKTsKCgkvKiBObyB0aHJlYWQgbWF5IGJlIGluIGEgY2FsbCB0byBhbnkgb2YgcHBwX2lucHV0 KCksCgkgKiBwcHBfaW5wdXRfZXJyb3IoKSwgcHBwX291dHB1dF93YWtldXAoKSwgcHBwX2NoYW5u ZWxfaW5kZXgoKQoJICogb3IgcHBwX3VuaXRfbnVtYmVyKCkgZm9yIGEgY2hhbm5lbCBhdCB0aGUg dGltZSB0aGF0CgkgKiBwcHBfdW5yZWdpc3Rlcl9jaGFubmVsKCkgaXMgY2FsbGVkIGZvciB0aGF0 IGNoYW5uZWwuCgkgKi8KCS8qIEJ5IHRoZSB0aW1lIGEgY2FsbCB0byBwcHBfdW5yZWdpc3Rlcl9j aGFubmVsKCkgcmV0dXJucywgbm8KCSAqIHRocmVhZCB3aWxsIGJlIGV4ZWN1dGluZyBpbiBhIGNh bGwgZnJvbSB0aGUgZ2VuZXJpYyBsYXllcgoJICogdG8gdGhhdCBjaGFubmVsJ3Mgc3RhcnRfeG1p dCgpIG9yIGlvY3RsKCkgZnVuY3Rpb24sIGFuZCB0aGUKCSAqIGdlbmVyaWMgbGF5ZXIgd2lsbCBu b3QgY2FsbCBlaXRoZXIgb2YgdGhvc2UgZnVuY3Rpb25zCgkgKiBzdWJzZXF1ZW50bHkuCgkgKi8K CXBwcF91bnJlZ2lzdGVyX2NoYW5uZWwoY2hhbik7CgoJaGRsYy0+bXR1X2xvY2tlZCA9IDA7Cglo ZGxjLT5zdGF0ZS5wcHAuc2V0dGluZ3MuY2hhbm5lbCA9IC0xOwoJZGV2LT5oYXJkX2hlYWRlcl9s ZW4gPSAxNjsKfQoKCgovKioKICogUmVjZWl2ZSBhIGJ1ZmZlciBmcm9tIHRoZSBoYXJkd2FyZSwg c3RyaXAgdGhlIFBQUCBoZWFkZXIsIGFuZCBwYXNzCiAqIHRoZSByZXN0IHRvIHRoZSBnZW5lcmlj IFBQUCBsYXllci4KICovCnN0YXRpYyB2b2lkIGhkbGNfcHBwX25ldGlmX3J4KHN0cnVjdCBza19i dWZmICpza2IpCnsKCXN0cnVjdCBwcHBfY2hhbm5lbCAqY29uc3QgY2hhbiA9IGhkbGNfdG9fY2hh bihkZXZfdG9faGRsYyhza2ItPmRldikpOwoJdW5zaWduZWQgY2hhciAqcDsKCgkvKiBzdHJpcCBh ZGRyZXNzL2NvbnRyb2wgZmllbGQgaWYgcHJlc2VudCAqLwoJcCA9IHNrYi0+ZGF0YTsKCWlmIChw WzBdID09IFBQUF9BTExTVEFUSU9OUyAmJiBwWzFdID09IFBQUF9VSSkgewoJCS8qIGNob3Agb2Zm IGFkZHJlc3MvY29udHJvbCAqLwoJCWlmIChza2ItPmxlbiA8IDMpCgkJCWdvdG8gZXJyOwoJCXAg PSBza2JfcHVsbChza2IsIDIpOwoJfQoKCS8qIGRlY29tcHJlc3MgcHJvdG9jb2wgZmllbGQgaWYg Y29tcHJlc3NlZCAqLwoJaWYgKHBbMF0gJiAxKSB7CgkJLyogcHJvdG9jb2wgaXMgY29tcHJlc3Nl ZCAqLwoJCXNrYl9wdXNoKHNrYiwgMSlbMF0gPSAwOwoJfSBlbHNlIGlmIChza2ItPmxlbiA8IDIp CgkJZ290byBlcnI7CgoJLyogcGFzcyB0byBnZW5lcmljIGxheWVyICovCglwcHBfaW5wdXQoY2hh biwgc2tiKTsKCXJldHVybjsKCiBlcnI6CglrZnJlZV9za2Ioc2tiKTsKCXBwcF9pbnB1dF9lcnJv cihjaGFuLCAwKTsKfQoKCgovKioKICogU2VuZCBhIHBhY2tldCAob3IgbXVsdGlsaW5rIGZyYWdt ZW50KSBvbiB0aGlzIGNoYW5uZWwuCiAqIFJldHVybnMgMSBpZiBpdCB3YXMgYWNjZXB0ZWQsIDAg dG8gcXVldWUgaXQgZm9yIGxhdGVyLgogKgogKiBUaGUgZ2VuZXJpYyBsYXllciB3aWxsIG5vdCBj YWxsIHRoZSBzdGFydF94bWl0KCkgZnVuY3Rpb24gZm9yIGEKICogY2hhbm5lbCB3aGlsZSBhbnkg dGhyZWFkIGlzIGFscmVhZHkgZXhlY3V0aW5nIGluIHRoYXQgZnVuY3Rpb24gZm9yCiAqIHRoYXQg Y2hhbm5lbC4KICoKICogVGhlIGdlbmVyaWMgbGF5ZXIgbWF5IGNhbGwgdGhlIGNoYW5uZWwgc3Rh cnRfeG1pdCgpIGZ1bmN0aW9uIGF0CiAqIHNvZnRpcnEvQkggbGV2ZWwgYnV0IHdpbGwgbm90IGNh bGwgaXQgYXQgaW50ZXJydXB0IGxldmVsLiAgVGh1cyB0aGUKICogc3RhcnRfeG1pdCgpIGZ1bmN0 aW9uIG1heSBub3QgYmxvY2suCiAqLwpzdGF0aWMgaW50IGhkbGNfZ2VucHBwX3N0YXJ0X3htaXQo c3RydWN0IHBwcF9jaGFubmVsICpjaGFuLAoJCQkJICBzdHJ1Y3Qgc2tfYnVmZiAqc2tiKQp7Cglo ZGxjX2RldmljZSAqY29uc3QgaGRsYyA9IGNoYW5fdG9faGRsYyhjaGFuKTsKCXN0cnVjdCBuZXRf ZGV2aWNlICpjb25zdCBkZXYgPSBoZGxjX3RvX2RldihoZGxjKTsKCWludCBwcm90bzsKCXVuc2ln bmVkIGNoYXIgKmRhdGE7CglpbnQgaXNsY3A7CgoJaWYgKCFuZXRpZl9nb29kX3RvX2dvKGRldikp IHsKCQkvKiogQHRvZG8gSW5zdGVhZCwgcmV0dXJuIDAgdG8gbWFrZSBnZW5lcmljIGxheWVyCgkJ ICogcXVldWUgdGhlIHBhY2tldC4gIFRoYXQgd2lsbCByZXF1aXJlIGNhbGxpbmcKCQkgKiBwcHBf b3V0cHV0X3dha2V1cCgpIGF0IGFuIGFwcHJvcHJpYXRlIHRpbWUuICovCgkJa2ZyZWVfc2tiKHNr Yik7CgkJKytoZGxjLT5zdGF0cy50eF9kcm9wcGVkOwoJCXJldHVybiAxOwoJfQoKCWRhdGEgID0g c2tiLT5kYXRhOwoJcHJvdG8gPSAoZGF0YVswXSA8PCA4KSArIGRhdGFbMV07CgoJLyogTENQIHBh Y2tldHMgd2l0aCBjb2RlcyBiZXR3ZWVuIDEgKGNvbmZpZ3VyZS1yZXF1ZXN0KQoJICogYW5kIDcg KGNvZGUtcmVqZWN0KSBtdXN0IGJlIHNlbnQgYXMgdGhvdWdoIG5vIG9wdGlvbnMKCSAqIGhhdmUg YmVlbiBuZWdvdGlhdGVkLgoJICovCglpc2xjcCA9IHByb3RvID09IFBQUF9MQ1AgJiYgMSA8PSBk YXRhWzJdICYmIGRhdGFbMl0gPD0gNzsKCgkvKiBjb21wcmVzcyBwcm90b2NvbCBmaWVsZCBpZiBv cHRpb24gZW5hYmxlZCAqLwoJaWYgKGRhdGFbMF0gPT0gMCAmJiAoaGRsYy0+c3RhdGUucHBwLmZs YWdzICYgU0NfQ09NUF9QUk9UKSAmJiAhaXNsY3ApCgkJc2tiX3B1bGwoc2tiLDEpOwoKCS8qIHBy ZXBlbmQgYWRkcmVzcy9jb250cm9sIGZpZWxkcyBpZiBuZWNlc3NhcnkgKi8KCWlmICgoaGRsYy0+ c3RhdGUucHBwLmZsYWdzICYgU0NfQ09NUF9BQykgPT0gMCB8fCBpc2xjcCkgewoJCWlmIChza2Jf aGVhZHJvb20oc2tiKSA8IDIpIHsKCQkJc3RydWN0IHNrX2J1ZmYgKm5wa3QgPSBkZXZfYWxsb2Nf c2tiKHNrYi0+bGVuICsgMik7CgkJCWlmIChucGt0ID09IE5VTEwpIHsKCQkJCWtmcmVlX3NrYihz a2IpOwoJCQkJKytoZGxjLT5zdGF0cy50eF9kcm9wcGVkOwoJCQkJcmV0dXJuIDE7CgkJCX0KCQkJ c2tiX3Jlc2VydmUobnBrdCwyKTsKCQkJbWVtY3B5KHNrYl9wdXQobnBrdCxza2ItPmxlbiksIHNr Yi0+ZGF0YSwgc2tiLT5sZW4pOwoJCQlrZnJlZV9za2Ioc2tiKTsKCQkJc2tiID0gbnBrdDsKCQl9 CgkJc2tiX3B1c2goc2tiLDIpOwoJCXNrYi0+ZGF0YVswXSA9IFBQUF9BTExTVEFUSU9OUzsKCQlz a2ItPmRhdGFbMV0gPSBQUFBfVUk7Cgl9CgoJc2tiLT5kZXYgPSBkZXY7Cglza2ItPm5oLnJhdyA9 IHNrYi0+ZGF0YTsKCglkZXZfcXVldWVfeG1pdChza2IpOwoJcmV0dXJuIDE7Cn0KCgoKLyoqCiAq IEhhbmRsZSBhbiBpb2N0bCBjYWxsIHRoYXQgaGFzIGNvbWUgaW4gdmlhIC9kZXYvcHBwLgogKgog KiBUaGUgZ2VuZXJpYyBsYXllciB3aWxsIG9ubHkgY2FsbCB0aGUgY2hhbm5lbCBpb2N0bCgpIGZ1 bmN0aW9uIGluCiAqIHByb2Nlc3MgY29udGV4dC4KICoKICogVGhlIGdlbmVyaWMgbGF5ZXIgd2ls bCBub3QgY2FsbCB0aGUgaW9jdGwoKSBmdW5jdGlvbiBmb3IgYSBjaGFubmVsCiAqIHdoaWxlIGFu eSB0aHJlYWQgaXMgYWxyZWFkeSBleGVjdXRpbmcgaW4gdGhhdCBmdW5jdGlvbiBmb3IgdGhhdAog KiBjaGFubmVsLgogKi8Kc3RhdGljIGludCBoZGxjX2dlbnBwcF9pb2N0bChzdHJ1Y3QgcHBwX2No YW5uZWwgKmNoYW4sCgkJCSAgICAgdW5zaWduZWQgaW50IGNtZCwgdW5zaWduZWQgbG9uZyBhcmcp CnsKCWhkbGNfZGV2aWNlICpjb25zdCBoZGxjID0gY2hhbl90b19oZGxjKGNoYW4pOwoJc3RydWN0 IG5ldF9kZXZpY2UgKmNvbnN0IGRldiA9IGhkbGNfdG9fZGV2KGhkbGMpOwoJaW50IHZhbDsKCWlu dCBlcnI7CgoJZXJyID0gLUVGQVVMVDsKCXN3aXRjaCAoY21kKSB7CgljYXNlIFBQUElPQ0dNUlU6 CgkJaWYgKHB1dF91c2VyKGRldi0+bXR1IC0gUFBQX09WRVJIRUFELCAoaW50ICopIGFyZykpCgkJ CWJyZWFrOwoJCWVyciA9IDA7CgkJYnJlYWs7CgoJY2FzZSBQUFBJT0NTTVJVOgoJCWlmIChnZXRf dXNlcih2YWwsIChpbnQgKikgYXJnKSkKCQkJYnJlYWs7CgoJCWlmICh2YWwgPiBkZXYtPm10dSAt IFBQUF9PVkVSSEVBRCkKCQkJZXJyID0gLUVJTlZBTDsKCQllbHNlCgkJCWVyciA9IDA7CgkJYnJl YWs7CgoJY2FzZSBQUFBJT0NTRkxBR1M6CgkJaWYgKGdldF91c2VyKHZhbCwgKGludCAqKSBhcmcp KQoJCQlicmVhazsKCQl2YWwgJj0gU0NfTUFTSzsJLyoga2VlcCB0aGUgYml0cyB0aGF0IGFyZSBh bGxvd2VkIHRvIGJlIHNldCAqLwoJCWhkbGMtPnN0YXRlLnBwcC5mbGFncyAmPSB+U0NfTUFTSzsK CQloZGxjLT5zdGF0ZS5wcHAuZmxhZ3MgfD0gdmFsOwoJCWVyciA9IDA7CgkJYnJlYWs7CgoJZGVm YXVsdDoKCQllcnIgPSAtRU5PVFRZOwoJfQoJcmV0dXJuIGVycjsKfQoKCgppbnQgaGRsY19wcHBf aW9jdGwoaGRsY19kZXZpY2UgKmhkbGMsIHN0cnVjdCBpZnJlcSAqaWZyKQp7CglzdHJ1Y3QgbmV0 X2RldmljZSAqZGV2ID0gaGRsY190b19kZXYoaGRsYyk7CglpbnQgZXJyOwoKCXN3aXRjaCAoaWZy LT5pZnJfc2V0dGluZ3MudHlwZSkgewoJY2FzZSBJRl9HRVRfUFJPVE86CgkJaWZyLT5pZnJfc2V0 dGluZ3MudHlwZSA9IElGX1BST1RPX1BQUDsKCQlpZiAoaWZyLT5pZnJfc2V0dGluZ3Muc2l6ZSA8 IHNpemVvZihwcHBfcHJvdG8pKSB7CgkJCWlmci0+aWZyX3NldHRpbmdzLnNpemUgPSBzaXplb2Yo cHBwX3Byb3RvKTsKCQkJcmV0dXJuIC1FTk9CVUZTOwoJCX0KCQlpZiAoY29weV90b191c2VyKGlm ci0+aWZyX3NldHRpbmdzLmlmc19pZnN1LnBwcCwKCQkJCSAmaGRsYy0+c3RhdGUucHBwLnNldHRp bmdzLCBzaXplb2YocHBwX3Byb3RvKSkpCgkJCXJldHVybiAtRUZBVUxUOwoJCXJldHVybiAwOwoK CWNhc2UgSUZfUFJPVE9fUFBQOgoJCWlmKCFjYXBhYmxlKENBUF9ORVRfQURNSU4pKQoJCQlyZXR1 cm4gLUVQRVJNOwoKCQlpZihkZXYtPmZsYWdzICYgSUZGX1VQKQoJCQlyZXR1cm4gLUVCVVNZOwoK CQkvKiBubyBzZXR0YWJsZSBwYXJhbWV0ZXJzICovCgoJCWVyciA9IGhkbGMtPmF0dGFjaChoZGxj LCBFTkNPRElOR19OUlosUEFSSVRZX0NSQzE2X1BSMV9DQ0lUVCk7CgkJaWYgKCFlcnIpIHsKCQkJ aGRsY19wcm90b19kZXRhY2goaGRsYyk7CgkJCWVyciA9IGhkbGNfcHBwX3JlZ2lzdGVyKGhkbGMp OwoJCX0KCgkJcmV0dXJuIGVycjsKCX0KCglyZXR1cm4gLUVJTlZBTDsKfQo= ------_=_NextPart_001_01C4902D.D3F1B09D-- From davem@davemloft.net Wed Sep 1 13:37:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 13:38:01 -0700 (PDT) Received: from smtp109.mail.sc5.yahoo.com (smtp109.mail.sc5.yahoo.com [66.163.170.7]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id i81Kbs8b005999 for ; Wed, 1 Sep 2004 13:37:54 -0700 Received: from unknown (HELO cheetah.davemloft.net) (davem?330@63.197.226.105 with login) by smtp109.mail.sc5.yahoo.com with SMTP; 1 Sep 2004 20:37:46 -0000 Date: Wed, 1 Sep 2004 13:37:09 -0700 From: "David S. Miller" To: Zhikui Chen Cc: hadi@cyberus.ca, dccp@ietf.org, netdev@oss.sgi.com, acme@conectiva.com.br Subject: Re: HELP for dccp implementation. Message-Id: <20040901133709.3637d63d.davem@davemloft.net> In-Reply-To: <4135A32A.4030901@rus.uni-stuttgart.de> References: <412CC269.8080907@rus.uni-stuttgart.de> <1093454747.1034.85.camel@jzny.localdomain> <4135A32A.4030901@rus.uni-stuttgart.de> Organization: DaveM Loft Enterprises X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8304 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 594 Lines: 14 On Wed, 01 Sep 2004 12:23:38 +0200 Zhikui Chen wrote: > If I assign a value such as 0x ee9fbc00 to sk in dccp_rcv (before lookup > calling), and comment lookkup calling, I get a error report from > bh_lock_sock(sk) calling inside dccp_rcv, which error report is > spin_is_locked on uninitialized spinlock ee9fbc00, and spin_lock > (:ee9fbc00) already locked by /73. > > Do you know its reason? Thanks, Zhikui, are you working together with Arnaldo using his code base, like we suggested to you? Or are you working still on your own code? From janitor@sternwelten.at Wed Sep 1 13:49:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 13:49:30 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81KnOW9006538 for ; Wed, 1 Sep 2004 13:49:25 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 087725C065; Wed, 1 Sep 2004 22:49:14 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19512-07; Wed, 1 Sep 2004 22:49:13 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id A22D75C008; Wed, 1 Sep 2004 22:49:13 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2c2d-00077g-Sz; Wed, 01 Sep 2004 22:49:15 +0200 Subject: [patch 1/1] remove old ifdefs dmascc To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 22:49:15 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8305 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 832 Lines: 28 Patches to remove some old ifdefs. remove most of the #include kill compat cruft like #define ahd_pci_set_dma_mask pci_set_dma_mask Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/hamradio/dmascc.c | 1 - 1 files changed, 1 deletion(-) diff -puN drivers/net/hamradio/dmascc.c~remove-old-ifdefs-dmascc drivers/net/hamradio/dmascc.c --- linux-2.6.9-rc1-bk7/drivers/net/hamradio/dmascc.c~remove-old-ifdefs-dmascc 2004-08-31 17:42:11.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/hamradio/dmascc.c 2004-08-31 17:42:11.000000000 +0200 @@ -37,7 +37,6 @@ #include #include #include -#include #include #include #include _ From janitor@sternwelten.at Wed Sep 1 14:03:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:13 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L372Z007121 for ; Wed, 1 Sep 2004 14:03:08 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id D225B5C065; Wed, 1 Sep 2004 23:02:57 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19160-06; Wed, 1 Sep 2004 23:02:57 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 3B3025C008; Wed, 1 Sep 2004 23:02:57 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cFv-0007nx-Fp; Wed, 01 Sep 2004 23:02:59 +0200 Subject: [patch 05/16] net/e100: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:02:59 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8310 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 2783 Lines: 93 I would appreciate any comments from the janitor@sternweltens list. This is one (of many) cases where I made a decision about replacing set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(some_time); with msleep(jiffies_to_msecs(some_time)); msleep() is not exactly the same as the previous code, but I only did this replacement where I thought long delays were *desired*. If this is not the case here, then just disregard this patch. Thanks, Nish Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/e100.c | 15 +++++---------- 1 files changed, 5 insertions(+), 10 deletions(-) diff -puN drivers/net/e100.c~msleep-drivers_net_e100 drivers/net/e100.c --- linux-2.6.9-rc1-bk7/drivers/net/e100.c~msleep-drivers_net_e100 2004-09-01 19:35:28.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/e100.c 2004-09-01 19:35:28.000000000 +0200 @@ -623,8 +623,7 @@ static int e100_self_test(struct nic *ni writel(selftest | dma_addr, &nic->csr->port); e100_write_flush(nic); /* Wait 10 msec for self-test to complete */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 100 + 1); + msleep(10); /* Interrupts are enabled after self-test */ e100_disable_irq(nic); @@ -672,8 +671,7 @@ static void e100_eeprom_write(struct nic e100_write_flush(nic); udelay(4); } /* Wait 10 msec for cmd to complete */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 100 + 1); + msleep(10); /* Chip deselect */ writeb(0, &nic->csr->eeprom_ctrl_lo); @@ -1758,8 +1756,7 @@ static int e100_loopback_test(struct nic memset(skb->data, 0xFF, ETH_DATA_LEN); e100_xmit_frame(skb, nic->netdev); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 100 + 1); + msleep(10); if(memcmp(nic->rx_to_clean->skb->data + sizeof(struct rfd), skb->data, ETH_DATA_LEN)) @@ -1845,8 +1842,7 @@ static void e100_get_regs(struct net_dev mdio_read(netdev, nic->mii.phy_id, i); memset(nic->mem->dump_buf, 0, sizeof(nic->mem->dump_buf)); e100_exec_cb(nic, NULL, e100_dump); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 100 + 1); + msleep(10); memcpy(&buff[2 + E100_PHY_REGS], nic->mem->dump_buf, sizeof(nic->mem->dump_buf)); } @@ -2020,8 +2016,7 @@ static int e100_phys_id(struct net_devic if(!data || data > (u32)(MAX_SCHEDULE_TIMEOUT / HZ)) data = (u32)(MAX_SCHEDULE_TIMEOUT / HZ); mod_timer(&nic->blink_timer, jiffies); - set_current_state(TASK_INTERRUPTIBLE); - schedule_timeout(data * HZ); + msleep(data * 1000); del_timer_sync(&nic->blink_timer); mdio_write(netdev, nic->mii.phy_id, MII_LED_CONTROL, 0); _ From janitor@sternwelten.at Wed Sep 1 14:02:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:02:52 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L2j0l007089 for ; Wed, 1 Sep 2004 14:02:46 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id B08D95C065; Wed, 1 Sep 2004 23:02:35 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19160-04; Wed, 1 Sep 2004 23:02:35 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 3C3265C008; Wed, 1 Sep 2004 23:02:35 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cFZ-0007l7-Gh; Wed, 01 Sep 2004 23:02:37 +0200 Subject: [patch 01/16] __FUNCTION__ string concatenation To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:02:37 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8306 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1608 Lines: 43 I've replaced the __FUNCTION__ string concatenation with the %s placeholder and a printf parameter in drivers/net/wireless/prism65/islpci_mgt.h, as suggested in the TODO list. I don't have the hardware to do a run-time check. It should not pose any problems though. # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/07/04 22:08:37+02:00 drizzd@aon.at # __FUNCTION__ string concatenation is deprecated # # drivers/net/wireless/prism54/islpci_mgt.h # 2004/07/03 17:18:20+02:00 drizzd@aon.at +1 -1 # __FUNCTION__ string concatenation is deprecated # Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/wireless/prism54/islpci_mgt.h | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -puN drivers/net/wireless/prism54/islpci_mgt.h~printk-net_wireless_prism54_islpci_mgt.h drivers/net/wireless/prism54/islpci_mgt.h --- linux-2.6.9-rc1-bk7/drivers/net/wireless/prism54/islpci_mgt.h~printk-net_wireless_prism54_islpci_mgt.h 2004-09-01 19:34:23.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/wireless/prism54/islpci_mgt.h 2004-09-01 19:34:23.000000000 +0200 @@ -31,7 +31,7 @@ #define K_DEBUG(f, m, args...) do { if(f & m) printk(KERN_DEBUG args); } while(0) #define DEBUG(f, args...) K_DEBUG(f, pc_debug, args) -#define TRACE(devname) K_DEBUG(SHOW_TRACING, VERBOSE, "%s: -> " __FUNCTION__ "()\n", devname) +#define TRACE(devname) K_DEBUG(SHOW_TRACING, VERBOSE, "%s: -> %s()\n", devname, __FUNCTION__) extern int pc_debug; #define init_wds 0 /* help compiler optimize away dead code */ _ From janitor@sternwelten.at Wed Sep 1 14:02:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:02:57 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L2pwo007095 for ; Wed, 1 Sep 2004 14:02:51 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 2E21D5C066; Wed, 1 Sep 2004 23:02:41 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26133-04; Wed, 1 Sep 2004 23:02:40 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id AEE925C008; Wed, 1 Sep 2004 23:02:40 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cFf-0007lp-03; Wed, 01 Sep 2004 23:02:43 +0200 Subject: [patch 02/16] net/3c505: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:02:42 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8307 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1440 Lines: 51 I would appreciate any comments from the janitor@sternweltens list. Thanks, Nish Description: Uses msleep() instead of schedule_timeout() so the task is guaranteed to delay the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/3c505.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) diff -puN drivers/net/3c505.c~msleep-drivers_net_3c505 drivers/net/3c505.c --- linux-2.6.9-rc1-bk7/drivers/net/3c505.c~msleep-drivers_net_3c505 2004-09-01 19:35:25.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/3c505.c 2004-09-01 19:35:25.000000000 +0200 @@ -1327,8 +1327,7 @@ static int __init elp_sense(struct net_d if (orig_HSR & DIR) { /* If HCR.DIR is up, we pull it down. HSR.DIR should follow. */ outb(0, dev->base_addr + PORT_CONTROL); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(30*HZ/100); + msleep(300); if (inb_status(addr) & DIR) { if (elp_debug > 0) printk(notfound_msg, 2); @@ -1337,8 +1336,7 @@ static int __init elp_sense(struct net_d } else { /* If HCR.DIR is down, we pull it up. HSR.DIR should follow. */ outb(DIR, dev->base_addr + PORT_CONTROL); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(30*HZ/100); + msleep(300); if (!(inb_status(addr) & DIR)) { if (elp_debug > 0) printk(notfound_msg, 3); _ From janitor@sternwelten.at Wed Sep 1 14:02:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:01 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L2ukq007103 for ; Wed, 1 Sep 2004 14:02:57 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id ABB7B5C066; Wed, 1 Sep 2004 23:02:46 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19160-05; Wed, 1 Sep 2004 23:02:46 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 3FC3C5C008; Wed, 1 Sep 2004 23:02:46 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cFk-0007mX-Fq; Wed, 01 Sep 2004 23:02:48 +0200 Subject: [patch 03/16] net/appletalk: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:02:48 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8308 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1299 Lines: 51 I would appreciate any comments from the janitor@sternweltens list. Thanks, Nish Description: Uses msleep() instead of schedule_timeout() so the task is guaranteed to delay the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/appletalk/ltpc.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) diff -puN drivers/net/appletalk/ltpc.c~msleep-drivers_net_appletalk_ltpc drivers/net/appletalk/ltpc.c --- linux-2.6.9-rc1-bk7/drivers/net/appletalk/ltpc.c~msleep-drivers_net_appletalk_ltpc 2004-09-01 19:35:26.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/appletalk/ltpc.c 2004-09-01 19:35:26.000000000 +0200 @@ -1109,8 +1109,7 @@ struct net_device * __init ltpc_probe(vo inb_p(io+1); inb_p(io+3); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(2*HZ/100); + msleep(20); inb_p(io+0); inb_p(io+2); @@ -1120,8 +1119,7 @@ struct net_device * __init ltpc_probe(vo inb_p(io+5); /* enable dma */ inb_p(io+6); /* tri-state interrupt line */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ); + msleep(1000); /* now, figure out which dma channel we're using, unless it's already been specified */ _ From janitor@sternwelten.at Wed Sep 1 14:03:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:07 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L32RC007113 for ; Wed, 1 Sep 2004 14:03:02 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 376FC5C066; Wed, 1 Sep 2004 23:02:52 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26133-05; Wed, 1 Sep 2004 23:02:51 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id BDA2B5C008; Wed, 1 Sep 2004 23:02:51 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cFp-0007nF-Vx; Wed, 01 Sep 2004 23:02:54 +0200 Subject: [patch 04/16] net/cs89x0: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:02:53 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8309 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1375 Lines: 52 I would appreciate any comments from the janitor@sternweltens list. This is one (of many) cases where I made a decision about replacing set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(some_time); with msleep(jiffies_to_msecs(some_time)); msleep() is not exactly the same as the previous code, but I only did this replacement where I thought long delays were *desired*. If this is not the case here, then just disregard this patch. Thanks, Nish Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/cs89x0.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/cs89x0.c~msleep-drivers_net_cs89x0 drivers/net/cs89x0.c --- linux-2.6.9-rc1-bk7/drivers/net/cs89x0.c~msleep-drivers_net_cs89x0 2004-09-01 19:35:27.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/cs89x0.c 2004-09-01 19:35:27.000000000 +0200 @@ -891,8 +891,7 @@ void __init reset_chip(struct net_devic writereg(dev, PP_SelfCTL, readreg(dev, PP_SelfCTL) | POWER_ON_RESET); /* wait 30 ms */ - current->state = TASK_INTERRUPTIBLE; - schedule_timeout(30*HZ/1000); + msleep(30); if (lp->chip_type != CS8900) { /* Hardware problem requires PNP registers to be reconfigured after a reset */ _ From janitor@sternwelten.at Wed Sep 1 14:03:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:18 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3DOU007127 for ; Wed, 1 Sep 2004 14:03:13 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 2D9985C067; Wed, 1 Sep 2004 23:03:03 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26133-06; Wed, 1 Sep 2004 23:03:02 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id AA7EE5C008; Wed, 1 Sep 2004 23:03:02 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cG0-0007of-Vj; Wed, 01 Sep 2004 23:03:05 +0200 Subject: [patch 06/16] net/e1000_osdep: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:04 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8311 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1764 Lines: 59 On Tue, Jul 27, 2004 at 04:00:52AM +0100, Matthew Wilcox wrote: > On Mon, Jul 26, 2004 at 05:00:01PM -0700, Nishanth Aravamudan wrote: > > I would appreciate any comments from the janitor@sternweltens list. > > > > > > > > Description: Replace schedule_timeout() with msleep() to guarantee the > > task delays for the desired time. > > } else { \ > > - set_current_state(TASK_UNINTERRUPTIBLE); \ > > - schedule_timeout((x * HZ)/1000 + 2); \ > > + msleep(x); \ > > } } while(0) > > Looks much better than the previous code. It's actually possible to do > better, though. Simply change to: > > #define msec_delay(x) msleep(x) > > If msleep() ends up scheduling, the bad attempt to sleep will be caught > by schedule(). There's no need to do the check in the driver. Thanks for the tip and here is this change: Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/e1000/e1000_osdep.h | 8 +------- 1 files changed, 1 insertion(+), 7 deletions(-) diff -puN drivers/net/e1000/e1000_osdep.h~msleep-drivers_net_e1000_osdep drivers/net/e1000/e1000_osdep.h --- linux-2.6.9-rc1-bk7/drivers/net/e1000/e1000_osdep.h~msleep-drivers_net_e1000_osdep 2004-09-01 19:35:28.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/e1000/e1000_osdep.h 2004-09-01 19:35:28.000000000 +0200 @@ -42,13 +42,7 @@ #include #ifndef msec_delay -#define msec_delay(x) do { if(in_interrupt()) { \ - /* Don't mdelay in interrupt context! */ \ - BUG(); \ - } else { \ - set_current_state(TASK_UNINTERRUPTIBLE); \ - schedule_timeout((x * HZ)/1000 + 2); \ - } } while(0) +#define msec_delay(x) msleep(x) #endif #define PCI_COMMAND_REGISTER PCI_COMMAND _ From janitor@sternwelten.at Wed Sep 1 14:03:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:23 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3Ivw007137 for ; Wed, 1 Sep 2004 14:03:19 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 9FFA55C068; Wed, 1 Sep 2004 23:03:08 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19160-07; Wed, 1 Sep 2004 23:03:08 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 3C4ED5C008; Wed, 1 Sep 2004 23:03:08 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cG6-0007pN-FM; Wed, 01 Sep 2004 23:03:10 +0200 Subject: [patch 07/16] net/ewrk3: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:10 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8312 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 938 Lines: 37 I would appreciate any comments from the janitor@sternweltens list. Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/ewrk3.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/ewrk3.c~msleep-drivers_net_ewrk3 drivers/net/ewrk3.c --- linux-2.6.9-rc1-bk7/drivers/net/ewrk3.c~msleep-drivers_net_ewrk3 2004-09-01 19:35:29.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/ewrk3.c 2004-09-01 19:35:29.000000000 +0200 @@ -1681,8 +1681,7 @@ static int ewrk3_ethtool_ioctl(struct ne /* Wait a little while */ spin_unlock_irqrestore(&lp->hw_lock, flags); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ>>2); + msleep(250); spin_lock_irqsave(&lp->hw_lock, flags); /* Exit if we got a signal */ _ From janitor@sternwelten.at Wed Sep 1 14:03:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:29 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3ObA007145 for ; Wed, 1 Sep 2004 14:03:24 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 67CE25C066; Wed, 1 Sep 2004 23:03:14 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26133-07; Wed, 1 Sep 2004 23:03:14 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id B790E5C008; Wed, 1 Sep 2004 23:03:13 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGB-0007q6-VQ; Wed, 01 Sep 2004 23:03:16 +0200 Subject: [patch 08/16] net/gt96100eth: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:15 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8313 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 3299 Lines: 109 I would appreciate any comments from the janitor@sternweltens list. This is one (of many) cases where I made a decision about replacing set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(some_time); with msleep(jiffies_to_msecs(some_time)); msleep() is not exactly the same as the previous code, but I only did this replacement where I thought long delays were *desired*. If this is not the case here, then just disregard this patch. Thanks, Nish PS. In this patch, the last delay is a bit confusing. It, in code, delayed for 1 msec, but the comment said 20 msecs, does anyone know which it should be? Description: Replace gt96100_delay() with msleep() to guarantee the task delays for the desired time. Remove the definition of gt96100_delay(). Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/gt96100eth.c | 19 ++++--------------- 1 files changed, 4 insertions(+), 15 deletions(-) diff -puN drivers/net/gt96100eth.c~msleep-drivers_net_gt96100eth drivers/net/gt96100eth.c --- linux-2.6.9-rc1-bk7/drivers/net/gt96100eth.c~msleep-drivers_net_gt96100eth 2004-09-01 19:35:29.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/gt96100eth.c 2004-09-01 19:35:29.000000000 +0200 @@ -59,7 +59,6 @@ // prototypes static void* dmaalloc(size_t size, dma_addr_t *dma_handle); static void dmafree(size_t size, void *vaddr); -static void gt96100_delay(int msec); static int gt96100_add_hash_entry(struct net_device *dev, unsigned char* addr); static void read_mib_counters(struct gt96100_private *gp); @@ -183,16 +182,6 @@ static void dmafree(size_t size, void *v free_pages((unsigned long)vaddr, get_order(size)); } -static void gt96100_delay(int ms) -{ - if (in_interrupt()) - return; - else { - current->state = TASK_INTERRUPTIBLE; - schedule_timeout(ms*HZ/1000); - } -} - static int parse_mac_addr(struct net_device *dev, char* macstr) { @@ -238,7 +227,7 @@ read_MII(int phy_addr, u32 reg) // wait for last operation to complete while (GT96100_READ(GT96100_ETH_SMI_REG) & smirBusy) { // snooze for 1 msec and check again - gt96100_delay(1); + msleep(1); if (--timedout == 0) { printk(KERN_ERR "%s: busy timeout!!\n", __FUNCTION__); @@ -252,7 +241,7 @@ read_MII(int phy_addr, u32 reg) // wait for read to complete while (!((smir = GT96100_READ(GT96100_ETH_SMI_REG)) & smirReadValid)) { // snooze for 1 msec and check again - gt96100_delay(1); + msleep(1); if (--timedout == 0) { printk(KERN_ERR "%s: timeout!!\n", __FUNCTION__); @@ -304,7 +293,7 @@ write_MII(int phy_addr, u32 reg, u16 dat // wait for last operation to complete while (GT96100_READ(GT96100_ETH_SMI_REG) & smirBusy) { // snooze for 1 msec and check again - gt96100_delay(1); + msleep(1); if (--timedout == 0) { printk(KERN_ERR "%s: busy timeout!!\n", __FUNCTION__); @@ -528,7 +517,7 @@ abort(struct net_device *dev, u32 abort_ // wait for abort to complete while (GT96100ETH_READ(gp, GT96100_ETH_SDMA_COMM) & abort_bits) { // snooze for 20 msec and check again - gt96100_delay(1); + msleep(20); // was gt96100_delay(1) -> should it be 20 or 1? if (--timedout == 0) { err("%s: timeout!!\n", __FUNCTION__); _ From janitor@sternwelten.at Wed Sep 1 14:03:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:35 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3UFA007155 for ; Wed, 1 Sep 2004 14:03:30 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 16B385C065; Wed, 1 Sep 2004 23:03:20 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19160-08; Wed, 1 Sep 2004 23:03:19 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 407E75C008; Wed, 1 Sep 2004 23:03:19 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGH-0007qo-Fw; Wed, 01 Sep 2004 23:03:21 +0200 Subject: [patch 09/16] ixgb/ixgb_osdep: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:21 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8314 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1197 Lines: 44 I would appreciate any comments from the janitor@sternweltens list. Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Redefine msec_delay(x) to directly call msleep(x). Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/ixgb/ixgb_osdep.h | 8 +------- 1 files changed, 1 insertion(+), 7 deletions(-) diff -puN drivers/net/ixgb/ixgb_osdep.h~msleep-drivers_net_ixgb_osdep drivers/net/ixgb/ixgb_osdep.h --- linux-2.6.9-rc1-bk7/drivers/net/ixgb/ixgb_osdep.h~msleep-drivers_net_ixgb_osdep 2004-09-01 19:35:34.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/ixgb/ixgb_osdep.h 2004-09-01 19:35:34.000000000 +0200 @@ -41,13 +41,7 @@ #include #ifndef msec_delay -#define msec_delay(x) do { if(in_interrupt()) { \ - /* Don't mdelay in interrupt context! */ \ - BUG(); \ - } else { \ - set_current_state(TASK_UNINTERRUPTIBLE); \ - schedule_timeout((x * HZ)/1000 + 2); \ - } } while(0) +#define msec_delay(x) msleep(x) #endif #define PCI_COMMAND_REGISTER PCI_COMMAND _ From janitor@sternwelten.at Wed Sep 1 14:03:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:39 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3Z0C007170 for ; Wed, 1 Sep 2004 14:03:35 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 209A35C065; Wed, 1 Sep 2004 23:03:25 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26133-08; Wed, 1 Sep 2004 23:03:24 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id A468B5C008; Wed, 1 Sep 2004 23:03:24 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGN-0007rW-0B; Wed, 01 Sep 2004 23:03:27 +0200 Subject: [patch 10/16] net/mac89x0: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:26 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8315 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1387 Lines: 53 I would appreciate any comments from the janitor@sternweltens list. This is one (of many) cases where I made a decision about replacing set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(some_time); with msleep(jiffies_to_msecs(some_time)); msleep() is not exactly the same as the previous code, but I only did this replacement where I thought long delays were *desired*. If this is not the case here, then just disregard this patch. Thanks, Nish Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/mac89x0.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/mac89x0.c~msleep-drivers_net_max89x0 drivers/net/mac89x0.c --- linux-2.6.9-rc1-bk7/drivers/net/mac89x0.c~msleep-drivers_net_max89x0 2004-09-01 19:35:34.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/mac89x0.c 2004-09-01 19:35:34.000000000 +0200 @@ -308,8 +308,7 @@ void __init reset_chip(struct net_device writereg(dev, PP_SelfCTL, readreg(dev, PP_SelfCTL) | POWER_ON_RESET); /* wait 30 ms */ - current->state = TASK_INTERRUPTIBLE; - schedule_timeout(30*HZ/1000); + msleep(30); /* Wait until the chip is reset */ reset_start_time = jiffies; _ From janitor@sternwelten.at Wed Sep 1 14:03:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:46 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3ej7007189 for ; Wed, 1 Sep 2004 14:03:41 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 8A3A25C065; Wed, 1 Sep 2004 23:03:30 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19160-09; Wed, 1 Sep 2004 23:03:30 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 2119F5C008; Wed, 1 Sep 2004 23:03:30 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGS-0007sE-Ej; Wed, 01 Sep 2004 23:03:32 +0200 Subject: [patch 11/16] net/ni65: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:32 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8316 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1021 Lines: 38 I would appreciate any comments from the janitor@sternweltens list. Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/ni65.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/ni65.c~msleep-drivers_net_ni65 drivers/net/ni65.c --- linux-2.6.9-rc1-bk7/drivers/net/ni65.c~msleep-drivers_net_ni65 2004-09-01 19:35:35.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/ni65.c 2004-09-01 19:35:35.000000000 +0200 @@ -526,8 +526,7 @@ static int __init ni65_probe1(struct net ni65_init_lance(p,dev->dev_addr,0,0); irq_mask = probe_irq_on(); writereg(CSR0_INIT|CSR0_INEA,CSR0); /* trigger interrupt */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ/50); + msleep(20); dev->irq = probe_irq_off(irq_mask); if(!dev->irq) { _ From janitor@sternwelten.at Wed Sep 1 14:03:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:03:52 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3kMm007301 for ; Wed, 1 Sep 2004 14:03:46 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 0C0875C065; Wed, 1 Sep 2004 23:03:36 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26133-09; Wed, 1 Sep 2004 23:03:35 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 96E9C5C008; Wed, 1 Sep 2004 23:03:35 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGX-0007sw-WC; Wed, 01 Sep 2004 23:03:38 +0200 Subject: [patch 12/16] net/ns83820: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:37 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8317 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1220 Lines: 45 On Tue, Jul 27, 2004 at 02:34:08PM -0400, Benjamin LaHaise wrote: > The commit message doesn't seem correspond to the actual patch. What > are you trying to "fix"? Thanks for catching this - it was another typo. Please find the corrected patch below. -Nish Description: Uses msleep() instead of schedule_timeout() to guarantee the task delays the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/ns83820.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/ns83820.c~msleep-drivers_net_ns83820 drivers/net/ns83820.c --- linux-2.6.9-rc1-bk7/drivers/net/ns83820.c~msleep-drivers_net_ns83820 2004-09-01 19:35:35.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/ns83820.c 2004-09-01 19:35:35.000000000 +0200 @@ -1960,8 +1960,7 @@ static int __devinit ns83820_init_one(st if (reset_phy) { printk(KERN_INFO "%s: resetting phy\n", ndev->name); writel(dev->CFG_cache | CFG_PHY_RST, dev->base + CFG); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout((HZ+99)/100); + msleep(10); writel(dev->CFG_cache, dev->base + CFG); } _ From janitor@sternwelten.at Wed Sep 1 14:03:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:04:00 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3qlf007420 for ; Wed, 1 Sep 2004 14:03:52 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id ED3A75C065; Wed, 1 Sep 2004 23:03:41 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 19160-10; Wed, 1 Sep 2004 23:03:41 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 1A0BB5C008; Wed, 1 Sep 2004 23:03:41 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGd-0007te-EC; Wed, 01 Sep 2004 23:03:43 +0200 Subject: [patch 13/16] net/s2io.c: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:43 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8318 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 4651 Lines: 176 I would appreciate any comments from the janitor@sternweltens list. Description: Use msleep() instead of schedule_timeout() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/s2io.c | 45 +++++++++-------------------- 1 files changed, 15 insertions(+), 30 deletions(-) diff -puN drivers/net/s2io.c~msleep-drivers_net_s2io drivers/net/s2io.c --- linux-2.6.9-rc1-bk7/drivers/net/s2io.c~msleep-drivers_net_s2io 2004-09-01 19:35:36.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/s2io.c 2004-09-01 19:35:36.000000000 +0200 @@ -555,8 +555,7 @@ static int initNic(struct s2io_nic *nic) val64 = 0; writeq(val64, &bar0->sw_reset); val64 = readq(&bar0->sw_reset); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 2); + msleep(500); /* Enable Receiving broadcasts */ val64 = readq(&bar0->mac_cfg); @@ -803,8 +802,7 @@ static int initNic(struct s2io_nic *nic) dev->name); return -1; } - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 20); + msleep(50); time++; } @@ -838,8 +836,7 @@ static int initNic(struct s2io_nic *nic) return -1; } time++; - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 20); + msleep(50); } /* Initializing proper values as Pause threshold into all @@ -1182,8 +1179,7 @@ static int startNic(struct s2io_nic *nic writeq(val64, &bar0->mc_rldram_mrs); val64 = readq(&bar0->mc_rldram_mrs); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 10); /* Delay by around 100 ms. */ + msleep(100); /* Enabling ECC Protection. */ val64 = readq(&bar0->adapter_control); @@ -1891,8 +1887,7 @@ int waitForCmdComplete(nic_t * sp) ret = SUCCESS; break; } - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 20); + msleep(50); if (cnt++ > 10) break; } @@ -1931,15 +1926,13 @@ void s2io_reset(nic_t * sp) * As of now I'am just giving a 250ms delay and hoping that the * PCI write to sw_reset register is done by this time. */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 4); + msleep(250); /* Restore the PCI state saved during initializarion. */ pci_restore_state(sp->pdev, sp->config_space); s2io_init_pci(sp); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 4); + msleep(250); /* SXE-002: Configure link and activity LED to turn it off */ subid = sp->pdev->subsystem_device; @@ -2157,8 +2150,7 @@ int s2io_close(struct net_device *dev) /* If the device tasklet is running, wait till its done before killing it */ while (atomic_read(&(sp->tasklet_status))) { - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 10); + msleep(100); } tasklet_kill(&sp->task); @@ -2169,8 +2161,7 @@ int s2io_close(struct net_device *dev) break; } - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 20); + msleep(50); cnt++; if (cnt == 10) { DBG_PRINT(ERR_DBG, @@ -2943,8 +2934,7 @@ static u32 readEeprom(nic_t * sp, int of data = I2C_CONTROL_GET_DATA(val64); break; } - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 20); + msleep(50); exit_cnt++; } @@ -2983,8 +2973,7 @@ static int writeEeprom(nic_t * sp, int o ret = 0; break; } - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 20); + msleep(50); exit_cnt++; } @@ -3256,8 +3245,7 @@ static int s2io_bistTest(nic_t * sp, uin ret = 0; break; } - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 10); + msleep(100); cnt++; } @@ -3356,8 +3344,7 @@ static int s2io_rldramTest(nic_t * sp, u val64 = readq(&bar0->mc_rldram_test_ctrl); if (val64 & MC_RLDRAM_TEST_DONE) break; - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 5); + msleep(200); } if (cnt == 5) @@ -3373,8 +3360,7 @@ static int s2io_rldramTest(nic_t * sp, u val64 = readq(&bar0->mc_rldram_test_ctrl); if (val64 & MC_RLDRAM_TEST_DONE) break; - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 2); + msleep(500); } if (cnt == 5) @@ -3711,8 +3697,7 @@ static void s2io_set_link(unsigned long /* Allow a small delay for the NICs self initiated * cleanup to complete. */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ / 10); + msleep(100); val64 = readq(&bar0->adapter_status); if (verify_xena_quiescence(val64, nic->device_enabled_once)) { _ From janitor@sternwelten.at Wed Sep 1 14:03:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:04:06 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L3vCR007517 for ; Wed, 1 Sep 2004 14:03:58 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 122495C066; Wed, 1 Sep 2004 23:03:47 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26133-10; Wed, 1 Sep 2004 23:03:46 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 930D85C008; Wed, 1 Sep 2004 23:03:46 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGi-0007uM-UH; Wed, 01 Sep 2004 23:03:49 +0200 Subject: [patch 14/16] net/ibmtr: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:48 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8319 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1505 Lines: 50 On Tue, Jul 27, 2004 at 01:43:32PM -0700, Nishanth Aravamudan wrote: > I would appreciate any comments from the janitor@sternweltens list. > > > > Description: Use msleep() instead of schedule_timeout() to guarantee > the task delays for the desired time. > > Signed-off-by: Nishanth Aravamudan The previous patch introduced extraneous whitespace, sorry. Thanks, Domen. Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/tokenring/ibmtr.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff -puN drivers/net/tokenring/ibmtr.c~msleep-drivers_net_tokenring_ibmtr drivers/net/tokenring/ibmtr.c --- linux-2.6.9-rc1-bk7/drivers/net/tokenring/ibmtr.c~msleep-drivers_net_tokenring_ibmtr 2004-09-01 19:35:37.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/tokenring/ibmtr.c 2004-09-01 19:35:37.000000000 +0200 @@ -108,6 +108,7 @@ in the event that chatty debug messages #define IBMTR_DEBUG_MESSAGES 0 #include +#include #ifdef PCMCIA /* required for ibmtr_cs.c to build */ #undef MODULE /* yes, really */ @@ -858,8 +859,7 @@ static int tok_init_card(struct net_devi writeb(~INT_ENABLE, ti->mmio + ACA_OFFSET + ACA_RESET + ISRP_EVEN); outb(0, PIOaddr + ADAPTRESET); - current->state=TASK_UNINTERRUPTIBLE; - schedule_timeout(TR_RST_TIME); /* wait 50ms */ + msleep(jiffies_to_msecs(TR_RST_TIME)); outb(0, PIOaddr + ADAPTRESETREL); #ifdef ENABLE_PAGING _ From janitor@sternwelten.at Wed Sep 1 14:04:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:04:12 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L42Bn007682 for ; Wed, 1 Sep 2004 14:04:03 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id A9D0C5C067; Wed, 1 Sep 2004 23:03:52 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 01102-01; Wed, 1 Sep 2004 23:03:52 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 262A35C008; Wed, 1 Sep 2004 23:03:52 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGo-0007v4-E1; Wed, 01 Sep 2004 23:03:54 +0200 Subject: [patch 15/16] tulip/de2104x: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:54 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8320 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 997 Lines: 38 I would appreciate any comments from the janitor@sternweltens list. Description: Use msleep() instead of schedule_timeout() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/tulip/de2104x.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/tulip/de2104x.c~msleep-drivers_net_tulip_de2104x drivers/net/tulip/de2104x.c --- linux-2.6.9-rc1-bk7/drivers/net/tulip/de2104x.c~msleep-drivers_net_tulip_de2104x 2004-09-01 19:35:38.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/tulip/de2104x.c 2004-09-01 19:35:38.000000000 +0200 @@ -1208,8 +1208,7 @@ static void de_adapter_wake (struct de_p pci_write_config_dword(de->pdev, PCIPM, pmctl); /* de4x5.c delays, so we do too */ - current->state = TASK_UNINTERRUPTIBLE; - schedule_timeout(msecs_to_jiffies(10)); + msleep(10); } } _ From janitor@sternwelten.at Wed Sep 1 14:04:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:04:17 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L48SU007823 for ; Wed, 1 Sep 2004 14:04:09 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 85B5F5C008; Wed, 1 Sep 2004 23:03:58 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 01102-02; Wed, 1 Sep 2004 23:03:58 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id A00B65C068; Wed, 1 Sep 2004 23:03:57 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cGt-0007vm-Tz; Wed, 01 Sep 2004 23:04:00 +0200 Subject: [patch 16/16] parport/ieee1284_ops: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:03:59 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8321 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1477 Lines: 53 I would appreciate any comments from the janitor@sternweltens list. This is one (of many) cases where I made a decision about replacing set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(some_time); with msleep(jiffies_to_msecs(some_time)); msleep() is not exactly the same as the previous code, but I only did this replacement where I thought long delays were *desired*. If this is not the case here, then just disregard this patch. Thanks, Nish Description: Uses msleep() instead of schedule_timeout() to guarantee the task delays the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/parport/ieee1284_ops.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/parport/ieee1284_ops.c~msleep-drivers_parport_ieee1284_ops drivers/parport/ieee1284_ops.c --- linux-2.6.9-rc1-bk7/drivers/parport/ieee1284_ops.c~msleep-drivers_parport_ieee1284_ops 2004-09-01 19:35:41.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/parport/ieee1284_ops.c 2004-09-01 19:35:41.000000000 +0200 @@ -542,8 +542,7 @@ size_t parport_ieee1284_ecp_read_data (s /* Yield the port for a while. */ if (count && dev->port->irq != PARPORT_IRQ_NONE) { parport_release (dev); - __set_current_state (TASK_INTERRUPTIBLE); - schedule_timeout ((HZ + 24) / 25); + msleep(40); parport_claim_or_block (dev); } else _ From janitor@sternwelten.at Wed Sep 1 14:05:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:05:42 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L5Vfm009733 for ; Wed, 1 Sep 2004 14:05:32 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id D4AB75C065; Wed, 1 Sep 2004 23:05:21 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 01102-06; Wed, 1 Sep 2004 23:05:21 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 6BB3A5C008; Wed, 1 Sep 2004 23:05:21 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cIF-0007yy-Lb; Wed, 01 Sep 2004 23:05:23 +0200 Subject: [patch 1/8] irda/act200l-sir: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, jt@hpl.hp.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:05:23 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8322 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1050 Lines: 38 I would appreciate any comments from the janitor@sternweltens list. Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/irda/act200l-sir.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/irda/act200l-sir.c~msleep-drivers_net_irda_act200l-sir drivers/net/irda/act200l-sir.c --- linux-2.6.9-rc1-bk7/drivers/net/irda/act200l-sir.c~msleep-drivers_net_irda_act200l-sir 2004-09-01 19:35:31.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/irda/act200l-sir.c 2004-09-01 19:35:31.000000000 +0200 @@ -177,8 +177,7 @@ static int act200l_change_speed(struct s /* Write control bytes */ sirdev_raw_write(dev, control, 3); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(msecs_to_jiffies(5)); + msleep(5); /* Go back to normal mode */ sirdev_set_dtr_rts(dev, TRUE, TRUE); _ From janitor@sternwelten.at Wed Sep 1 14:05:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:05:44 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L5cWq009866 for ; Wed, 1 Sep 2004 14:05:38 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id AA92C5C066; Wed, 1 Sep 2004 23:05:27 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 04663-06; Wed, 1 Sep 2004 23:05:27 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id E40355C008; Wed, 1 Sep 2004 23:05:26 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cIL-0007zg-5D; Wed, 01 Sep 2004 23:05:29 +0200 Subject: [patch 2/8] irda/irtty-sir: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, jt@hpl.hp.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:05:28 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8323 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1258 Lines: 49 I would appreciate any comments from the janitor@sternweltens list. Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/irda/irtty-sir.c | 7 +++---- 1 files changed, 3 insertions(+), 4 deletions(-) diff -puN drivers/net/irda/irtty-sir.c~msleep-drivers_net_irda_irtty-sir drivers/net/irda/irtty-sir.c --- linux-2.6.9-rc1-bk7/drivers/net/irda/irtty-sir.c~msleep-drivers_net_irda_irtty-sir 2004-09-01 19:35:31.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/irda/irtty-sir.c 2004-09-01 19:35:31.000000000 +0200 @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -96,10 +97,8 @@ static void irtty_wait_until_sent(struct tty->driver->wait_until_sent(tty, msecs_to_jiffies(100)); unlock_kernel(); } - else { - set_task_state(current, TASK_UNINTERRUPTIBLE); - schedule_timeout(msecs_to_jiffies(USBSERIAL_TX_DONE_DELAY)); - } + else + msleep(USBSERIAL_TX_DONE_DELAY); } /* _ From janitor@sternwelten.at Wed Sep 1 14:05:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:05:56 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L5mv8010060 for ; Wed, 1 Sep 2004 14:05:48 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 5E4395C068; Wed, 1 Sep 2004 23:05:38 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 04663-07; Wed, 1 Sep 2004 23:05:38 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id E49035C008; Wed, 1 Sep 2004 23:05:37 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cIW-000816-6g; Wed, 01 Sep 2004 23:05:40 +0200 Subject: [patch 4/8] irda/sir_dev: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, jt@hpl.hp.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:05:39 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8325 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1317 Lines: 46 I would appreciate any comments from the janitor@sternweltens list. Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/irda/sir_dev.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff -puN drivers/net/irda/sir_dev.c~msleep-drivers_net_irda_sir_dev drivers/net/irda/sir_dev.c --- linux-2.6.9-rc1-bk7/drivers/net/irda/sir_dev.c~msleep-drivers_net_irda_sir_dev 2004-09-01 19:35:32.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/irda/sir_dev.c 2004-09-01 19:35:32.000000000 +0200 @@ -15,6 +15,7 @@ #include #include #include +#include #include #include @@ -73,8 +74,7 @@ int sirdev_raw_write(struct sir_dev *dev spin_lock_irqsave(&dev->tx_lock, flags); /* serialize with other tx operations */ while (dev->tx_buff.len > 0) { /* wait until tx idle */ spin_unlock_irqrestore(&dev->tx_lock, flags); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(msecs_to_jiffies(10)); + msleep(10); spin_lock_irqsave(&dev->tx_lock, flags); } _ From janitor@sternwelten.at Wed Sep 1 14:05:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:05:53 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L5gFg009961 for ; Wed, 1 Sep 2004 14:05:43 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id DE3005C067; Wed, 1 Sep 2004 23:05:32 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 01102-07; Wed, 1 Sep 2004 23:05:32 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 5E9A05C008; Wed, 1 Sep 2004 23:05:32 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cIQ-00080O-LF; Wed, 01 Sep 2004 23:05:34 +0200 Subject: [patch 3/8] irda/ma600-sir: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, jt@hpl.hp.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:05:34 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8324 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1945 Lines: 64 I would appreciate any comments from the janitor@sternweltens list. Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/irda/ma600-sir.c | 12 ++++-------- 1 files changed, 4 insertions(+), 8 deletions(-) diff -puN drivers/net/irda/ma600-sir.c~msleep-drivers_net_irda_ma600-sir drivers/net/irda/ma600-sir.c --- linux-2.6.9-rc1-bk7/drivers/net/irda/ma600-sir.c~msleep-drivers_net_irda_ma600-sir 2004-09-01 19:35:32.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/irda/ma600-sir.c 2004-09-01 19:35:32.000000000 +0200 @@ -191,8 +191,7 @@ static int ma600_change_speed(struct sir sirdev_raw_write(dev, &byte, sizeof(byte)); /* Wait at least 10ms: fake wait_until_sent - 10 bits at 9600 baud*/ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(msecs_to_jiffies(15)); /* old ma600 uses 15ms */ + msleep(15); /* old ma600 uses 15ms */ #if 1 /* read-back of the control byte. ma600 is the first dongle driver @@ -215,8 +214,7 @@ static int ma600_change_speed(struct sir sirdev_set_dtr_rts(dev, TRUE, TRUE); /* Wait at least 10ms */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(msecs_to_jiffies(10)); + msleep(10); /* dongle is now switched to the new speed */ dev->speed = speed; @@ -245,13 +243,11 @@ int ma600_reset(struct sir_dev *dev) /* Reset the dongle : set DTR low for 10 ms */ sirdev_set_dtr_rts(dev, FALSE, TRUE); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(msecs_to_jiffies(10)); + msleep(10); /* Go back to normal mode */ sirdev_set_dtr_rts(dev, TRUE, TRUE); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(msecs_to_jiffies(10)); + msleep(10); dev->speed = 9600; /* That's the dongle-default */ _ From janitor@sternwelten.at Wed Sep 1 14:05:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:06:03 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L5spH010172 for ; Wed, 1 Sep 2004 14:05:54 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id CC3AB5C06A; Wed, 1 Sep 2004 23:05:43 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 01102-08; Wed, 1 Sep 2004 23:05:43 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 63B025C069; Wed, 1 Sep 2004 23:05:43 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cIb-00081o-MO; Wed, 01 Sep 2004 23:05:45 +0200 Subject: [patch 5/8] irda/tekram-sir: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, jt@hpl.hp.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:05:45 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8326 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1039 Lines: 38 I would appreciate any comments from the janitor@sternweltens list. Description: Replace schedule_timeout() with msleep() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/irda/tekram-sir.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/irda/tekram-sir.c~msleep-drivers_net_irda_tekram-sir drivers/net/irda/tekram-sir.c --- linux-2.6.9-rc1-bk7/drivers/net/irda/tekram-sir.c~msleep-drivers_net_irda_tekram-sir 2004-09-01 19:35:32.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/irda/tekram-sir.c 2004-09-01 19:35:32.000000000 +0200 @@ -210,8 +210,7 @@ static int tekram_reset(struct sir_dev * sirdev_set_dtr_rts(dev, FALSE, TRUE); /* Should sleep 1 ms */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(msecs_to_jiffies(1)); + msleep(1); /* Set DTR, Set RTS */ sirdev_set_dtr_rts(dev, TRUE, TRUE); _ From janitor@sternwelten.at Wed Sep 1 14:06:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:06:08 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L5xwp010292 for ; Wed, 1 Sep 2004 14:06:00 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 7876A5C06B; Wed, 1 Sep 2004 23:05:49 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 04663-09; Wed, 1 Sep 2004 23:05:49 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id DF0FE5C008; Wed, 1 Sep 2004 23:05:48 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cIh-00082W-6F; Wed, 01 Sep 2004 23:05:51 +0200 Subject: [patch 6/8] wireless/airo: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, jt@hpl.hp.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:05:50 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8327 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 2419 Lines: 82 I would appreciate any comments from the janitor@sternweltens list. Description: Use msleep() instead of schedule_timeout() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/wireless/airo.c | 18 ++++++------------ 1 files changed, 6 insertions(+), 12 deletions(-) diff -puN drivers/net/wireless/airo.c~msleep-drivers_net_wireless_airo drivers/net/wireless/airo.c --- linux-2.6.9-rc1-bk7/drivers/net/wireless/airo.c~msleep-drivers_net_wireless_airo 2004-09-01 19:35:39.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/wireless/airo.c 2004-09-01 19:35:39.000000000 +0200 @@ -2670,11 +2670,9 @@ int reset_card( struct net_device *dev , return -1; waitbusy (ai); OUT4500(ai,COMMAND,CMD_SOFTRESET); - set_current_state (TASK_UNINTERRUPTIBLE); - schedule_timeout (HZ/5); + msleep(200); waitbusy (ai); - set_current_state (TASK_UNINTERRUPTIBLE); - schedule_timeout (HZ/5); + msleep(200); if (lock) up(&ai->sem); return 0; @@ -7436,8 +7434,7 @@ int cmdreset(struct airo_info *ai) { OUT4500(ai,COMMAND,CMD_SOFTRESET); - set_current_state (TASK_UNINTERRUPTIBLE); - schedule_timeout (HZ); /* WAS 600 12/7/00 */ + msleep(1000); /* WAS 600 12/7/00 */ if(!waitbusy (ai)){ printk(KERN_INFO "Waitbusy hang AFTER RESET\n"); @@ -7464,8 +7461,7 @@ int setflashmode (struct airo_info *ai) OUT4500(ai, SWS3, FLASH_COMMAND); OUT4500(ai, COMMAND,0); } - set_current_state (TASK_UNINTERRUPTIBLE); - schedule_timeout (HZ/2); /* 500ms delay */ + msleep(500); if(!waitbusy(ai)) { clear_bit (FLAG_FLASHING, &ai->flags); @@ -7575,8 +7571,7 @@ int flashputbuf(struct airo_info *ai){ int flashrestart(struct airo_info *ai,struct net_device *dev){ int i,status; - set_current_state (TASK_UNINTERRUPTIBLE); - schedule_timeout (HZ); /* Added 12/7/00 */ + msleep(1000); /* Added 12/7/00 */ clear_bit (FLAG_FLASHING, &ai->flags); if (test_bit(FLAG_MPI, &ai->flags)) { status = mpi_init_descriptors(ai); @@ -7591,8 +7586,7 @@ int flashrestart(struct airo_info *ai,st ( ai, 2312, i >= MAX_FIDS / 2 ); } - set_current_state (TASK_UNINTERRUPTIBLE); - schedule_timeout (HZ); /* Added 12/7/00 */ + msleep(1000); /* Added 12/7/00 */ return status; } #endif /* CISCO_EXT */ _ From janitor@sternwelten.at Wed Sep 1 14:06:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:06:14 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L65at010414 for ; Wed, 1 Sep 2004 14:06:05 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id EE3955C069; Wed, 1 Sep 2004 23:05:54 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 04663-10; Wed, 1 Sep 2004 23:05:54 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 6B25F5C008; Wed, 1 Sep 2004 23:05:54 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cIm-00083E-M9; Wed, 01 Sep 2004 23:05:56 +0200 Subject: [patch 7/8] wireless/airport: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, jt@hpl.hp.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:05:56 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8328 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 2171 Lines: 72 I would appreciate any comments from the janitor@sternweltens list. Description: Use msleep() instead of schedule_timeout() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/wireless/airport.c | 15 +++++---------- 1 files changed, 5 insertions(+), 10 deletions(-) diff -puN drivers/net/wireless/airport.c~msleep-drivers_net_wireless_airport drivers/net/wireless/airport.c --- linux-2.6.9-rc1-bk7/drivers/net/wireless/airport.c~msleep-drivers_net_wireless_airport 2004-09-01 19:35:41.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/wireless/airport.c 2004-09-01 19:35:41.000000000 +0200 @@ -94,8 +94,7 @@ airport_resume(struct macio_dev *mdev) printk(KERN_DEBUG "%s: Airport waking up\n", dev->name); pmac_call_feature(PMAC_FTR_AIRPORT_ENABLE, macio_get_of_node(mdev), 0, 1); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ/5); + msleep(200); enable_irq(dev->irq); @@ -147,8 +146,7 @@ airport_detach(struct macio_dev *mdev) macio_release_resource(mdev, 0); pmac_call_feature(PMAC_FTR_AIRPORT_ENABLE, macio_get_of_node(mdev), 0, 0); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ); + msleep(1000); macio_set_drvdata(mdev, NULL); free_netdev(dev); @@ -174,11 +172,9 @@ static int airport_hard_reset(struct ori disable_irq(dev->irq); pmac_call_feature(PMAC_FTR_AIRPORT_ENABLE, macio_get_of_node(card->mdev), 0, 0); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ); + msleep(1000); pmac_call_feature(PMAC_FTR_AIRPORT_ENABLE, macio_get_of_node(card->mdev), 0, 1); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ); + msleep(1000); enable_irq(dev->irq); schedule_timeout(HZ); @@ -240,8 +236,7 @@ airport_attach(struct macio_dev *mdev, c /* Power up card */ pmac_call_feature(PMAC_FTR_AIRPORT_ENABLE, macio_get_of_node(mdev), 0, 1); - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(HZ); + msleep(1000); /* Reset it before we get the interrupt */ hermes_init(hw); _ From janitor@sternwelten.at Wed Sep 1 14:06:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:06:20 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L6Alo010538 for ; Wed, 1 Sep 2004 14:06:11 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 968A55C06C; Wed, 1 Sep 2004 23:06:00 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 15352-01; Wed, 1 Sep 2004 23:06:00 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 16C7C5C008; Wed, 1 Sep 2004 23:06:00 +0200 (CEST) Received: from localhost ([127.0.0.1] helo=localhost.localdomain) by sputnik with esmtp (Exim 4.34) id 1C2cIs-00083w-5t; Wed, 01 Sep 2004 23:06:02 +0200 Subject: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() To: netdev@oss.sgi.com Cc: jgarzik@pobox.com, jt@hpl.hp.com, janitor@sternwelten.at From: janitor@sternwelten.at Date: Wed, 01 Sep 2004 23:06:01 +0200 Message-ID: X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8329 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 1055 Lines: 38 I would appreciate any comments from the janitor@sternweltens list. Description: Use msleep() instead of schedule_timeout() to guarantee the task delays for the desired time. Signed-off-by: Nishanth Aravamudan Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-max/drivers/net/wireless/prism54/islpci_dev.c | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) diff -puN drivers/net/wireless/prism54/islpci_dev.c~msleep-drivers_net_wireless_prism54_islpci_dev drivers/net/wireless/prism54/islpci_dev.c --- linux-2.6.9-rc1-bk7/drivers/net/wireless/prism54/islpci_dev.c~msleep-drivers_net_wireless_prism54_islpci_dev 2004-09-01 19:35:41.000000000 +0200 +++ linux-2.6.9-rc1-bk7-max/drivers/net/wireless/prism54/islpci_dev.c 2004-09-01 19:35:41.000000000 +0200 @@ -436,8 +436,7 @@ prism54_bring_down(islpci_private *priv) wmb(); /* wait a while for the device to reset */ - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(50*HZ/1000); + msleep(50); return 0; } _ From jt@bougret.hpl.hp.com Wed Sep 1 14:09:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:09:44 -0700 (PDT) Received: from palrel12.hp.com (palrel12.hp.com [156.153.255.237]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81L9c39014539 for ; Wed, 1 Sep 2004 14:09:39 -0700 Received: from tomil.hpl.hp.com (tomil.hpl.hp.com [15.0.152.100]) by palrel12.hp.com (Postfix) with ESMTP id 5A0224000A2; Wed, 1 Sep 2004 14:09:30 -0700 (PDT) Received: from bougret.hpl.hp.com (bougret.hpl.hp.com [15.4.92.227]) by tomil.hpl.hp.com (8.9.3 (PHNE_29774)/8.9.3 HPLabs Timeshare Server) with ESMTP id OAA12874; Wed, 1 Sep 2004 14:12:01 -0700 (PDT) Received: from jt by bougret.hpl.hp.com with local (Exim 3.35 #1 (Debian)) id 1C2cMD-000363-00; Wed, 01 Sep 2004 14:09:29 -0700 Date: Wed, 1 Sep 2004 14:09:29 -0700 To: janitor@sternwelten.at Cc: netdev@oss.sgi.com, jgarzik@pobox.com Subject: Re: [patch 1/8] irda/act200l-sir: replace schedule_timeout() with msleep() Message-ID: <20040901210929.GA11442@bougret.hpl.hp.com> Reply-To: jt@hpl.hp.com References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i Organisation: HP Labs Palo Alto Address: HP Labs, 1U-17, 1501 Page Mill road, Palo Alto, CA 94304, USA. E-mail: jt@hpl.hp.com From: Jean Tourrilhes X-archive-position: 8330 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jt@bougret.hpl.hp.com Precedence: bulk X-list: netdev Content-Length: 309 Lines: 14 On Wed, Sep 01, 2004 at 11:05:23PM +0200, janitor@sternwelten.at wrote: > > > > > > > I would appreciate any comments from the janitor@sternweltens list. I already commented that I don't like the confusing msleep() API and I prefer the more explicit schedule_timeout(). But that's only me... Jean From max@stro.at Wed Sep 1 14:40:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:40:17 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81LeCas015306 for ; Wed, 1 Sep 2004 14:40:13 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 59E035C065; Wed, 1 Sep 2004 23:40:02 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26747-03; Wed, 1 Sep 2004 23:40:01 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id C027F5C008; Wed, 1 Sep 2004 23:40:01 +0200 (CEST) Received: from max by sputnik with local (Exim 4.34) id 1C2cpn-0000e6-Sl; Wed, 01 Sep 2004 23:40:03 +0200 Date: Wed, 1 Sep 2004 23:40:03 +0200 From: maximilian attems To: jt@hpl.hp.com Cc: netdev@oss.sgi.com, jgarzik@pobox.com, kj Subject: Re: [patch 1/8] irda/act200l-sir: replace schedule_timeout() with msleep() Message-ID: <20040901214003.GC7467@stro.at> Mail-Followup-To: jt@hpl.hp.com, netdev@oss.sgi.com, jgarzik@pobox.com, kj References: <20040901210929.GA11442@bougret.hpl.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901210929.GA11442@bougret.hpl.hp.com> User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8331 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 666 Lines: 22 On Wed, 01 Sep 2004, Jean Tourrilhes wrote: > On Wed, Sep 01, 2004 at 11:05:23PM +0200, janitor@sternwelten.at wrote: > > I would appreciate any comments from the janitor@sternweltens list. uups mangled some text there sorry for this silly email. > > I already commented that I don't like the confusing msleep() > API and I prefer the more explicit schedule_timeout(). > But that's only me... > > Jean hmm we have still archs were HZ < 100. i find msleep use msecs units a lot more readable than schedule_timeout((HZ + 99) / 100); the schedule_timeout(HZ/100) gets safely converted with msleep. -- maks kernel janitor http://janitor.kernelnewbies.org/ From jt@bougret.hpl.hp.com Wed Sep 1 14:48:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 14:48:30 -0700 (PDT) Received: from palrel13.hp.com (palrel13.hp.com [156.153.255.238]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81LmOaM015754 for ; Wed, 1 Sep 2004 14:48:24 -0700 Received: from tomil.hpl.hp.com (tomil.hpl.hp.com [15.0.152.100]) by palrel13.hp.com (Postfix) with ESMTP id 7FE3B1C0851D; Wed, 1 Sep 2004 14:48:16 -0700 (PDT) Received: from bougret.hpl.hp.com (bougret.hpl.hp.com [15.4.92.227]) by tomil.hpl.hp.com (8.9.3 (PHNE_29774)/8.9.3 HPLabs Timeshare Server) with ESMTP id OAA13748; Wed, 1 Sep 2004 14:50:47 -0700 (PDT) Received: from jt by bougret.hpl.hp.com with local (Exim 3.35 #1 (Debian)) id 1C2cxj-0003Oy-00; Wed, 01 Sep 2004 14:48:15 -0700 Date: Wed, 1 Sep 2004 14:48:15 -0700 To: netdev@oss.sgi.com, jgarzik@pobox.com, kj Subject: Re: [patch 1/8] irda/act200l-sir: replace schedule_timeout() with msleep() Message-ID: <20040901214815.GA13071@bougret.hpl.hp.com> Reply-To: jt@hpl.hp.com References: <20040901210929.GA11442@bougret.hpl.hp.com> <20040901214003.GC7467@stro.at> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901214003.GC7467@stro.at> User-Agent: Mutt/1.3.28i Organisation: HP Labs Palo Alto Address: HP Labs, 1U-17, 1501 Page Mill road, Palo Alto, CA 94304, USA. E-mail: jt@hpl.hp.com From: Jean Tourrilhes X-archive-position: 8332 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jt@bougret.hpl.hp.com Precedence: bulk X-list: netdev Content-Length: 926 Lines: 27 On Wed, Sep 01, 2004 at 11:40:03PM +0200, maximilian attems wrote: > On Wed, 01 Sep 2004, Jean Tourrilhes wrote: > > > On Wed, Sep 01, 2004 at 11:05:23PM +0200, janitor@sternwelten.at wrote: > > > I would appreciate any comments from the janitor@sternweltens list. > uups mangled some text there sorry for this silly email. > > > > I already commented that I don't like the confusing msleep() > > API and I prefer the more explicit schedule_timeout(). > > But that's only me... > > > > Jean > > hmm we have still archs were HZ < 100. > i find msleep use msecs units a lot more readable than > schedule_timeout((HZ + 99) / 100); > > the schedule_timeout(HZ/100) gets safely converted with msleep. I don't have complain about converting the (HZ + 99) / 100 expressions to something saner. My beef is the fact that msleep hide the fact that a schedule might happen. This is important in the IrDA code. > maks Jean From nacc@us.ibm.com Wed Sep 1 15:07:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 15:08:03 -0700 (PDT) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81M7oDP016384 for ; Wed, 1 Sep 2004 15:07:57 -0700 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e32.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id i81M3UsB481516; Wed, 1 Sep 2004 18:03:30 -0400 Received: from arkanoid.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by westrelay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i81M3ShU204832; Wed, 1 Sep 2004 16:03:29 -0600 Received: from arkanoid.beaverton.ibm.com (arkanoid [127.0.0.1]) by arkanoid.beaverton.ibm.com (8.13.1/8.13.1/Debian-6) with ESMTP id i81M3R6a004197; Wed, 1 Sep 2004 22:03:27 GMT Received: (from aravamud@localhost) by arkanoid.beaverton.ibm.com (8.13.1/8.13.1/Debian-6) id i81M3RWP004194; Wed, 1 Sep 2004 22:03:27 GMT X-Authentication-Warning: arkanoid.beaverton.ibm.com: aravamud set sender to nacc@us.ibm.com using -f Date: Wed, 1 Sep 2004 22:03:26 +0000 From: Nishanth Aravamudan To: jt@hpl.hp.com Cc: netdev@oss.sgi.com, jgarzik@pobox.com, kj Subject: Re: [Kernel-janitors] Re: [patch 1/8] irda/act200l-sir: replace schedule_timeout() with msleep() Message-ID: <20040901220326.GB2516@us.ibm.com> References: <20040901210929.GA11442@bougret.hpl.hp.com> <20040901214003.GC7467@stro.at> <20040901214815.GA13071@bougret.hpl.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901214815.GA13071@bougret.hpl.hp.com> X-Operating-System: Linux 2.6.73 (i686) User-Agent: Mutt/1.5.6+20040803i X-archive-position: 8333 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nacc@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1508 Lines: 34 On Wed, Sep 01, 2004 at 02:48:15PM -0700, Jean Tourrilhes wrote: > On Wed, Sep 01, 2004 at 11:40:03PM +0200, maximilian attems wrote: > > On Wed, 01 Sep 2004, Jean Tourrilhes wrote: > > > > > On Wed, Sep 01, 2004 at 11:05:23PM +0200, janitor@sternwelten.at wrote: > > > > I would appreciate any comments from the janitor@sternweltens list. > > uups mangled some text there sorry for this silly email. > > > > > > I already commented that I don't like the confusing msleep() > > > API and I prefer the more explicit schedule_timeout(). > > > But that's only me... > > > > > > Jean > > > > hmm we have still archs were HZ < 100. > > i find msleep use msecs units a lot more readable than > > schedule_timeout((HZ + 99) / 100); > > > > the schedule_timeout(HZ/100) gets safely converted with msleep. > > I don't have complain about converting the (HZ + 99) / 100 > expressions to something saner. My beef is the fact that msleep hide > the fact that a schedule might happen. This is important in the IrDA > code. It *is* important for developers to realize that invoking msleep() may involve giving up the CPU (ie. eventually calling schedule()); however, I think my previous point, that the name itself (the "sleep" part, I mean) is a fair and clear indication of this behavior, is valid. In those cases where a busy-wait is desired, then mdelay() should be used, as indicated by "delay". I think with this in mind & with a quick glance at the source, if need be, the naming is quite safe. -Nish From max@stro.at Wed Sep 1 15:58:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 15:59:06 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i81Mwsp0017551 for ; Wed, 1 Sep 2004 15:58:55 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id DFA3E5C008; Thu, 2 Sep 2004 00:58:43 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 01686-10; Thu, 2 Sep 2004 00:58:43 +0200 (CEST) Received: from sputnik (M830P021.adsl.highway.telekom.at [62.47.135.181]) by baikonur.stro.at (Postfix) with ESMTP id 3AAC15C065; Thu, 2 Sep 2004 00:58:43 +0200 (CEST) Received: from max by sputnik with local (Exim 4.34) id 1C2e3u-0001e1-TX; Thu, 02 Sep 2004 00:58:43 +0200 Date: Thu, 2 Sep 2004 00:58:42 +0200 From: maximilian attems To: jt@hpl.hp.com Cc: netdev@oss.sgi.com, jgarzik@pobox.com, kj Subject: Re: [Kernel-janitors] Re: [patch 1/8] irda/act200l-sir: replace schedule_timeout() with msleep() Message-ID: <20040901225841.GF7467@stro.at> Mail-Followup-To: jt@hpl.hp.com, netdev@oss.sgi.com, jgarzik@pobox.com, kj References: <20040901210929.GA11442@bougret.hpl.hp.com> <20040901214003.GC7467@stro.at> <20040901214815.GA13071@bougret.hpl.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901214815.GA13071@bougret.hpl.hp.com> User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8334 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 737 Lines: 26 On Wed, 01 Sep 2004, Jean Tourrilhes wrote: > On Wed, Sep 01, 2004 at 11:40:03PM +0200, maximilian attems wrote: > > On Wed, 01 Sep 2004, Jean Tourrilhes wrote: .. > > > > hmm we have still archs were HZ < 100. > > i find msleep use msecs units a lot more readable than > > schedule_timeout((HZ + 99) / 100); > > > > the schedule_timeout(HZ/100) gets safely converted with msleep. > > I don't have complain about converting the (HZ + 99) / 100 > expressions to something saner. My beef is the fact that msleep hide > the fact that a schedule might happen. This is important in the IrDA > code. sorry my woding was confusing: (HZ + 99) / 100 is correct! as msleep(10) -- maks kernel janitor http://janitor.kernelnewbies.org/ From acme@conectiva.com.br Wed Sep 1 19:46:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 19:46:31 -0700 (PDT) Received: from perninha.conectiva.com.br (perninha.conectiva.com.br [200.140.247.100]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i822kOCt024942 for ; Wed, 1 Sep 2004 19:46:25 -0700 Received: by perninha.conectiva.com.br (Postfix, from userid 568) id 5FC4147429; Wed, 1 Sep 2004 23:46:14 -0300 (BRT) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by perninha.conectiva.com.br (Postfix) with SMTP id B313F4796E for ; Wed, 1 Sep 2004 23:46:13 -0300 (BRT) Received: (qmail 31263 invoked by uid 0); 2 Sep 2004 03:43:59 -0000 Received: from mapi8.distro.conectiva (HELO oops.kerneljanitors.org) (10.0.16.10) by burns.conectiva with SMTP; 2 Sep 2004 03:43:59 -0000 Received: by oops.kerneljanitors.org (Postfix, from userid 500) id CA0F21464C; Wed, 1 Sep 2004 23:48:05 -0300 (BRT) Date: Wed, 1 Sep 2004 23:48:05 -0300 From: Arnaldo Carvalho de Melo To: "David S. Miller" Cc: Zhikui Chen , hadi@cyberus.ca, dccp@ietf.org, netdev@oss.sgi.com Subject: Re: HELP for dccp implementation. Message-ID: <20040902024805.GA23844@conectiva.com.br> References: <412CC269.8080907@rus.uni-stuttgart.de> <1093454747.1034.85.camel@jzny.localdomain> <4135A32A.4030901@rus.uni-stuttgart.de> <20040901133709.3637d63d.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901133709.3637d63d.davem@davemloft.net> X-Url: http://advogato.org/person/acme User-Agent: Mutt/1.5.5.1i X-Bogosity: No, tests=bogofilter, spamicity=0.051111, version=0.16.3 X-archive-position: 8335 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Content-Length: 882 Lines: 23 Em Wed, Sep 01, 2004 at 01:37:09PM -0700, David S. Miller escreveu: > On Wed, 01 Sep 2004 12:23:38 +0200 > Zhikui Chen wrote: > > > If I assign a value such as 0x ee9fbc00 to sk in dccp_rcv (before lookup > > calling), and comment lookkup calling, I get a error report from > > bh_lock_sock(sk) calling inside dccp_rcv, which error report is > > spin_is_locked on uninitialized spinlock ee9fbc00, and spin_lock > > (:ee9fbc00) already locked by /73. > > > > Do you know its reason? Thanks, > > Zhikui, are you working together with Arnaldo using his > code base, like we suggested to you? Or are you working > still on your own code? Dave, I've been quite busy lately with some other projects and haven't been able to colaborate with Zhikui, I hope to change this situation soon. Regards, - Arnaldo From davem@davemloft.net Wed Sep 1 22:22:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 22:22:29 -0700 (PDT) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i825MM5j031276 for ; Wed, 1 Sep 2004 22:22:22 -0700 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1C2k2A-0005yV-00; Wed, 01 Sep 2004 22:21:18 -0700 Date: Wed, 1 Sep 2004 22:21:18 -0700 From: "David S. Miller" To: Herbert Xu Cc: shemminger@osdl.org, netdev@oss.sgi.com Subject: Re: neigh_create/inetdev_destroy race? Message-Id: <20040901222118.0ce4bcc6.davem@davemloft.net> In-Reply-To: <20040831104139.GA2124@gondor.apana.org.au> References: <20040814005411.GA18350@gondor.apana.org.au> <20040814012513.GA721@gondor.apana.org.au> <20040814013030.GA2042@gondor.apana.org.au> <20040814050848.GA11874@gondor.apana.org.au> <20040814062703.GA4806@gondor.apana.org.au> <20040815191450.77532d5d.davem@redhat.com> <20040816105131.GA11299@gondor.apana.org.au> <20040828234201.79556f6e.davem@davemloft.net> <20040829065031.GA786@gondor.apana.org.au> <20040830230820.7514985d.davem@davemloft.net> <20040831104139.GA2124@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8336 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 2207 Lines: 81 On Tue, 31 Aug 2004 20:41:39 +1000 Herbert Xu wrote: > > I think we can clear this by putting neigh_parms_release() into an > > RCU handler. It can't be in in_dev_rcu_put. > > Yes that should go a long way in resolving this problem. So here's the first step. No rcu_read_lock()'s are needed since the tbl->lock needs to be held as a write when traversing these things anyways for other reasons. Can you work on the next bit you mentioned, making sure the corresponding idev is still alive when we add a neighbour with its neigh_parms to the hash table? Thanks. ===== include/net/neighbour.h 1.8 vs edited ===== --- 1.8/include/net/neighbour.h 2004-08-16 14:10:51 -07:00 +++ edited/include/net/neighbour.h 2004-09-01 21:57:37 -07:00 @@ -46,6 +46,7 @@ #include #include #include +#include #include #include @@ -65,6 +66,8 @@ void *priv; void *sysctl_table; + + struct rcu_head rcu_head; int base_reachable_time; int retrans_time; ===== net/core/neighbour.c 1.28 vs edited ===== --- 1.28/net/core/neighbour.c 2004-04-29 16:26:35 -07:00 +++ edited/net/core/neighbour.c 2004-09-01 22:00:59 -07:00 @@ -1120,6 +1120,7 @@ if (p) { memcpy(p, &tbl->parms, sizeof(*p)); p->tbl = tbl; + INIT_RCU_HEAD(&p->rcu_head); p->reachable_time = neigh_rand_reach_time(p->base_reachable_time); if (dev && dev->neigh_setup && dev->neigh_setup(dev, p)) { @@ -1135,6 +1136,14 @@ return p; } +static void neigh_rcu_free_parms(struct rcu_head *head) +{ + struct neigh_parms *parms = + container_of(head, struct neigh_parms, rcu_head); + + kfree(parms); +} + void neigh_parms_release(struct neigh_table *tbl, struct neigh_parms *parms) { struct neigh_parms **p; @@ -1146,7 +1155,7 @@ if (*p == parms) { *p = parms->next; write_unlock_bh(&tbl->lock); - kfree(parms); + call_rcu(&parms->rcu_head, neigh_rcu_free_parms); return; } } @@ -1159,6 +1168,7 @@ { unsigned long now = jiffies; + INIT_RCU_HEAD(&tbl->parms.rcu_head); tbl->parms.reachable_time = neigh_rand_reach_time(tbl->parms.base_reachable_time); From davem@redhat.com Wed Sep 1 22:27:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 22:27:38 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i825RWEM031668 for ; Wed, 1 Sep 2004 22:27:32 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.10/8.12.10) with ESMTP id i825RES0025351; Thu, 2 Sep 2004 01:27:19 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id i825R9313941; Thu, 2 Sep 2004 01:27:09 -0400 Received: from cheetah.davemloft.net (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11/8.12.10) with SMTP id i825R1hU011844; Thu, 2 Sep 2004 01:27:01 -0400 Date: Wed, 1 Sep 2004 22:26:24 -0700 From: "David S. Miller" To: Andi Kleen Cc: netdev@oss.sgi.com Subject: Re: [PATCH] Fix CONFIG_COMPAT with !CONFIG_NET Message-Id: <20040901222624.31205ef5.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8337 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 394 Lines: 12 On Tue, 31 Aug 2004 12:06:30 +0200 Andi Kleen wrote: > Fix compilation with CONFIG_COMPAT set and CONFIG_NET disabled. I like this patch, but... > -static int ret_einval(unsigned int fd, unsigned int cmd, unsigned long arg) > +static __attribute__((used)) int > +ret_einval(unsigned int fd, unsigned int cmd, unsigned long arg) Ahem... use something in linux/compiler.h ok? :) From davem@redhat.com Wed Sep 1 22:34:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 22:34:14 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i825Y85k032101 for ; Wed, 1 Sep 2004 22:34:09 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.10/8.12.10) with ESMTP id i825XqS0026489; Thu, 2 Sep 2004 01:33:52 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id i825Xl314921; Thu, 2 Sep 2004 01:33:47 -0400 Received: from cheetah.davemloft.net (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11/8.12.10) with SMTP id i825XdiV013008; Thu, 2 Sep 2004 01:33:39 -0400 Date: Wed, 1 Sep 2004 22:33:01 -0700 From: "David S. Miller" To: Andi Kleen Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, akepner@sgi.com Subject: Re: [PATCH] Extend lock less TX to real devices Message-Id: <20040901223301.1a8d97a8.davem@redhat.com> In-Reply-To: References: X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8338 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 1126 Lines: 32 On Tue, 31 Aug 2004 14:38:20 +0200 Andi Kleen wrote: > This patch extends the recently added NETIF_F_LLTX to real devices. Well, it does a lot of other things too. > I added support for trylocking instead of spinning like sch_generic > does - for that the driver has to return -1, then the packet is requeued. > The check for a local device deadlock is lost for this case, > but that doesn't seem to be a big loss (I've never seen this printk > ever get triggered) It is triggerable if you misconfigure your system. I'm totally against this change, because previously at least the user would find out in their logs. With your change the system explodes looping with no explanation why. > The patch looks bigger than it really is because i moved some code > around and converted the macros into inlines. .. > I also did an additional micro optimization: And for this reason you need to split this patch up. I would recommend: patch 1) Change macros into inlines patch 2) local_bh_disable() preemption count optimization patch 3) support for F_LLTX on real devices patch 4) locking changes Thanks Andi. From davem@redhat.com Wed Sep 1 22:42:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 22:42:18 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i825gCc2032568 for ; Wed, 1 Sep 2004 22:42:12 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.10/8.12.10) with ESMTP id i825fsS0028432; Thu, 2 Sep 2004 01:41:54 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id i825fs316499; Thu, 2 Sep 2004 01:41:54 -0400 Received: from cheetah.davemloft.net (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11/8.12.10) with SMTP id i825fjLG015242; Thu, 2 Sep 2004 01:41:46 -0400 Date: Wed, 1 Sep 2004 22:41:08 -0700 From: "David S. Miller" To: vatsa@in.ibm.com Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, dipankar@in.ibm.com, paulmck@us.ibm.com Subject: Re: [RFC] Use RCU for tcp_ehash lookup Message-Id: <20040901224108.3b2d692d.davem@redhat.com> In-Reply-To: <20040831125941.GA5534@in.ibm.com> References: <20040831125941.GA5534@in.ibm.com> X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8339 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 3453 Lines: 71 On Tue, 31 Aug 2004 18:29:41 +0530 Srivatsa Vaddagiri wrote: > Some notes on the patch: > > - Although readprofile shows improvement in tick count for > __tcp_v4_lookup_established, I haven't come across any benchmarks that is > benefited noticeably by the lock-free lookup. I have tried httperf, netperf > and simple file transfer tests so far. > > This could possibly be because the hash table size on the machines I was > testing was high (tcp_ehash_size = 128K), leading to low contention rate on > the hash bucket locks. Also because of the fact that lookup could happen in > parallel to socket input packet processing. > > I would be interested to know if anyone has seen high-rate of lock contention > for hash bucket lock. Such workloads would benefit from the lock-free lookup. The reason you don't see any improvement is that the ehash table is pretty write heavy. I'm not totally against your patch, I just don't think that the TCP established hash table qualifies as "read heavy" as per what RCU is truly effective for. > - I presume that one of the reasons for keeping the hash table so big is to > keep lock contention low (& to reduce the size of hash chains). If the lookup > is made lock-free, then could the size of the hash table be reduced (without > adversely impacting performance)? It's large so that the hash itself is effective, not for locking reasons. > - Biggest problem I had converting over to RCU was the refcount race between > sock_put and sock_hold. sock_put might see the refcount go to zero and decide > to free the object, while on some other CPU, sock_get's are pending against > the same object. The patch handles the race by deciding to free the object > only from the RCU callback. That's exactly what I was concerned about when I saw that you had attempted this change. It is incredibly important for state changes and updates to be seen as atomic by the packet input processing engine. It would be illegal for a cpu running TCP input to see a socket in two tables at the same time (for example, in the main established area and in the second half for TIME_WAIT buckets). If the visibility of the socket is wrong, sockets could be erroneously be reset during the transition from established to TIME_WAIT state. Beware! > - Socket table lookups that happens thr', say /proc/net/tcp or tcpdiag_dump, is > not lock-free yet. This is because of movement of socket performed in > __tcp_tw_hashdance, between established half to time-wait half. > There is a window during this movement, when the same socket is present > on both time-wait half as well as established half. I felt that it is not > good to have /proc/net/tcp report two instances of the same socket. Hence > I resorted to have /proc/net/tcp and tcpdiag_dump doing the lookup using > a spinlock. /proc/net/tcp should simply not be used by people, we have the netlink interface to get socket listings which actually scales. Leaving /proc/net/tcp readable on servers with real users is a DoS waiting to happen. > Note that __tcp_v4_lookup_established should not be affected by the above > movement because I found it scans the established half first and _then_ the > time wait half. So even if the same socket is present in both established half > and time wait half, __tcp_v4_lookup_established will lookup only one of them > (& not both). I hope this is true. From davem@davemloft.net Wed Sep 1 22:44:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 22:44:06 -0700 (PDT) Received: from smtp110.mail.sc5.yahoo.com (smtp110.mail.sc5.yahoo.com [66.163.170.8]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id i825hxhA000393 for ; Wed, 1 Sep 2004 22:44:00 -0700 Received: from unknown (HELO cheetah.davemloft.net) (davem?330@63.197.226.105 with login) by smtp110.mail.sc5.yahoo.com with SMTP; 2 Sep 2004 05:43:51 -0000 Date: Wed, 1 Sep 2004 22:43:06 -0700 From: "David S. Miller" To: Andi Kleen Cc: vatsa@in.ibm.com, davem@redhat.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, dipankar@in.ibm.com, paulmck@us.ibm.com Subject: Re: [RFC] Use RCU for tcp_ehash lookup Message-Id: <20040901224306.7dd80458.davem@davemloft.net> In-Reply-To: <20040831135419.GA17642@wotan.suse.de> References: <20040831125941.GA5534@in.ibm.com> <20040831135419.GA17642@wotan.suse.de> Organization: DaveM Loft Enterprises X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8340 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 760 Lines: 19 On Tue, 31 Aug 2004 15:54:20 +0200 Andi Kleen wrote: > And it should also fix the performance problems with > cat /proc/net/tcp on ppc64/ia64 for large hash tables because the rw locks > are gone. Time to convert netstat et al. over the netlink too. > > - I presume that one of the reasons for keeping the hash table so big is to > > keep lock contention low (& to reduce the size of hash chains). If the lookup > > is made lock-free, then could the size of the hash table be reduced (without > > adversely impacting performance)? > > Definitely worth trying IMHO. The current hash tables are far > too big. I would do that as followon patches though. The hashes are big to make the hash effective, not to help the locking contention. From davem@davemloft.net Wed Sep 1 22:46:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Sep 2004 22:46:44 -0700 (PDT) Received: from smtp108.mail.sc5.yahoo.com (smtp108.mail.sc5.yahoo.com [66.163.170.6]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id i825kdsQ000774 for ; Wed, 1 Sep 2004 22:46:39 -0700 Received: from unknown (HELO cheetah.davemloft.net) (davem?330@63.197.226.105 with login) by smtp108.mail.sc5.yahoo.com with SMTP; 2 Sep 2004 05:46:31 -0000 Date: Wed, 1 Sep 2004 22:45:46 -0700 From: "David S. Miller" To: vatsa@in.ibm.com Cc: ak@suse.de, davem@redhat.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, dipankar@in.ibm.com, paulmck@us.ibm.com Subject: Re: [RFC] Use RCU for tcp_ehash lookup Message-Id: <20040901224546.03765c8d.davem@davemloft.net> In-Reply-To: <20040901113641.GA3918@in.ibm.com> References: <20040831125941.GA5534@in.ibm.com> <20040831135419.GA17642@wotan.suse.de> <20040901113641.GA3918@in.ibm.com> Organization: DaveM Loft Enterprises X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8341 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev Content-Length: 1484 Lines: 34 On Wed, 1 Sep 2004 17:06:41 +0530 Srivatsa Vaddagiri wrote: > On Tue, Aug 31, 2004 at 03:54:20PM +0200, Andi Kleen wrote: > > I bet also when you just do rdtsc timing for the TCP receive > > path the cycle numbers will be way down (excluding the copy). > > I got cycle numbers for the lookup routine (with CONFIG_PREEMPT turned off). > They were taken on a 900MHz 8way Intel P3 SMP box. The results are as below: > > > ------------------------------------------------------------------------------- > | 2.6.8.1 | 2.6.8.1 + my patch > ------------------------------------------------------------------------------- > Average cycles | | > spent in | | > __tcp_v4_lookup_established | 2970.65 | 668.227 > | (~3.3 micro-seconds) | (~0.74 microseconds) > ------------------------------------------------------------------------------- > > This repesents improvement by a factor of 77.5%! And yet none of your benchmarks show noticable improvements, which means that this micro-measurement is totally unimportant in the grand scheme of things as far as we know. I'm not adding in a patch that merely provides some micro-measurement improvement that someone can do a shamans dance over. :) If we're going to add this new level of complexity to the TCP code we need to see some real usage performance improvement, not just something that shows up when we put a microscope on a single function. From margitsw@t-online.de Thu Sep 2 00:02:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 00:02:45 -0700 (PDT) Received: from mailout07.sul.t-online.com (mailout07.sul.t-online.com [194.25.134.83]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i8272W1O002697 for ; Thu, 2 Sep 2004 00:02:33 -0700 Received: from fwd00.aul.t-online.de by mailout07.sul.t-online.com with smtp id 1C2lbz-0003TC-04; Thu, 02 Sep 2004 09:02:23 +0200 Received: from roglap.local (ZZMK7oZb8e9+xkugoAKd-ri8Nbyr6YNpR3taY7+k50x-7OKsWGqCcl@[217.224.24.102]) by fwd00.sul.t-online.com with esmtp id 1C2lbn-2FEFIe0; Thu, 2 Sep 2004 09:02:11 +0200 From: margitsw@t-online.de (Margit Schubert-While) To: netdev@oss.sgi.com Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Date: Thu, 2 Sep 2004 08:50:38 +0200 User-Agent: KMail/1.5.4 Cc: janitor@sternwelten.at MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200409020850.38177.margitsw@t-online.de> X-ID: ZZMK7oZb8e9+xkugoAKd-ri8Nbyr6YNpR3taY7+k50x-7OKsWGqCcl X-TOI-MSGID: 750c7a0a-8598-42e0-be04-32caec994a4a X-archive-position: 8342 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: margitsw@t-online.de Precedence: bulk X-list: netdev Content-Length: 306 Lines: 10 I agree with Jean and add the following : You are assuming HZ = 1000. In 2.4, HZ = 100 (And in 2.6, HZ is not necessarily = 1000). The prism54 code base is identical between 2.4/2.6 and is maintained as such in the project. Therefore, I look forward to your implementation of msleep() in 2.4 ;-) Margit From max@stro.at Thu Sep 2 01:24:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 01:24:49 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i828OhMH008902 for ; Thu, 2 Sep 2004 01:24:44 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 569FE5C065; Thu, 2 Sep 2004 10:24:33 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 11776-07; Thu, 2 Sep 2004 10:24:32 +0200 (CEST) Received: from sputnik (M965P030.adsl.highway.telekom.at [62.47.152.158]) by baikonur.stro.at (Postfix) with ESMTP id B5D775C008; Thu, 2 Sep 2004 10:24:32 +0200 (CEST) Received: from max by sputnik with local (Exim 4.34) id 1C2mtV-00012b-A6; Thu, 02 Sep 2004 10:24:33 +0200 Date: Thu, 2 Sep 2004 10:24:33 +0200 From: maximilian attems To: Margit Schubert-While Cc: netdev@oss.sgi.com, kj Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Message-ID: <20040902082432.GA1876@stro.at> Mail-Followup-To: Margit Schubert-While , netdev@oss.sgi.com, kj References: <200409020850.38177.margitsw@t-online.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200409020850.38177.margitsw@t-online.de> User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8343 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 724 Lines: 24 On Thu, 02 Sep 2004, Margit Schubert-While wrote: > I agree with Jean and add the following : > You are assuming HZ = 1000. > In 2.4, HZ = 100 (And in 2.6, HZ is not necessarily = 1000). shure, but this is an argument for msleep as it uses msecs as unit, which don't depend on your arch. (well physics says different for arch/relativistic :) > The prism54 code base is identical between 2.4/2.6 and is > maintained as such in the project. > Therefore, I look forward to your implementation of msleep() > in 2.4 ;-) 2.4 is closed for such stuff, you'll know that better than me, it shouldn't hinder 2.6 in it's progression. one day you will need to branch. -- maks kernel janitor http://janitor.kernelnewbies.org/ From margitsw@t-online.de Thu Sep 2 02:47:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 02:48:04 -0700 (PDT) Received: from mailout10.sul.t-online.com (mailout10.sul.t-online.com [194.25.134.21]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i829lwLE013269 for ; Thu, 2 Sep 2004 02:47:59 -0700 Received: from fwd09.aul.t-online.de by mailout10.sul.t-online.com with smtp id 1C2oC6-0000xj-01; Thu, 02 Sep 2004 11:47:50 +0200 Received: from roglap.local (SUHUzvZ-8eXCHB3xcEwbuSp0Mp4mYoutEzX8sHfg8jiJjYa1rVMWw+@[217.255.115.148]) by fwd09.sul.t-online.com with esmtp id 1C2oBr-0hwnce0; Thu, 2 Sep 2004 11:47:35 +0200 From: margitsw@t-online.de (Margit Schubert-While) To: netdev@oss.sgi.com Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Date: Thu, 2 Sep 2004 11:35:57 +0200 User-Agent: KMail/1.5.4 Cc: janitor@sternwelten.at MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200409021135.57632.margitsw@t-online.de> X-ID: SUHUzvZ-8eXCHB3xcEwbuSp0Mp4mYoutEzX8sHfg8jiJjYa1rVMWw+ X-TOI-MSGID: 5ef338f1-34c6-4bd0-8378-af70bc838370 X-archive-position: 8344 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: margitsw@t-online.de Precedence: bulk X-list: netdev Content-Length: 237 Lines: 8 On Thu, 02 Sep 2004, Maximilian scribeth: > it shouldn't hinder 2.6 in it's progression. I consider this a regression. As schedule_timeout is used elesewhere in the prism54 code, we are using a consistent and documented method. Margit From max@stro.at Thu Sep 2 03:03:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 03:03:38 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82A3WFo014123 for ; Thu, 2 Sep 2004 03:03:32 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id E87C65C065; Thu, 2 Sep 2004 12:03:21 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 09880-07; Thu, 2 Sep 2004 12:03:21 +0200 (CEST) Received: from sputnik (M965P030.adsl.highway.telekom.at [62.47.152.158]) by baikonur.stro.at (Postfix) with ESMTP id 71EDB5C008; Thu, 2 Sep 2004 12:03:21 +0200 (CEST) Received: from max by sputnik with local (Exim 4.34) id 1C2oR8-0001sT-5r; Thu, 02 Sep 2004 12:03:22 +0200 Date: Thu, 2 Sep 2004 12:03:22 +0200 From: maximilian attems To: Margit Schubert-While Cc: netdev@oss.sgi.com, kj Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Message-ID: <20040902100322.GD1876@stro.at> Mail-Followup-To: Margit Schubert-While , netdev@oss.sgi.com, kj References: <200409021135.57632.margitsw@t-online.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200409021135.57632.margitsw@t-online.de> User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8345 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev Content-Length: 460 Lines: 16 On Thu, 02 Sep 2004, Margit Schubert-While wrote: > On Thu, 02 Sep 2004, Maximilian scribeth: > > it shouldn't hinder 2.6 in it's progression. > I consider this a regression. > As schedule_timeout is used elesewhere in the prism54 code, > we are using a consistent and documented method. you didn't answer to the unit argument in favour of msleep. shure msleep is also consistent and documented. -- maks kernel janitor http://janitor.kernelnewbies.org/ From margitsw@t-online.de Thu Sep 2 03:33:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 03:33:44 -0700 (PDT) Received: from mailout05.sul.t-online.com (mailout05.sul.t-online.com [194.25.134.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82AXcn7016072 for ; Thu, 2 Sep 2004 03:33:38 -0700 Received: from fwd01.aul.t-online.de by mailout05.sul.t-online.com with smtp id 1C2ouH-0003kY-05; Thu, 02 Sep 2004 12:33:29 +0200 Received: from roglap.local (Tnz5qYZF8e9457pYQieTTeEIqQdc7lGj+UHe6PaeG+V07c5IS8Ns4h@[217.224.28.77]) by fwd01.sul.t-online.com with esmtp id 1C2ouE-1a48OG0; Thu, 2 Sep 2004 12:33:26 +0200 From: margitsw@t-online.de (Margit Schubert-While) To: netdev@oss.sgi.com Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Date: Thu, 2 Sep 2004 12:21:42 +0200 User-Agent: KMail/1.5.4 Cc: janitor@sternwelten.at MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200409021221.42287.margitsw@t-online.de> X-ID: Tnz5qYZF8e9457pYQieTTeEIqQdc7lGj+UHe6PaeG+V07c5IS8Ns4h X-TOI-MSGID: 68c3e6c7-4a05-4969-9a8a-cb9a86876b9b X-archive-position: 8346 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: margitsw@t-online.de Precedence: bulk X-list: netdev On Thu, 02 Sep 2004, Maximilian scribeth: > you didn't answer to the unit argument in favour of msleep. Don't need to, here's msleep - void msleep(unsigned int msecs) { unsigned long timeout = msecs_to_jiffies(msecs); while (timeout) { set_current_state(TASK_UNINTERRUPTIBLE); timeout = schedule_timeout(timeout); } } In other words, with the subtle exception of the while loop, it reconstitutes the original code. (Although m_to_j doesn't even exactly do that) (And note because of the while loop, this may not be what the author intended) > shure msleep is also consistent and documented. grep -r Documentation = Nix Margit From margitsw@t-online.de Thu Sep 2 04:14:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 04:14:23 -0700 (PDT) Received: from mailout09.sul.t-online.com (mailout09.sul.t-online.com [194.25.134.84]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82BEEAT021128 for ; Thu, 2 Sep 2004 04:14:15 -0700 Received: from fwd08.aul.t-online.de by mailout09.sul.t-online.com with smtp id 1C2pXZ-0001mb-05; Thu, 02 Sep 2004 13:14:05 +0200 Received: from roglap.local (ZYwoviZSreF-pdqRDHwkJ74gIjgQ8CI3ZTOcIys3IeGfg8e2b7Hmcy@[80.128.222.2]) by fwd08.sul.t-online.com with esmtp id 1C2pXR-0VU6C00; Thu, 2 Sep 2004 13:13:57 +0200 From: margitsw@t-online.de (Margit Schubert-While) To: netdev@oss.sgi.com Subject: Re: [patch 01/16] __FUNCTION__ string concatenation Date: Thu, 2 Sep 2004 13:02:22 +0200 User-Agent: KMail/1.5.4 Cc: janitor@sternwelten.at MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200409021302.22823.margitsw@t-online.de> X-ID: ZYwoviZSreF-pdqRDHwkJ74gIjgQ8CI3ZTOcIys3IeGfg8e2b7Hmcy X-TOI-MSGID: 9470ae78-2f8a-4cc5-a808-7cfeabfe9e97 X-archive-position: 8347 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: margitsw@t-online.de Precedence: bulk X-list: netdev On Thu, 02 Sep 2004, Maximilian scribeth: > drivers/net/wireless/prism65/islpci_mgt.h, as suggested Woo, we got another project ? (prism65) > I don't have the hardware to do a run-time check. > It should not pose any problems though. Oh, I think we can safely say that, especially as TRACE is nowhere referenced ;-) (And therefore the patch is unnecessary) Margit From herbert@gondor.apana.org.au Thu Sep 2 06:06:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 06:06:41 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82D6WjR004015 for ; Thu, 2 Sep 2004 06:06:33 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1C2rI2-0002ob-00; Thu, 02 Sep 2004 23:06:10 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1C2rHx-0008TX-00; Thu, 02 Sep 2004 23:06:05 +1000 Date: Thu, 2 Sep 2004 23:06:05 +1000 To: "David S. Miller" Cc: shemminger@osdl.org, netdev@oss.sgi.com Subject: Re: neigh_create/inetdev_destroy race? Message-ID: <20040902130605.GA32570@gondor.apana.org.au> References: <20040814013030.GA2042@gondor.apana.org.au> <20040814050848.GA11874@gondor.apana.org.au> <20040814062703.GA4806@gondor.apana.org.au> <20040815191450.77532d5d.davem@redhat.com> <20040816105131.GA11299@gondor.apana.org.au> <20040828234201.79556f6e.davem@davemloft.net> <20040829065031.GA786@gondor.apana.org.au> <20040830230820.7514985d.davem@davemloft.net> <20040831104139.GA2124@gondor.apana.org.au> <20040901222118.0ce4bcc6.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901222118.0ce4bcc6.davem@davemloft.net> User-Agent: Mutt/1.5.6+20040722i From: Herbert Xu X-archive-position: 8348 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev On Wed, Sep 01, 2004 at 10:21:18PM -0700, David S. Miller wrote: > > So here's the first step. No rcu_read_lock()'s are needed > since the tbl->lock needs to be held as a write when > traversing these things anyways for other reasons. Thanks. > Can you work on the next bit you mentioned, making > sure the corresponding idev is still alive when we add > a neighbour with its neigh_parms to the hash table? Sure. Actually I prefer to do it by ref counting neigh_parms directly. I'll send you a patch soon. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From khc@pm.waw.pl Thu Sep 2 07:06:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 07:06:55 -0700 (PDT) Received: from inx.pm.waw.pl (IDENT:1YwU6gXiFp5E39BXnhw2yBkLOvILUoL1@inx.pm.waw.pl [195.116.170.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82E6gHA015644 for ; Thu, 2 Sep 2004 07:06:43 -0700 Received: from defiant.pm.waw.pl (pk124.warszawa.cvx.ppp.tpnet.pl [213.76.106.124]) by inx.pm.waw.pl (Postfix) with ESMTP id 9EA61E0E2; Thu, 2 Sep 2004 16:05:08 +0200 (CEST) Received: by defiant.pm.waw.pl (Postfix, from userid 500) id 40E49302D6; Thu, 2 Sep 2004 14:27:57 +0200 (CEST) To: Jeff Garzik Cc: Subject: [PATCH][REPOST] 21143 Tulip problems with 10BaseT References: From: Krzysztof Halasa Date: Thu, 02 Sep 2004 14:27:57 +0200 In-Reply-To: (Krzysztof Halasa's message of "Fri, 28 May 2004 14:21:57 +0200") Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 8349 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: khc@pm.waw.pl Precedence: bulk X-list: netdev Hi, Looking at my kernel tree I noticed this patch has not been applied. I'm currently unable to test it, but it was required in the past and I don't think the situation has changed. Please apply. Thanks. --- linux-2.6/drivers/net/tulip/21142.c 19 Mar 2004 15:01:15 -0000 +++ linux-2.6/drivers/net/tulip/21142.c 2 Sep 2004 12:26:31 -0000 @@ -149,11 +149,13 @@ else if (negotiated & 0x0080) dev->if_port = 3; else if (negotiated & 0x0040) dev->if_port = 4; else if (negotiated & 0x0020) dev->if_port = 0; - else { + else if ((csr12 & 2) == 0 && (tp->sym_advertise & 0x0180)) + dev->if_port = 3; + else if ((csr12 & 4) == 0 && (tp->sym_advertise & 0x0060)) + dev->if_port = 0; + else tp->nwayset = 0; - if ((csr12 & 2) == 0 && (tp->sym_advertise & 0x0180)) - dev->if_port = 3; - } + tp->full_duplex = (tulip_media_cap[dev->if_port] & MediaAlwaysFD) ? 1:0; if (tulip_debug > 1) { My original mail: From: Krzysztof Halasa Subject: [PATCH] 21143 Tulip problems with 10BaseT To: Jeff Garzik Cc: Date: Fri, 28 May 2004 14:21:57 +0200 Current 2.4 and 2.6 kernels have problems with Tulip 21143 on 10BaseT link (with not-NWay-capable 10 Mbps peer) - tulip_debug shows: Linux Tulip driver version 1.1.13 (May 11, 2002) tulip0: EEPROM default media type Autosense. tulip0: Index #0 - Media 10baseT (#0) described by a 21142 Serial PHY (2) block. tulip0: Index #1 - Media 10baseT-FDX (#4) described by a 21142 Serial PHY (2) block. tulip0: Index #2 - Media 100baseTx (#3) described by a 21143 SYM PHY (4) block. tulip0: Index #3 - Media 100baseTx-FDX (#5) described by a 21143 SYM PHY (4) block. eth2: Digital DS21143 Tulip rev 48 at 0xe400, 00:C0:CA:13:48:10, IRQ 9. eth2: Restarting 21143 autonegotiation, csr14=0003ffff. eth2: tulip_up(), irq==9. eth2: Restarting 21143 autonegotiation, csr14=0003ffff. eth2: Done tulip_up(), CSR0 ffa08000, CSR5 f0760000 CSR6 b2422202. eth2: interrupt csr5=0xf0670004 new csr5=0xf0660000. eth2: exiting interrupt, csr5=0xf0660000. eth2: interrupt csr5=0xf0670004 new csr5=0xf0660000. eth2: exiting interrupt, csr5=0xf0660000. eth2: interrupt csr5=0xf0670004 new csr5=0xf0660000. eth2: exiting interrupt, csr5=0xf0660000. eth2: interrupt csr5=0xf0668010 new csr5=0xf0660000. eth2: 21143 link status interrupt 000050ca, CSR5 f0668010, fffbffff. ^^^^^^^^ = got NLP but no FLP (no NWay) eth2: Autonegotiation failed, using 10baseT, link beat status 50ca. ... while we know the peer is using NLP and therefore is 10BaseT-HD capable. eth2: 21143 non-MII 10baseT transceiver control 08af/00a5. eth2: Setting CSR15 to 08af0008/00a50008. eth2: Using media type 10baseT, CSR12 is c6. eth2: Setting CSR6 82420000/b2422002 CSR12 000010c6. eth2: exiting interrupt, csr5=0xf0660000. eth2: 21143 negotiation status 000010c6, 10baseT. eth2: 21143 negotiation failed, status 000010c6. eth2: Testing new 21143 media 100baseTx. eth2: interrupt csr5=0xf0008102 new csr5=0xf0000000. eth2: The transmitter stopped. CSR5 is f0008102, CSR6 b2420000, new CSR6 83860000. eth2: interrupt csr5=0xf0670004 new csr5=0xf0660000. eth2: exiting interrupt, csr5=0xf0660000. eth2: 21143 negotiation status 000000c6, 100baseTx. eth2: No 21143 100baseTx link beat, 000000c6, trying NWay. eth2: Restarting 21143 autonegotiation, csr14=0003ffff. eth2: interrupt csr5=0xf0008102 new csr5=0xf0000000. eth2: The transmitter stopped. CSR5 is f0008102, CSR6 b2420200, new CSR6 82420200. eth2: interrupt csr5=0xf0670004 new csr5=0xf0660000. eth2: exiting interrupt, csr5=0xf0660000. eth2: interrupt csr5=0xf0668010 new csr5=0xf0660000. eth2: 21143 link status interrupt 000050ca, CSR5 f0668010, fffbffff. eth2: Autonegotiation failed, using 10baseT, link beat status 50ca. After the attached patch is applied: eth2: Restarting 21143 autonegotiation, csr14=0003ffff. eth2: tulip_up(), irq==9. eth2: Restarting 21143 autonegotiation, csr14=0003ffff. eth2: Done tulip_up(), CSR0 ffa08000, CSR5 f0360000 CSR6 b2422202. eth2: interrupt csr5=0xf0670004 new csr5=0xf0660000. eth2: exiting interrupt, csr5=0xf0660000. eth2: interrupt csr5=0xf0670004 new csr5=0xf0660000. eth2: exiting interrupt, csr5=0xf0660000. eth2: interrupt csr5=0xf0670004 new csr5=0xf0660000. eth2: exiting interrupt, csr5=0xf0660000. eth2: interrupt csr5=0xf0668010 new csr5=0xf0660000. eth2: 21143 link status interrupt 000050ca, CSR5 f0668010, fffbffff. ^^^^^^^^ The same here - 0x50CA = got NLP, negotiation ok (seems like NLP only sets this bit as well), no 100 Mbps link status (whatever that means). eth2: Switching to 10baseT based on link negotiation 01e0 & 0000 = 0000. Correct, we haven't received FLP so no "capability bits" are present. eth2: 21143 non-MII 10baseT transceiver control 08af/00a5. eth2: Setting CSR15 to 08af0008/00a50008. eth2: Using media type 10baseT, CSR12 is c6. eth2: Setting CSR6 82420000/b2422002 CSR12 000010c6. eth2: exiting interrupt, csr5=0xf0660000. eth2: 21143 negotiation status 000010c6, 10baseT. ^^^^ Not sure about it (no 10 nor 100 Mbps link pulse?). But the chip does it this way. Possibly the bit should be cleared before read? eth2: Using NWay-set 10baseT media, csr12 000010c6. eth2: interrupt csr5=0xf0668010 new csr5=0xf0660000. eth2: 21143 link status interrupt 000050ca, CSR5 f0668010, fff8ffff. eth2: 21143 10baseT link beat good. eth2: exiting interrupt, csr5=0xf0660000. eth2: 21143 negotiation status 000050ca, 10baseT. eth2: Using NWay-set 10baseT media, csr12 000050ca. eth2: 21143 negotiation status 000050ca, 10baseT. eth2: Using NWay-set 10baseT media, csr12 000050ca. The chip on this card (Micronet SP2500K - Taiwan I think) is DEC 21143-PC: 00:08.0 Class 0200: 1011:0019 (rev 30) Subsystem: 1113:1207 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- ; Thu, 2 Sep 2004 08:28:21 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 3C4615C065; Thu, 2 Sep 2004 17:28:08 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 06176-05; Thu, 2 Sep 2004 17:28:07 +0200 (CEST) Received: from sputnik (M802P011.adsl.highway.telekom.at [62.47.132.43]) by baikonur.stro.at (Postfix) with ESMTP id C08005C008; Thu, 2 Sep 2004 17:28:07 +0200 (CEST) Received: from max by sputnik with local (Exim 4.34) id 1C2tVQ-00016b-71; Thu, 02 Sep 2004 17:28:08 +0200 Date: Thu, 2 Sep 2004 17:28:08 +0200 From: maximilian attems To: Margit Schubert-While Cc: netdev@oss.sgi.com Subject: Re: [patch 01/16] __FUNCTION__ string concatenation Message-ID: <20040902152807.GA1894@stro.at> Mail-Followup-To: Margit Schubert-While , netdev@oss.sgi.com References: <200409021302.22823.margitsw@t-online.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200409021302.22823.margitsw@t-online.de> User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8350 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev On Thu, 02 Sep 2004, Margit Schubert-While wrote: > On Thu, 02 Sep 2004, Maximilian scribeth: > > I don't have the hardware to do a run-time check. > > It should not pose any problems though. > > Oh, I think we can safely say that, especially as TRACE > is nowhere referenced ;-) > (And therefore the patch is unnecessary) yup nice hint it was used: drivers/net/wireless/prism54/islpci_hotplug.c: /* TRACE(DRV_NAME); */ will resent patch that removes both. -- maks From andre.correa@pobox.com Thu Sep 2 08:40:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 08:40:46 -0700 (PDT) Received: from sasl.smtp.pobox.com (puzzle.sasl.smtp.pobox.com [207.8.226.4]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82FedSp023470 for ; Thu, 2 Sep 2004 08:40:39 -0700 Received: from localhost.localdomain (localhost [127.0.0.1]) by puzzle.pobox.com (Postfix) with ESMTP id 9373F138F05; Thu, 2 Sep 2004 11:38:24 -0400 (EDT) Received: from pobox.com (unknown [200.150.240.34]) by puzzle.pobox.com (Postfix) with ESMTP id D8FDE138EF7; Thu, 2 Sep 2004 11:38:21 -0400 (EDT) Message-ID: <41373ED5.5070606@pobox.com> Date: Thu, 02 Sep 2004 12:40:05 -0300 From: Andre Correa User-Agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com, andre.correa@pobox.com Subject: Bad: scheduling while atomic! in 2.6.8.1 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 8351 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andre.correa@pobox.com Precedence: bulk X-list: netdev Hi, I set up a Linux box as a firewall with 4 NICs (3C905) on a Dell with 2.6.8.1 and iptables 1.2.11. 3 NICs have several IP addresses and the 4th has 4 VLANs associated. This box is plugged on Cisco switches. Everything was fine, firewalling OK, until I plugged the 4th NIC. When traffic start to flow the box logs a _LOT_ of errors on syslog: Sep 1 03:58:48 fw01 kernel: bad: scheduling while atomic! Sep 1 03:58:48 fw01 kernel: [] schedule+0x3c/0x428 Sep 1 03:58:48 fw01 kernel: [] sys_socketcall+0x150/0x1f4 Sep 1 03:58:48 fw01 kernel: [] work_resched+0x5/0x16 Sep 1 03:58:48 fw01 kernel: bad: scheduling while atomic! Sep 1 03:58:48 fw01 kernel: [] schedule+0x3c/0x428 Sep 1 03:58:48 fw01 kernel: [] __kfree_skb+0xd3/0xd8 Sep 1 03:58:48 fw01 kernel: [] schedule_timeout+0x14/0xb0 Sep 1 03:58:48 fw01 kernel: [] unix_wait_for_peer+0xac/0xc8 Sep 1 03:58:48 fw01 kernel: [] autoremove_wake_function+0x0/0x40 Sep 1 03:58:48 fw01 kernel: [] autoremove_wake_function+0x0/0x40 Sep 1 03:58:48 fw01 kernel: [] unix_dgram_sendmsg+0x39b/0x4b0 Sep 1 03:58:48 fw01 kernel: [] sock_aio_write+0x101/0x10c Sep 1 03:58:48 fw01 kernel: [] do_sync_write+0x7a/0xac Sep 1 03:58:48 fw01 kernel: [] kfree_skbmem+0x17/0x1c Sep 1 03:58:48 fw01 kernel: [] __kfree_skb+0xd3/0xd8 Sep 1 03:58:48 fw01 kernel: [] vfs_write+0xb5/0xd4 Sep 1 03:58:48 fw01 kernel: [] sys_write+0x40/0x6c Sep 1 03:58:48 fw01 kernel: [] syscall_call+0x7/0xb Sep 1 03:58:48 fw01 kernel: bad: scheduling while atomic! Sep 1 03:58:48 fw01 kernel: [] schedule+0x3c/0x428 Sep 1 03:58:49 fw01 kernel: [] sys_socketcall+0x150/0x1f4 Sep 1 03:58:49 fw01 kernel: [] work_resched+0x5/0x16 Sep 1 03:58:49 fw01 kernel: bad: scheduling while atomic! Sep 1 03:58:49 fw01 kernel: [] schedule+0x3c/0x428 Sep 1 03:58:49 fw01 kernel: [] __kfree_skb+0xd3/0xd8 Sep 1 03:58:49 fw01 kernel: [] schedule_timeout+0x14/0xb0 Sep 1 03:58:49 fw01 kernel: [] unix_wait_for_peer+0xac/0xc8 Sep 1 03:58:49 fw01 kernel: [] autoremove_wake_function+0x0/0x40 Sep 1 03:58:49 fw01 kernel: [] autoremove_wake_function+0x0/0x40 Sep 1 03:58:49 fw01 kernel: [] unix_dgram_sendmsg+0x39b/0x4b0 Sep 1 03:58:49 fw01 kernel: [] sock_aio_write+0x101/0x10c Sep 1 03:58:49 fw01 kernel: [] do_sync_write+0x7a/0xac Sep 1 03:58:49 fw01 kernel: [] kfree_skbmem+0x17/0x1c Sep 1 03:58:49 fw01 kernel: [] __kfree_skb+0xd3/0xd8 Sep 1 03:58:49 fw01 kernel: [] vfs_write+0xb5/0xd4 Sep 1 03:58:49 fw01 kernel: [] sys_write+0x40/0x6c Sep 1 03:58:49 fw01 kernel: [] syscall_call+0x7/0xb I got more then 110Mb of it in ~2 hours of tests. Shutting down interface doesn't stop it, just a reboot takes the machine back to its normal state, if cable is unplugged. I've tested NIC, cable, PCI slot, switch port, switch and even changed the box itself, but nothing helped. When I take VLAN down, on Cisco switch, no errors are logged. If I go back to 2.6.7 + VLAN, no errors too, all OK. It seens to be related to VLAN on 2.6.8.1 only. Searching kernel source I found that it comes from kernel/sched.c, but it doesn't tells me much. /* * Test if we are atomic. Since do_exit() needs to call into * schedule() atomically, we ignore that path for now. * Otherwise, whine if we are scheduling when we should not be. */ if (likely(!(current->state & (TASK_DEAD | TASK_ZOMBIE)))) { if (unlikely(in_atomic())) { printk(KERN_ERR "bad: scheduling while atomic!\n"); dump_stack(); } } Does anybody can help on it?! Does it look like a bug or what? Any help is appreciated. tks Andre From vatsa@in.ibm.com Thu Sep 2 08:46:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 08:46:45 -0700 (PDT) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82FkcoB023856 for ; Thu, 2 Sep 2004 08:46:38 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e6.ny.us.ibm.com (8.12.10/8.12.9) with ESMTP id i82FkNnt734040; Thu, 2 Sep 2004 11:46:23 -0400 Received: from snowy.in.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.10/NCO/VER6.6) with SMTP id i82FlQND083086; Thu, 2 Sep 2004 11:47:30 -0400 Received: by snowy.in.ibm.com (Postfix, from userid 502) id 7AA9B24E33; Thu, 2 Sep 2004 21:18:37 +0530 (IST) Date: Thu, 2 Sep 2004 21:18:37 +0530 From: Srivatsa Vaddagiri To: davem@redhat.com Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, dipankar@in.ibm.com, paulmck@us.ibm.com Subject: Fw: Re: [RFC] Use RCU for tcp_ehash lookup Message-ID: <20040902154837.GA5435@in.ibm.com> Reply-To: vatsa@in.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 8352 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vatsa@in.ibm.com Precedence: bulk X-list: netdev Resending. Looks like my earlier mail didn't make it. ----- Forwarded message from Srivatsa Vaddagiri ----- Date: Thu, 2 Sep 2004 19:34:44 +0530 From: Srivatsa Vaddagiri To: "David S. Miller" Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, dipankar@in.ibm.com, paulmck@us.ibm.com Subject: Re: [RFC] Use RCU for tcp_ehash lookup Reply-To: vatsa@in.ibm.com On Wed, Sep 01, 2004 at 10:41:08PM -0700, David S. Miller wrote: > The reason you don't see any improvement is that the ehash table is > pretty write heavy. In my simple one-file-transfer-test-at-a-time, it should have been read-mostly. Probably the fact lookups are not serialized wrt input pakcet processing may have shadowed the benefits of lock-free lookup. However perhaps if I have multiple file transfer sessions in progress (one per cpu maybe), then the benefit of reduced time spent in looking up a socket, could be passed on to threads doing network input. > I'm not totally against your patch, I just don't think that the TCP established > hash table qualifies as "read heavy" as per what RCU is truly effective for. IMHO the benefits of lock-free will be seen only in such scenarios, i.e where read_lock ended up having to spin-wait on a update to finish. In the lock-free case, there is no such wait. > That's exactly what I was concerned about when I saw that you had attempted > this change. It is incredibly important for state changes and updates to > be seen as atomic by the packet input processing engine. It would be illegal > for a cpu running TCP input to see a socket in two tables at the same time > (for example, in the main established area and in the second half for TIME_WAIT > buckets). > > If the visibility of the socket is wrong, sockets could be erroneously > be reset during the transition from established to TIME_WAIT state. > Beware! This is precisely the reason why I changed the order of movement in __tcp_tw_hashdance. Earlier, it was removing the socket from the established half and _then_ adding it to time-wait half. This would have lead to a window where the socket is neither in established-half not in the time-wait half. A packet arriving in this window (& doing lock-free lookup) would have been dropped. Hence I reversed the order of movement to add in time-wait first before removing from established half. > > Note that __tcp_v4_lookup_established should not be affected by the above > > movement because I found it scans the established half first and _then_ the > > time wait half. So even if the same socket is present in both established half > > and time wait half, __tcp_v4_lookup_established will lookup only one of them > > (& not both). > > I hope this is true. AFAICS it is true! If __tcp_v4_lookup_established finds it in the established half, it does no further lookup in the time-wait half. -- Thanks and Regards, Srivatsa Vaddagiri, Linux Technology Center, IBM Software Labs, Bangalore, INDIA - 560017 ----- End forwarded message ----- -- Thanks and Regards, Srivatsa Vaddagiri, Linux Technology Center, IBM Software Labs, Bangalore, INDIA - 560017 From nacc@us.ibm.com Thu Sep 2 09:10:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 09:10:24 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82GAD75024762 for ; Thu, 2 Sep 2004 09:10:19 -0700 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e35.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id i82G9p06025910; Thu, 2 Sep 2004 12:09:51 -0400 Received: from arkanoid.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by westrelay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i82G9oZC037834; Thu, 2 Sep 2004 10:09:50 -0600 Received: from arkanoid.beaverton.ibm.com (arkanoid [127.0.0.1]) by arkanoid.beaverton.ibm.com (8.13.1/8.13.1/Debian-6) with ESMTP id i82G9n0t002058; Thu, 2 Sep 2004 16:09:49 GMT Received: (from aravamud@localhost) by arkanoid.beaverton.ibm.com (8.13.1/8.13.1/Debian-6) id i82G9nU6002055; Thu, 2 Sep 2004 16:09:49 GMT X-Authentication-Warning: arkanoid.beaverton.ibm.com: aravamud set sender to nacc@us.ibm.com using -f Date: Thu, 2 Sep 2004 16:09:49 +0000 From: Nishanth Aravamudan To: Margit Schubert-While , netdev@oss.sgi.com, kj Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Message-ID: <20040902160948.GA1944@us.ibm.com> References: <200409020850.38177.margitsw@t-online.de> <20040902082432.GA1876@stro.at> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040902082432.GA1876@stro.at> X-Operating-System: Linux 2.6.73 (i686) User-Agent: Mutt/1.5.6+20040803i X-archive-position: 8353 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nacc@us.ibm.com Precedence: bulk X-list: netdev On Thu, Sep 02, 2004 at 10:24:33AM +0200, maximilian attems wrote: > On Thu, 02 Sep 2004, Margit Schubert-While wrote: > > > I agree with Jean and add the following : > > You are assuming HZ = 1000. > > In 2.4, HZ = 100 (And in 2.6, HZ is not necessarily = 1000). As the original author of the patches, I feel I should interject . . . Why/Where do you see an assumption about the value of HZ? The conversion of the parameter to schedule_timeout() from jiffies to msecs is the only place I can see where that might appear to be the case. But, upon closer examination, there is no such assumption: 1000 = the number of milliseconds in a second. HZ = the number of jiffies in a second (regardless of architecture) In the original code for prism54/islpci_dev.c: set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(50*HZ/1000); Thus, to convert (50*HZ/1000) from jiffies to msecs, multiply by 1000 and divide by HZ, or: 50*HZ/1000 jiffies * 1000/HZ msecs/jiffie = 50 msecs. And thus, in the patched code, the above becomes: msleep(50); Does that clear things up? -Nish From nacc@us.ibm.com Thu Sep 2 09:23:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 09:23:37 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82GNOG1025237 for ; Thu, 2 Sep 2004 09:23:31 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e31.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id i82GN3gM506480; Thu, 2 Sep 2004 12:23:04 -0400 Received: from arkanoid.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i82GN2l9402682; Thu, 2 Sep 2004 10:23:03 -0600 Received: from arkanoid.beaverton.ibm.com (arkanoid [127.0.0.1]) by arkanoid.beaverton.ibm.com (8.13.1/8.13.1/Debian-6) with ESMTP id i82GN2Ww002105; Thu, 2 Sep 2004 16:23:02 GMT Received: (from aravamud@localhost) by arkanoid.beaverton.ibm.com (8.13.1/8.13.1/Debian-6) id i82GN1Bd002102; Thu, 2 Sep 2004 16:23:01 GMT X-Authentication-Warning: arkanoid.beaverton.ibm.com: aravamud set sender to nacc@us.ibm.com using -f Date: Thu, 2 Sep 2004 16:23:01 +0000 From: Nishanth Aravamudan To: Margit Schubert-While , netdev@oss.sgi.com, kj Subject: Re: [Kernel-janitors] Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Message-ID: <20040902162301.GB1944@us.ibm.com> References: <200409021135.57632.margitsw@t-online.de> <20040902100322.GD1876@stro.at> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040902100322.GD1876@stro.at> X-Operating-System: Linux 2.6.73 (i686) User-Agent: Mutt/1.5.6+20040803i X-archive-position: 8354 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nacc@us.ibm.com Precedence: bulk X-list: netdev On Thu, Sep 02, 2004 at 12:03:22PM +0200, maximilian attems wrote: > On Thu, 02 Sep 2004, Margit Schubert-While wrote: > > > On Thu, 02 Sep 2004, Maximilian scribeth: > > > it shouldn't hinder 2.6 in it's progression. > > I consider this a regression. > > As schedule_timeout is used elesewhere in the prism54 code, > > we are using a consistent and documented method. A grep of drivers/net/wireless/prism54 for schedule_timeout showed three occurrences (in 2.6.9-rc1-bk7): islpci_dev.c: schedule_timeout(50*HZ/1000); islpci_dev.c: remaining = schedule_timeout(HZ); islpci_mgt.c: timeleft = schedule_timeout(wait_cycle_jiffies); The first is removed by my patch. The second & third are potentially bugs as there is no set_current_state() preceding the call to schedule_timeout(). As per the source: /** * schedule_timeout - sleep until timeout * @timeout: timeout value in jiffies * * Make the current task sleep until @timeout jiffies have * elapsed. The routine will return immediately unless * the current task state has been set (see set_current_state()). Therefore, in the current code, the schedule_timeout() call does not have the desired effect (the same information is available in kernel-hacking.ps). Both of these calls should probably be fixed, but I'm not sure if you wish to sleep in TASK_INTERUPTIBLE or TASK_UNINTERRUPTIBLE. Keep in mind that msleep_interruptible() is also (hopefully) being pushed to the kernel soon. As to consistency or documentation . . . I have no evidence to suggest that msleep() is inconsistent. And I don't think there is any need for more documentation than the source in this case: /** * msleep - sleep safely even with waitqueue interruptions * @msecs: Time in milliseconds to sleep for */ Hope this helps clear things up. -Nish From paulmck@us.ibm.com Thu Sep 2 09:36:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 09:36:13 -0700 (PDT) Received: from linux.local (bi01p1.co.us.ibm.com [32.97.110.142]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82Ga0j7025740 for ; Thu, 2 Sep 2004 09:36:06 -0700 Received: by linux.local (Postfix on SuSE Linux 7.3 (i386), from userid 500) id 4A40E148B98; Thu, 2 Sep 2004 09:31:50 -0700 (PDT) Date: Thu, 2 Sep 2004 09:31:50 -0700 From: "Paul E. McKenney" To: "David S. Miller" Cc: vatsa@in.ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, dipankar@in.ibm.com Subject: Re: [RFC] Use RCU for tcp_ehash lookup Message-ID: <20040902163149.GB1258@us.ibm.com> Reply-To: paulmck@us.ibm.com References: <20040831125941.GA5534@in.ibm.com> <20040901224108.3b2d692d.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901224108.3b2d692d.davem@redhat.com> User-Agent: Mutt/1.4.1i X-archive-position: 8355 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paulmck@us.ibm.com Precedence: bulk X-list: netdev On Wed, Sep 01, 2004 at 10:41:08PM -0700, David S. Miller wrote: > On Tue, 31 Aug 2004 18:29:41 +0530 > Srivatsa Vaddagiri wrote: > > - Biggest problem I had converting over to RCU was the refcount race between > > sock_put and sock_hold. sock_put might see the refcount go to zero and decide > > to free the object, while on some other CPU, sock_get's are pending against > > the same object. The patch handles the race by deciding to free the object > > only from the RCU callback. > > That's exactly what I was concerned about when I saw that you had attempted > this change. It is incredibly important for state changes and updates to > be seen as atomic by the packet input processing engine. It would be illegal > for a cpu running TCP input to see a socket in two tables at the same time > (for example, in the main established area and in the second half for TIME_WAIT > buckets). > > If the visibility of the socket is wrong, sockets could be erroneously > be reset during the transition from established to TIME_WAIT state. > Beware! If the usages is too write-intensive, then RCU will certainly be less likely to work well. But there is nothing quite like actually trying it to see how it works. ;-) That aside, it -is- possible to make such state changes appear atomic, even when moving elements from one list to another. One way of doing this is to atomically replace the element with a "tombstone" element. Normal pointer writes suffice. The "tombstone" is set up so that searches for the outgoing element will stall (e.g., spin or sleep, depending on the environment). The element is moved to its destination list. At this point, searches for the element in the old list will still stall, while searches for the element in the new list will succeed. The tombstone is now marked so that CPUs stall on it now resume, but indicating failure to find the element in the old list. Of course, this approach makes writes more expensive than they otherwise would be, so, again, RCU is best for read-intensive uses. ;-) The fact that this data structure is not very read-intensive is due to the fact that short-lived TCP connections are quite common, right? Or am I missing the finer points of this data structure's workings? Thanx, Paul From margitsw@t-online.de Thu Sep 2 10:42:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 10:42:14 -0700 (PDT) Received: from mailout06.sul.t-online.com (mailout06.sul.t-online.com [194.25.134.19]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82Hg9YQ027077 for ; Thu, 2 Sep 2004 10:42:09 -0700 Received: from fwd03.aul.t-online.de by mailout06.sul.t-online.com with smtp id 1C2vay-0002NZ-00; Thu, 02 Sep 2004 19:42:00 +0200 Received: from roglap.local (rShJCvZ1oekYZs039dSpPc+qIpI7GJcpa8ZytgJ7WvedNZzrj1mEZV@[217.255.125.194]) by fwd03.sul.t-online.com with esmtp id 1C2var-0j45wm0; Thu, 2 Sep 2004 19:41:53 +0200 From: margitsw@t-online.de (Margit Schubert-While) To: netdev@oss.sgi.com Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Date: Thu, 2 Sep 2004 19:30:18 +0200 User-Agent: KMail/1.5.4 Cc: janitor@sternwelten.at MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200409021930.18414.margitsw@t-online.de> X-ID: rShJCvZ1oekYZs039dSpPc+qIpI7GJcpa8ZytgJ7WvedNZzrj1mEZV X-TOI-MSGID: e1e2d3ca-1c83-40a9-a21f-7e2330d4634a X-archive-position: 8356 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: margitsw@t-online.de Precedence: bulk X-list: netdev On Thu, 02 Sep 2004, Nishanth scribeth: > A grep of drivers/net/wireless/prism54 for schedule_timeout showed three > occurrences (in 2.6.9-rc1-bk7): > islpci_dev.c: schedule_timeout(50*HZ/1000); > islpci_dev.c: remaining = schedule_timeout(HZ); > islpci_mgt.c: timeleft = schedule_timeout(wait_cycle_jiffies); > The first is removed by my patch. > The second & third are potentially bugs as there is no > set_current_state() preceding the call to schedule_timeout(). As per the > source: Nope, look a few lines above: DEFINE_WAIT() and prepare_to_wait(). Margit From margitsw@t-online.de Thu Sep 2 11:04:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 11:04:21 -0700 (PDT) Received: from mailout11.sul.t-online.com (mailout11.sul.t-online.com [194.25.134.85]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82I4ANW027860 for ; Thu, 2 Sep 2004 11:04:11 -0700 Received: from fwd03.aul.t-online.de by mailout11.sul.t-online.com with smtp id 1C2vwF-0006Qv-02; Thu, 02 Sep 2004 20:03:59 +0200 Received: from roglap.local (VgPdmiZvoeP7taiB8A7bbXy2sknkngKrtbSxwLtFFwgu92JIs-MOkm@[217.255.125.194]) by fwd03.sul.t-online.com with esmtp id 1C2vwE-024Pq40; Thu, 2 Sep 2004 20:03:58 +0200 From: margitsw@t-online.de (Margit Schubert-While) To: netdev@oss.sgi.com Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Date: Thu, 2 Sep 2004 19:52:18 +0200 User-Agent: KMail/1.5.4 Cc: janitor@sternwelten.at, nacc@us.ibm.com MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200409021952.18280.margitsw@t-online.de> X-ID: VgPdmiZvoeP7taiB8A7bbXy2sknkngKrtbSxwLtFFwgu92JIs-MOkm X-TOI-MSGID: 9117e798-6f49-4eaf-b79c-44d2921846f0 X-archive-position: 8357 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: margitsw@t-online.de Precedence: bulk X-list: netdev On Thu, 02 Sep 2004, Nishanth scribeth: > Keep in mind that msleep_interruptible() is also > (hopefully) being pushed to the kernel soon I think you need this for your current patch set ;-) eg. In e100, where you replace an interruptible timeout: > @@ -2020,8 +2016,7 @@ I don't think that's correct. Margit From nacc@us.ibm.com Thu Sep 2 11:27:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 11:27:49 -0700 (PDT) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82IRbEr028661 for ; Thu, 2 Sep 2004 11:27:43 -0700 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e33.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id i82IRMDD385518; Thu, 2 Sep 2004 14:27:22 -0400 Received: from arkanoid.beaverton.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i82IRLl9300046; Thu, 2 Sep 2004 12:27:21 -0600 Received: from arkanoid.beaverton.ibm.com (arkanoid [127.0.0.1]) by arkanoid.beaverton.ibm.com (8.13.1/8.13.1/Debian-6) with ESMTP id i82IRKjg002786; Thu, 2 Sep 2004 18:27:20 GMT Received: (from aravamud@localhost) by arkanoid.beaverton.ibm.com (8.13.1/8.13.1/Debian-6) id i82IRJaf002783; Thu, 2 Sep 2004 18:27:19 GMT X-Authentication-Warning: arkanoid.beaverton.ibm.com: aravamud set sender to nacc@us.ibm.com using -f Date: Thu, 2 Sep 2004 18:27:19 +0000 From: Nishanth Aravamudan To: Margit Schubert-While Cc: netdev@oss.sgi.com, janitor@sternwelten.at Subject: Re: [patch 8/8] prism54/islpci_dev: replace schedule_timeout() with msleep() Message-ID: <20040902182719.GD1944@us.ibm.com> References: <200409021952.18280.margitsw@t-online.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200409021952.18280.margitsw@t-online.de> X-Operating-System: Linux 2.6.73 (i686) User-Agent: Mutt/1.5.6+20040803i X-archive-position: 8358 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nacc@us.ibm.com Precedence: bulk X-list: netdev On Thu, Sep 02, 2004 at 07:52:18PM +0200, Margit Schubert-While wrote: > On Thu, 02 Sep 2004, Nishanth scribeth: > > Keep in mind that msleep_interruptible() is also > > (hopefully) being pushed to the kernel soon > > I think you need this for your current patch set ;-) > eg. In e100, where you replace an interruptible timeout: > > @@ -2020,8 +2016,7 @@ > > I don't think that's correct. The reasoning for me behind changing some of the TASK_INTERRUPTIBLE'd schedule_timeout()s to msleep()s was that LDD somewhat incorrectly advised device driver authors to use an INTERRUPTIBLE timeout for longer delays, when, in fact, they should probably use an UNINTERRUPTIBLE one. Only if signals are explicitly expected to occur is INTERRUPTIBLE necessary (in general). [By long delays, I mean those measurable in msecs] I am not an expert on the E100, so perhaps this was an error on my part. But this is also why I have a header on my patch submission regarding exactly this issue. If someone could verify (none of the maintainers I sent the original patch to did not reply with any problems for this patch) that there is or is not an issue, I'd appreciate it. -Nish From kaber@trash.net Thu Sep 2 11:36:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 11:36:58 -0700 (PDT) Received: from gw.localnet ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82IapdD029088 for ; Thu, 2 Sep 2004 11:36:52 -0700 Received: from [172.16.1.123] (helo=trash.net ident=kaber) by gw.localnet with esmtp (Exim 3.36 #1 (Debian)) id 1C2wWi-0004h7-00; Thu, 02 Sep 2004 20:41:40 +0200 Message-ID: <4137681D.3000902@trash.net> Date: Thu, 02 Sep 2004 20:36:13 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: yoshfuji@linux-ipv6.org, Herbert Xu , netdev@oss.sgi.com Subject: [PATCH 2.6]: Fix suboptimal fragment sizing for last fragment Content-Type: multipart/mixed; boundary="------------050003050700070306040006" X-archive-position: 8359 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------050003050700070306040006 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Yoshifuji's recent fragment patch prevents unnecessary fragmentation when the data can be kept in a single packet, but only for the first packet. When fragmenting, all fragments are still truncated to multiples of 8 and we might end up creating an unnecessary fragment. This dump shows the problem (MTU 1499): 172.16.1.123.32771 > 172.16.195.3.4135: udp 2937 (frag 7066:1472@0+) 172.16.1.123 > 172.16.195.3: udp (frag 7066:1472@1472+) 172.16.1.123 > 172.16.195.3: udp (frag 7066:1@2944) This patch always builds mtu sized fragments and truncates the previous fragment to a multiple of 8 bytes when allocating a new one. With the patch the dump looks like this: 172.16.1.123.32772 > 172.16.195.3.4135: udp 2937 (frag 49641:1472@0+) 172.16.1.123 > 172.16.195.3: udp (frag 49641:1473@1472) Regards Patrick --------------050003050700070306040006 Content-Type: text/x-patch; name="frag.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="frag.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/09/02 17:35:32+02:00 kaber@coreworks.de # [IPV4/IPV6]: Fix suboptimal fragment sizing for last fragment # # Signed-off-by: Patrick McHardy # # net/ipv6/ip6_output.c # 2004/09/02 17:35:14+02:00 kaber@coreworks.de +13 -22 # [IPV4/IPV6]: Fix suboptimal fragment sizing for last fragment # # net/ipv4/ip_output.c # 2004/09/02 17:35:14+02:00 kaber@coreworks.de +20 -49 # [IPV4/IPV6]: Fix suboptimal fragment sizing for last fragment # diff -Nru a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c --- a/net/ipv4/ip_output.c 2004-09-02 17:39:01 +02:00 +++ b/net/ipv4/ip_output.c 2004-09-02 17:39:01 +02:00 @@ -735,10 +735,10 @@ int hh_len; int exthdrlen; int mtu; - int copy = 0; + int copy; int err; int offset = 0; - unsigned int maxfraglen, fragheaderlen, fraggap = 0; + unsigned int maxfraglen, fragheaderlen; int csummode = CHECKSUM_NONE; if (flags&MSG_PROBE) @@ -781,6 +781,7 @@ hh_len = LL_RESERVED_SPACE(rt->u.dst.dev); fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0); + maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen; if (inet->cork.length + length > 0xFFFF - fragheaderlen) { ip_local_error(sk, EMSGSIZE, rt->rt_dst, inet->dport, mtu-exthdrlen); @@ -788,26 +789,11 @@ } /* - * Let's try using as much space as possible to avoid generating - * additional unnecessary small fragment of length - * (mtu-fragheaderlen)%8 if mtu-fragheaderlen is not 0 modulo 8. - * -- yoshfuji - */ - if (fragheaderlen + inet->cork.length + length <= mtu) - maxfraglen = mtu; - else - maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen; - - if (fragheaderlen + inet->cork.length <= mtu && - fragheaderlen + inet->cork.length + length > mtu) - fraggap = 1; - - /* * transhdrlen > 0 means that this is the first fragment and we wish * it won't be fragmented in the future. */ if (transhdrlen && - length + fragheaderlen <= maxfraglen && + length + fragheaderlen <= mtu && rt->u.dst.dev->features&(NETIF_F_IP_CSUM|NETIF_F_NO_CSUM|NETIF_F_HW_CSUM) && !exthdrlen) csummode = CHECKSUM_HW; @@ -821,34 +807,33 @@ * adding appropriate IP header. */ - if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) { - fraggap = 0; + if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) goto alloc_new_skb; - } while (length > 0) { - if ((copy = maxfraglen - skb->len) <= 0) { + if ((copy = mtu - skb->len) <= 0) { char *data; unsigned int datalen; unsigned int fraglen; + unsigned int fraggap; unsigned int alloclen; struct sk_buff *skb_prev; - BUG_TRAP(fraggap || copy == 0); + BUG_TRAP(copy == 0); alloc_new_skb: skb_prev = skb; + fraggap = 0; + if (skb_prev) + fraggap = mtu - maxfraglen; - if (fraggap) - fraggap = -copy; - - datalen = maxfraglen - fragheaderlen; + datalen = mtu - fragheaderlen; if (datalen > length + fraggap) datalen = length + fraggap; fraglen = datalen + fragheaderlen; if ((flags & MSG_MORE) && !(rt->u.dst.dev->features&NETIF_F_SG)) - alloclen = maxfraglen; + alloclen = mtu; else alloclen = datalen + fragheaderlen; @@ -913,7 +898,6 @@ length -= datalen - fraggap; transhdrlen = 0; exthdrlen = 0; - fraggap = 0; csummode = CHECKSUM_NONE; /* @@ -1006,7 +990,7 @@ int mtu; int len; int err; - unsigned int maxfraglen, fragheaderlen, fraggap = 0; + unsigned int maxfraglen, fragheaderlen, fraggap; if (inet->hdrincl) return -EPERM; @@ -1028,27 +1012,13 @@ mtu = inet->cork.fragsize; fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0); + maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen; if (inet->cork.length + size > 0xFFFF - fragheaderlen) { ip_local_error(sk, EMSGSIZE, rt->rt_dst, inet->dport, mtu); return -EMSGSIZE; } - /* - * Let's try using as much space as possible to avoid generating - * additional unnecessary small fragment of length - * (mtu-fragheaderlen)%8 if mtu-fragheaderlen is not 0 modulo 8. - * -- yoshfuji - */ - if (fragheaderlen + inet->cork.length + size <= mtu) - maxfraglen = mtu; - else - maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen; - - if (fragheaderlen + inet->cork.length <= mtu && - fragheaderlen + inet->cork.length + size > mtu) - fraggap = 1; - if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) return -EINVAL; @@ -1056,17 +1026,18 @@ while (size > 0) { int i; - if ((len = maxfraglen - skb->len) <= 0) { + if ((len = mtu - skb->len) <= 0) { struct sk_buff *skb_prev; char *data; struct iphdr *iph; int alloclen; - BUG_TRAP(fraggap || len == 0); + BUG_TRAP(len == 0); skb_prev = skb; - if (fraggap) - fraggap = -len; + fraggap = 0; + if (skb_prev) + fraggap = mtu - maxfraglen; alloclen = fragheaderlen + hh_len + fraggap + 15; skb = sock_wmalloc(sk, alloclen, 1, sk->sk_allocation); diff -Nru a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c --- a/net/ipv6/ip6_output.c 2004-09-02 17:39:01 +02:00 +++ b/net/ipv6/ip6_output.c 2004-09-02 17:39:01 +02:00 @@ -814,11 +814,11 @@ struct inet_opt *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sk_buff *skb; - unsigned int maxfraglen, fragheaderlen, fraggap = 0; + unsigned int maxfraglen, fragheaderlen; int exthdrlen; int hh_len; int mtu; - int copy = 0; + int copy; int err; int offset = 0; int csummode = CHECKSUM_NONE; @@ -867,6 +867,7 @@ hh_len = LL_RESERVED_SPACE(rt->u.dst.dev); fragheaderlen = sizeof(struct ipv6hdr) + (opt ? opt->opt_nflen : 0); + maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen - sizeof(struct frag_hdr); if (mtu <= sizeof(struct ipv6hdr) + IPV6_MAXPLEN) { if (inet->cork.length + length > sizeof(struct ipv6hdr) + IPV6_MAXPLEN - fragheaderlen) { @@ -883,46 +884,37 @@ * * Note that we may need to "move" the data from the tail of * of the buffer to the new fragment when we split - * the message at the first time. + * the message. * * FIXME: It may be fragmented into multiple chunks * at once if non-fragmentable extension headers * are too large. * --yoshfuji */ - if (fragheaderlen + inet->cork.length + length <= mtu) - maxfraglen = mtu; - else - maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen - - sizeof(struct frag_hdr); - - if (fragheaderlen + inet->cork.length <= mtu && - fragheaderlen + inet->cork.length + length > mtu) - fraggap = 1; inet->cork.length += length; - if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) { - fraggap = 0; + if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) goto alloc_new_skb; - } while (length > 0) { - if ((copy = maxfraglen - skb->len) <= 0) { + if ((copy = mtu - skb->len) <= 0) { char *data; unsigned int datalen; unsigned int fraglen; + unsigned int fraggap; unsigned int alloclen; struct sk_buff *skb_prev; - BUG_TRAP(fraggap || copy == 0); + BUG_TRAP(copy == 0); alloc_new_skb: skb_prev = skb; /* There's no room in the current skb */ - if (fraggap) - fraggap = -copy; + fraggap = 0; + if (skb_prev) + fraggap = mtu - maxfraglen; - datalen = maxfraglen - fragheaderlen; + datalen = mtu - fragheaderlen; if (datalen > length + fraggap) datalen = length + fraggap; @@ -930,7 +922,7 @@ fraglen = datalen + fragheaderlen; if ((flags & MSG_MORE) && !(rt->u.dst.dev->features&NETIF_F_SG)) - alloclen = maxfraglen; + alloclen = mtu; else alloclen = datalen + fragheaderlen; @@ -1005,7 +997,6 @@ length -= datalen - fraggap; transhdrlen = 0; exthdrlen = 0; - fraggap = 0; csummode = CHECKSUM_NONE; /* --------------050003050700070306040006-- From yoshfuji@linux-ipv6.org Thu Sep 2 12:47:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 12:47:44 -0700 (PDT) Received: from yue.st-paulia.net ([203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82Jlbg2001233 for ; Thu, 2 Sep 2004 12:47:38 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 1976B33CE6; Fri, 3 Sep 2004 04:48:25 +0900 (JST) Date: Fri, 03 Sep 2004 04:48:23 +0900 (JST) Message-Id: <20040903.044823.82214059.yoshfuji@linux-ipv6.org> To: kaber@trash.net Cc: davem@redhat.com, herbert@debian.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH 2.6]: Fix suboptimal fragment sizing for last fragment From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <4137681D.3000902@trash.net> References: <4137681D.3000902@trash.net> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 8360 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <4137681D.3000902@trash.net> (at Thu, 02 Sep 2004 20:36:13 +0200), Patrick McHardy says: > Yoshifuji's recent fragment patch prevents unnecessary fragmentation > when the data can be kept in a single packet, but only for the first > packet. When fragmenting, all fragments are still truncated to > multiples of 8 and we might end up creating an unnecessary fragment. > > This dump shows the problem (MTU 1499): > > 172.16.1.123.32771 > 172.16.195.3.4135: udp 2937 (frag 7066:1472@0+) > 172.16.1.123 > 172.16.195.3: udp (frag 7066:1472@1472+) > 172.16.1.123 > 172.16.195.3: udp (frag 7066:1@2944) > > This patch always builds mtu sized fragments and truncates the previous > fragment to a multiple of 8 bytes when allocating a new one. With the > patch the dump looks like this: > > > 172.16.1.123.32772 > 172.16.195.3.4135: udp 2937 (frag 49641:1472@0+) > 172.16.1.123 > 172.16.195.3: udp (frag 49641:1473@1472) Let me clarify. Are you sending payload of 2945 bytes (= udp payload of 2937 bytes)? Good point. I'll check this patch today. (Let me sleep for now...) Anyway, please update the comment instead of removing completely. Thanks. --yoshfuji From vkondra@mail.ru Thu Sep 2 13:25:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 13:26:37 -0700 (PDT) Received: from mx1.mail.ru (mx1.mail.ru [194.67.23.121]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82KPolW003606 for ; Thu, 2 Sep 2004 13:25:51 -0700 Received: from [212.179.200.204] (port=14794 helo=[192.168.10.2]) by mx1.mail.ru with esmtp id 1C2y91-00093I-00; Fri, 03 Sep 2004 00:25:21 +0400 From: Vladimir Kondratiev To: netdev@oss.sgi.com Subject: Re: [RFC] acx100 inclusion in mainline; generic 802.11 stack Date: Thu, 2 Sep 2004 23:24:35 +0300 User-Agent: KMail/1.7 Cc: Jeff Garzik , Denis Vlasenko , Jean Tourrilhes , Jouni Malinen , acx100-devel@lists.sourceforge.net, prism54-devel@prism54.org References: <200408312111.02438.vda@port.imtp.ilyichevsk.odessa.ua> <4134C1A7.50600@pobox.com> In-Reply-To: <4134C1A7.50600@pobox.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart3883509.4mOIAh7QTs"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200409022324.43117.vkondra@mail.ru> X-Spam: Probable Spam X-archive-position: 8361 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vkondra@mail.ru Precedence: bulk X-list: netdev --nextPart3883509.4mOIAh7QTs Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Jeff, On Tuesday 31 August 2004 21:21, Jeff Garzik wrote: JG> Denis Vlasenko wrote: JG> > I think 'senior' network guys are in position to decide upon which JG> > of currently available 802.11 stacks we should continue to work. JG> > (Atheros has one, said to be derived from BSD, is there any others?) JG> JG> JG> Already have. Start with the code in wireless-2.6 -- HostAP -- and use JG> DaveM's 802.11 stack template as a model for actually integrating 802.11 JG> very tightly with the rest of the net stack. JG> JG> http://www.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/davem-= p8 0211.tar.bz2=20 Is this stack the main one that is going to be used? I.e. if I am working o= n=20 driver for next generation .11 card - should I try to use it, request/submi= tt=20 missing features etc.? Or should I use wireless extensions? --nextPart3883509.4mOIAh7QTs Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQBBN4GKqxdj7mhC6o0RAsfgAJ9Nz7PdFdRc+2ywEfCEYcS+qutHsQCcCOKM JlRXzBD/qeiPNnWtwViO/VQ= =HvC8 -----END PGP SIGNATURE----- --nextPart3883509.4mOIAh7QTs-- From jgarzik@pobox.com Thu Sep 2 13:34:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 13:35:04 -0700 (PDT) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82KYH72003984 for ; Thu, 2 Sep 2004 13:34:18 -0700 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1C2yHC-0004JW-Dp; Thu, 02 Sep 2004 21:33:46 +0100 Message-ID: <4137839B.4000303@pobox.com> Date: Thu, 02 Sep 2004 16:33:31 -0400 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vladimir Kondratiev CC: netdev@oss.sgi.com, Denis Vlasenko , Jean Tourrilhes , Jouni Malinen , acx100-devel@lists.sourceforge.net, prism54-devel@prism54.org, "David S. Miller" Subject: Re: [RFC] acx100 inclusion in mainline; generic 802.11 stack References: <200408312111.02438.vda@port.imtp.ilyichevsk.odessa.ua> <4134C1A7.50600@pobox.com> <200409022324.43117.vkondra@mail.ru> In-Reply-To: <200409022324.43117.vkondra@mail.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 8362 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Vladimir Kondratiev wrote: > Jeff, > > On Tuesday 31 August 2004 21:21, Jeff Garzik wrote: > JG> Denis Vlasenko wrote: > JG> > I think 'senior' network guys are in position to decide upon which > JG> > of currently available 802.11 stacks we should continue to work. > JG> > (Atheros has one, said to be derived from BSD, is there any others?) > JG> > JG> > JG> Already have. Start with the code in wireless-2.6 -- HostAP -- and use > JG> DaveM's 802.11 stack template as a model for actually integrating 802.11 > JG> very tightly with the rest of the net stack. > JG> > JG> > http://www.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/davem-p8 > 0211.tar.bz2 > > Is this stack the main one that is going to be used? I.e. if I am working on > driver for next generation .11 card - should I try to use it, request/submitt > missing features etc.? Or should I use wireless extensions? DaveM's code is a template for how a wireless stack would look when properly and fully integrated into the net core. Although JeanT and I disagree about this, I am less interested in backwards compatibility than I am about making wireless a "first class citizen" in the kernel. As I have proven with kcompat (http://sf.net/projects/gkernel/) you can be backwards compatible while still evolving the current kernel driver API to meet current design needs. Jeff From ak@suse.de Thu Sep 2 14:20:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 14:20:55 -0700 (PDT) Received: from Cantor.suse.de (cantor.suse.de [195.135.220.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82LKauk004895 for ; Thu, 2 Sep 2004 14:20:37 -0700 Received: from hermes.suse.de (hermes-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id 85706B6C4D8; Thu, 2 Sep 2004 23:19:50 +0200 (CEST) Date: Thu, 2 Sep 2004 23:19:50 +0200 From: Andi Kleen To: Srivatsa Vaddagiri Cc: Andi Kleen , davem@redhat.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, Dipankar , paulmck@us.ibm.com Subject: Re: [RFC] Use RCU for tcp_ehash lookup Message-ID: <20040902211950.GH16175@wotan.suse.de> References: <20040831125941.GA5534@in.ibm.com> <20040831135419.GA17642@wotan.suse.de> <20040901113641.GA3918@in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901113641.GA3918@in.ibm.com> X-archive-position: 8363 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On Wed, Sep 01, 2004 at 05:06:41PM +0530, Srivatsa Vaddagiri wrote: > | 2.6.8.1 | 2.6.8.1 + my patch > ------------------------------------------------------------------------------- > Average cycles | | > spent in | | > __tcp_v4_lookup_established | 2970.65 | 668.227 > | (~3.3 micro-seconds) | (~0.74 microseconds) > ------------------------------------------------------------------------------- > > This repesents improvement by a factor of 77.5%! Nice. > > > > > > And it should also fix the performance problems with > > cat /proc/net/tcp on ppc64/ia64 for large hash tables because the rw locks > > are gone. > > But spinlocks are in! Would that still improve the performance compared to rw > locks? (See me earlier note where I have explained that lookup done for > /proc/net/tcp is _not_ lock-free yet). Yes, spinlocks are much faster than rwlocks. -Andi From davem@redhat.com Thu Sep 2 14:45:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 14:45:58 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82Ljpn3005512 for ; Thu, 2 Sep 2004 14:45:52 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.10/8.12.10) with ESMTP id i82LjZS0001754; Thu, 2 Sep 2004 17:45:40 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id i82LjZ315553; Thu, 2 Sep 2004 17:45:35 -0400 Received: from cheetah.davemloft.net (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11/8.12.10) with SMTP id i82LjRBl012123; Thu, 2 Sep 2004 17:45:27 -0400 Date: Thu, 2 Sep 2004 14:44:36 -0700 From: "David S. Miller" To: Patrick McHardy Cc: yoshfuji@linux-ipv6.org, herbert@debian.org, netdev@oss.sgi.com Subject: Re: [PATCH 2.6]: Fix suboptimal fragment sizing for last fragment Message-Id: <20040902144436.2c8c1337.davem@redhat.com> In-Reply-To: <4137681D.3000902@trash.net> References: <4137681D.3000902@trash.net> X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8364 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Thu, 02 Sep 2004 20:36:13 +0200 Patrick McHardy wrote: > This patch always builds mtu sized fragments and truncates the previous > fragment to a multiple of 8 bytes when allocating a new one. With the > patch the dump looks like this: Looks great Patrick, applied. I see only one remaining possible improvement. If the fraggap area is paged data, we probably should try use page frags in the new SKB if this split occurs in ip_append_page(). From herbert@gondor.apana.org.au Thu Sep 2 15:04:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 15:04:28 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82M4HDs006026 for ; Thu, 2 Sep 2004 15:04:18 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1C2zgK-0007pN-00; Fri, 03 Sep 2004 08:03:48 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1C2zgG-0000vj-00; Fri, 03 Sep 2004 08:03:44 +1000 Date: Fri, 3 Sep 2004 08:03:44 +1000 To: "David S. Miller" Cc: Patrick McHardy , yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: [PATCH 2.6]: Fix suboptimal fragment sizing for last fragment Message-ID: <20040902220343.GA3250@gondor.apana.org.au> References: <4137681D.3000902@trash.net> <20040902144436.2c8c1337.davem@redhat.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="fdj2RfSjLxBAspz7" Content-Disposition: inline In-Reply-To: <20040902144436.2c8c1337.davem@redhat.com> User-Agent: Mutt/1.5.6+20040722i From: Herbert Xu X-archive-position: 8365 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --fdj2RfSjLxBAspz7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Sep 02, 2004 at 02:44:36PM -0700, David S. Miller wrote: > > I see only one remaining possible improvement. If the fraggap > area is paged data, we probably should try use page frags > in the new SKB if this split occurs in ip_append_page(). Yes. That could also tie into another optimisation. But it's an ugly one :) The sk->csum values are added up at the end of the processing so when we move bytes to and fro the total csum value stays the same even if the fragment csum values change. Therefore we can get rid of the csum adjustments except for the parity bit in the getfrag call. Is this an acceptable optimisation to you guys? Another thing we can do is to not always fill up the frags in the middle and then move bytes off them. As it is if you do a send that spans multiple packets each fragment will be filled up to the full and then chopped off when the next one is started. And to finish it off, here is a really trivial patch to shave off 27 bytes from the source code :) It does nothing else, well unless your compiler decides to compile csum_block_sub out-of-line. Signed-off-by: Herbert Xu Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --fdj2RfSjLxBAspz7 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p ===== net/ipv4/ip_output.c 1.65 vs edited ===== --- 1.65/net/ipv4/ip_output.c 2004-09-02 15:07:28 +10:00 +++ edited/net/ipv4/ip_output.c 2004-09-03 07:53:54 +10:00 @@ -896,8 +896,8 @@ skb->csum = skb_copy_and_csum_bits( skb_prev, maxfraglen, data + transhdrlen, fraggap, 0); - skb_prev->csum = csum_block_sub( - skb_prev->csum, skb->csum, 0); + skb_prev->csum = csum_sub(skb_prev->csum, + skb->csum); data += fraggap; skb_trim(skb_prev, maxfraglen); } @@ -1094,8 +1094,8 @@ skb->csum = skb_copy_and_csum_bits( skb_prev, maxfraglen, data, fraggap, 0); - skb_prev->csum = csum_block_sub( - skb_prev->csum, skb->csum, 0); + skb_prev->csum = csum_sub(skb_prev->csum, + skb->csum); skb_trim(skb_prev, maxfraglen); } ===== net/ipv6/ip6_output.c 1.70 vs edited ===== --- 1.70/net/ipv6/ip6_output.c 2004-09-02 15:07:29 +10:00 +++ edited/net/ipv6/ip6_output.c 2004-09-03 07:54:41 +10:00 @@ -985,8 +985,8 @@ skb->csum = skb_copy_and_csum_bits( skb_prev, maxfraglen, data + transhdrlen, fraggap, 0); - skb_prev->csum = csum_block_sub( - skb_prev->csum, skb->csum, 0); + skb_prev->csum = csum_sub(skb_prev->csum, + skb->csum); data += fraggap; skb_trim(skb_prev, maxfraglen); } --fdj2RfSjLxBAspz7-- From keizerflipje@home.nl Thu Sep 2 15:05:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 15:06:03 -0700 (PDT) Received: from smtpq3.home.nl (smtpq3.home.nl [213.51.128.198]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82M5tmo006283 for ; Thu, 2 Sep 2004 15:05:56 -0700 Received: from [213.51.128.135] (port=55609 helo=smtp4.home.nl) by smtpq3.home.nl with esmtp (Exim 4.30) id 1C2ziE-0008C4-D0; Fri, 03 Sep 2004 00:05:46 +0200 Received: from cp232498-a.gelen1.lb.home.nl ([217.120.68.81]:51665 helo=[10.0.0.200]) by smtp4.home.nl with esmtp (Exim 4.30) id 1C2ziB-0002zZ-Ev; Fri, 03 Sep 2004 00:05:43 +0200 Message-ID: <41379937.5060301@home.nl> Date: Fri, 03 Sep 2004 00:05:43 +0200 From: Pascal de Bruijn User-Agent: Mozilla Thunderbird 0.7.3 (Windows/20040803) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu CC: netdev@oss.sgi.com Subject: Re: r8169 1.6LK lockup References: <4137418D.2040706@home.nl> <20040902174437.GA12068@electric-eye.fr.zoreil.com> In-Reply-To: <20040902174437.GA12068@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AtHome-MailScanner-Information: Neem contact op met support@home.nl voor meer informatie X-AtHome-MailScanner: Found to be clean X-archive-position: 8366 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: keizerflipje@home.nl Precedence: bulk X-list: netdev Right, I've patched my kernel, and now everything works like a breeze... I enabled all features: ethtool -K eth0 rx on tx on sg on tso on Then I uploaded 10GB and downloaded about 2GB (in an alternating style). No more lockups... Good work! Thanks, Pascal de Bruijn Francois Romieu wrote: >Pascal de Bruijn : >[francois messed the r8169 update] > >How does the system perform if you apply the patches below as well: >http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.9-rc1-mm1/r8169-130.patch >http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.9-rc1-mm1/r8169-140.patch > >Please Cc: netdev@oss.sgi.com > >-- >Ueimor > > > From davem@redhat.com Thu Sep 2 15:10:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 15:10:06 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82M9x4x006811 for ; Thu, 2 Sep 2004 15:10:00 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.10/8.12.10) with ESMTP id i82M9US0007366; Thu, 2 Sep 2004 18:09:42 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id i82M9T321944; Thu, 2 Sep 2004 18:09:29 -0400 Received: from cheetah.davemloft.net (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11/8.12.10) with SMTP id i82M9Ljj022112; Thu, 2 Sep 2004 18:09:21 -0400 Date: Thu, 2 Sep 2004 15:08:30 -0700 From: "David S. Miller" To: Herbert Xu Cc: kaber@trash.net, yoshfuji@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: [PATCH 2.6]: Fix suboptimal fragment sizing for last fragment Message-Id: <20040902150830.6585cfc5.davem@redhat.com> In-Reply-To: <20040902220343.GA3250@gondor.apana.org.au> References: <4137681D.3000902@trash.net> <20040902144436.2c8c1337.davem@redhat.com> <20040902220343.GA3250@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8367 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev On Fri, 3 Sep 2004 08:03:44 +1000 Herbert Xu wrote: > That could also tie into another optimisation. But it's > an ugly one :) The sk->csum values are added up at the end of > the processing so when we move bytes to and fro the total csum > value stays the same even if the fragment csum values change. > > Therefore we can get rid of the csum adjustments except for the > parity bit in the getfrag call. > > Is this an acceptable optimisation to you guys? I have no problems with this. > Another thing we can do is to not always fill up the frags in the middle > and then move bytes off them. As it is if you do a send that spans > multiple packets each fragment will be filled up to the full and then > chopped off when the next one is started. Please elaborate. > And to finish it off, here is a really trivial patch to shave off 27 > bytes from the source code :) It does nothing else, well unless your > compiler decides to compile csum_block_sub out-of-line. Applied :-) From romieu@fr.zoreil.com Thu Sep 2 16:06:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 16:06:27 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i82N6KBe007957 for ; Thu, 2 Sep 2004 16:06:21 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id i82N2fvr016876; Fri, 3 Sep 2004 01:02:41 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id i82N2eqD016875; Fri, 3 Sep 2004 01:02:40 +0200 Date: Fri, 3 Sep 2004 01:02:40 +0200 From: Francois Romieu To: Pascal de Bruijn Cc: netdev@oss.sgi.com Subject: Re: r8169 1.6LK lockup Message-ID: <20040902230240.GA16747@electric-eye.fr.zoreil.com> References: <4137418D.2040706@home.nl> <20040902174437.GA12068@electric-eye.fr.zoreil.com> <41379937.5060301@home.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41379937.5060301@home.nl> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 8368 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Pascal de Bruijn : [fully featured patched r8169 driver] > > Then I uploaded 10GB and downloaded about 2GB (in an alternating style). > > No more lockups... > > Good work! It appears to be better but still not rock solid. Grrr... -- Ueimor From yoshfuji@linux-ipv6.org Thu Sep 2 18:40:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 18:40:10 -0700 (PDT) Received: from yue.st-paulia.net ([203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i831e3Wr014315 for ; Thu, 2 Sep 2004 18:40:03 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 4CB4633CE6; Fri, 3 Sep 2004 10:40:51 +0900 (JST) Date: Fri, 03 Sep 2004 10:40:50 +0900 (JST) Message-Id: <20040903.104050.29603454.yoshfuji@linux-ipv6.org> To: davem@redhat.com, herbert@gondor.apana.org.au Cc: kaber@trash.net, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH 2.6]: Fix suboptimal fragment sizing for last fragment From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20040902220343.GA3250@gondor.apana.org.au> References: <4137681D.3000902@trash.net> <20040902144436.2c8c1337.davem@redhat.com> <20040902220343.GA3250@gondor.apana.org.au> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 8369 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20040902220343.GA3250@gondor.apana.org.au> (at Fri, 3 Sep 2004 08:03:44 +1000), Herbert Xu says: > Another thing we can do is to not always fill up the frags in the middle > and then move bytes off them. As it is if you do a send that spans > multiple packets each fragment will be filled up to the full and then > chopped off when the next one is started. I think I had similar impression when I saw the Patrick's patch. Here's the optimization: if we know the remaining data exceeds the mtu, we do not need to fill up full of it. skb_prev: mtu +----------+--+-+ | | | | +----------+--+-+ ^ ^ skb_prev->len | maxfraglen appending data: +--------+ | | +--------+ ---------> length In this case, we know we need more fragment(s). So, let's fill up to maxfraglen (instead of mtu) to avoid needless copy in the next loop. Signed-off-by: Hideaki YOSHIFUJI ===== net/ipv4/ip_output.c 1.67 vs edited ===== --- 1.67/net/ipv4/ip_output.c 2004-09-03 06:50:20 +09:00 +++ edited/net/ipv4/ip_output.c 2004-09-03 10:15:53 +09:00 @@ -811,26 +811,33 @@ goto alloc_new_skb; while (length > 0) { - if ((copy = mtu - skb->len) <= 0) { + /* Check if the remaining data fits into current packet. */ + copy = mtu - skb->len; + if (copy < length) + copy = maxfraglen - skb->len; + if (copy <= 0) { char *data; unsigned int datalen; unsigned int fraglen; unsigned int fraggap; unsigned int alloclen; struct sk_buff *skb_prev; - BUG_TRAP(copy == 0); - alloc_new_skb: skb_prev = skb; - fraggap = 0; if (skb_prev) - fraggap = mtu - maxfraglen; - - datalen = mtu - fragheaderlen; - if (datalen > length + fraggap) - datalen = length + fraggap; + fraggap = skb_prev->len - maxfraglen; + else + fraggap = 0; + /* + * If remaining data exceeds the mtu, + * we know we need more fragment(s). + */ + datalen = length + fraggap; + if (datalen > mtu - fragheaderlen) + datalen = maxfraglen - fragheaderlen; fraglen = datalen + fragheaderlen; + if ((flags & MSG_MORE) && !(rt->u.dst.dev->features&NETIF_F_SG)) alloclen = mtu; @@ -1026,18 +1033,22 @@ while (size > 0) { int i; - if ((len = mtu - skb->len) <= 0) { + + /* Check if the remaining data fits into current packet. */ + len = mtu - skb->len; + if (len > size) + len = maxfraglen - skb->len; + if (len <= 0) { struct sk_buff *skb_prev; char *data; struct iphdr *iph; int alloclen; - BUG_TRAP(len == 0); - skb_prev = skb; - fraggap = 0; if (skb_prev) - fraggap = mtu - maxfraglen; + fraggap = skb_prev->len - maxfraglen; + else + fraggap = 0; alloclen = fragheaderlen + hh_len + fraggap + 15; skb = sock_wmalloc(sk, alloclen, 1, sk->sk_allocation); ===== net/ipv6/ip6_output.c 1.72 vs edited ===== --- 1.72/net/ipv6/ip6_output.c 2004-09-03 06:50:20 +09:00 +++ edited/net/ipv6/ip6_output.c 2004-09-03 10:24:57 +09:00 @@ -898,26 +898,34 @@ goto alloc_new_skb; while (length > 0) { - if ((copy = mtu - skb->len) <= 0) { + /* Check if the remaining data fits into current packet. */ + copy = mtu - skb->len; + if (copy < length) + copy = maxfraglen - skb->len; + + if (copy <= 0) { char *data; unsigned int datalen; unsigned int fraglen; unsigned int fraggap; unsigned int alloclen; struct sk_buff *skb_prev; - BUG_TRAP(copy == 0); alloc_new_skb: skb_prev = skb; /* There's no room in the current skb */ - fraggap = 0; if (skb_prev) - fraggap = mtu - maxfraglen; - - datalen = mtu - fragheaderlen; + fraggap = skb_prev->len - maxfraglen; + else + fraggap = 0; - if (datalen > length + fraggap) - datalen = length + fraggap; + /* + * If remaining data exceeds the mtu, + * we know we need more fragment(s). + */ + datalen = length + fraggap; + if (datalen > mtu - fragheaderlen) + datalen = maxfraglen - fragheaderlen; fraglen = datalen + fragheaderlen; if ((flags & MSG_MORE) && -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From vatsa@in.ibm.com Thu Sep 2 21:04:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 21:05:05 -0700 (PDT) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i8344qbw020552 for ; Thu, 2 Sep 2004 21:04:53 -0700 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e1.ny.us.ibm.com (8.12.10/NS PXFA) with ESMTP id i8344dCR013368; Fri, 3 Sep 2004 00:04:39 -0400 Received: from snowy.in.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i8345mND146320; Fri, 3 Sep 2004 00:05:51 -0400 Received: by snowy.in.ibm.com (Postfix, from userid 502) id ED4CD24E32; Thu, 2 Sep 2004 19:34:44 +0530 (IST) Date: Thu, 2 Sep 2004 19:34:44 +0530 From: Srivatsa Vaddagiri To: "David S. Miller" Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, dipankar@in.ibm.com, paulmck@us.ibm.com Subject: Re: [RFC] Use RCU for tcp_ehash lookup Message-ID: <20040902140444.GA4808@in.ibm.com> Reply-To: vatsa@in.ibm.com References: <20040831125941.GA5534@in.ibm.com> <20040901224108.3b2d692d.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901224108.3b2d692d.davem@redhat.com> User-Agent: Mutt/1.4.1i X-archive-position: 8370 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vatsa@in.ibm.com Precedence: bulk X-list: netdev On Wed, Sep 01, 2004 at 10:41:08PM -0700, David S. Miller wrote: > The reason you don't see any improvement is that the ehash table is > pretty write heavy. In my simple one-file-transfer-test-at-a-time, it should have been read-mostly. Probably the fact lookups are not serialized wrt input pakcet processing may have shadowed the benefits of lock-free lookup. However perhaps if I have multiple file transfer sessions in progress (one per cpu maybe), then the benefit of reduced time spent in looking up a socket, could be passed on to threads doing network input. > I'm not totally against your patch, I just don't think that the TCP established > hash table qualifies as "read heavy" as per what RCU is truly effective for. IMHO the benefits of lock-free will be seen only in such scenarios, i.e where read_lock ended up having to spin-wait on a update to finish. In the lock-free case, there is no such wait. > That's exactly what I was concerned about when I saw that you had attempted > this change. It is incredibly important for state changes and updates to > be seen as atomic by the packet input processing engine. It would be illegal > for a cpu running TCP input to see a socket in two tables at the same time > (for example, in the main established area and in the second half for TIME_WAIT > buckets). > > If the visibility of the socket is wrong, sockets could be erroneously > be reset during the transition from established to TIME_WAIT state. > Beware! This is precisely the reason why I changed the order of movement in __tcp_tw_hashdance. Earlier, it was removing the socket from the established half and _then_ adding it to time-wait half. This would have lead to a window where the socket is neither in established-half not in the time-wait half. A packet arriving in this window (& doing lock-free lookup) would have been dropped. Hence I reversed the order of movement to add in time-wait first before removing from established half. > > Note that __tcp_v4_lookup_established should not be affected by the above > > movement because I found it scans the established half first and _then_ the > > time wait half. So even if the same socket is present in both established half > > and time wait half, __tcp_v4_lookup_established will lookup only one of them > > (& not both). > > I hope this is true. AFAICS it is true! If __tcp_v4_lookup_established finds it in the established half, it does no further lookup in the time-wait half. -- Thanks and Regards, Srivatsa Vaddagiri, Linux Technology Center, IBM Software Labs, Bangalore, INDIA - 560017 From max@stro.at Thu Sep 2 23:47:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Sep 2004 23:47:08 -0700 (PDT) Received: from baikonur.stro.at (baikonur.stro.at [213.239.196.228]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i836l1Yp024411 for ; Thu, 2 Sep 2004 23:47:02 -0700 Received: from localhost (localhost [127.0.0.1]) by baikonur.stro.at (Postfix) with ESMTP id 271885C065; Fri, 3 Sep 2004 08:46:51 +0200 (CEST) Received: from baikonur.stro.at ([127.0.0.1]) by localhost (baikonur [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 28481-06; Fri, 3 Sep 2004 08:46:50 +0200 (CEST) Received: from sputnik (M914P027.adsl.highway.telekom.at [62.47.146.59]) by baikonur.stro.at (Postfix) with ESMTP id 8A0AA5C008; Fri, 3 Sep 2004 08:46:50 +0200 (CEST) Received: from max by sputnik with local (Exim 4.34) id 1C37qV-0001Hl-Le; Fri, 03 Sep 2004 08:46:51 +0200 Date: Fri, 3 Sep 2004 08:46:51 +0200 From: maximilian attems To: netdev@oss.sgi.com, jgarzik@pobox.com Subject: [patch] prism54 remove unused macro Message-ID: <20040903064651.GB1856@stro.at> Mail-Followup-To: netdev@oss.sgi.com, jgarzik@pobox.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: by Amavis (ClamAV) at stro.at X-archive-position: 8371 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: janitor@sternwelten.at Precedence: bulk X-list: netdev the TRACE macro was noticed because of it's string concatenation. this patch removes it as it's unused. Signed-off-by: Maximilian Attems --- linux-2.6.9-rc1-bk7-orig/drivers/net/wireless/prism54/islpci_mgt.h 2004-08-14 12:54:51.000000000 +0200 +++ linux-2.6.9-rc1-bk7/drivers/net/wireless/prism54/islpci_mgt.h 2004-09-03 08:35:30.000000000 +0200 @@ -31,8 +31,6 @@ #define K_DEBUG(f, m, args...) do { if(f & m) printk(KERN_DEBUG args); } while(0) #define DEBUG(f, args...) K_DEBUG(f, pc_debug, args) -#define TRACE(devname) K_DEBUG(SHOW_TRACING, VERBOSE, "%s: -> " __FUNCTION__ "()\n", devname) - extern int pc_debug; #define init_wds 0 /* help compiler optimize away dead code */ --- linux-2.6.9-rc1-bk7-orig/drivers/net/wireless/prism54/islpci_hotplug.c 2004-08-14 12:55:32.000000000 +0200 +++ linux-2.6.9-rc1-bk7/drivers/net/wireless/prism54/islpci_hotplug.c 2004-09-03 08:35:45.000000000 +0200 @@ -107,8 +107,6 @@ prism54_probe(struct pci_dev *pdev, cons islpci_private *priv; int rvalue; - /* TRACE(DRV_NAME); */ - /* Enable the pci device */ if (pci_enable_device(pdev)) { From laforge@netfilter.org Fri Sep 3 00:02:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 00:03:00 -0700 (PDT) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i8372q8R025104 for ; Fri, 3 Sep 2004 00:02:52 -0700 Received: from [192.168.200.2] (helo=sunbeam.gnumonks.org) by coruscant.gnumonks.org with esmtp (TLSv1:RC4-SHA:128) (Exim 4.20) id 1C385p-0001iI-J5; Fri, 03 Sep 2004 09:02:42 +0200 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1C385i-0000R6-LU; Fri, 03 Sep 2004 09:02:34 +0200 Date: Fri, 3 Sep 2004 09:02:34 +0200 From: Harald Welte To: David Miller Cc: Netfilter Development Mailinglist , netdev@oss.sgi.com Subject: [PATCH 2.6] 2/2: Fix NAT helper locking Message-ID: <20040903070234.GQ26263@sunbeam.de.gnumonks.org> Mail-Followup-To: Harald Welte , David Miller , Netfilter Development Mailinglist , netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="CRjAHycgiaTQGSqU" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040818i X-archive-position: 8372 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --CRjAHycgiaTQGSqU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Dave! This is the second of a two part patch. This part fixes the locking in NAT helpers. Please apply, Thanks. # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/08/08 01:40:28+02:00 kaber@coreworks.de=20 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers # =20 # There is a possible deadlock condition with conntrack/nat-helpers: # =20 # CPU1: # conntrack-helper:help: lock(private_lock) # ip_conntrack_expect_related: write_lock(ip_conntrack_lock) # =20 # CPU2: # nat-core:do_bindings: read_lock(ip_conntrack_lock) # nat-helper:help: lock(private_lock) # =20 # The lock in the nat-helper is unneccessary because the expectation # is never changed and is protected by ip_conntrack_lock. # =20 # Signed-off-by: Patrick McHardy # Signed-off-by: Harald Welte #=20 # net/ipv4/netfilter/ip_nat_irc.c # 2004/08/08 01:40:10+02:00 kaber@coreworks.de +1 -17 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # net/ipv4/netfilter/ip_nat_ftp.c # 2004/08/08 01:40:10+02:00 kaber@coreworks.de +1 -19 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # net/ipv4/netfilter/ip_conntrack_irc.c # 2004/08/08 01:40:10+02:00 kaber@coreworks.de +3 -4 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # net/ipv4/netfilter/ip_conntrack_ftp.c # 2004/08/08 01:40:10+02:00 kaber@coreworks.de +1 -2 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # include/linux/netfilter_ipv4/ip_conntrack_irc.h # 2004/08/08 01:40:10+02:00 kaber@coreworks.de +0 -5 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # include/linux/netfilter_ipv4/ip_conntrack_ftp.h # 2004/08/08 01:40:10+02:00 kaber@coreworks.de +0 -5 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 diff -Nru a/include/linux/netfilter_ipv4/ip_conntrack_ftp.h b/include/linux= /netfilter_ipv4/ip_conntrack_ftp.h --- a/include/linux/netfilter_ipv4/ip_conntrack_ftp.h 2004-08-08 01:41:23 += 02:00 +++ b/include/linux/netfilter_ipv4/ip_conntrack_ftp.h 2004-08-08 01:41:23 += 02:00 @@ -4,11 +4,6 @@ =20 #ifdef __KERNEL__ =20 -#include - -/* Protects ftp part of conntracks */ -DECLARE_LOCK_EXTERN(ip_ftp_lock); - #define FTP_PORT 21 =20 #endif /* __KERNEL__ */ diff -Nru a/include/linux/netfilter_ipv4/ip_conntrack_irc.h b/include/linux= /netfilter_ipv4/ip_conntrack_irc.h --- a/include/linux/netfilter_ipv4/ip_conntrack_irc.h 2004-08-08 01:41:23 += 02:00 +++ b/include/linux/netfilter_ipv4/ip_conntrack_irc.h 2004-08-08 01:41:23 += 02:00 @@ -33,12 +33,7 @@ =20 #ifdef __KERNEL__ =20 -#include - #define IRC_PORT 6667 - -/* Protects irc part of conntracks */ -DECLARE_LOCK_EXTERN(ip_irc_lock); =20 #endif /* __KERNEL__ */ =20 diff -Nru a/net/ipv4/netfilter/ip_conntrack_ftp.c b/net/ipv4/netfilter/ip_c= onntrack_ftp.c --- a/net/ipv4/netfilter/ip_conntrack_ftp.c 2004-08-08 01:41:23 +02:00 +++ b/net/ipv4/netfilter/ip_conntrack_ftp.c 2004-08-08 01:41:23 +02:00 @@ -27,7 +27,7 @@ /* This is slow, but it's simple. --RR */ static char ftp_buffer[65536]; =20 -DECLARE_LOCK(ip_ftp_lock); +static DECLARE_LOCK(ip_ftp_lock); struct module *ip_conntrack_ftp =3D THIS_MODULE; =20 #define MAX_PORTS 8 @@ -455,7 +455,6 @@ } =20 PROVIDES_CONNTRACK(ftp); -EXPORT_SYMBOL(ip_ftp_lock); =20 module_init(init); module_exit(fini); diff -Nru a/net/ipv4/netfilter/ip_conntrack_irc.c b/net/ipv4/netfilter/ip_c= onntrack_irc.c --- a/net/ipv4/netfilter/ip_conntrack_irc.c 2004-08-08 01:41:23 +02:00 +++ b/net/ipv4/netfilter/ip_conntrack_irc.c 2004-08-08 01:41:23 +02:00 @@ -40,6 +40,7 @@ static unsigned int dcc_timeout =3D 300; /* This is slow, but it's simple. --RR */ static char irc_buffer[65536]; +static DECLARE_LOCK(irc_buffer_lock); =20 MODULE_AUTHOR("Harald Welte "); MODULE_DESCRIPTION("IRC (DCC) connection tracking helper"); @@ -54,7 +55,6 @@ static char *dccprotos[] =3D { "SEND ", "CHAT ", "MOVE ", "TSEND ", "SCHAT= " }; #define MINMATCHLEN 5 =20 -DECLARE_LOCK(ip_irc_lock); struct module *ip_conntrack_irc =3D THIS_MODULE; =20 #if 0 @@ -134,7 +134,7 @@ if (dataoff >=3D skb->len) return NF_ACCEPT; =20 - LOCK_BH(&ip_irc_lock); + LOCK_BH(&irc_buffer_lock); skb_copy_bits(skb, dataoff, irc_buffer, skb->len - dataoff); =20 data =3D irc_buffer; @@ -227,7 +227,7 @@ } /* while data < ... */ =20 out: - UNLOCK_BH(&ip_irc_lock); + UNLOCK_BH(&irc_buffer_lock); return NF_ACCEPT; } =20 @@ -302,7 +302,6 @@ } =20 PROVIDES_CONNTRACK(irc); -EXPORT_SYMBOL(ip_irc_lock); =20 module_init(init); module_exit(fini); diff -Nru a/net/ipv4/netfilter/ip_nat_ftp.c b/net/ipv4/netfilter/ip_nat_ftp= =2Ec --- a/net/ipv4/netfilter/ip_nat_ftp.c 2004-08-08 01:41:23 +02:00 +++ b/net/ipv4/netfilter/ip_nat_ftp.c 2004-08-08 01:41:23 +02:00 @@ -35,8 +35,6 @@ =20 MODULE_PARM(ports, "1-" __MODULE_STRING(MAX_PORTS) "i"); =20 -DECLARE_LOCK_EXTERN(ip_ftp_lock); - /* FIXME: Time out? --RR */ =20 static unsigned int @@ -59,8 +57,6 @@ DEBUGP("nat_expected: We have a connection!\n"); exp_ftp_info =3D &ct->master->help.exp_ftp_info; =20 - LOCK_BH(&ip_ftp_lock); - if (exp_ftp_info->ftptype =3D=3D IP_CT_FTP_PORT || exp_ftp_info->ftptype =3D=3D IP_CT_FTP_EPRT) { /* PORT command: make connection go to the client. */ @@ -75,7 +71,6 @@ DEBUGP("nat_expected: PASV cmd. %u.%u.%u.%u->%u.%u.%u.%u\n", NIPQUAD(newsrcip), NIPQUAD(newdstip)); } - UNLOCK_BH(&ip_ftp_lock); =20 if (HOOK2MANIP(hooknum) =3D=3D IP_NAT_MANIP_SRC) newip =3D newsrcip; @@ -111,8 +106,6 @@ { char buffer[sizeof("nnn,nnn,nnn,nnn,nnn,nnn")]; =20 - MUST_BE_LOCKED(&ip_ftp_lock); - sprintf(buffer, "%u,%u,%u,%u,%u,%u", NIPQUAD(newip), port>>8, port&0xFF); =20 @@ -134,8 +127,6 @@ { char buffer[sizeof("|1|255.255.255.255|65535|")]; =20 - MUST_BE_LOCKED(&ip_ftp_lock); - sprintf(buffer, "|1|%u.%u.%u.%u|%u|", NIPQUAD(newip), port); =20 DEBUGP("calling ip_nat_mangle_tcp_packet\n"); @@ -156,8 +147,6 @@ { char buffer[sizeof("|||65535|")]; =20 - MUST_BE_LOCKED(&ip_ftp_lock); - sprintf(buffer, "|||%u|", port); =20 DEBUGP("calling ip_nat_mangle_tcp_packet\n"); @@ -189,7 +178,6 @@ u_int16_t port; struct ip_conntrack_tuple newtuple; =20 - MUST_BE_LOCKED(&ip_ftp_lock); DEBUGP("FTP_NAT: seq %u + %u in %u\n", expect->seq, exp_ftp_info->len, ntohl(tcph->seq)); @@ -268,15 +256,12 @@ } =20 datalen =3D (*pskb)->len - iph->ihl * 4 - tcph->doff * 4; - LOCK_BH(&ip_ftp_lock); /* If it's in the right range... */ if (between(exp->seq + exp_ftp_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen)) { - if (!ftp_data_fixup(exp_ftp_info, ct, pskb, ctinfo, exp)) { - UNLOCK_BH(&ip_ftp_lock); + if (!ftp_data_fixup(exp_ftp_info, ct, pskb, ctinfo, exp)) return NF_DROP; - } } else { /* Half a match? This means a partial retransmisison. It's a cracker being funky. */ @@ -286,11 +271,8 @@ ntohl(tcph->seq), ntohl(tcph->seq) + datalen); } - UNLOCK_BH(&ip_ftp_lock); return NF_DROP; } - UNLOCK_BH(&ip_ftp_lock); - return NF_ACCEPT; } =20 diff -Nru a/net/ipv4/netfilter/ip_nat_irc.c b/net/ipv4/netfilter/ip_nat_irc= =2Ec --- a/net/ipv4/netfilter/ip_nat_irc.c 2004-08-08 01:41:23 +02:00 +++ b/net/ipv4/netfilter/ip_nat_irc.c 2004-08-08 01:41:23 +02:00 @@ -44,9 +44,6 @@ MODULE_PARM(ports, "1-" __MODULE_STRING(MAX_PORTS) "i"); MODULE_PARM_DESC(ports, "port numbers of IRC servers"); =20 -/* protects irc part of conntracks */ -DECLARE_LOCK_EXTERN(ip_irc_lock); - /* FIXME: Time out? --RR */ =20 static unsigned int @@ -102,8 +99,6 @@ /* "4294967296 65635 " */ char buffer[18]; =20 - MUST_BE_LOCKED(&ip_irc_lock); - DEBUGP("IRC_NAT: info (seq %u + %u) in %u\n", expect->seq, exp_irc_info->len, ntohl(tcph->seq)); @@ -111,11 +106,6 @@ newip =3D ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; =20 /* Alter conntrack's expectations. */ - - /* We can read expect here without conntrack lock, since it's - only set in ip_conntrack_irc, with ip_irc_lock held - writable */ - t =3D expect->tuple; t.dst.ip =3D newip; for (port =3D exp_irc_info->port; port !=3D 0; port++) { @@ -185,15 +175,12 @@ DEBUGP("got beyond not touching\n"); =20 datalen =3D (*pskb)->len - iph->ihl * 4 - tcph->doff * 4; - LOCK_BH(&ip_irc_lock); /* Check whether the whole IP/address pattern is carried in the payload */ if (between(exp->seq + exp_irc_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen)) { - if (!irc_data_fixup(exp_irc_info, ct, pskb, ctinfo, exp)) { - UNLOCK_BH(&ip_irc_lock); + if (!irc_data_fixup(exp_irc_info, ct, pskb, ctinfo, exp)) return NF_DROP; - } } else {=20 /* Half a match? This means a partial retransmisison. It's a cracker being funky. */ @@ -204,11 +191,8 @@ ntohl(tcph->seq), ntohl(tcph->seq) + datalen); } - UNLOCK_BH(&ip_irc_lock); return NF_DROP; } - UNLOCK_BH(&ip_irc_lock); - return NF_ACCEPT; } =20 --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --CRjAHycgiaTQGSqU Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBOBcKXaXGVTD0i/8RAt9bAJ9mgosYQaErcJ6v/8dofPmfdB+HPwCcC+hu q/G+Xbg03/ej1aGUj2zhWSE= =+hPN -----END PGP SIGNATURE----- --CRjAHycgiaTQGSqU-- From laforge@netfilter.org Fri Sep 3 00:04:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 00:04:21 -0700 (PDT) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i8374ElJ025288 for ; Fri, 3 Sep 2004 00:04:14 -0700 Received: from [192.168.200.2] (helo=sunbeam.gnumonks.org) by coruscant.gnumonks.org with esmtp (TLSv1:RC4-SHA:128) (Exim 4.20) id 1C387A-0001kd-FJ; Fri, 03 Sep 2004 09:04:05 +0200 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1C3873-0000RP-M4; Fri, 03 Sep 2004 09:03:57 +0200 Date: Fri, 3 Sep 2004 09:03:57 +0200 From: Harald Welte To: David Miller Cc: Netfilter Development Mailinglist , netdev@oss.sgi.com Subject: [PATCH 2.4] 1/2: Rename NAT helper structures Message-ID: <20040903070357.GR26263@sunbeam.de.gnumonks.org> Mail-Followup-To: Harald Welte , David Miller , Netfilter Development Mailinglist , netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="bxF9Dep5HzwGj9mC" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040818i X-archive-position: 8373 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --bxF9Dep5HzwGj9mC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Dave! This is the first of a two part patch. Part one fixes confusing naming of some NAT helper data structures (ct_ are part of ip_conntrack, exp_ are part of ip_conntrack_expect). This patch is required to make the second apply, which fixes NAT helper locking. Please apply, thanks. # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/08/08 12:26:16+02:00 kaber@coreworks.de=20 # [NETFILTER]: Fix confusing naming in NAT-helpers # =20 # Signed-off-by: Patrick McHardy # Signed-off-by: Harald Welte #=20 # net/ipv4/netfilter/ip_nat_irc.c # 2004/08/08 12:26:12+02:00 kaber@coreworks.de +9 -9 # [NETFILTER]: Fix confusing naming in NAT-helpers #=20 # net/ipv4/netfilter/ip_nat_ftp.c # 2004/08/08 12:26:12+02:00 kaber@coreworks.de +12 -12 # [NETFILTER]: Fix confusing naming in NAT-helpers #=20 diff -Nru a/net/ipv4/netfilter/ip_nat_ftp.c b/net/ipv4/netfilter/ip_nat_ftp= =2Ec --- a/net/ipv4/netfilter/ip_nat_ftp.c 2004-08-08 12:49:36 +02:00 +++ b/net/ipv4/netfilter/ip_nat_ftp.c 2004-08-08 12:49:36 +02:00 @@ -166,7 +166,7 @@ [IP_CT_FTP_EPSV] mangle_epsv_packet }; =20 -static int ftp_data_fixup(const struct ip_ct_ftp_expect *ct_ftp_info, +static int ftp_data_fixup(const struct ip_ct_ftp_expect *exp_ftp_info, struct ip_conntrack *ct, struct sk_buff **pskb, enum ip_conntrack_info ctinfo, @@ -180,13 +180,13 @@ =20 MUST_BE_LOCKED(&ip_ftp_lock); DEBUGP("FTP_NAT: seq %u + %u in %u\n", - expect->seq, ct_ftp_info->len, + expect->seq, exp_ftp_info->len, ntohl(tcph->seq)); =20 /* Change address inside packet to match way we're mapping this connection. */ - if (ct_ftp_info->ftptype =3D=3D IP_CT_FTP_PASV - || ct_ftp_info->ftptype =3D=3D IP_CT_FTP_EPSV) { + if (exp_ftp_info->ftptype =3D=3D IP_CT_FTP_PASV + || exp_ftp_info->ftptype =3D=3D IP_CT_FTP_EPSV) { /* PASV/EPSV response: must be where client thinks server is */ newip =3D ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip; @@ -208,7 +208,7 @@ newtuple.src.u.tcp.port =3D expect->tuple.src.u.tcp.port; =20 /* Try to get same port: if not, try to change it. */ - for (port =3D ct_ftp_info->port; port !=3D 0; port++) { + for (port =3D exp_ftp_info->port; port !=3D 0; port++) { newtuple.dst.u.tcp.port =3D htons(port); =20 if (ip_conntrack_change_expect(expect, &newtuple) =3D=3D 0) @@ -217,9 +217,9 @@ if (port =3D=3D 0) return 0; =20 - if (!mangle[ct_ftp_info->ftptype](pskb, newip, port, + if (!mangle[exp_ftp_info->ftptype](pskb, newip, port, expect->seq - ntohl(tcph->seq), - ct_ftp_info->len, ct, ctinfo)) + exp_ftp_info->len, ct, ctinfo)) return 0; =20 return 1; @@ -236,12 +236,12 @@ struct tcphdr *tcph =3D (void *)iph + iph->ihl*4; unsigned int datalen; int dir; - struct ip_ct_ftp_expect *ct_ftp_info; + struct ip_ct_ftp_expect *exp_ftp_info; =20 if (!exp) DEBUGP("ip_nat_ftp: no exp!!"); =20 - ct_ftp_info =3D &exp->help.exp_ftp_info; + exp_ftp_info =3D &exp->help.exp_ftp_info; =20 /* Only mangle things once: original direction in POST_ROUTING and reply direction on PRE_ROUTING. */ @@ -259,10 +259,10 @@ datalen =3D (*pskb)->len - iph->ihl * 4 - tcph->doff * 4; LOCK_BH(&ip_ftp_lock); /* If it's in the right range... */ - if (between(exp->seq + ct_ftp_info->len, + if (between(exp->seq + exp_ftp_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen)) { - if (!ftp_data_fixup(ct_ftp_info, ct, pskb, ctinfo, exp)) { + if (!ftp_data_fixup(exp_ftp_info, ct, pskb, ctinfo, exp)) { UNLOCK_BH(&ip_ftp_lock); return NF_DROP; } @@ -271,7 +271,7 @@ It's a cracker being funky. */ if (net_ratelimit()) { printk("FTP_NAT: partial packet %u/%u in %u/%u\n", - exp->seq, ct_ftp_info->len, + exp->seq, exp_ftp_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen); } diff -Nru a/net/ipv4/netfilter/ip_nat_irc.c b/net/ipv4/netfilter/ip_nat_irc= =2Ec --- a/net/ipv4/netfilter/ip_nat_irc.c 2004-08-08 12:49:36 +02:00 +++ b/net/ipv4/netfilter/ip_nat_irc.c 2004-08-08 12:49:36 +02:00 @@ -89,7 +89,7 @@ return ip_nat_setup_info(ct, &mr, hooknum); } =20 -static int irc_data_fixup(const struct ip_ct_irc_expect *ct_irc_info, +static int irc_data_fixup(const struct ip_ct_irc_expect *exp_irc_info, struct ip_conntrack *ct, struct sk_buff **pskb, enum ip_conntrack_info ctinfo, @@ -107,7 +107,7 @@ MUST_BE_LOCKED(&ip_irc_lock); =20 DEBUGP("IRC_NAT: info (seq %u + %u) in %u\n", - expect->seq, ct_irc_info->len, + expect->seq, exp_irc_info->len, ntohl(tcph->seq)); =20 newip =3D ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; @@ -120,7 +120,7 @@ =20 t =3D expect->tuple; t.dst.ip =3D newip; - for (port =3D ct_irc_info->port; port !=3D 0; port++) { + for (port =3D exp_irc_info->port; port !=3D 0; port++) { t.dst.u.tcp.port =3D htons(port); if (ip_conntrack_change_expect(expect, &t) =3D=3D 0) { DEBUGP("using port %d", port); @@ -150,7 +150,7 @@ =20 return ip_nat_mangle_tcp_packet(pskb, ct, ctinfo,=20 expect->seq - ntohl(tcph->seq), - ct_irc_info->len, buffer,=20 + exp_irc_info->len, buffer,=20 strlen(buffer)); } =20 @@ -165,12 +165,12 @@ struct tcphdr *tcph =3D (void *) iph + iph->ihl * 4; unsigned int datalen; int dir; - struct ip_ct_irc_expect *ct_irc_info; + struct ip_ct_irc_expect *exp_irc_info; =20 if (!exp) DEBUGP("ip_nat_irc: no exp!!"); =09 - ct_irc_info =3D &exp->help.exp_irc_info; + exp_irc_info =3D &exp->help.exp_irc_info; =20 /* Only mangle things once: original direction in POST_ROUTING and reply direction on PRE_ROUTING. */ @@ -189,10 +189,10 @@ datalen =3D (*pskb)->len - iph->ihl * 4 - tcph->doff * 4; LOCK_BH(&ip_irc_lock); /* Check wether the whole IP/address pattern is carried in the payload */ - if (between(exp->seq + ct_irc_info->len, + if (between(exp->seq + exp_irc_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen)) { - if (!irc_data_fixup(ct_irc_info, ct, pskb, ctinfo, exp)) { + if (!irc_data_fixup(exp_irc_info, ct, pskb, ctinfo, exp)) { UNLOCK_BH(&ip_irc_lock); return NF_DROP; } @@ -202,7 +202,7 @@ if (net_ratelimit()) { printk ("IRC_NAT: partial packet %u/%u in %u/%u\n", - exp->seq, ct_irc_info->len, + exp->seq, exp_irc_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen); } --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --bxF9Dep5HzwGj9mC Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBOBddXaXGVTD0i/8RAjD2AJ0d2eu3tiW8CgGBQmfLfMQ6fMsXjQCff2qV iiENgsMIUj+DlK6U5GumqEc= =F4Nx -----END PGP SIGNATURE----- --bxF9Dep5HzwGj9mC-- From laforge@netfilter.org Fri Sep 3 00:05:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 00:05:21 -0700 (PDT) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i8375BQr025559 for ; Fri, 3 Sep 2004 00:05:12 -0700 Received: from [192.168.200.2] (helo=sunbeam.gnumonks.org) by coruscant.gnumonks.org with esmtp (TLSv1:RC4-SHA:128) (Exim 4.20) id 1C3885-0001kx-Km; Fri, 03 Sep 2004 09:05:02 +0200 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1C387z-0000Rd-1u; Fri, 03 Sep 2004 09:04:55 +0200 Date: Fri, 3 Sep 2004 09:04:55 +0200 From: Harald Welte To: David Miller Cc: Netfilter Development Mailinglist , netdev@oss.sgi.com Subject: [PATCH 2.4] 2/2: Fix NAT helper locking Message-ID: <20040903070455.GS26263@sunbeam.de.gnumonks.org> Mail-Followup-To: Harald Welte , David Miller , Netfilter Development Mailinglist , netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="qgEfXXHyyarqcYJd" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040818i X-archive-position: 8374 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --qgEfXXHyyarqcYJd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Dave! This is the second of a two part patch. This part fixes the locking in NAT helpers. Please apply, Thanks. # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/08/08 12:49:20+02:00 kaber@coreworks.de=20 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers # =20 # There is a possible deadlock condition with conntrack/nat-helpers: # =20 # CPU1: # conntrack-helper:help: lock(private_lock) # ip_conntrack_expect_related: write_lock(ip_conntrack_lock) # =20 # CPU2: # nat-core:do_bindings: read_lock(ip_conntrack_lock) # nat-helper:help: lock(private_lock) # =20 # The lock in the nat-helper is unneccessary because the expectation # is never changed and is protected by ip_conntrack_lock. # =20 # Signed-off-by: Patrick McHardy # Signed-off-by: Harald Welte #=20 # net/ipv4/netfilter/ip_nat_irc.c # 2004/08/08 12:49:15+02:00 kaber@coreworks.de +1 -17 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # net/ipv4/netfilter/ip_nat_ftp.c # 2004/08/08 12:49:15+02:00 kaber@coreworks.de +1 -19 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # net/ipv4/netfilter/ip_conntrack_irc.c # 2004/08/08 12:49:15+02:00 kaber@coreworks.de +0 -8 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # net/ipv4/netfilter/ip_conntrack_ftp.c # 2004/08/08 12:49:15+02:00 kaber@coreworks.de +3 -6 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # include/linux/netfilter_ipv4/ip_conntrack_irc.h # 2004/08/08 12:49:15+02:00 kaber@coreworks.de +0 -5 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 # include/linux/netfilter_ipv4/ip_conntrack_ftp.h # 2004/08/08 12:49:15+02:00 kaber@coreworks.de +0 -5 # [NETFILTER]: Fix deadlock condition in conntrack/nat-helpers #=20 diff -Nru a/include/linux/netfilter_ipv4/ip_conntrack_ftp.h b/include/linux= /netfilter_ipv4/ip_conntrack_ftp.h --- a/include/linux/netfilter_ipv4/ip_conntrack_ftp.h 2004-08-08 12:49:45 += 02:00 +++ b/include/linux/netfilter_ipv4/ip_conntrack_ftp.h 2004-08-08 12:49:45 += 02:00 @@ -4,11 +4,6 @@ =20 #ifdef __KERNEL__ =20 -#include - -/* Protects ftp part of conntracks */ -DECLARE_LOCK_EXTERN(ip_ftp_lock); - #define FTP_PORT 21 =20 #endif /* __KERNEL__ */ diff -Nru a/include/linux/netfilter_ipv4/ip_conntrack_irc.h b/include/linux= /netfilter_ipv4/ip_conntrack_irc.h --- a/include/linux/netfilter_ipv4/ip_conntrack_irc.h 2004-08-08 12:49:45 += 02:00 +++ b/include/linux/netfilter_ipv4/ip_conntrack_irc.h 2004-08-08 12:49:45 += 02:00 @@ -33,17 +33,12 @@ =20 #ifdef __KERNEL__ =20 -#include - #define IRC_PORT 6667 =20 struct dccproto { char* match; int matchlen; }; - -/* Protects irc part of conntracks */ -DECLARE_LOCK_EXTERN(ip_irc_lock); =20 #endif /* __KERNEL__ */ =20 diff -Nru a/net/ipv4/netfilter/ip_conntrack_ftp.c b/net/ipv4/netfilter/ip_c= onntrack_ftp.c --- a/net/ipv4/netfilter/ip_conntrack_ftp.c 2004-08-08 12:49:45 +02:00 +++ b/net/ipv4/netfilter/ip_conntrack_ftp.c 2004-08-08 12:49:45 +02:00 @@ -11,7 +11,7 @@ #include #include =20 -DECLARE_LOCK(ip_ftp_lock); +static DECLARE_LOCK(ip_ftp_lock); struct module *ip_conntrack_ftp =3D THIS_MODULE; =20 #define MAX_PORTS 8 @@ -338,7 +338,6 @@ memset(&expect, 0, sizeof(expect)); =20 /* Update the ftp info */ - LOCK_BH(&ip_ftp_lock); if (htonl((array[0] << 24) | (array[1] << 16) | (array[2] << 8) | array[3= ]) =3D=3D ct->tuplehash[dir].tuple.src.ip) { exp->seq =3D ntohl(tcph->seq) + matchoff; @@ -358,7 +357,8 @@ for reporting this potential problem (DMZ machines opening holes to internal networks, or the packet filter itself). */ - if (!loose) goto out; + if (!loose) + return NF_ACCEPT; } =20 exp->tuple =3D ((struct ip_conntrack_tuple) @@ -376,9 +376,6 @@ =20 /* Ignore failure; should only happen with NAT */ ip_conntrack_expect_related(ct, &expect); - out: - UNLOCK_BH(&ip_ftp_lock); - return NF_ACCEPT; } =20 diff -Nru a/net/ipv4/netfilter/ip_conntrack_irc.c b/net/ipv4/netfilter/ip_c= onntrack_irc.c --- a/net/ipv4/netfilter/ip_conntrack_irc.c 2004-08-08 12:49:45 +02:00 +++ b/net/ipv4/netfilter/ip_conntrack_irc.c 2004-08-08 12:49:45 +02:00 @@ -29,7 +29,6 @@ #include #include =20 -#include #include #include =20 @@ -61,7 +60,6 @@ }; #define MINMATCHLEN 5 =20 -DECLARE_LOCK(ip_irc_lock); struct module *ip_conntrack_irc =3D THIS_MODULE; =20 #if 0 @@ -208,8 +206,6 @@ =09 memset(&expect, 0, sizeof(expect)); =20 - LOCK_BH(&ip_irc_lock); - /* save position of address in dcc string, * neccessary for NAT */ DEBUGP("tcph->seq =3D %u\n", tcph->seq); @@ -236,8 +232,6 @@ ntohs(exp->tuple.dst.u.tcp.port)); =20 ip_conntrack_expect_related(ct, &expect); - UNLOCK_BH(&ip_irc_lock); - return NF_ACCEPT; } /* for .. NUM_DCCPROTO */ } /* while data < ... */ @@ -314,8 +308,6 @@ ip_conntrack_helper_unregister(&irc_helpers[i]); } } - -EXPORT_SYMBOL(ip_irc_lock); =20 module_init(init); module_exit(fini); diff -Nru a/net/ipv4/netfilter/ip_nat_ftp.c b/net/ipv4/netfilter/ip_nat_ftp= =2Ec --- a/net/ipv4/netfilter/ip_nat_ftp.c 2004-08-08 12:49:45 +02:00 +++ b/net/ipv4/netfilter/ip_nat_ftp.c 2004-08-08 12:49:45 +02:00 @@ -24,8 +24,6 @@ MODULE_PARM(ports, "1-" __MODULE_STRING(MAX_PORTS) "i"); #endif =20 -DECLARE_LOCK_EXTERN(ip_ftp_lock); - /* FIXME: Time out? --RR */ =20 static unsigned int @@ -48,8 +46,6 @@ DEBUGP("nat_expected: We have a connection!\n"); exp_ftp_info =3D &ct->master->help.exp_ftp_info; =20 - LOCK_BH(&ip_ftp_lock); - if (exp_ftp_info->ftptype =3D=3D IP_CT_FTP_PORT || exp_ftp_info->ftptype =3D=3D IP_CT_FTP_EPRT) { /* PORT command: make connection go to the client. */ @@ -64,7 +60,6 @@ DEBUGP("nat_expected: PASV cmd. %u.%u.%u.%u->%u.%u.%u.%u\n", NIPQUAD(newsrcip), NIPQUAD(newdstip)); } - UNLOCK_BH(&ip_ftp_lock); =20 if (HOOK2MANIP(hooknum) =3D=3D IP_NAT_MANIP_SRC) newip =3D newsrcip; @@ -100,8 +95,6 @@ { char buffer[sizeof("nnn,nnn,nnn,nnn,nnn,nnn")]; =20 - MUST_BE_LOCKED(&ip_ftp_lock); - sprintf(buffer, "%u,%u,%u,%u,%u,%u", NIPQUAD(newip), port>>8, port&0xFF); =20 @@ -123,8 +116,6 @@ { char buffer[sizeof("|1|255.255.255.255|65535|")]; =20 - MUST_BE_LOCKED(&ip_ftp_lock); - sprintf(buffer, "|1|%u.%u.%u.%u|%u|", NIPQUAD(newip), port); =20 DEBUGP("calling ip_nat_mangle_tcp_packet\n"); @@ -145,8 +136,6 @@ { char buffer[sizeof("|||65535|")]; =20 - MUST_BE_LOCKED(&ip_ftp_lock); - sprintf(buffer, "|||%u|", port); =20 DEBUGP("calling ip_nat_mangle_tcp_packet\n"); @@ -178,7 +167,6 @@ u_int16_t port; struct ip_conntrack_tuple newtuple; =20 - MUST_BE_LOCKED(&ip_ftp_lock); DEBUGP("FTP_NAT: seq %u + %u in %u\n", expect->seq, exp_ftp_info->len, ntohl(tcph->seq)); @@ -257,15 +245,12 @@ } =20 datalen =3D (*pskb)->len - iph->ihl * 4 - tcph->doff * 4; - LOCK_BH(&ip_ftp_lock); /* If it's in the right range... */ if (between(exp->seq + exp_ftp_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen)) { - if (!ftp_data_fixup(exp_ftp_info, ct, pskb, ctinfo, exp)) { - UNLOCK_BH(&ip_ftp_lock); + if (!ftp_data_fixup(exp_ftp_info, ct, pskb, ctinfo, exp)) return NF_DROP; - } } else { /* Half a match? This means a partial retransmisison. It's a cracker being funky. */ @@ -275,11 +260,8 @@ ntohl(tcph->seq), ntohl(tcph->seq) + datalen); } - UNLOCK_BH(&ip_ftp_lock); return NF_DROP; } - UNLOCK_BH(&ip_ftp_lock); - return NF_ACCEPT; } =20 diff -Nru a/net/ipv4/netfilter/ip_nat_irc.c b/net/ipv4/netfilter/ip_nat_irc= =2Ec --- a/net/ipv4/netfilter/ip_nat_irc.c 2004-08-08 12:49:45 +02:00 +++ b/net/ipv4/netfilter/ip_nat_irc.c 2004-08-08 12:49:45 +02:00 @@ -46,9 +46,6 @@ MODULE_PARM_DESC(ports, "port numbers of IRC servers"); #endif =20 -/* protects irc part of conntracks */ -DECLARE_LOCK_EXTERN(ip_irc_lock); - /* FIXME: Time out? --RR */ =20 static unsigned int @@ -104,8 +101,6 @@ /* "4294967296 65635 " */ char buffer[18]; =20 - MUST_BE_LOCKED(&ip_irc_lock); - DEBUGP("IRC_NAT: info (seq %u + %u) in %u\n", expect->seq, exp_irc_info->len, ntohl(tcph->seq)); @@ -113,11 +108,6 @@ newip =3D ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; =20 /* Alter conntrack's expectations. */ - - /* We can read expect here without conntrack lock, since it's - only set in ip_conntrack_irc, with ip_irc_lock held - writable */ - t =3D expect->tuple; t.dst.ip =3D newip; for (port =3D exp_irc_info->port; port !=3D 0; port++) { @@ -187,15 +177,12 @@ DEBUGP("got beyond not touching\n"); =20 datalen =3D (*pskb)->len - iph->ihl * 4 - tcph->doff * 4; - LOCK_BH(&ip_irc_lock); /* Check wether the whole IP/address pattern is carried in the payload */ if (between(exp->seq + exp_irc_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen)) { - if (!irc_data_fixup(exp_irc_info, ct, pskb, ctinfo, exp)) { - UNLOCK_BH(&ip_irc_lock); + if (!irc_data_fixup(exp_irc_info, ct, pskb, ctinfo, exp)) return NF_DROP; - } } else {=20 /* Half a match? This means a partial retransmisison. It's a cracker being funky. */ @@ -206,11 +193,8 @@ ntohl(tcph->seq), ntohl(tcph->seq) + datalen); } - UNLOCK_BH(&ip_irc_lock); return NF_DROP; } - UNLOCK_BH(&ip_irc_lock); - return NF_ACCEPT; } =20 --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --qgEfXXHyyarqcYJd Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBOBeXXaXGVTD0i/8RAjEuAJ41bNu5bI3O7R6UJQskFf5wGN5G9QCfbGDk Joq5HfUvLv0Fokn9mwXPG50= =aVT1 -----END PGP SIGNATURE----- --qgEfXXHyyarqcYJd-- From laforge@netfilter.org Fri Sep 3 00:06:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 00:06:13 -0700 (PDT) Received: from coruscant.gnumonks.org (mail@coruscant.franken.de [193.174.159.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83762W2025812 for ; Fri, 3 Sep 2004 00:06:02 -0700 Received: from [192.168.200.2] (helo=sunbeam.gnumonks.org) by coruscant.gnumonks.org with esmtp (TLSv1:RC4-SHA:128) (Exim 4.20) id 1C388u-0001lP-Oc; Fri, 03 Sep 2004 09:05:53 +0200 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1C388p-0000Rs-Jh; Fri, 03 Sep 2004 09:05:47 +0200 Date: Fri, 3 Sep 2004 09:05:47 +0200 From: Harald Welte To: Netfilter Development Mailinglist Cc: netdev@oss.sgi.com Subject: [PATCH 2.6] 1/2: Rename NAT helper structures Message-ID: <20040903070547.GT26263@sunbeam.de.gnumonks.org> Mail-Followup-To: Harald Welte , Netfilter Development Mailinglist , netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Ca23f2aBZR6YDKM9" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040818i X-archive-position: 8375 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --Ca23f2aBZR6YDKM9 Content-Type: multipart/mixed; boundary="ngiTnHdmUEG79yp6" Content-Disposition: inline --ngiTnHdmUEG79yp6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I forgot to Cc' the lists with this part of the patchset, here is the forwarded message: --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --ngiTnHdmUEG79yp6 Content-Type: message/rfc822 Content-Disposition: inline Date: Fri, 3 Sep 2004 09:00:17 +0200 From: Harald Welte To: David Miller Subject: [PATCH 2.6] 1/2: Rename NAT helper structures Message-ID: <20040903070017.GP26263@sunbeam.de.gnumonks.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="6eUvXotnMb6+obQB" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040818i --6eUvXotnMb6+obQB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Dave! This is the first of a two part patch. Part one fixes confusing naming of some NAT helper data structures (ct_ are part of ip_conntrack, exp_ are part of ip_conntrack_expect). This patch is required to make the second apply, which fixes NAT helper locking. # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/08/07 23:30:12+02:00 kaber@coreworks.de=20 # [NETFILTER]: Fix confusing naming in NAT-helpers # # Signed-off-by: Patrick McHardy # Signed-off-by: Harald Welte #=20 # net/ipv4/netfilter/ip_nat_irc.c # 2004/08/07 23:29:48+02:00 kaber@coreworks.de +9 -9 # [NETFILTER]: Fix confusing naming in NAT-helpers #=20 # net/ipv4/netfilter/ip_nat_ftp.c # 2004/08/07 23:29:48+02:00 kaber@coreworks.de +12 -12 # [NETFILTER]: Fix confusing naming in NAT-helpers #=20 diff -Nru a/net/ipv4/netfilter/ip_nat_ftp.c b/net/ipv4/netfilter/ip_nat_ftp= =2Ec --- a/net/ipv4/netfilter/ip_nat_ftp.c 2004-08-08 01:41:06 +02:00 +++ b/net/ipv4/netfilter/ip_nat_ftp.c 2004-08-08 01:41:06 +02:00 @@ -177,7 +177,7 @@ [IP_CT_FTP_EPSV] =3D mangle_epsv_packet }; =20 -static int ftp_data_fixup(const struct ip_ct_ftp_expect *ct_ftp_info, +static int ftp_data_fixup(const struct ip_ct_ftp_expect *exp_ftp_info, struct ip_conntrack *ct, struct sk_buff **pskb, enum ip_conntrack_info ctinfo, @@ -191,13 +191,13 @@ =20 MUST_BE_LOCKED(&ip_ftp_lock); DEBUGP("FTP_NAT: seq %u + %u in %u\n", - expect->seq, ct_ftp_info->len, + expect->seq, exp_ftp_info->len, ntohl(tcph->seq)); =20 /* Change address inside packet to match way we're mapping this connection. */ - if (ct_ftp_info->ftptype =3D=3D IP_CT_FTP_PASV - || ct_ftp_info->ftptype =3D=3D IP_CT_FTP_EPSV) { + if (exp_ftp_info->ftptype =3D=3D IP_CT_FTP_PASV + || exp_ftp_info->ftptype =3D=3D IP_CT_FTP_EPSV) { /* PASV/EPSV response: must be where client thinks server is */ newip =3D ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip; @@ -219,7 +219,7 @@ newtuple.src.u.tcp.port =3D expect->tuple.src.u.tcp.port; =20 /* Try to get same port: if not, try to change it. */ - for (port =3D ct_ftp_info->port; port !=3D 0; port++) { + for (port =3D exp_ftp_info->port; port !=3D 0; port++) { newtuple.dst.u.tcp.port =3D htons(port); =20 if (ip_conntrack_change_expect(expect, &newtuple) =3D=3D 0) @@ -228,9 +228,9 @@ if (port =3D=3D 0) return 0; =20 - if (!mangle[ct_ftp_info->ftptype](pskb, newip, port, + if (!mangle[exp_ftp_info->ftptype](pskb, newip, port, expect->seq - ntohl(tcph->seq), - ct_ftp_info->len, ct, ctinfo)) + exp_ftp_info->len, ct, ctinfo)) return 0; =20 return 1; @@ -247,12 +247,12 @@ struct tcphdr *tcph =3D (void *)iph + iph->ihl*4; unsigned int datalen; int dir; - struct ip_ct_ftp_expect *ct_ftp_info; + struct ip_ct_ftp_expect *exp_ftp_info; =20 if (!exp) DEBUGP("ip_nat_ftp: no exp!!"); =20 - ct_ftp_info =3D &exp->help.exp_ftp_info; + exp_ftp_info =3D &exp->help.exp_ftp_info; =20 /* Only mangle things once: original direction in POST_ROUTING and reply direction on PRE_ROUTING. */ @@ -270,10 +270,10 @@ datalen =3D (*pskb)->len - iph->ihl * 4 - tcph->doff * 4; LOCK_BH(&ip_ftp_lock); /* If it's in the right range... */ - if (between(exp->seq + ct_ftp_info->len, + if (between(exp->seq + exp_ftp_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen)) { - if (!ftp_data_fixup(ct_ftp_info, ct, pskb, ctinfo, exp)) { + if (!ftp_data_fixup(exp_ftp_info, ct, pskb, ctinfo, exp)) { UNLOCK_BH(&ip_ftp_lock); return NF_DROP; } @@ -282,7 +282,7 @@ It's a cracker being funky. */ if (net_ratelimit()) { printk("FTP_NAT: partial packet %u/%u in %u/%u\n", - exp->seq, ct_ftp_info->len, + exp->seq, exp_ftp_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen); } diff -Nru a/net/ipv4/netfilter/ip_nat_irc.c b/net/ipv4/netfilter/ip_nat_irc= =2Ec --- a/net/ipv4/netfilter/ip_nat_irc.c 2004-08-08 01:41:06 +02:00 +++ b/net/ipv4/netfilter/ip_nat_irc.c 2004-08-08 01:41:06 +02:00 @@ -87,7 +87,7 @@ return ip_nat_setup_info(ct, &mr, hooknum); } =20 -static int irc_data_fixup(const struct ip_ct_irc_expect *ct_irc_info, +static int irc_data_fixup(const struct ip_ct_irc_expect *exp_irc_info, struct ip_conntrack *ct, struct sk_buff **pskb, enum ip_conntrack_info ctinfo, @@ -105,7 +105,7 @@ MUST_BE_LOCKED(&ip_irc_lock); =20 DEBUGP("IRC_NAT: info (seq %u + %u) in %u\n", - expect->seq, ct_irc_info->len, + expect->seq, exp_irc_info->len, ntohl(tcph->seq)); =20 newip =3D ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; @@ -118,7 +118,7 @@ =20 t =3D expect->tuple; t.dst.ip =3D newip; - for (port =3D ct_irc_info->port; port !=3D 0; port++) { + for (port =3D exp_irc_info->port; port !=3D 0; port++) { t.dst.u.tcp.port =3D htons(port); if (ip_conntrack_change_expect(expect, &t) =3D=3D 0) { DEBUGP("using port %d", port); @@ -148,7 +148,7 @@ =20 return ip_nat_mangle_tcp_packet(pskb, ct, ctinfo,=20 expect->seq - ntohl(tcph->seq), - ct_irc_info->len, buffer,=20 + exp_irc_info->len, buffer,=20 strlen(buffer)); } =20 @@ -163,12 +163,12 @@ struct tcphdr *tcph =3D (void *) iph + iph->ihl * 4; unsigned int datalen; int dir; - struct ip_ct_irc_expect *ct_irc_info; + struct ip_ct_irc_expect *exp_irc_info; =20 if (!exp) DEBUGP("ip_nat_irc: no exp!!"); =09 - ct_irc_info =3D &exp->help.exp_irc_info; + exp_irc_info =3D &exp->help.exp_irc_info; =20 /* Only mangle things once: original direction in POST_ROUTING and reply direction on PRE_ROUTING. */ @@ -187,10 +187,10 @@ datalen =3D (*pskb)->len - iph->ihl * 4 - tcph->doff * 4; LOCK_BH(&ip_irc_lock); /* Check whether the whole IP/address pattern is carried in the payload */ - if (between(exp->seq + ct_irc_info->len, + if (between(exp->seq + exp_irc_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen)) { - if (!irc_data_fixup(ct_irc_info, ct, pskb, ctinfo, exp)) { + if (!irc_data_fixup(exp_irc_info, ct, pskb, ctinfo, exp)) { UNLOCK_BH(&ip_irc_lock); return NF_DROP; } @@ -200,7 +200,7 @@ if (net_ratelimit()) { printk ("IRC_NAT: partial packet %u/%u in %u/%u\n", - exp->seq, ct_irc_info->len, + exp->seq, exp_irc_info->len, ntohl(tcph->seq), ntohl(tcph->seq) + datalen); } --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --6eUvXotnMb6+obQB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBOBaBXaXGVTD0i/8RAotRAJ90N4o+BXwF6PAX/6OQANliZbjCggCgquzv tIf1zDcRfMVMkcFaTJTcVpI= =zNZp -----END PGP SIGNATURE----- --6eUvXotnMb6+obQB-- --ngiTnHdmUEG79yp6-- --Ca23f2aBZR6YDKM9 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBOBfLXaXGVTD0i/8RAvWeAJ9cvcIu6ttmq5c5FpcZDQywBqlc8QCeJn44 c0xfFnKWBmgfVAoQS8qkUrM= =VeXZ -----END PGP SIGNATURE----- --Ca23f2aBZR6YDKM9-- From davem@davemloft.net Fri Sep 3 00:26:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 00:26:25 -0700 (PDT) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i837QCP7030073 for ; Fri, 3 Sep 2004 00:26:13 -0700 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1C38RJ-0008SX-00; Fri, 03 Sep 2004 00:24:53 -0700 Date: Fri, 3 Sep 2004 00:24:53 -0700 From: "David S. Miller" To: Harald Welte Cc: netfilter-devel@lists.netfilter.org, netdev@oss.sgi.com Subject: Re: [PATCH 2.6] 2/2: Fix NAT helper locking Message-Id: <20040903002453.77beee16.davem@davemloft.net> In-Reply-To: <20040903070234.GQ26263@sunbeam.de.gnumonks.org> References: <20040903070234.GQ26263@sunbeam.de.gnumonks.org> X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8376 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 3 Sep 2004 09:02:34 +0200 Harald Welte wrote: > This is the second of a two part patch. > > This part fixes the locking in NAT helpers. > > Please apply, Thanks. Applied both patches, but the first had serious offsets and the second had to be applied by hand due to rejects against the current 2.6.x sources. Ummm... wow what ancient tree did you patch against Harald? The tree you patched against didn't even have the skb_header_pointer() changes in it, that's caveman era :-) That's what caused the rejects. Anyways, I merged it all in cleanly. Thanks. From davem@davemloft.net Fri Sep 3 00:28:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 00:28:37 -0700 (PDT) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i837SUaF030315 for ; Fri, 3 Sep 2004 00:28:30 -0700 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1C38TY-0008TQ-00; Fri, 03 Sep 2004 00:27:12 -0700 Date: Fri, 3 Sep 2004 00:27:12 -0700 From: "David S. Miller" To: Harald Welte Cc: netfilter-devel@lists.netfilter.org, netdev@oss.sgi.com Subject: Re: [PATCH 2.4] 2/2: Fix NAT helper locking Message-Id: <20040903002712.4fdf8194.davem@davemloft.net> In-Reply-To: <20040903070455.GS26263@sunbeam.de.gnumonks.org> References: <20040903070455.GS26263@sunbeam.de.gnumonks.org> X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8377 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 3 Sep 2004 09:04:55 +0200 Harald Welte wrote: > This is the second of a two part patch. > > This part fixes the locking in NAT helpers. These 2.4.x variants of the NAT locking fixes are applied as well. Thanks. From herbert@gondor.apana.org.au Fri Sep 3 06:37:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 06:37:13 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83Db1Yk015955 for ; Fri, 3 Sep 2004 06:37:03 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1C3EEu-0005sv-00; Fri, 03 Sep 2004 23:36:28 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1C3EEp-000630-00; Fri, 03 Sep 2004 23:36:23 +1000 Date: Fri, 3 Sep 2004 23:36:23 +1000 To: "David S. Miller" Cc: shemminger@osdl.org, netdev@oss.sgi.com Subject: Re: neigh_create/inetdev_destroy race? Message-ID: <20040903133623.GA23179@gondor.apana.org.au> References: <20040814050848.GA11874@gondor.apana.org.au> <20040814062703.GA4806@gondor.apana.org.au> <20040815191450.77532d5d.davem@redhat.com> <20040816105131.GA11299@gondor.apana.org.au> <20040828234201.79556f6e.davem@davemloft.net> <20040829065031.GA786@gondor.apana.org.au> <20040830230820.7514985d.davem@davemloft.net> <20040831104139.GA2124@gondor.apana.org.au> <20040901222118.0ce4bcc6.davem@davemloft.net> <20040902130605.GA32570@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="huq684BweRXVnRxX" Content-Disposition: inline In-Reply-To: <20040902130605.GA32570@gondor.apana.org.au> User-Agent: Mutt/1.5.6+20040722i From: Herbert Xu X-archive-position: 8378 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --huq684BweRXVnRxX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Sep 02, 2004 at 11:06:05PM +1000, herbert wrote: > > > Can you work on the next bit you mentioned, making > > sure the corresponding idev is still alive when we add > > a neighbour with its neigh_parms to the hash table? > > Sure. Actually I prefer to do it by ref counting neigh_parms directly. > I'll send you a patch soon. Here is the patch. I've added a refcnt on neigh_parms as well as a dead flag. The latter is checked under the tbl_lock before adding a neigh entry to the hash table. The non-trivial bit of the patch is the first chunk of net/core/neighbour.c. I removed that line because not doing so would mean that I have to drop the reference to the parms right there. That would've lead to race conditions since many places dereference neigh->parms without holding locks. It's also unnecessary to reset n->parms since we're no longer in a hurry to see it go due to the new ref counting. You'll also notice that I've put all dereferences of dev->*_ptr under the rcu_read_lock(). Without this we may get a neigh_parms that's already been released. Incidentally a lot of these places were racy even before the RCU change. For example, in the IPv6 case neigh->parms may be set to a value that's just been released. Finally in order to make sure that all stale entries are purged as quickly as possible I've added neigh_ifdown/arp_ifdown calls after every neigh_parms_release call. In many cases we now have multiple calls to neigh_ifdown in the shutdown path. I didn't remove the earlier calls because there may be hidden dependencies for them to be there. Once the respective maintainers have looked at them we can probably remove most of them. Signed-off-by: Herbert Xu Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --huq684BweRXVnRxX Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p ===== drivers/s390/net/qeth_main.c 1.13 vs edited ===== --- 1.13/drivers/s390/net/qeth_main.c 2004-08-27 17:02:36 +10:00 +++ edited/drivers/s390/net/qeth_main.c 2004-09-03 23:16:30 +10:00 @@ -6710,19 +6710,28 @@ qeth_arp_constructor(struct neighbour *neigh) { struct net_device *dev = neigh->dev; - struct in_device *in_dev = in_dev_get(dev); + struct in_device *in_dev; + struct neigh_parms *parms; - if (in_dev == NULL) - return -EINVAL; if (!qeth_verify_dev(dev)) { - in_dev_put(in_dev); return qeth_old_arp_constructor(neigh); } + rcu_read_lock(); + in_dev = __in_dev_get(dev); + if (in_dev == NULL) { + rcu_read_unlock(); + return -EINVAL; + } + + parms = in_dev->arp_parms; + if (parms) { + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); + } + rcu_read_unlock(); + neigh->type = inet_addr_type(*(u32 *) neigh->primary_key); - if (in_dev->arp_parms) - neigh->parms = in_dev->arp_parms; - in_dev_put(in_dev); neigh->nud_state = NUD_NOARP; neigh->ops = arp_direct_ops; neigh->output = neigh->ops->queue_xmit; ===== include/net/neighbour.h 1.9 vs edited ===== --- 1.9/include/net/neighbour.h 2004-09-02 15:03:13 +10:00 +++ edited/include/net/neighbour.h 2004-09-03 23:15:51 +10:00 @@ -67,6 +67,8 @@ void *sysctl_table; + int dead; + atomic_t refcnt; struct rcu_head rcu_head; int base_reachable_time; @@ -199,6 +201,7 @@ extern struct neigh_parms *neigh_parms_alloc(struct net_device *dev, struct neigh_table *tbl); extern void neigh_parms_release(struct neigh_table *tbl, struct neigh_parms *parms); +extern void neigh_parms_destroy(struct neigh_parms *parms); extern unsigned long neigh_rand_reach_time(unsigned long base); extern void pneigh_enqueue(struct neigh_table *tbl, struct neigh_parms *p, @@ -219,6 +222,23 @@ char *p_name, proc_handler *proc_handler); extern void neigh_sysctl_unregister(struct neigh_parms *p); + +static inline void __neigh_parms_put(struct neigh_parms *parms) +{ + atomic_dec(&parms->refcnt); +} + +static inline void neigh_parms_put(struct neigh_parms *parms) +{ + if (atomic_dec_and_test(&parms->refcnt)) + neigh_parms_destroy(parms); +} + +static inline struct neigh_parms *neigh_parms_clone(struct neigh_parms *parms) +{ + atomic_inc(&parms->refcnt); + return parms; +} /* * Neighbour references ===== net/atm/clip.c 1.36 vs edited ===== --- 1.36/net/atm/clip.c 2004-08-16 12:05:46 +10:00 +++ edited/net/atm/clip.c 2004-09-03 23:16:55 +10:00 @@ -26,6 +26,7 @@ #include #include #include +#include #include /* for struct rtable and routing */ #include /* icmp_send */ #include /* for HZ */ @@ -311,13 +312,27 @@ { struct atmarp_entry *entry = NEIGH2ENTRY(neigh); struct net_device *dev = neigh->dev; - struct in_device *in_dev = dev->ip_ptr; + struct in_device *in_dev; + struct neigh_parms *parms; DPRINTK("clip_constructor (neigh %p, entry %p)\n",neigh,entry); - if (!in_dev) return -EINVAL; neigh->type = inet_addr_type(entry->ip); if (neigh->type != RTN_UNICAST) return -EINVAL; - if (in_dev->arp_parms) neigh->parms = in_dev->arp_parms; + + rcu_read_lock(); + in_dev = __in_dev_get(dev); + if (!in_dev) { + rcu_read_unlock(); + return -EINVAL; + } + + parms = in_dev->arp_parms; + if (parms) { + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); + } + rcu_read_unlock(); + neigh->ops = &clip_neigh_ops; neigh->output = neigh->nud_state & NUD_VALID ? neigh->ops->connected_output : neigh->ops->output; ===== net/core/neighbour.c 1.29 vs edited ===== --- 1.29/net/core/neighbour.c 2004-09-02 15:03:13 +10:00 +++ edited/net/core/neighbour.c 2004-09-03 23:17:17 +10:00 @@ -227,7 +227,6 @@ we must kill timers etc. and move it to safe state. */ - n->parms = &tbl->parms; skb_queue_purge(&n->arp_queue); n->output = neigh_blackhole; if (n->nud_state & NUD_VALID) @@ -273,7 +272,7 @@ n->updated = n->used = now; n->nud_state = NUD_NONE; n->output = neigh_blackhole; - n->parms = &tbl->parms; + n->parms = neigh_parms_clone(&tbl->parms); init_timer(&n->timer); n->timer.function = neigh_timer_handler; n->timer.data = (unsigned long)n; @@ -340,12 +339,16 @@ hash_val = tbl->hash(pkey, dev); write_lock_bh(&tbl->lock); + if (n->parms->dead) { + rc = ERR_PTR(-EINVAL); + goto out_tbl_unlock; + } + for (n1 = tbl->hash_buckets[hash_val]; n1; n1 = n1->next) { if (dev == n1->dev && !memcmp(n1->primary_key, pkey, key_len)) { neigh_hold(n1); - write_unlock_bh(&tbl->lock); rc = n1; - goto out_neigh_release; + goto out_tbl_unlock; } } @@ -358,6 +361,8 @@ rc = n; out: return rc; +out_tbl_unlock: + write_unlock_bh(&tbl->lock); out_neigh_release: neigh_release(n); goto out; @@ -494,6 +499,7 @@ skb_queue_purge(&neigh->arp_queue); dev_put(neigh->dev); + neigh_parms_put(neigh->parms); NEIGH_PRINTK2("neigh %p is destroyed.\n", neigh); @@ -1120,6 +1126,7 @@ if (p) { memcpy(p, &tbl->parms, sizeof(*p)); p->tbl = tbl; + atomic_set(&p->refcnt, 1); INIT_RCU_HEAD(&p->rcu_head); p->reachable_time = neigh_rand_reach_time(p->base_reachable_time); @@ -1141,7 +1148,7 @@ struct neigh_parms *parms = container_of(head, struct neigh_parms, rcu_head); - kfree(parms); + neigh_parms_put(parms); } void neigh_parms_release(struct neigh_table *tbl, struct neigh_parms *parms) @@ -1154,6 +1161,7 @@ for (p = &tbl->parms.next; *p; p = &(*p)->next) { if (*p == parms) { *p = parms->next; + parms->dead = 1; write_unlock_bh(&tbl->lock); call_rcu(&parms->rcu_head, neigh_rcu_free_parms); return; @@ -1163,11 +1171,17 @@ NEIGH_PRINTK1("neigh_parms_release: not found\n"); } +void neigh_parms_destroy(struct neigh_parms *parms) +{ + kfree(parms); +} + void neigh_table_init(struct neigh_table *tbl) { unsigned long now = jiffies; + atomic_set(&tbl->parms.refcnt, 1); INIT_RCU_HEAD(&tbl->parms.rcu_head); tbl->parms.reachable_time = neigh_rand_reach_time(tbl->parms.base_reachable_time); ===== net/decnet/dn_dev.c 1.24 vs edited ===== --- 1.24/net/decnet/dn_dev.c 2004-08-19 07:36:05 +10:00 +++ edited/net/decnet/dn_dev.c 2004-09-03 22:19:04 +10:00 @@ -1215,6 +1215,7 @@ dev->dn_ptr = NULL; neigh_parms_release(&dn_neigh_table, dn_db->neigh_parms); + neigh_ifdown(&dn_neigh_table, dev); if (dn_db->router) neigh_release(dn_db->router); ===== net/decnet/dn_neigh.c 1.10 vs edited ===== --- 1.10/net/decnet/dn_neigh.c 2004-01-26 16:13:52 +11:00 +++ edited/net/decnet/dn_neigh.c 2004-09-03 23:17:26 +10:00 @@ -35,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -134,13 +135,22 @@ { struct net_device *dev = neigh->dev; struct dn_neigh *dn = (struct dn_neigh *)neigh; - struct dn_dev *dn_db = (struct dn_dev *)dev->dn_ptr; + struct dn_dev *dn_db; + struct neigh_parms *parms; - if (dn_db == NULL) + rcu_read_lock(); + dn_db = dev->dn_ptr; + if (dn_db == NULL) { + rcu_read_unlock(); return -EINVAL; + } - if (dn_db->neigh_parms) - neigh->parms = dn_db->neigh_parms; + parms = dn_db->neigh_parms; + if (parms) { + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); + } + rcu_read_unlock(); if (dn_db->use_long) neigh->ops = &dn_long_ops; ===== net/ipv4/arp.c 1.43 vs edited ===== --- 1.43/net/ipv4/arp.c 2004-09-02 11:12:37 +10:00 +++ edited/net/ipv4/arp.c 2004-09-03 23:17:42 +10:00 @@ -96,6 +96,7 @@ #include #include #include +#include #ifdef CONFIG_SYSCTL #include #endif @@ -237,16 +238,24 @@ { u32 addr = *(u32*)neigh->primary_key; struct net_device *dev = neigh->dev; - struct in_device *in_dev = in_dev_get(dev); - - if (in_dev == NULL) - return -EINVAL; + struct in_device *in_dev; + struct neigh_parms *parms; neigh->type = inet_addr_type(addr); - if (in_dev->arp_parms) - neigh->parms = in_dev->arp_parms; - in_dev_put(in_dev); + rcu_read_lock(); + in_dev = __in_dev_get(dev); + if (in_dev == NULL) { + rcu_read_unlock(); + return -EINVAL; + } + + parms = in_dev->arp_parms; + if (parms) { + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); + } + rcu_read_unlock(); if (dev->hard_header == NULL) { neigh->nud_state = NUD_NOARP; ===== net/ipv4/devinet.c 1.36 vs edited ===== --- 1.36/net/ipv4/devinet.c 2004-08-16 16:11:59 +10:00 +++ edited/net/ipv4/devinet.c 2004-09-03 22:19:04 +10:00 @@ -184,6 +184,7 @@ static void inetdev_destroy(struct in_device *in_dev) { struct in_ifaddr *ifa; + struct net_device *dev; ASSERT_RTNL(); @@ -200,12 +201,15 @@ devinet_sysctl_unregister(&in_dev->cnf); #endif - in_dev->dev->ip_ptr = NULL; + dev = in_dev->dev; + dev->ip_ptr = NULL; #ifdef CONFIG_SYSCTL neigh_sysctl_unregister(in_dev->arp_parms); #endif neigh_parms_release(&arp_tbl, in_dev->arp_parms); + arp_ifdown(dev); + call_rcu(&in_dev->rcu_head, in_dev_rcu_put); } ===== net/ipv6/addrconf.c 1.106 vs edited ===== --- 1.106/net/ipv6/addrconf.c 2004-08-17 12:25:06 +10:00 +++ edited/net/ipv6/addrconf.c 2004-09-03 23:04:03 +10:00 @@ -2072,6 +2072,7 @@ neigh_sysctl_unregister(idev->nd_parms); #endif neigh_parms_release(&nd_tbl, idev->nd_parms); + neigh_ifdown(&nd_tbl, dev); in6_dev_put(idev); } return 0; ===== net/ipv6/ndisc.c 1.87 vs edited ===== --- 1.87/net/ipv6/ndisc.c 2004-08-08 16:43:41 +10:00 +++ edited/net/ipv6/ndisc.c 2004-09-03 23:17:56 +10:00 @@ -58,6 +58,7 @@ #include #include #include +#include #ifdef CONFIG_SYSCTL #include #endif @@ -284,14 +285,23 @@ { struct in6_addr *addr = (struct in6_addr*)&neigh->primary_key; struct net_device *dev = neigh->dev; - struct inet6_dev *in6_dev = in6_dev_get(dev); + struct inet6_dev *in6_dev; + struct neigh_parms *parms; int is_multicast = ipv6_addr_is_multicast(addr); - if (in6_dev == NULL) + rcu_read_lock(); + in6_dev = in6_dev_get(dev); + if (in6_dev == NULL) { + rcu_read_unlock(); return -EINVAL; + } - if (in6_dev->nd_parms) - neigh->parms = in6_dev->nd_parms; + parms = in6_dev->nd_parms; + if (parms) { + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); + } + rcu_read_unlock(); neigh->type = is_multicast ? RTN_MULTICAST : RTN_UNICAST; if (dev->hard_header == NULL) { --huq684BweRXVnRxX-- From stigge@antcom.de Fri Sep 3 07:07:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 07:07:30 -0700 (PDT) Received: from stigge.org (pD9E7EEC6.dip.t-dialin.net [217.231.238.198]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83E7NSE017083 for ; Fri, 3 Sep 2004 07:07:25 -0700 Received: (qmail 20881 invoked from network); 3 Sep 2004 14:07:07 -0000 Received: from unknown (HELO atari.stigge.org) (192.168.1.99) by sbo.stigge.org with SMTP; 3 Sep 2004 14:07:07 -0000 Received: from localhost (localhost [127.0.0.1]) by atari.stigge.org (Postfix) with ESMTP id D788910034F50; Fri, 3 Sep 2004 16:07:06 +0200 (CEST) Subject: Re: Debian #240812 - tg3 problems with NFS From: Roland Stigge To: "David S. Miller" Cc: Christoph Hellwig , netdev@oss.sgi.com, 240812@bugs.debian.org In-Reply-To: <20040819102941.70c94182.davem@redhat.com> References: <20040819161933.GA27114@lst.de> <20040819170619.GA28226@lst.de> <20040819102941.70c94182.davem@redhat.com> Content-Type: text/plain Organization: Antcom Message-Id: <1094220426.4139.5.camel@atari.stigge.org> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Fri, 03 Sep 2004 16:07:06 +0200 Content-Transfer-Encoding: 7bit X-archive-position: 8379 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: stigge@antcom.de Precedence: bulk X-list: netdev On Thu, 2004-08-19 at 19:29, David S. Miller wrote: > Maybe it's checksum offload related. Try: > > ethtool -K $(DEVICE_NAME) rx off tx off sg off tso off > > Does that make the problem go away? No. From kaber@trash.net Fri Sep 3 08:54:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 08:54:17 -0700 (PDT) Received: from gw.localnet ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83FsAOE024381 for ; Fri, 3 Sep 2004 08:54:11 -0700 Received: from [172.16.1.123] (helo=trash.net ident=kaber) by gw.localnet with esmtp (Exim 3.36 #1 (Debian)) id 1C3GSf-0007b8-00; Fri, 03 Sep 2004 17:58:49 +0200 Message-ID: <41389371.2030009@trash.net> Date: Fri, 03 Sep 2004 17:53:21 +0200 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: Harald Welte , netfilter-devel@lists.netfilter.org, netdev@oss.sgi.com Subject: Re: [PATCH 2.6] 2/2: Fix NAT helper locking References: <20040903070234.GQ26263@sunbeam.de.gnumonks.org> <20040903002453.77beee16.davem@davemloft.net> In-Reply-To: <20040903002453.77beee16.davem@davemloft.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 8380 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev David S. Miller wrote: >Applied both patches, but the first had serious offsets >and the second had to be applied by hand due to rejects >against the current 2.6.x sources. > >Ummm... wow what ancient tree did you patch against >Harald? The tree you patched against didn't even have the >skb_header_pointer() changes in it, that's caveman >era :-) That's what caused the rejects. > > My fault, I put the patches into patch-o-matic about two or three weeks ago and didn't expect they wouldn't apply anymore so quickly :) > >Anyways, I merged it all in cleanly. > Thanks. Patrick From shemminger@osdl.org Fri Sep 3 09:01:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 09:01:30 -0700 (PDT) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83G1Npv024802 for ; Fri, 3 Sep 2004 09:01:24 -0700 Received: from dell_ss3.pdx.osdl.net (dell_ss3.pdx.osdl.net [172.20.1.60]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id i83G0r120723; Fri, 3 Sep 2004 09:00:53 -0700 Date: Fri, 3 Sep 2004 09:00:53 -0700 From: Stephen Hemminger To: Herbert Xu Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: neigh_create/inetdev_destroy race? Message-Id: <20040903090053.22c67bb9@dell_ss3.pdx.osdl.net> In-Reply-To: <20040903133623.GA23179@gondor.apana.org.au> References: <20040814050848.GA11874@gondor.apana.org.au> <20040814062703.GA4806@gondor.apana.org.au> <20040815191450.77532d5d.davem@redhat.com> <20040816105131.GA11299@gondor.apana.org.au> <20040828234201.79556f6e.davem@davemloft.net> <20040829065031.GA786@gondor.apana.org.au> <20040830230820.7514985d.davem@davemloft.net> <20040831104139.GA2124@gondor.apana.org.au> <20040901222118.0ce4bcc6.davem@davemloft.net> <20040902130605.GA32570@gondor.apana.org.au> <20040903133623.GA23179@gondor.apana.org.au> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; i386-redhat-linux-gnu) X-Face: &@E+xe?c%:&e4D{>f1O<&U>2qwRREG5!}7R4;D<"NO^UI2mJ[eEOA2*3>(`Th.yP,VDPo9$ /`~cw![cmj~~jWe?AHY7D1S+\}5brN0k*NE?pPh_'_d>6;XGG[\KDRViCfumZT3@[ Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8381 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Fri, 3 Sep 2004 23:36:23 +1000 Herbert Xu wrote: > On Thu, Sep 02, 2004 at 11:06:05PM +1000, herbert wrote: > > > > > Can you work on the next bit you mentioned, making > > > sure the corresponding idev is still alive when we add > > > a neighbour with its neigh_parms to the hash table? > > > > Sure. Actually I prefer to do it by ref counting neigh_parms directly. > > I'll send you a patch soon. > > Here is the patch. > > I've added a refcnt on neigh_parms as well as a dead flag. The latter > is checked under the tbl_lock before adding a neigh entry to the hash > table. > > The non-trivial bit of the patch is the first chunk of net/core/neighbour.c. > I removed that line because not doing so would mean that I have to drop > the reference to the parms right there. That would've lead to race > conditions since many places dereference neigh->parms without holding > locks. It's also unnecessary to reset n->parms since we're no longer > in a hurry to see it go due to the new ref counting. > > You'll also notice that I've put all dereferences of dev->*_ptr under > the rcu_read_lock(). Without this we may get a neigh_parms that's > already been released. I haven't looked at the exact code in detail, but don't you need use rcu_dereference() as well to make sure and get the smp_read_barrier_depends on Alpha. From davem@davemloft.net Fri Sep 3 09:19:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 09:20:00 -0700 (PDT) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83GJsow025997 for ; Fri, 3 Sep 2004 09:19:54 -0700 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1C3GlW-0001oW-00; Fri, 03 Sep 2004 09:18:18 -0700 Date: Fri, 3 Sep 2004 09:18:17 -0700 From: "David S. Miller" To: Herbert Xu Cc: shemminger@osdl.org, netdev@oss.sgi.com Subject: Re: neigh_create/inetdev_destroy race? Message-Id: <20040903091817.7b97d090.davem@davemloft.net> In-Reply-To: <20040903133623.GA23179@gondor.apana.org.au> References: <20040814050848.GA11874@gondor.apana.org.au> <20040814062703.GA4806@gondor.apana.org.au> <20040815191450.77532d5d.davem@redhat.com> <20040816105131.GA11299@gondor.apana.org.au> <20040828234201.79556f6e.davem@davemloft.net> <20040829065031.GA786@gondor.apana.org.au> <20040830230820.7514985d.davem@davemloft.net> <20040831104139.GA2124@gondor.apana.org.au> <20040901222118.0ce4bcc6.davem@davemloft.net> <20040902130605.GA32570@gondor.apana.org.au> <20040903133623.GA23179@gondor.apana.org.au> X-Mailer: Sylpheed version 0.9.12 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 8382 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 3 Sep 2004 23:36:23 +1000 Herbert Xu wrote: > Here is the patch. Looks great. Yes, I see how the existing cases were racey pre-RCU, it is similar to the sysctl stuff and that area was truly horrible before Stephen and myself redid how generic device destruction works. Patch applied, thanks Herbert. From jeffpc@optonline.net Fri Sep 3 10:07:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 10:07:18 -0700 (PDT) Received: from mta8.srv.hcvlny.cv.net (mta8.srv.hcvlny.cv.net [167.206.5.75]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83H7BYf027180 for ; Fri, 3 Sep 2004 10:07:11 -0700 Received: from [10.0.0.15] (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta8.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0I3H001KI67Q2C@mta8.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Fri, 03 Sep 2004 13:07:02 -0400 (EDT) Date: Fri, 03 Sep 2004 13:06:55 -0400 From: josef Jeff Sipek Subject: [PATCH/RFC 2.6] NET: 64-bit network statistics To: linux-kernel@vger.kernel.org Cc: netdev@oss.sgi.com Message-id: <200409031307.01240.jeffpc@optonline.net> MIME-version: 1.0 Content-type: Text/Plain; charset=us-ascii Content-transfer-encoding: 7BIT Content-disposition: inline User-Agent: KMail/1.6.2 X-archive-position: 8384 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've created a patch that monitors changes to the network statistics variables and keeps internal 64-bit counter. I decided to split it into two parts (patches are to follow in next emails): 1) generic variable monitoring system (watch64) The watch64 system allows the programmer to specify the approximate interval at which he wants his variables checked. If he tries to specify shorter interval than the minimum a default value of HZ/10 is used. To minimize locking, RCU and seqlock are used. On 64-bit systems, all is optimized away. 2) network statistics specific patch (64network) Upon registration of a network device, all the statistics variables are registered with watch64. Additionally, a new proc file is created /proc/net/dev64 displays the 64-bit values as supposed to /proc/net/dev which is left to display the original 32-bit variables for backward compatibility. The sysfs interface (/sys/class/net//statistics/*) displays the 64-bit values only. On 64-bit systems, all is optimized away through watch64. Josef "Jeff" Sipek. - -- *NOTE: This message is ROT-13 encrypted twice for extra protection* -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBOKSzwFP0+seVj/4RAkz7AJ0Ut21nPMkHGKv1dXK17yoA5hQ1+ACglpMq IHh+tYW3innmwjlA7EU2x78= =LnHg -----END PGP SIGNATURE----- From Rezwanul_Kabir@Dell.com Fri Sep 3 10:06:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 10:06:22 -0700 (PDT) Received: from ausc60pc101.us.dell.com (ausc60pc101.us.dell.com [143.166.85.206]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83H6H0F027153 for ; Fri, 3 Sep 2004 10:06:18 -0700 Received: from ausx2kcpc115.aus.amer.dell.com (10.166.84.69) by ausc60pc101.us.dell.com with ESMTP; 03 Sep 2004 12:06:05 -0500 X-Ironport-AV: i="3.84,129,1091422800"; d="scan'208"; a="84464312:sNHT20663774" X-MimeOLE: Produced By Microsoft Exchange V6.0.6527.0 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: ioctl() to get MAC address from EEPROM Date: Fri, 3 Sep 2004 12:06:02 -0500 Message-ID: <06226F23984D7A49A694576CF06603F908BBCC@ausx2kmpc106.aus.amer.dell.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: ioctl() to get MAC address from EEPROM Thread-Index: AcSR2EpqqtgahXMFRfmmyUt8DI/YOw== From: To: X-OriginalArrivalTime: 03 Sep 2004 17:06:03.0956 (UTC) FILETIME=[4B8AAF40:01C491D8] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id i83H6H0F027153 X-archive-position: 8383 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Rezwanul_Kabir@Dell.com Precedence: bulk X-list: netdev Hi There seems to be no standard way to retrieve the MAC address of a NIC stored in the EEPROM ( ETHTOOL_GEEPROM ioctl may be used to do such thing but there's a need for a more direct standard interface).This is sometimes necessary when the MAC address in EEPROM may differ from the one associated with the software interface (i.e. dev_addr in struct net_device).For example, in some modes of channel bonding , the MAC address of the active NIC is duplicated on the rest of the members of the specific bond/team. How to fetch the "permanent" MAC address in this case? Any plan to include such commands in ethtool ioctls? Is there a better way to do this? Any suggestions would be appreciated.. Thanks.. --rez From jeffpc@optonline.net Fri Sep 3 10:19:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 10:19:40 -0700 (PDT) Received: from mta4.srv.hcvlny.cv.net (mta4.srv.hcvlny.cv.net [167.206.5.70]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83HJYt9028073 for ; Fri, 3 Sep 2004 10:19:35 -0700 Received: from [10.0.0.15] (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta4.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0I3H00DMR6SDS8@mta4.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Fri, 03 Sep 2004 13:19:25 -0400 (EDT) Date: Fri, 03 Sep 2004 13:19:24 -0400 From: "Josef 'Jeff' Sipek" Subject: [PATCH 2.6] watch64: generic variable monitoring system In-reply-to: <200409031307.01240.jeffpc@optonline.net> To: linux-kernel@vger.kernel.org Cc: netdev@oss.sgi.com Message-id: <200409031319.24863.jeffpc@optonline.net> MIME-version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Content-disposition: inline User-Agent: KMail/1.6.2 References: <200409031307.01240.jeffpc@optonline.net> X-archive-position: 8385 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev The watch64 system allows the programmer to specify the approximate interval at which he wants his variables checked. If he tries to specify shorter interval than the minimum a default value of HZ/10 is used. To minimize locking, RCU and seqlock are used. On 64-bit systems, all is optimized away. The following patch can be also pulled from http://jeffpc.bkbits.net/watch64-2.6 Josef "Jeff" Sipek. Signed-off-by: Josef "Jeff" Sipek diff -Nru a/Documentation/00-INDEX b/Documentation/00-INDEX --- a/Documentation/00-INDEX 2004-09-03 12:21:17 -04:00 +++ b/Documentation/00-INDEX 2004-09-03 12:21:17 -04:00 @@ -250,6 +250,8 @@ - directory with info regarding video/TV/radio cards and linux. vm/ - directory with info on the Linux vm code. +watch64.txt + - watch64 API description watchdog/ - how to auto-reboot Linux if it has "fallen and can't get up". ;-) x86_64/ diff -Nru a/Documentation/watch64.txt b/Documentation/watch64.txt --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/Documentation/watch64.txt 2004-09-03 12:21:17 -04:00 @@ -0,0 +1,35 @@ +int watch64_register(unsigned long* ptr, unsigned int interval); + + - Registers *ptr to be monitored every interval jiffies. + - If interval==0, WATCH64_INTERVAL will be used (HZ/10 by default) + +int watch64_unregister(unsigned long* ptr, struct watch64* st); + + - Unregister *ptr + - st is optional pointer to the struct containing the registration + information + - if st==NULL, it will be looked up automatically + +struct watch64* watch64_find(unsigned long* ptr); + + - Return struct with registration information of *ptr + +int watch64_disable(unsigned long* ptr, struct watch64* st); + + - Disable *ptr from being monitored, without removing it from the list + - st is optional (see watch64_unregister for more information) + +int watch64_enable(unsigned long* ptr, struct watch64* st); + + - Enable *ptr from being monitored (opposite of watch64_disable) + - st is optional (see watch64_unregister for more information) + +int watch64_toggle(unsigned long* ptr, struct watch64* st); + + - Toggle the enable/disable status + - st is optional (see watch64_unregister for more information) + +inline u_int64_t watch64_getval(unsigned long* ptr, struct watch64* st); + + - Return the whole 64-bit counter + - st is optional (see watch64_unregister for more information) diff -Nru a/include/linux/watch64.h b/include/linux/watch64.h --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/include/linux/watch64.h 2004-09-03 12:21:17 -04:00 @@ -0,0 +1,63 @@ +/* + * inclue/linux/watch64.h + * + * Copyright (C) 2003 Josef "Jeff" Sipek + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ + +#ifndef _LINUX_64WATCH_H +#define _LINUX_64WATCH_H + +#include +#include +#include +#include +#include + +#define WATCH64_INTERVAL (HZ/10) +#define WATCH64_MINIMUM (HZ/20) +#define WATCH64_MAGIC 0x573634 + +#if (BITS_PER_LONG == 64) + +struct watch64 { +}; + +#else + +struct watch64 { + struct list_head list; + unsigned long *ptr; + unsigned long oldval; + u_int64_t total; + unsigned int interval; + int active; + seqlock_t lock; + struct rcu_head rcuhead; +}; + +#endif /* (BITS_PER_LONG == 64) */ + +/* + * Prototypes + */ + +void watch64_init(void); +void watch64_run(unsigned long var); +int watch64_register(unsigned long* ptr, unsigned int interval); +int watch64_unregister(unsigned long* ptr, struct watch64* st); +void watch64_rcufree(struct rcu_head* p); +struct watch64* watch64_find(unsigned long* ptr); +inline struct watch64* __watch64_find(unsigned long* ptr); +int watch64_disable(unsigned long* ptr, struct watch64* st); +inline int __watch64_disable(unsigned long* ptr, struct watch64* st); +int watch64_enable(unsigned long* ptr, struct watch64* st); +inline int __watch64_enable(unsigned long* ptr, struct watch64* st); +int watch64_toggle(unsigned long* ptr, struct watch64* st); +inline u_int64_t watch64_getval(unsigned long* ptr, struct watch64* st); + +#endif /* _LINUX_WATCH64_H */ diff -Nru a/kernel/Makefile b/kernel/Makefile --- a/kernel/Makefile 2004-09-03 12:21:17 -04:00 +++ b/kernel/Makefile 2004-09-03 12:21:17 -04:00 @@ -7,7 +7,7 @@ sysctl.o capability.o ptrace.o timer.o user.o \ signal.o sys.o kmod.o workqueue.o pid.o \ rcupdate.o intermodule.o extable.o params.o posix-timers.o \ - kthread.o + kthread.o watch64.o obj-$(CONFIG_FUTEX) += futex.o obj-$(CONFIG_GENERIC_ISA_DMA) += dma.o diff -Nru a/kernel/watch64.c b/kernel/watch64.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/kernel/watch64.c 2004-09-03 12:21:17 -04:00 @@ -0,0 +1,392 @@ +/* + * kernel/watch64.c + * + * Copyright (C) 2003 Josef "Jeff" Sipek + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * Watch64 global variables + */ + +spinlock_t watch64_biglock = SPIN_LOCK_UNLOCKED; +LIST_HEAD(watch64_head); +struct timer_list watch64_timer; +int watch64_setup; + +#if (BITS_PER_LONG == 64) + +void watch64_init(void) +{ +} + +void watch64_run(unsigned long var) +{ +} + +int watch64_register(unsigned long* ptr, unsigned int interval) +{ + return 0; +} + +int watch64_unregister(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +void watch64_rcufree(void* p) +{ +} + +struct watch64* watch64_find(unsigned long* ptr) +{ + return NULL; +} + +struct watch64* __watch64_find(unsigned long* ptr) +{ + return NULL; +} + +int watch64_disable(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +int __watch64_disable(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +int watch64_enable(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +int __watch64_enable(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +int watch64_toggle(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +inline u_int64_t watch64_getval(unsigned long* ptr, struct watch64* st) +{ + return (u_int64_t) *ptr; +} + +#else + +/* + * Initiate watch64 system + */ + +void watch64_init(void) +{ + spin_lock(&watch64_biglock); + + if (watch64_setup==WATCH64_MAGIC) { + spin_unlock(&watch64_biglock); + return; + } + + printk(KERN_WARNING "watch64: 2003/08/22 Josef 'Jeff' Sipek \n"); + printk(KERN_WARNING "watch64: Enabling Watch64 extensions..."); + + init_timer(&watch64_timer); + watch64_timer.function = watch64_run; + watch64_timer.data = (unsigned long) NULL; + watch64_timer.expires = jiffies + WATCH64_MINIMUM; + add_timer(&watch64_timer); + + printk("done.\n"); + + watch64_setup = WATCH64_MAGIC; + + spin_unlock(&watch64_biglock); +} + +/* + * Go through the list of registered variables and check them for changes + */ + +void watch64_run(unsigned long var) +{ + struct list_head* entry; + struct watch64* watch_struct; + unsigned long tmp; + + rcu_read_lock(); + list_for_each_rcu(entry, &watch64_head) { + watch_struct = list_entry(entry, struct watch64, list); + if (*watch_struct->ptr != watch_struct->oldval) { + tmp = *watch_struct->ptr; + if (tmp > watch_struct->oldval) { + write_seqlock(&watch_struct->lock); + watch_struct->total += tmp - watch_struct->oldval; + write_sequnlock(&watch_struct->lock); + } else if (tmp < watch_struct->oldval) { + write_seqlock(&watch_struct->lock); + watch_struct->total += ((u_int64_t) 1<oldval + tmp; + write_sequnlock(&watch_struct->lock); + } + watch_struct->oldval = tmp; + } + } + rcu_read_unlock(); + + mod_timer(&watch64_timer, jiffies + WATCH64_MINIMUM); +} + +/* + * Register a new variable with watch64 + */ + +int watch64_register(unsigned long* ptr, unsigned int interval) +{ + struct watch64* temp; + + temp = (struct watch64*) kmalloc(sizeof(struct watch64),GFP_ATOMIC); + + if (!temp) + return -ENOMEM; + + if (watch64_setup!=WATCH64_MAGIC) + watch64_init(); + + temp->ptr = ptr; + temp->oldval = 0; + temp->total = 0; + if (interval==0) + temp->interval = WATCH64_INTERVAL; + else if (intervalinterval = WATCH64_MINIMUM; + printk("watch64: attempted to add new watch with interval below %d jiffies",WATCH64_MINIMUM); + } else + temp->interval = interval; + + temp->active = 0; + + seqlock_init(&temp->lock); + + list_add_rcu(&temp->list, &watch64_head); + + return 0; +} + +/* + * Unregister a variable with watch64 + */ + +int watch64_unregister(unsigned long* ptr, struct watch64* st) +{ + rcu_read_lock(); + if (!st) + st = __watch64_find(ptr); + + if (!st) + return -EINVAL; + + __watch64_disable(ptr, st); + list_del_rcu(&st->list); + + call_rcu(&st->rcuhead, watch64_rcufree); + rcu_read_unlock(); + + return 0; +} + +/* + * Free memory via RCU + */ + +void watch64_rcufree(struct rcu_head* p) +{ + kfree(container_of(p, struct watch64, rcuhead)); +} + +/* + * Find watch64 structure with RCU lock + */ + +struct watch64* watch64_find(unsigned long* ptr) +{ + struct watch64* tmp; + + rcu_read_lock(); + tmp = __watch64_find(ptr); + rcu_read_unlock(); + + return tmp; +} + +/* + * Find watch64 structure without RCU lock + */ + +inline struct watch64* __watch64_find(unsigned long* ptr) +{ + struct list_head* tmp; + struct watch64* watch64_struct; + + list_for_each_rcu(tmp, &watch64_head) { + watch64_struct = list_entry(tmp, struct watch64, list); + if (watch64_struct->ptr==ptr) + return watch64_struct; + } + + return NULL; +} + +/* + * Disable a variable watch with RCU lock + */ + +int watch64_disable(unsigned long* ptr, struct watch64* st) +{ + int tmp; + + rcu_read_lock(); + tmp = __watch64_disable(ptr,st); + rcu_read_unlock(); + + return tmp; +} + +/* + * Disable a variable watch without RCU lock + */ + +inline int __watch64_disable(unsigned long* ptr, struct watch64* st) +{ + if (!st) + st = watch64_find(ptr); + + if (!st) + return -EINVAL; + + st->active = 0; + + return 0; +} + +/* + * Enable a variable watch with RCU lock + */ + +int watch64_enable(unsigned long* ptr, struct watch64* st) +{ + int tmp; + + rcu_read_lock(); + tmp = __watch64_enable(ptr,st); + rcu_read_unlock(); + + return tmp; +} + +/* + * Enable a variable watch without RCU lock + */ + +inline int __watch64_enable(unsigned long* ptr, struct watch64* st) +{ + if (!st) + st = __watch64_find(ptr); + + if (!st) + return -EINVAL; + + st->oldval = *ptr; + write_seqlock(&st->lock); + st->total = (u_int64_t) st->oldval; + write_sequnlock(&st->lock); + st->active = 1; + + return 0; +} + +/* + * Toggle a variable watch + */ + +int watch64_toggle(unsigned long* ptr, struct watch64* st) +{ + rcu_read_lock(); + if (!st) + st = __watch64_find(ptr); + + if (!st) { + rcu_read_unlock(); + return -EINVAL; + } + + if (st->active) + __watch64_disable(ptr,st); + else + __watch64_enable(ptr,st); + rcu_read_unlock(); + + return 0; +} + +/* + * Return the total 64-bit value + */ + +inline u_int64_t watch64_getval(unsigned long* ptr, struct watch64* st) +{ + unsigned int seq; + u_int64_t total; + + rcu_read_lock(); + if (!st) + st = __watch64_find(ptr); + + if (!st) { + rcu_read_unlock(); + return *ptr; + } + + do { + seq = read_seqbegin(&st->lock); + total = st->total; + } while (read_seqretry(&st->lock, seq)); + rcu_read_unlock(); + + return total; +} + +#endif /* (BITS_PER_LONG == 64) */ + +/* + * Export all the necessary symbols + */ + +EXPORT_SYMBOL(watch64_register); +EXPORT_SYMBOL(watch64_unregister); +EXPORT_SYMBOL(watch64_find); +EXPORT_SYMBOL(watch64_disable); +EXPORT_SYMBOL(watch64_enable); +EXPORT_SYMBOL(watch64_toggle); +EXPORT_SYMBOL(watch64_getval); From jeffpc@optonline.net Fri Sep 3 10:22:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 10:22:49 -0700 (PDT) Received: from mta10.srv.hcvlny.cv.net (mta10.srv.hcvlny.cv.net [167.206.5.85]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83HMdXi028435 for ; Fri, 3 Sep 2004 10:22:40 -0700 Received: from [10.0.0.15] (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta10.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0I3H00IO36XIC9@mta10.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Fri, 03 Sep 2004 13:22:30 -0400 (EDT) Date: Fri, 03 Sep 2004 13:22:29 -0400 From: "Josef 'Jeff' Sipek" Subject: [PATCH 2.6] 64network: 64-bit network statistics In-reply-to: <200409031307.01240.jeffpc@optonline.net> To: linux-kernel@vger.kernel.org Cc: netdev@oss.sgi.com Message-id: <200409031322.29981.jeffpc@optonline.net> MIME-version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Content-disposition: inline User-Agent: KMail/1.6.2 References: <200409031307.01240.jeffpc@optonline.net> X-archive-position: 8386 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev Upon registration of a network device, all the statistics variables are registered with watch64. Additionally, a new proc file is created /proc/net/dev64 displays the 64-bit values as supposed to /proc/net/dev which is left to display the original 32-bit variables for backward compatibility. The sysfs interface (/sys/class/net//statistics/*) displays the 64-bit values only. On 64-bit systems, all is optimized away through watch64. Requires: watch64 The following patch can be also pulled from http://jeffpc.bkbits.net/64network-2.6 (includes watch64) Josef "Jeff" Sipek Signed-off-by: Josef "Jeff" Sipek diff -Nru a/include/linux/netdevice.h b/include/linux/netdevice.h --- a/include/linux/netdevice.h 2004-09-03 12:22:08 -04:00 +++ b/include/linux/netdevice.h 2004-09-03 12:22:08 -04:00 @@ -14,6 +14,7 @@ * Alan Cox, * Bjorn Ekwall. * Pekka Riikonen + * Josef "Jeff" Sipek * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License @@ -945,6 +946,10 @@ #ifdef CONFIG_SYSCTL extern char *net_sysctl_strdup(const char *s); #endif + +/* Register/unregister all the members of struct net_device_stats with watch64 */ +inline void net_register_stats64(struct net_device_stats* stats); +inline void net_unregister_stats64(struct net_device_stats* stats); #endif /* __KERNEL__ */ diff -Nru a/net/core/dev.c b/net/core/dev.c --- a/net/core/dev.c 2004-09-03 12:22:08 -04:00 +++ b/net/core/dev.c 2004-09-03 12:22:08 -04:00 @@ -18,6 +18,7 @@ * Alexey Kuznetsov * Adam Sulmicki * Pekka Riikonen + * Josef "Jeff" Sipek * * Changes: * D.J. Barrow : Fixed bug where dev->refcnt gets set @@ -70,6 +71,7 @@ * indefinitely on dev->refcnt * J Hadi Salim : - Backlog queue sampling * - netif_rx() feedback + * Josef "Jeff" Sipek : Added watch64 calls for network statistics */ #include @@ -108,6 +110,7 @@ #include #include #include +#include #ifdef CONFIG_NET_RADIO #include /* Note : will define WIRELESS_EXT */ #include @@ -2110,6 +2113,49 @@ seq_printf(seq, "%6s: No statistics available.\n", dev->name); } +static void dev_seq_printf_stats64(struct seq_file *seq, struct net_device *dev) +{ + if (dev->get_stats) { + struct net_device_stats *stats = dev->get_stats(dev); + + seq_printf(seq, "%6s:%8llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu " + "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n", + dev->name, watch64_getval(&stats->rx_bytes,NULL), + watch64_getval(&stats->rx_packets,NULL), + watch64_getval(&stats->rx_errors,NULL), + watch64_getval(&stats->rx_dropped,NULL) + + watch64_getval(&stats->rx_missed_errors,NULL), + watch64_getval(&stats->rx_fifo_errors,NULL), + watch64_getval(&stats->rx_length_errors,NULL) + + watch64_getval(&stats->rx_over_errors,NULL) + + watch64_getval(&stats->rx_crc_errors,NULL) + + watch64_getval(&stats->rx_frame_errors,NULL), + watch64_getval(&stats->rx_compressed,NULL), + watch64_getval(&stats->multicast,NULL), + watch64_getval(&stats->tx_bytes,NULL), + watch64_getval(&stats->tx_packets,NULL), + watch64_getval(&stats->tx_errors,NULL), + watch64_getval(&stats->tx_dropped,NULL), + watch64_getval(&stats->tx_fifo_errors,NULL), + watch64_getval(&stats->collisions,NULL), + watch64_getval(&stats->tx_carrier_errors,NULL) + + watch64_getval(&stats->tx_aborted_errors,NULL) + + watch64_getval(&stats->tx_window_errors,NULL) + + watch64_getval(&stats->tx_heartbeat_errors,NULL), + watch64_getval(&stats->tx_compressed,NULL)); + } else + seq_printf(seq, "%6s: No statistics available.\n", dev->name); +} + +static void dev_seq_show_header(struct seq_file *seq) +{ + seq_puts(seq, "Inter-| Receive " + " | Transmit\n" + " face |bytes packets errs drop fifo frame " + "compressed multicast|bytes packets errs " + "drop fifo colls carrier compressed\n"); +} + /* * Called from the PROCfs module. This now uses the new arbitrary sized * /proc/net interface to create /proc/net/dev @@ -2117,16 +2163,21 @@ static int dev_seq_show(struct seq_file *seq, void *v) { if (v == SEQ_START_TOKEN) - seq_puts(seq, "Inter-| Receive " - " | Transmit\n" - " face |bytes packets errs drop fifo frame " - "compressed multicast|bytes packets errs " - "drop fifo colls carrier compressed\n"); + dev_seq_show_header(seq); else dev_seq_printf_stats(seq, v); return 0; } +static int dev_seq_show64(struct seq_file *seq, void *v) +{ + if (v == SEQ_START_TOKEN) + dev_seq_show_header(seq); + else + dev_seq_printf_stats64(seq, v); + return 0; +} + static struct netif_rx_stats *softnet_get_online(loff_t *pos) { struct netif_rx_stats *rc = NULL; @@ -2179,11 +2230,23 @@ .show = dev_seq_show, }; +static struct seq_operations dev_seq_ops64 = { + .start = dev_seq_start, + .next = dev_seq_next, + .stop = dev_seq_stop, + .show = dev_seq_show64, +}; + static int dev_seq_open(struct inode *inode, struct file *file) { return seq_open(file, &dev_seq_ops); } +static int dev_seq_open64(struct inode *inode, struct file *file) +{ + return seq_open(file, &dev_seq_ops64); +} + static struct file_operations dev_seq_fops = { .owner = THIS_MODULE, .open = dev_seq_open, @@ -2192,6 +2255,14 @@ .release = seq_release, }; +static struct file_operations dev_seq_fops64 = { + .owner = THIS_MODULE, + .open = dev_seq_open64, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + static struct seq_operations softnet_seq_ops = { .start = softnet_seq_start, .next = softnet_seq_next, @@ -2224,8 +2295,10 @@ if (!proc_net_fops_create("dev", S_IRUGO, &dev_seq_fops)) goto out; - if (!proc_net_fops_create("softnet_stat", S_IRUGO, &softnet_seq_fops)) + if (!proc_net_fops_create("dev64", S_IRUGO, &dev_seq_fops64)) goto out_dev; + if (!proc_net_fops_create("softnet_stat", S_IRUGO, &softnet_seq_fops)) + goto out_dev64; if (wireless_proc_init()) goto out_softnet; rc = 0; @@ -2233,6 +2306,8 @@ return rc; out_softnet: proc_net_remove("softnet_stat"); +out_dev64: + proc_net_remove("dev64"); out_dev: proc_net_remove("dev"); goto out; @@ -2910,6 +2985,9 @@ * device is present. */ + if (dev->get_stats) + net_register_stats64(dev->get_stats(dev)); + set_bit(__LINK_STATE_PRESENT, &dev->state); dev->next = NULL; @@ -2922,7 +3000,7 @@ dev_hold(dev); dev->reg_state = NETREG_REGISTERING; write_unlock_bh(&dev_base_lock); - + /* Notify protocols, that a new device appeared. */ notifier_call_chain(&netdev_chain, NETDEV_REGISTER, dev); @@ -3145,6 +3223,9 @@ /* If device is running, close it first. */ if (dev->flags & IFF_UP) dev_close(dev); + + if (dev->get_stats) + net_unregister_stats64(dev->get_stats(dev)); /* And unlink it from device chain. */ for (dp = &dev_base; (d = *dp) != NULL; dp = &d->next) { @@ -3246,6 +3327,98 @@ } #endif /* CONFIG_HOTPLUG_CPU */ +/* + * Register all the members of the net_device_stats structure + * + */ + +inline void net_register_stats64(struct net_device_stats* stats) +{ + if (!stats) + return; + + watch64_register(&stats->tx_packets,0); + watch64_enable (&stats->tx_packets,NULL); + watch64_register(&stats->rx_packets,0); + watch64_enable (&stats->rx_packets,NULL); + watch64_register(&stats->tx_bytes,0); + watch64_enable (&stats->tx_bytes,NULL); + watch64_register(&stats->rx_bytes,0); + watch64_enable (&stats->rx_bytes,NULL); + watch64_register(&stats->tx_errors,0); + watch64_enable (&stats->tx_errors,NULL); + watch64_register(&stats->rx_errors,0); + watch64_enable (&stats->rx_errors,NULL); + watch64_register(&stats->tx_dropped,0); + watch64_enable (&stats->tx_dropped,NULL); + watch64_register(&stats->rx_dropped,0); + watch64_enable (&stats->rx_dropped,NULL); + watch64_register(&stats->multicast,0); + watch64_enable (&stats->multicast,NULL); + watch64_register(&stats->collisions,0); + watch64_enable (&stats->collisions,NULL); + watch64_register(&stats->rx_length_errors,0); + watch64_enable (&stats->rx_length_errors,NULL); + watch64_register(&stats->rx_over_errors,0); + watch64_enable (&stats->rx_over_errors,NULL); + watch64_register(&stats->rx_crc_errors,0); + watch64_enable (&stats->rx_crc_errors,NULL); + watch64_register(&stats->rx_frame_errors,0); + watch64_enable (&stats->rx_frame_errors,NULL); + watch64_register(&stats->rx_fifo_errors,0); + watch64_enable (&stats->rx_fifo_errors,NULL); + watch64_register(&stats->rx_missed_errors,0); + watch64_enable (&stats->rx_missed_errors,NULL); + watch64_register(&stats->tx_aborted_errors,0); + watch64_enable (&stats->tx_aborted_errors,NULL); + watch64_register(&stats->tx_carrier_errors,0); + watch64_enable (&stats->tx_carrier_errors,NULL); + watch64_register(&stats->tx_fifo_errors,0); + watch64_enable (&stats->tx_fifo_errors,NULL); + watch64_register(&stats->tx_heartbeat_errors,0); + watch64_enable (&stats->tx_heartbeat_errors,NULL); + watch64_register(&stats->tx_window_errors,0); + watch64_enable (&stats->tx_window_errors,NULL); + watch64_register(&stats->rx_compressed,0); + watch64_enable (&stats->rx_compressed,NULL); + watch64_register(&stats->tx_compressed,0); + watch64_enable (&stats->tx_compressed,NULL); +} + +/* + * Unregister all the members of the net_device_stats structure + * + */ + +inline void net_unregister_stats64(struct net_device_stats* stats) +{ + if (!stats) + return; + + watch64_unregister(&stats->tx_packets,0); + watch64_unregister(&stats->rx_packets,0); + watch64_unregister(&stats->tx_bytes,0); + watch64_unregister(&stats->rx_bytes,0); + watch64_unregister(&stats->tx_errors,0); + watch64_unregister(&stats->rx_errors,0); + watch64_unregister(&stats->tx_dropped,0); + watch64_unregister(&stats->rx_dropped,0); + watch64_unregister(&stats->multicast,0); + watch64_unregister(&stats->collisions,0); + watch64_unregister(&stats->rx_length_errors,0); + watch64_unregister(&stats->rx_over_errors,0); + watch64_unregister(&stats->rx_crc_errors,0); + watch64_unregister(&stats->rx_frame_errors,0); + watch64_unregister(&stats->rx_fifo_errors,0); + watch64_unregister(&stats->rx_missed_errors,0); + watch64_unregister(&stats->tx_aborted_errors,0); + watch64_unregister(&stats->tx_carrier_errors,0); + watch64_unregister(&stats->tx_fifo_errors,0); + watch64_unregister(&stats->tx_heartbeat_errors,0); + watch64_unregister(&stats->tx_window_errors,0); + watch64_unregister(&stats->rx_compressed,0); + watch64_unregister(&stats->tx_compressed,0); +} /* * Initialize the DEV module. At boot time this walks the device list and diff -Nru a/net/core/net-sysfs.c b/net/core/net-sysfs.c --- a/net/core/net-sysfs.c 2004-09-03 12:22:08 -04:00 +++ b/net/core/net-sysfs.c 2004-09-03 12:22:08 -04:00 @@ -16,6 +16,7 @@ #include #include #include +#include #define to_class_dev(obj) container_of(obj,struct class_device,kobj) #define to_net_dev(class) container_of(class, struct net_device, class_dev) @@ -23,6 +24,7 @@ static const char fmt_hex[] = "%#x\n"; static const char fmt_dec[] = "%d\n"; static const char fmt_ulong[] = "%lu\n"; +static const char fmt_ullong[] = "%llu\n"; static inline int dev_isalive(const struct net_device *dev) { @@ -204,8 +206,8 @@ read_lock(&dev_base_lock); if (dev_isalive(dev) && dev->get_stats && (stats = (*dev->get_stats)(dev))) - ret = sprintf(buf, fmt_ulong, - *(unsigned long *)(((u8 *) stats) + offset)); + ret = sprintf(buf, fmt_ullong, + watch64_getval((unsigned long *)(((u8 *) stats) + offset),NULL)); read_unlock(&dev_base_lock); return ret; From jeffpc@optonline.net Fri Sep 3 10:25:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 10:25:31 -0700 (PDT) Received: from mta6.srv.hcvlny.cv.net (mta6.srv.hcvlny.cv.net [167.206.5.72]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83HPPCq028777 for ; Fri, 3 Sep 2004 10:25:26 -0700 Received: from [10.0.0.15] (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta6.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0I3H00FI771066@mta6.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Fri, 03 Sep 2004 13:24:37 -0400 (EDT) Date: Fri, 03 Sep 2004 13:24:28 -0400 From: Jeff Sipek Subject: Re: [PATCH/RFC 2.6] NET: 64-bit network statistics In-reply-to: <200409031307.01240.jeffpc@optonline.net> To: linux-kernel@vger.kernel.org Cc: netdev@oss.sgi.com Message-id: <200409031324.36252.jeffpc@optonline.net> MIME-version: 1.0 Content-type: Text/Plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Content-disposition: inline User-Agent: KMail/1.6.2 References: <200409031307.01240.jeffpc@optonline.net> X-archive-position: 8387 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 03 September 2004 13:06, josef Jeff Sipek wrote: > I've created a patch that monitors changes to the network statistics > variables and keeps internal 64-bit counter. I decided to split it into two > parts (patches are to follow in next emails): > 1) generic variable monitoring system (watch64) > 2) network statistics specific patch (64network) Btw, both of these patches apply cleanly against 2.6.9-rc1-bk10. Jeff. - -- bad pun of the week: the formula 1 control computer suffered from a race condition -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBOKjQwFP0+seVj/4RApg/AKDEFSTVOMSvVh9zVU65o/P6ZcfBxgCffeId QddOVsR+uHdkV2D4/U8QVO4= =jQIT -----END PGP SIGNATURE----- From vkondra@mail.ru Fri Sep 3 11:04:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 11:05:32 -0700 (PDT) Received: from mx2.mail.ru (mx2.mail.ru [194.67.23.122]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83I4k8w029921 for ; Fri, 3 Sep 2004 11:04:47 -0700 Received: from [212.179.200.204] (port=25901 helo=[192.168.10.2]) by mx2.mail.ru with esmtp id 1C3IQ2-000Jtq-00; Fri, 03 Sep 2004 22:04:16 +0400 From: Vladimir Kondratiev To: netdev@oss.sgi.com Subject: Re: [RFC] acx100 inclusion in mainline; generic 802.11 stack Date: Fri, 3 Sep 2004 20:37:54 +0300 User-Agent: KMail/1.7 Cc: Jeff Garzik , Denis Vlasenko , Jean Tourrilhes , Jouni Malinen , acx100-devel@lists.sourceforge.net, prism54-devel@prism54.org, "David S. Miller" References: <200408312111.02438.vda@port.imtp.ilyichevsk.odessa.ua> <200409022324.43117.vkondra@mail.ru> <4137839B.4000303@pobox.com> In-Reply-To: <4137839B.4000303@pobox.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200409032039.28201.vkondra@mail.ru> X-Spam: Probable Spam X-archive-position: 8388 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vkondra@mail.ru Precedence: bulk X-list: netdev Is anyone working on this stack? I asked Dave, he is hot working on it. Or is this code dead? On Thursday 02 September 2004 23:33, Jeff Garzik wrote: JG> Vladimir Kondratiev wrote: JG> > Jeff, JG> > JG> > On Tuesday 31 August 2004 21:21, Jeff Garzik wrote: JG> > JG> Denis Vlasenko wrote: JG> > JG> > I think 'senior' network guys are in position to decide upon which JG> > JG> > of currently available 802.11 stacks we should continue to work. JG> > JG> > (Atheros has one, said to be derived from BSD, is there any others?) JG> > JG> JG> > JG> JG> > JG> Already have. Start with the code in wireless-2.6 -- HostAP -- and use JG> > JG> DaveM's 802.11 stack template as a model for actually integrating 802.11 JG> > JG> very tightly with the rest of the net stack. JG> > JG> JG> > JG> JG> > http://www.kernel.org/pub/linux/kernel/people/jgarzik/patchkits/2.6/davem-p8 JG> > 0211.tar.bz2 JG> > JG> > Is this stack the main one that is going to be used? I.e. if I am working on JG> > driver for next generation .11 card - should I try to use it, request/submitt JG> > missing features etc.? Or should I use wireless extensions? JG> JG> DaveM's code is a template for how a wireless stack would look when JG> properly and fully integrated into the net core. JG> JG> Although JeanT and I disagree about this, I am less interested in JG> backwards compatibility than I am about making wireless a "first class JG> citizen" in the kernel. As I have proven with kcompat JG> (http://sf.net/projects/gkernel/) you can be backwards compatible while JG> still evolving the current kernel driver API to meet current design needs. JG> JG> Jeff JG> JG> JG> JG> From yoshfuji@linux-ipv6.org Fri Sep 3 12:06:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 12:06:48 -0700 (PDT) Received: from yue.st-paulia.net ([203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83J6fWW001538 for ; Fri, 3 Sep 2004 12:06:41 -0700 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id D6C2533CE6; Sat, 4 Sep 2004 04:07:29 +0900 (JST) Date: Sat, 04 Sep 2004 04:07:27 +0900 (JST) Message-Id: <20040904.040727.72671952.yoshfuji@linux-ipv6.org> To: jeffpc@optonline.net Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, yoshfuji@linux-ipv6.org Subject: Re: [PATCH 2.6] watch64: generic variable monitoring system From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200409031319.24863.jeffpc@optonline.net> References: <200409031307.01240.jeffpc@optonline.net> <200409031319.24863.jeffpc@optonline.net> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 8389 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <200409031319.24863.jeffpc@optonline.net> (at Fri, 03 Sep 2004 13:19:24 -0400), "Josef 'Jeff' Sipek" says: > The watch64 system allows the programmer to specify the approximate interval > at which he wants his variables checked. If he tries to specify shorter > interval than the minimum a default value of HZ/10 is used. To minimize > locking, RCU and seqlock are used. On 64-bit systems, all is optimized away. I agree with the basic principle; it is very similar to mine. However, it is too complicated isn't it? I would do per-"table" registration (instead of per-variable one); watch64_getval() seems very ugly to me... -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From jeffpc@optonline.net Fri Sep 3 13:24:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 13:24:40 -0700 (PDT) Received: from mta6.srv.hcvlny.cv.net (mta6.srv.hcvlny.cv.net [167.206.5.72]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83KOWIQ006212 for ; Fri, 3 Sep 2004 13:24:34 -0700 Received: from [10.0.0.15] (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta6.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0I3H00EK9FCNLU@mta6.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Fri, 03 Sep 2004 16:24:23 -0400 (EDT) Date: Fri, 03 Sep 2004 16:24:15 -0400 From: Jeff Sipek Subject: Re: [PATCH 2.6] watch64: generic variable monitoring system In-reply-to: <20040904.040727.72671952.yoshfuji@linux-ipv6.org> To: YOSHIFUJI Hideaki / =?utf-8?q?=E5=90=89=E8=97=A4=E8=8B=B1=E6=98=8E?= Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Message-id: <200409031624.22665.jeffpc@optonline.net> MIME-version: 1.0 Content-type: Text/Plain; charset=utf-8 Content-disposition: inline User-Agent: KMail/1.6.2 References: <200409031307.01240.jeffpc@optonline.net> <200409031319.24863.jeffpc@optonline.net> <20040904.040727.72671952.yoshfuji@linux-ipv6.org> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id i83KOWIQ006212 X-archive-position: 8390 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 03 September 2004 15:07, YOSHIFUJI Hideaki / 吉藤英明 wrote: > I agree with the basic principle; it is very similar to mine. Yes, I saw a patch on lkml a while a go (possibly yours?) that used a workqueue (IIRC.) > However, it is too complicated isn't it? I considered the option of removing the capability of the programmer asking for a certain interval, and instead having all the variables checked every WATCH64_INTERVAL. > I would do per-"table" registration (instead of per-variable one); I considered that option, but then decided to make the watch64 system generic enough so that it could be used from anywhere in the kernel. Is my idea of having a kernel-wide subsystem like this too heavy-weight? > watch64_getval() seems very ugly to me... How so? Is it the multiplicity of "if (!st)"? Jeff. - -- bad pun of the week: the formula 1 control computer suffered from a race condition -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBONLzwFP0+seVj/4RAvYsAKCdVy9EzivcGtwa9CDiuvy/nwWuJwCglQ4L iIf4QXC7PA+YwQs3905sRv0= =NkA4 -----END PGP SIGNATURE----- From jgarzik@pobox.com Fri Sep 3 13:30:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 13:31:07 -0700 (PDT) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83KUMva006625 for ; Fri, 3 Sep 2004 13:30:23 -0700 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1C3Kgy-0002Ry-Ss; Fri, 03 Sep 2004 21:29:53 +0100 Message-ID: <4138D431.8040206@pobox.com> Date: Fri, 03 Sep 2004 16:29:37 -0400 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vladimir Kondratiev CC: netdev@oss.sgi.com, Denis Vlasenko , Jean Tourrilhes , Jouni Malinen , acx100-devel@lists.sourceforge.net, prism54-devel@prism54.org, "David S. Miller" Subject: Re: [RFC] acx100 inclusion in mainline; generic 802.11 stack References: <200408312111.02438.vda@port.imtp.ilyichevsk.odessa.ua> <200409022324.43117.vkondra@mail.ru> <4137839B.4000303@pobox.com> <200409032039.28201.vkondra@mail.ru> In-Reply-To: <200409032039.28201.vkondra@mail.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 8391 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Vladimir Kondratiev wrote: > Is anyone working on this stack? I asked Dave, he is hot working on it. > Or is this code dead? Nobody is actively working on that stack AFAIK. Jeff From jeffpc@optonline.net Fri Sep 3 13:40:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 13:41:01 -0700 (PDT) Received: from mta8.srv.hcvlny.cv.net (mta8.srv.hcvlny.cv.net [167.206.5.75]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83KetC2007208 for ; Fri, 3 Sep 2004 13:40:55 -0700 Received: from [10.0.0.15] (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta8.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0I3H00DA2G3JO9@mta8.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Fri, 03 Sep 2004 16:40:32 -0400 (EDT) Date: Fri, 03 Sep 2004 16:40:30 -0400 From: "Josef 'Jeff' Sipek" Subject: Re: [PATCH 2.6] watch64: generic variable monitoring system In-reply-to: <200409031618.47521.jeffpc@optonline.net> To: Stephen Hemminger Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Message-id: <200409031640.30731.jeffpc@optonline.net> MIME-version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Content-disposition: inline User-Agent: KMail/1.6.2 References: <200409031307.01240.jeffpc@optonline.net> <20040903121657.355a6a8b@dell_ss3.pdx.osdl.net> <200409031618.47521.jeffpc@optonline.net> X-archive-position: 8392 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev The following fixes watch64 patch previously submitted to follow CodingStyle guidelines. BK repo is up to date as well Jeff. Signed-off-by: Josef "Jeff" Sipek --- 1.7/kernel/watch64.c 2004-07-14 16:41:26 -04:00 +++ edited/watch64.c 2004-09-03 16:12:39 -04:00 @@ -110,7 +110,8 @@ return; } - printk(KERN_WARNING "watch64: 2003/08/22 Josef 'Jeff' Sipek \n"); + printk(KERN_WARNING "watch64: 2003/08/22 Josef 'Jeff' Sipek " + "\n"); printk(KERN_WARNING "watch64: Enabling Watch64 extensions..."); init_timer(&watch64_timer); @@ -139,19 +140,21 @@ rcu_read_lock(); list_for_each_rcu(entry, &watch64_head) { watch_struct = list_entry(entry, struct watch64, list); - if (*watch_struct->ptr != watch_struct->oldval) { - tmp = *watch_struct->ptr; - if (tmp > watch_struct->oldval) { - write_seqlock(&watch_struct->lock); - watch_struct->total += tmp - watch_struct->oldval; - write_sequnlock(&watch_struct->lock); - } else if (tmp < watch_struct->oldval) { - write_seqlock(&watch_struct->lock); - watch_struct->total += ((u_int64_t) 1<oldval + tmp; - write_sequnlock(&watch_struct->lock); - } - watch_struct->oldval = tmp; + if (*watch_struct->ptr == watch_struct->oldval) + continue; + + tmp = *watch_struct->ptr; + if (tmp > watch_struct->oldval) { + write_seqlock(&watch_struct->lock); + watch_struct->total += tmp - watch_struct->oldval; + write_sequnlock(&watch_struct->lock); + } else if (tmp < watch_struct->oldval) { + write_seqlock(&watch_struct->lock); + watch_struct->total += ((u_int64_t) 1<oldval + tmp; + write_sequnlock(&watch_struct->lock); } + watch_struct->oldval = tmp; } rcu_read_unlock(); @@ -181,7 +184,8 @@ temp->interval = WATCH64_INTERVAL; else if (intervalinterval = WATCH64_MINIMUM; - printk("watch64: attempted to add new watch with interval below %d jiffies",WATCH64_MINIMUM); + printk("watch64: attempted to add new watch with " + "interval below %d jiffies",WATCH64_MINIMUM); } else temp->interval = interval; From davej@redhat.com Fri Sep 3 13:53:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 13:53:19 -0700 (PDT) Received: from delerium.codemonkey.org.uk (delerium.kernelslacker.org [81.187.208.145]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83KrBOr008042 for ; Fri, 3 Sep 2004 13:53:14 -0700 Received: from delerium.codemonkey.org.uk (localhost.localdomain [127.0.0.1]) by delerium.codemonkey.org.uk (8.13.1/8.13.1) with ESMTP id i83Kq6OV021868; Fri, 3 Sep 2004 21:52:06 +0100 Received: (from davej@localhost) by delerium.codemonkey.org.uk (8.13.1/8.13.1/Submit) id i83Kq6lu021867; Fri, 3 Sep 2004 21:52:06 +0100 X-Authentication-Warning: delerium.codemonkey.org.uk: davej set sender to davej@redhat.com using -f Date: Fri, 3 Sep 2004 21:52:06 +0100 From: Dave Jones To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Too late check in af_packet.c Message-ID: <20040903205206.GT26419@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 8393 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davej@redhat.com Precedence: bulk X-list: netdev Using the automated source checker at coverity.com, they picked up on some code in packet_release() where a NULL check was done after dereferencing. Patch below. Signed-off-by: Dave Jones Dave --- linux-2.6.8/net/packet/af_packet.c~ 2004-09-03 21:48:14.653433072 +0100 +++ linux-2.6.8/net/packet/af_packet.c 2004-09-03 21:49:23.652943552 +0100 @@ -785,11 +785,13 @@ static int packet_release(struct socket *sock) { struct sock *sk = sock->sk; - struct packet_opt *po = pkt_sk(sk); + struct packet_opt *po; if (!sk) return 0; + po = pkt_sk(sk); + write_lock_bh(&packet_sklist_lock); sk_del_node_init(sk); write_unlock_bh(&packet_sklist_lock); From jeffpc@optonline.net Fri Sep 3 14:45:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 14:45:14 -0700 (PDT) Received: from mta9.srv.hcvlny.cv.net (mta9.srv.hcvlny.cv.net [167.206.5.42]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83Lj62F009118 for ; Fri, 3 Sep 2004 14:45:07 -0700 Received: from [10.0.0.15] (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta9.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0I3H00JKEJ29E3@mta9.srv.hcvlny.cv.net> for netdev@oss.sgi.com; Fri, 03 Sep 2004 17:44:34 -0400 (EDT) Date: Fri, 03 Sep 2004 17:44:24 -0400 From: "Josef 'Jeff' Sipek" Subject: Re: [PATCH 2.6] watch64: generic variable monitoring system In-reply-to: <20040903121657.355a6a8b@dell_ss3.pdx.osdl.net> To: Stephen Hemminger Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Message-id: <200409031744.32970.jeffpc@optonline.net> MIME-version: 1.0 Content-type: multipart/mixed; boundary="Boundary_(ID_Gu5A8t3yfXnZW0k3kEIBKg)" Content-disposition: inline User-Agent: KMail/1.6.2 References: <200409031307.01240.jeffpc@optonline.net> <200409031319.24863.jeffpc@optonline.net> <20040903121657.355a6a8b@dell_ss3.pdx.osdl.net> X-archive-position: 8394 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jeffpc@optonline.net Precedence: bulk X-list: netdev --Boundary_(ID_Gu5A8t3yfXnZW0k3kEIBKg) Content-type: Text/Plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Content-disposition: inline -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 03 September 2004 15:16, Stephen Hemminger wrote: > - Code doesn't match the kernel style (read Documentation/CodingStyle) Sorry about the white space, KMail apparently likes to butcher the text. These are the same patches with the little cleanup update. Jeff. - -- Reality is merely an illusion, albeit a very persistent one. - Albert Einstein -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD4DBQFBOOW+wFP0+seVj/4RAgSiAJj54qcqdEx66lbMW9ik0XviupTNAKC82an1 R0pGX0pTBZ78NWrZpxJm+w== =EesC -----END PGP SIGNATURE----- --Boundary_(ID_Gu5A8t3yfXnZW0k3kEIBKg) Content-type: text/x-diff; charset=iso-8859-1; name=watch64-patch Content-transfer-encoding: 7BIT Content-disposition: attachment; filename=watch64-patch diff -Nru a/Documentation/00-INDEX b/Documentation/00-INDEX --- a/Documentation/00-INDEX 2004-09-03 17:41:06 -04:00 +++ b/Documentation/00-INDEX 2004-09-03 17:41:06 -04:00 @@ -250,6 +250,8 @@ - directory with info regarding video/TV/radio cards and linux. vm/ - directory with info on the Linux vm code. +watch64.txt + - watch64 API description watchdog/ - how to auto-reboot Linux if it has "fallen and can't get up". ;-) x86_64/ diff -Nru a/Documentation/watch64.txt b/Documentation/watch64.txt --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/Documentation/watch64.txt 2004-09-03 17:41:06 -04:00 @@ -0,0 +1,35 @@ +int watch64_register(unsigned long* ptr, unsigned int interval); + + - Registers *ptr to be monitored every interval jiffies. + - If interval==0, WATCH64_INTERVAL will be used (HZ/10 by default) + +int watch64_unregister(unsigned long* ptr, struct watch64* st); + + - Unregister *ptr + - st is optional pointer to the struct containing the registration + information + - if st==NULL, it will be looked up automatically + +struct watch64* watch64_find(unsigned long* ptr); + + - Return struct with registration information of *ptr + +int watch64_disable(unsigned long* ptr, struct watch64* st); + + - Disable *ptr from being monitored, without removing it from the list + - st is optional (see watch64_unregister for more information) + +int watch64_enable(unsigned long* ptr, struct watch64* st); + + - Enable *ptr from being monitored (opposite of watch64_disable) + - st is optional (see watch64_unregister for more information) + +int watch64_toggle(unsigned long* ptr, struct watch64* st); + + - Toggle the enable/disable status + - st is optional (see watch64_unregister for more information) + +inline u_int64_t watch64_getval(unsigned long* ptr, struct watch64* st); + + - Return the whole 64-bit counter + - st is optional (see watch64_unregister for more information) diff -Nru a/include/linux/watch64.h b/include/linux/watch64.h --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/include/linux/watch64.h 2004-09-03 17:41:06 -04:00 @@ -0,0 +1,63 @@ +/* + * inclue/linux/watch64.h + * + * Copyright (C) 2003 Josef "Jeff" Sipek + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ + +#ifndef _LINUX_64WATCH_H +#define _LINUX_64WATCH_H + +#include +#include +#include +#include +#include + +#define WATCH64_INTERVAL (HZ/10) +#define WATCH64_MINIMUM (HZ/20) +#define WATCH64_MAGIC 0x573634 + +#if (BITS_PER_LONG == 64) + +struct watch64 { +}; + +#else + +struct watch64 { + struct list_head list; + unsigned long *ptr; + unsigned long oldval; + u_int64_t total; + unsigned int interval; + int active; + seqlock_t lock; + struct rcu_head rcuhead; +}; + +#endif /* (BITS_PER_LONG == 64) */ + +/* + * Prototypes + */ + +void watch64_init(void); +void watch64_run(unsigned long var); +int watch64_register(unsigned long* ptr, unsigned int interval); +int watch64_unregister(unsigned long* ptr, struct watch64* st); +void watch64_rcufree(struct rcu_head* p); +struct watch64* watch64_find(unsigned long* ptr); +inline struct watch64* __watch64_find(unsigned long* ptr); +int watch64_disable(unsigned long* ptr, struct watch64* st); +inline int __watch64_disable(unsigned long* ptr, struct watch64* st); +int watch64_enable(unsigned long* ptr, struct watch64* st); +inline int __watch64_enable(unsigned long* ptr, struct watch64* st); +int watch64_toggle(unsigned long* ptr, struct watch64* st); +inline u_int64_t watch64_getval(unsigned long* ptr, struct watch64* st); + +#endif /* _LINUX_WATCH64_H */ diff -Nru a/kernel/Makefile b/kernel/Makefile --- a/kernel/Makefile 2004-09-03 17:41:06 -04:00 +++ b/kernel/Makefile 2004-09-03 17:41:06 -04:00 @@ -7,7 +7,7 @@ sysctl.o capability.o ptrace.o timer.o user.o \ signal.o sys.o kmod.o workqueue.o pid.o \ rcupdate.o intermodule.o extable.o params.o posix-timers.o \ - kthread.o + kthread.o watch64.o obj-$(CONFIG_FUTEX) += futex.o obj-$(CONFIG_GENERIC_ISA_DMA) += dma.o diff -Nru a/kernel/watch64.c b/kernel/watch64.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/kernel/watch64.c 2004-09-03 17:41:06 -04:00 @@ -0,0 +1,396 @@ +/* + * kernel/watch64.c + * + * Copyright (C) 2003 Josef "Jeff" Sipek + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * Watch64 global variables + */ + +spinlock_t watch64_biglock = SPIN_LOCK_UNLOCKED; +LIST_HEAD(watch64_head); +struct timer_list watch64_timer; +int watch64_setup; + +#if (BITS_PER_LONG == 64) + +void watch64_init(void) +{ +} + +void watch64_run(unsigned long var) +{ +} + +int watch64_register(unsigned long* ptr, unsigned int interval) +{ + return 0; +} + +int watch64_unregister(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +void watch64_rcufree(void* p) +{ +} + +struct watch64* watch64_find(unsigned long* ptr) +{ + return NULL; +} + +struct watch64* __watch64_find(unsigned long* ptr) +{ + return NULL; +} + +int watch64_disable(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +int __watch64_disable(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +int watch64_enable(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +int __watch64_enable(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +int watch64_toggle(unsigned long* ptr, struct watch64* st) +{ + return 0; +} + +inline u_int64_t watch64_getval(unsigned long* ptr, struct watch64* st) +{ + return (u_int64_t) *ptr; +} + +#else + +/* + * Initiate watch64 system + */ + +void watch64_init(void) +{ + spin_lock(&watch64_biglock); + + if (watch64_setup==WATCH64_MAGIC) { + spin_unlock(&watch64_biglock); + return; + } + + printk(KERN_WARNING "watch64: 2003/08/22 Josef 'Jeff' Sipek " + "\n"); + printk(KERN_WARNING "watch64: Enabling Watch64 extensions..."); + + init_timer(&watch64_timer); + watch64_timer.function = watch64_run; + watch64_timer.data = (unsigned long) NULL; + watch64_timer.expires = jiffies + WATCH64_MINIMUM; + add_timer(&watch64_timer); + + printk("done.\n"); + + watch64_setup = WATCH64_MAGIC; + + spin_unlock(&watch64_biglock); +} + +/* + * Go through the list of registered variables and check them for changes + */ + +void watch64_run(unsigned long var) +{ + struct list_head* entry; + struct watch64* watch_struct; + unsigned long tmp; + + rcu_read_lock(); + list_for_each_rcu(entry, &watch64_head) { + watch_struct = list_entry(entry, struct watch64, list); + if (*watch_struct->ptr == watch_struct->oldval) + continue; + + tmp = *watch_struct->ptr; + if (tmp > watch_struct->oldval) { + write_seqlock(&watch_struct->lock); + watch_struct->total += tmp - watch_struct->oldval; + write_sequnlock(&watch_struct->lock); + } else if (tmp < watch_struct->oldval) { + write_seqlock(&watch_struct->lock); + watch_struct->total += ((u_int64_t) 1<oldval + tmp; + write_sequnlock(&watch_struct->lock); + } + watch_struct->oldval = tmp; + } + rcu_read_unlock(); + + mod_timer(&watch64_timer, jiffies + WATCH64_MINIMUM); +} + +/* + * Register a new variable with watch64 + */ + +int watch64_register(unsigned long* ptr, unsigned int interval) +{ + struct watch64* temp; + + temp = (struct watch64*) kmalloc(sizeof(struct watch64),GFP_ATOMIC); + + if (!temp) + return -ENOMEM; + + if (watch64_setup!=WATCH64_MAGIC) + watch64_init(); + + temp->ptr = ptr; + temp->oldval = 0; + temp->total = 0; + if (interval==0) + temp->interval = WATCH64_INTERVAL; + else if (intervalinterval = WATCH64_MINIMUM; + printk("watch64: attempted to add new watch with " + "interval below %d jiffies",WATCH64_MINIMUM); + } else + temp->interval = interval; + + temp->active = 0; + + seqlock_init(&temp->lock); + + list_add_rcu(&temp->list, &watch64_head); + + return 0; +} + +/* + * Unregister a variable with watch64 + */ + +int watch64_unregister(unsigned long* ptr, struct watch64* st) +{ + rcu_read_lock(); + if (!st) + st = __watch64_find(ptr); + + if (!st) + return -EINVAL; + + __watch64_disable(ptr, st); + list_del_rcu(&st->list); + + call_rcu(&st->rcuhead, watch64_rcufree); + rcu_read_unlock(); + + return 0; +} + +/* + * Free memory via RCU + */ + +void watch64_rcufree(struct rcu_head* p) +{ + kfree(container_of(p, struct watch64, rcuhead)); +} + +/* + * Find watch64 structure with RCU lock + */ + +struct watch64* watch64_find(unsigned long* ptr) +{ + struct watch64* tmp; + + rcu_read_lock(); + tmp = __watch64_find(ptr); + rcu_read_unlock(); + + return tmp; +} + +/* + * Find watch64 structure without RCU lock + */ + +inline struct watch64* __watch64_find(unsigned long* ptr) +{ + struct list_head* tmp; + struct watch64* watch64_struct; + + list_for_each_rcu(tmp, &watch64_head) { + watch64_struct = list_entry(tmp, struct watch64, list); + if (watch64_struct->ptr==ptr) + return watch64_struct; + } + + return NULL; +} + +/* + * Disable a variable watch with RCU lock + */ + +int watch64_disable(unsigned long* ptr, struct watch64* st) +{ + int tmp; + + rcu_read_lock(); + tmp = __watch64_disable(ptr,st); + rcu_read_unlock(); + + return tmp; +} + +/* + * Disable a variable watch without RCU lock + */ + +inline int __watch64_disable(unsigned long* ptr, struct watch64* st) +{ + if (!st) + st = watch64_find(ptr); + + if (!st) + return -EINVAL; + + st->active = 0; + + return 0; +} + +/* + * Enable a variable watch with RCU lock + */ + +int watch64_enable(unsigned long* ptr, struct watch64* st) +{ + int tmp; + + rcu_read_lock(); + tmp = __watch64_enable(ptr,st); + rcu_read_unlock(); + + return tmp; +} + +/* + * Enable a variable watch without RCU lock + */ + +inline int __watch64_enable(unsigned long* ptr, struct watch64* st) +{ + if (!st) + st = __watch64_find(ptr); + + if (!st) + return -EINVAL; + + st->oldval = *ptr; + write_seqlock(&st->lock); + st->total = (u_int64_t) st->oldval; + write_sequnlock(&st->lock); + st->active = 1; + + return 0; +} + +/* + * Toggle a variable watch + */ + +int watch64_toggle(unsigned long* ptr, struct watch64* st) +{ + rcu_read_lock(); + if (!st) + st = __watch64_find(ptr); + + if (!st) { + rcu_read_unlock(); + return -EINVAL; + } + + if (st->active) + __watch64_disable(ptr,st); + else + __watch64_enable(ptr,st); + rcu_read_unlock(); + + return 0; +} + +/* + * Return the total 64-bit value + */ + +inline u_int64_t watch64_getval(unsigned long* ptr, struct watch64* st) +{ + unsigned int seq; + u_int64_t total; + + rcu_read_lock(); + if (!st) + st = __watch64_find(ptr); + + if (!st) { + rcu_read_unlock(); + return *ptr; + } + + do { + seq = read_seqbegin(&st->lock); + total = st->total; + } while (read_seqretry(&st->lock, seq)); + rcu_read_unlock(); + + return total; +} + +#endif /* (BITS_PER_LONG == 64) */ + +/* + * Export all the necessary symbols + */ + +EXPORT_SYMBOL(watch64_register); +EXPORT_SYMBOL(watch64_unregister); +EXPORT_SYMBOL(watch64_find); +EXPORT_SYMBOL(watch64_disable); +EXPORT_SYMBOL(watch64_enable); +EXPORT_SYMBOL(watch64_toggle); +EXPORT_SYMBOL(watch64_getval); --Boundary_(ID_Gu5A8t3yfXnZW0k3kEIBKg) Content-type: text/x-diff; charset=iso-8859-1; name=64network-patch Content-transfer-encoding: 7BIT Content-disposition: attachment; filename=64network-patch diff -Nru a/include/linux/netdevice.h b/include/linux/netdevice.h --- a/include/linux/netdevice.h 2004-09-03 12:22:08 -04:00 +++ b/include/linux/netdevice.h 2004-09-03 12:22:08 -04:00 @@ -14,6 +14,7 @@ * Alan Cox, * Bjorn Ekwall. * Pekka Riikonen + * Josef "Jeff" Sipek * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License @@ -945,6 +946,10 @@ #ifdef CONFIG_SYSCTL extern char *net_sysctl_strdup(const char *s); #endif + +/* * Register/unregister all the members of struct net_device_stats with watch64 */ +inline void net_register_stats64(struct net_device_stats* stats); +inline void net_unregister_stats64(struct net_device_stats* stats); #endif /* __KERNEL__ */ diff -Nru a/net/core/dev.c b/net/core/dev.c --- a/net/core/dev.c 2004-09-03 12:22:08 -04:00 +++ b/net/core/dev.c 2004-09-03 12:22:08 -04:00 @@ -18,6 +18,7 @@ * Alexey Kuznetsov * Adam Sulmicki * Pekka Riikonen + * Josef "Jeff" Sipek * * Changes: * D.J. Barrow : Fixed bug where dev->refcnt gets set @@ -70,6 +71,7 @@ * indefinitely on dev->refcnt * J Hadi Salim : - Backlog queue sampling * - netif_rx() feedback + * Josef "Jeff" Sipek : Added watch64 calls for network statistics */ #include @@ -108,6 +110,7 @@ #include #include #include +#include #ifdef CONFIG_NET_RADIO #include /* Note : will define WIRELESS_EXT */ #include @@ -2110,6 +2113,49 @@ seq_printf(seq, "%6s: No statistics available.\n", dev->name); } +static void dev_seq_printf_stats64(struct seq_file *seq, struct net_device *dev) +{ + if (dev->get_stats) { + struct net_device_stats *stats = dev->get_stats(dev); + + seq_printf(seq, "%6s:%8llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu " + "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n", + dev->name, watch64_getval(&stats->rx_bytes,NULL), + watch64_getval(&stats->rx_packets,NULL), + watch64_getval(&stats->rx_errors,NULL), + watch64_getval(&stats->rx_dropped,NULL) + + watch64_getval(&stats->rx_missed_errors,NULL), + watch64_getval(&stats->rx_fifo_errors,NULL), + watch64_getval(&stats->rx_length_errors,NULL) + + watch64_getval(&stats->rx_over_errors,NULL) + + watch64_getval(&stats->rx_crc_errors,NULL) + + watch64_getval(&stats->rx_frame_errors,NULL), + watch64_getval(&stats->rx_compressed,NULL), + watch64_getval(&stats->multicast,NULL), + watch64_getval(&stats->tx_bytes,NULL), + watch64_getval(&stats->tx_packets,NULL), + watch64_getval(&stats->tx_errors,NULL), + watch64_getval(&stats->tx_dropped,NULL), + watch64_getval(&stats->tx_fifo_errors,NULL), + watch64_getval(&stats->collisions,NULL), + watch64_getval(&stats->tx_carrier_errors,NULL) + + watch64_getval(&stats->tx_aborted_errors,NULL) + + watch64_getval(&stats->tx_window_errors,NULL) + + watch64_getval(&stats->tx_heartbeat_errors,NULL), + watch64_getval(&stats->tx_compressed,NULL)); + } else + seq_printf(seq, "%6s: No statistics available.\n", dev->name); +} + +static void dev_seq_show_header(struct seq_file *seq) +{ + seq_puts(seq, "Inter-| Receive " + " | Transmit\n" + " face |bytes packets errs drop fifo frame " + "compressed multicast|bytes packets errs " + "drop fifo colls carrier compressed\n"); +} + /* * Called from the PROCfs module. This now uses the new arbitrary sized * /proc/net interface to create /proc/net/dev @@ -2117,16 +2163,21 @@ static int dev_seq_show(struct seq_file *seq, void *v) { if (v == SEQ_START_TOKEN) - seq_puts(seq, "Inter-| Receive " - " | Transmit\n" - " face |bytes packets errs drop fifo frame " - "compressed multicast|bytes packets errs " - "drop fifo colls carrier compressed\n"); + dev_seq_show_header(seq); else dev_seq_printf_stats(seq, v); return 0; } +static int dev_seq_show64(struct seq_file *seq, void *v) +{ + if (v == SEQ_START_TOKEN) + dev_seq_show_header(seq); + else + dev_seq_printf_stats64(seq, v); + return 0; +} + static struct netif_rx_stats *softnet_get_online(loff_t *pos) { struct netif_rx_stats *rc = NULL; @@ -2179,11 +2230,23 @@ .show = dev_seq_show, }; +static struct seq_operations dev_seq_ops64 = { + .start = dev_seq_start, + .next = dev_seq_next, + .stop = dev_seq_stop, + .show = dev_seq_show64, +}; + static int dev_seq_open(struct inode *inode, struct file *file) { return seq_open(file, &dev_seq_ops); } +static int dev_seq_open64(struct inode *inode, struct file *file) +{ + return seq_open(file, &dev_seq_ops64); +} + static struct file_operations dev_seq_fops = { .owner = THIS_MODULE, .open = dev_seq_open, @@ -2192,6 +2255,14 @@ .release = seq_release, }; +static struct file_operations dev_seq_fops64 = { + .owner = THIS_MODULE, + .open = dev_seq_open64, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + static struct seq_operations softnet_seq_ops = { .start = softnet_seq_start, .next = softnet_seq_next, @@ -2224,8 +2295,10 @@ if (!proc_net_fops_create("dev", S_IRUGO, &dev_seq_fops)) goto out; - if (!proc_net_fops_create("softnet_stat", S_IRUGO, &softnet_seq_fops)) + if (!proc_net_fops_create("dev64", S_IRUGO, &dev_seq_fops64)) goto out_dev; + if (!proc_net_fops_create("softnet_stat", S_IRUGO, &softnet_seq_fops)) + goto out_dev64; if (wireless_proc_init()) goto out_softnet; rc = 0; @@ -2233,6 +2306,8 @@ return rc; out_softnet: proc_net_remove("softnet_stat"); +out_dev64: + proc_net_remove("dev64"); out_dev: proc_net_remove("dev"); goto out; @@ -2910,6 +2985,9 @@ * device is present. */ + if (dev->get_stats) + net_register_stats64(dev->get_stats(dev)); + set_bit(__LINK_STATE_PRESENT, &dev->state); dev->next = NULL; @@ -2922,7 +3000,7 @@ dev_hold(dev); dev->reg_state = NETREG_REGISTERING; write_unlock_bh(&dev_base_lock); - + /* Notify protocols, that a new device appeared. */ notifier_call_chain(&netdev_chain, NETDEV_REGISTER, dev); @@ -3145,6 +3223,9 @@ /* If device is running, close it first. */ if (dev->flags & IFF_UP) dev_close(dev); + + if (dev->get_stats) + net_unregister_stats64(dev->get_stats(dev)); /* And unlink it from device chain. */ for (dp = &dev_base; (d = *dp) != NULL; dp = &d->next) { @@ -3246,6 +3327,98 @@ } #endif /* CONFIG_HOTPLUG_CPU */ +/* + * Register all the members of the net_device_stats structure + * + */ + +inline void net_register_stats64(struct net_device_stats* stats) +{ + if (!stats) + return; + + watch64_register(&stats->tx_packets,0); + watch64_enable (&stats->tx_packets,NULL); + watch64_register(&stats->rx_packets,0); + watch64_enable (&stats->rx_packets,NULL); + watch64_register(&stats->tx_bytes,0); + watch64_enable (&stats->tx_bytes,NULL); + watch64_register(&stats->rx_bytes,0); + watch64_enable (&stats->rx_bytes,NULL); + watch64_register(&stats->tx_errors,0); + watch64_enable (&stats->tx_errors,NULL); + watch64_register(&stats->rx_errors,0); + watch64_enable (&stats->rx_errors,NULL); + watch64_register(&stats->tx_dropped,0); + watch64_enable (&stats->tx_dropped,NULL); + watch64_register(&stats->rx_dropped,0); + watch64_enable (&stats->rx_dropped,NULL); + watch64_register(&stats->multicast,0); + watch64_enable (&stats->multicast,NULL); + watch64_register(&stats->collisions,0); + watch64_enable (&stats->collisions,NULL); + watch64_register(&stats->rx_length_errors,0); + watch64_enable (&stats->rx_length_errors,NULL); + watch64_register(&stats->rx_over_errors,0); + watch64_enable (&stats->rx_over_errors,NULL); + watch64_register(&stats->rx_crc_errors,0); + watch64_enable (&stats->rx_crc_errors,NULL); + watch64_register(&stats->rx_frame_errors,0); + watch64_enable (&stats->rx_frame_errors,NULL); + watch64_register(&stats->rx_fifo_errors,0); + watch64_enable (&stats->rx_fifo_errors,NULL); + watch64_register(&stats->rx_missed_errors,0); + watch64_enable (&stats->rx_missed_errors,NULL); + watch64_register(&stats->tx_aborted_errors,0); + watch64_enable (&stats->tx_aborted_errors,NULL); + watch64_register(&stats->tx_carrier_errors,0); + watch64_enable (&stats->tx_carrier_errors,NULL); + watch64_register(&stats->tx_fifo_errors,0); + watch64_enable (&stats->tx_fifo_errors,NULL); + watch64_register(&stats->tx_heartbeat_errors,0); + watch64_enable (&stats->tx_heartbeat_errors,NULL); + watch64_register(&stats->tx_window_errors,0); + watch64_enable (&stats->tx_window_errors,NULL); + watch64_register(&stats->rx_compressed,0); + watch64_enable (&stats->rx_compressed,NULL); + watch64_register(&stats->tx_compressed,0); + watch64_enable (&stats->tx_compressed,NULL); +} + +/* + * Unregister all the members of the net_device_stats structure + * + */ + +inline void net_unregister_stats64(struct net_device_stats* stats) +{ + if (!stats) + return; + + watch64_unregister(&stats->tx_packets,0); + watch64_unregister(&stats->rx_packets,0); + watch64_unregister(&stats->tx_bytes,0); + watch64_unregister(&stats->rx_bytes,0); + watch64_unregister(&stats->tx_errors,0); + watch64_unregister(&stats->rx_errors,0); + watch64_unregister(&stats->tx_dropped,0); + watch64_unregister(&stats->rx_dropped,0); + watch64_unregister(&stats->multicast,0); + watch64_unregister(&stats->collisions,0); + watch64_unregister(&stats->rx_length_errors,0); + watch64_unregister(&stats->rx_over_errors,0); + watch64_unregister(&stats->rx_crc_errors,0); + watch64_unregister(&stats->rx_frame_errors,0); + watch64_unregister(&stats->rx_fifo_errors,0); + watch64_unregister(&stats->rx_missed_errors,0); + watch64_unregister(&stats->tx_aborted_errors,0); + watch64_unregister(&stats->tx_carrier_errors,0); + watch64_unregister(&stats->tx_fifo_errors,0); + watch64_unregister(&stats->tx_heartbeat_errors,0); + watch64_unregister(&stats->tx_window_errors,0); + watch64_unregister(&stats->rx_compressed,0); + watch64_unregister(&stats->tx_compressed,0); +} /* * Initialize the DEV module. At boot time this walks the device list and diff -Nru a/net/core/net-sysfs.c b/net/core/net-sysfs.c --- a/net/core/net-sysfs.c 2004-09-03 12:22:08 -04:00 +++ b/net/core/net-sysfs.c 2004-09-03 12:22:08 -04:00 @@ -16,6 +16,7 @@ #include #include #include +#include #define to_class_dev(obj) container_of(obj,struct class_device,kobj) #define to_net_dev(class) container_of(class, struct net_device, class_dev) @@ -23,6 +24,7 @@ static const char fmt_hex[] = "%#x\n"; static const char fmt_dec[] = "%d\n"; static const char fmt_ulong[] = "%lu\n"; +static const char fmt_ullong[] = "%llu\n"; static inline int dev_isalive(const struct net_device *dev) { @@ -204,8 +206,8 @@ read_lock(&dev_base_lock); if (dev_isalive(dev) && dev->get_stats && (stats = (*dev->get_stats)(dev))) - ret = sprintf(buf, fmt_ulong, - *(unsigned long *)(((u8 *) stats) + offset)); + ret = sprintf(buf, fmt_ullong, + watch64_getval((unsigned long *)(((u8 *) stats) + offset),NULL)); read_unlock(&dev_base_lock); return ret; --Boundary_(ID_Gu5A8t3yfXnZW0k3kEIBKg)-- From afleming@freescale.com Fri Sep 3 15:18:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 15:18:44 -0700 (PDT) Received: from motgate8.mot.com (motgate8.mot.com [129.188.136.8]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83MIdDg009752 for ; Fri, 3 Sep 2004 15:18:39 -0700 Received: from az33exr04.mot.com (pobox4.mot.com [10.64.251.243]) by motgate8.mot.com (Motorola/Motgate8) with ESMTP id i83MJYF7003825 for ; Fri, 3 Sep 2004 15:19:34 -0700 (MST) Received: from [10.82.17.240] ([10.82.17.240]) by az33exr04.mot.com (Motorola/az33exr04) with ESMTP id i83KHo3r026899 for ; Fri, 3 Sep 2004 15:18:16 -0500 Mime-Version: 1.0 (Apple Message framework v618) Content-Transfer-Encoding: 7bit Message-Id: <29D06014-FDF7-11D8-942E-000393C30512@freescale.com> Content-Type: text/plain; charset=US-ASCII; format=flowed To: netdev@oss.sgi.com From: Andy Fleming Subject: Using schedule_work for interrupt handling Date: Fri, 3 Sep 2004 17:18:20 -0500 X-Mailer: Apple Mail (2.618) X-archive-position: 8395 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: afleming@freescale.com Precedence: bulk X-list: netdev So I've done another bk pull (just a few minutes ago), and my driver is still broken. If this isn't the place to ask, please direct me to the appropriate list -- I'm getting desperate! So what I'm doing in my driver is handling the PHY link change interrupt by disabling and clearing the interrupt I recieve, then calling schedule_work() to invoke my actual handler outside of interrupt time. That function calls the various PHY configuration functions, configures the controller state appropriately, and then enables interrupts before returning. The entire interrupt function is: static irqreturn_t phy_interrupt(int irq, void *dev_id, struct pt_regs *regs) { struct net_device *dev = (struct net_device *) dev_id; struct gfar_private *priv = netdev_priv(dev); /* Clear the interrupt */ mii_clear_phy_interrupt(priv->mii_info); /* Disable PHY interrupts */ mii_configure_phy_interrupt(priv->mii_info, MII_INTERRUPT_DISABLED); /* Schedule the phy change */ schedule_work(&priv->tq); return IRQ_HANDLED; } In the gfar_startup() function (called from gfar_open()), I initialize the work queue: INIT_WORK(&priv->tq, gfar_phy_change, dev); And later, I request the irq: if (priv->einfo->flags & GFAR_HAS_PHY_INTR) { if (request_irq(priv->einfo->interruptPHY, phy_interrupt, SA_SHIRQ, "phy_interrupt", mii_info->dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d (PHY)\n", mii_info->dev->name, priv->einfo->interruptPHY); } else { mii_configure_phy_interrupt(priv->mii_info, MII_INTERRUPT_ENABLED); return; } } (If the requesting the interrupt fails, it uses a timer, instead) Is this right? Am I missing a crucial step? I ask because this no longer works. The driver never successfully invokes gfar_phy_change(), and therefore never brings up the interface. I have tried MANY, MANY things to get this working for the last 2+ weeks, and nothing has succeeded. I can detail what I have done so far, if people think it will help, but for now I'm just seeing if anyone notices a flaw in my code (code which, I might add, works in 2.6.8.1). Is this just what I get for not using a "stable" kernel? If that's the case, then should I be submitting a bug to someone, so they know there may be a problem...somewhere? Thanks for any help, Andy Fleming PowerPC Software Enablement Freescale Semiconductor, Inc. From herbert@gondor.apana.org.au Fri Sep 3 16:50:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 16:50:14 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i83No40i014551 for ; Fri, 3 Sep 2004 16:50:05 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1C3NoP-0002T3-00; Sat, 04 Sep 2004 09:49:45 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1C3NoM-0001am-00; Sat, 04 Sep 2004 09:49:42 +1000 Date: Sat, 4 Sep 2004 09:49:41 +1000 To: Stephen Hemminger Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: neigh_create/inetdev_destroy race? Message-ID: <20040903234941.GA26247@gondor.apana.org.au> References: <20040815191450.77532d5d.davem@redhat.com> <20040816105131.GA11299@gondor.apana.org.au> <20040828234201.79556f6e.davem@davemloft.net> <20040829065031.GA786@gondor.apana.org.au> <20040830230820.7514985d.davem@davemloft.net> <20040831104139.GA2124@gondor.apana.org.au> <20040901222118.0ce4bcc6.davem@davemloft.net> <20040902130605.GA32570@gondor.apana.org.au> <20040903133623.GA23179@gondor.apana.org.au> <20040903090053.22c67bb9@dell_ss3.pdx.osdl.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="DocE+STaALJfprDB" Content-Disposition: inline In-Reply-To: <20040903090053.22c67bb9@dell_ss3.pdx.osdl.net> User-Agent: Mutt/1.5.6+20040722i From: Herbert Xu X-archive-position: 8396 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --DocE+STaALJfprDB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Fri, Sep 03, 2004 at 09:00:53AM -0700, Stephen Hemminger wrote: > > > You'll also notice that I've put all dereferences of dev->*_ptr under > > the rcu_read_lock(). Without this we may get a neigh_parms that's > > already been released. > > I haven't looked at the exact code in detail, but don't you need > use rcu_dereference() as well to make sure and get the smp_read_barrier_depends > on Alpha. Not really because we're not depending on *dev->neigh_parms to be set to NULLon shutdown. In fact *dev->neigh_parms never gets set to NULL at all. If it did we'd have trouble cleaning up dead entries from the hash table. So there is no data-dependent read here whose order must be preserved when *dev is destroyed. But hang on a second, I had forgotten about the creation path. Indeed that is buggy without a barrier for every path except IPv6. Without the barrier, we may be reading NULL pointers from parms which may result in stale neigh entries lingering around. Or worse we may read complete garbage that was there before the memset on *dev was done. Fortunately the last bit probably can only be triggered if you're stepping through gdb :) So here is a patch to make sure that there is a barrier between the reading of dev->*_ptr and *dev->neigh_parms. With these barriers in place, it's clear that *dev->neigh_parms can no longer be NULL since once the parms are allocated, that pointer is never reset to NULL again. Therefore I've also removed the parms check in these paths. They were bogus to begin with since if they ever triggered then we'll have dead neigh entries stuck in the hash table. Unfortunately I couldn't arrange for this to happen with DECnet due to the dn_db->parms.up() call that's sandwiched between the assignment of dev->dn_ptr and dn_db->neigh_parms. So I've kept the parms check there but it will now fail instead of continuing. I've also added an smp_wmb() there so that at least we won't be reading garbage from dn_db->neigh_parms. DECnet is also buggy since there is no locking at all in the destruction path. It either needs locking or RCU like IPv4. Signed-off-by: Herbert Xu Thanks a lot, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --DocE+STaALJfprDB Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p ===== drivers/s390/net/qeth_main.c 1.14 vs edited ===== --- 1.14/drivers/s390/net/qeth_main.c 2004-09-04 08:22:06 +10:00 +++ edited/drivers/s390/net/qeth_main.c 2004-09-04 09:17:17 +10:00 @@ -6718,17 +6718,15 @@ } rcu_read_lock(); - in_dev = __in_dev_get(dev); + in_dev = rcu_dereference(__in_dev_get(dev)); if (in_dev == NULL) { rcu_read_unlock(); return -EINVAL; } parms = in_dev->arp_parms; - if (parms) { - __neigh_parms_put(neigh->parms); - neigh->parms = neigh_parms_clone(parms); - } + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); rcu_read_unlock(); neigh->type = inet_addr_type(*(u32 *) neigh->primary_key); ===== net/atm/clip.c 1.37 vs edited ===== --- 1.37/net/atm/clip.c 2004-09-04 08:22:07 +10:00 +++ edited/net/atm/clip.c 2004-09-04 09:18:22 +10:00 @@ -320,17 +320,15 @@ if (neigh->type != RTN_UNICAST) return -EINVAL; rcu_read_lock(); - in_dev = __in_dev_get(dev); + in_dev = rcu_dereference(__in_dev_get(dev)); if (!in_dev) { rcu_read_unlock(); return -EINVAL; } parms = in_dev->arp_parms; - if (parms) { - __neigh_parms_put(neigh->parms); - neigh->parms = neigh_parms_clone(parms); - } + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); rcu_read_unlock(); neigh->ops = &clip_neigh_ops; ===== net/decnet/dn_dev.c 1.25 vs edited ===== --- 1.25/net/decnet/dn_dev.c 2004-09-04 08:22:07 +10:00 +++ edited/net/decnet/dn_dev.c 2004-09-04 09:36:51 +10:00 @@ -41,6 +41,7 @@ #include #include #include +#include #include #include #include @@ -1108,6 +1109,7 @@ memset(dn_db, 0, sizeof(struct dn_dev)); memcpy(&dn_db->parms, p, sizeof(struct dn_dev_parms)); + smp_wmb(); dev->dn_ptr = dn_db; dn_db->dev = dev; init_timer(&dn_db->timer); ===== net/decnet/dn_neigh.c 1.11 vs edited ===== --- 1.11/net/decnet/dn_neigh.c 2004-09-04 08:22:07 +10:00 +++ edited/net/decnet/dn_neigh.c 2004-09-04 09:33:12 +10:00 @@ -139,17 +139,20 @@ struct neigh_parms *parms; rcu_read_lock(); - dn_db = dev->dn_ptr; + dn_db = rcu_dereference(dev->dn_ptr); if (dn_db == NULL) { rcu_read_unlock(); return -EINVAL; } parms = dn_db->neigh_parms; - if (parms) { - __neigh_parms_put(neigh->parms); - neigh->parms = neigh_parms_clone(parms); + if (!parms) { + rcu_read_unlock(); + return -EINVAL; } + + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); rcu_read_unlock(); if (dn_db->use_long) ===== net/ipv4/arp.c 1.45 vs edited ===== --- 1.45/net/ipv4/arp.c 2004-09-04 08:22:07 +10:00 +++ edited/net/ipv4/arp.c 2004-09-04 09:17:46 +10:00 @@ -244,17 +244,15 @@ neigh->type = inet_addr_type(addr); rcu_read_lock(); - in_dev = __in_dev_get(dev); + in_dev = rcu_dereference(__in_dev_get(dev)); if (in_dev == NULL) { rcu_read_unlock(); return -EINVAL; } parms = in_dev->arp_parms; - if (parms) { - __neigh_parms_put(neigh->parms); - neigh->parms = neigh_parms_clone(parms); - } + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); rcu_read_unlock(); if (dev->hard_header == NULL) { ===== net/ipv6/ndisc.c 1.88 vs edited ===== --- 1.88/net/ipv6/ndisc.c 2004-09-04 08:22:07 +10:00 +++ edited/net/ipv6/ndisc.c 2004-09-04 09:28:12 +10:00 @@ -297,10 +297,8 @@ } parms = in6_dev->nd_parms; - if (parms) { - __neigh_parms_put(neigh->parms); - neigh->parms = neigh_parms_clone(parms); - } + __neigh_parms_put(neigh->parms); + neigh->parms = neigh_parms_clone(parms); rcu_read_unlock(); neigh->type = is_multicast ? RTN_MULTICAST : RTN_UNICAST; --DocE+STaALJfprDB-- From jgarzik@infradead.org Fri Sep 3 19:56:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Sep 2004 19:56:57 -0700 (PDT) Received: from canuck.infradead.org (canuck.infradead.org [205.233.218.70]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i842upX2017313 for ; Fri, 3 Sep 2004 19:56:52 -0700 Received: from jgarzik by canuck.infradead.org with local (Exim 4.33 #1 (Red Hat Linux)) id 1C3QjI-00074m-PP; Fri, 03 Sep 2004 22:56:40 -0400 Date: Fri, 3 Sep 2004 22:56:40 -0400 From: Jeff Garzik To: akpm@osdl.org, torvalds@osdl.org Cc: netdev@oss.sgi.com Subject: [BK PATCHES] 2.6.x net driver fixes Message-ID: <20040904025640.GA27001@canuck.infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 8397 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Please do a bk pull bk://gkernel.bkbits.net/net-drivers-2.6 This will update the following files: drivers/net/3c527.c | 13 ++---- drivers/net/8139cp.c | 2 - drivers/net/forcedeth.c | 78 +++++++++++++++++++++++++++-------------- drivers/net/r8169.c | 2 - drivers/net/wireless/airo.c | 1 drivers/net/wireless/wavelan.c | 11 +++-- 6 files changed, 66 insertions(+), 41 deletions(-) through these ChangeSets: (04/09/03 1.2021) [PATCH] fix media detection for nForce 2 nics attached is a patch that polls the media setting for non GigE nForce nics: Without polling, media changes are not autodetected. This is fatal, because the nic initialization is asynchroneous, thus "modprobe;ifup" resulted in a dead network connection. The attached patch fixes that problem. It's a repost of a patch I sent around three weeks ago: you objected that I rely on the nic irq instead of a software timer. I've documented why this is ok. (04/09/03 1.2020) [PATCH] airo build fix drivers/net/wireless/airo.c: In function `issuecommand': drivers/net/wireless/airo.c:3812: warning: implicit declaration of function `kernel_locked' *** Warning: "kernel_locked" [drivers/net/wireless/airo.ko] undefined! Signed-off-by: Andrew Morton (04/09/03 1.2019) [PATCH] wavelan uninitalised var. This seems a little odd, printing out the value of a variable we haven't read yet. Signed-off-by: Dave Jones (04/09/03 1.2018) [PATCH] 3c527 possible oops. If the alloc_skb() fails, we dereference it in the skb_reserve() call. Move the skb_reserve() call to after the NULL check. Also clean up some CodingStyle violations whilst in the vicinity. Signed-off-by: Dave Jones (04/08/31 1.1860.2.1) [netdrvr 8139cp,r8169] fix dma_addr_t sizeof test diff -Nru a/drivers/net/3c527.c b/drivers/net/3c527.c --- a/drivers/net/3c527.c 2004-09-03 22:20:03 -04:00 +++ b/drivers/net/3c527.c 2004-09-03 22:20:03 -04:00 @@ -751,18 +751,15 @@ rx_base=lp->rx_chain; - for(i=0; irx_ring[i].skb=alloc_skb(1532, GFP_KERNEL); - skb_reserve(lp->rx_ring[i].skb, 18); - - if(lp->rx_ring[i].skb==NULL) - { - for(;i>=0;i--) + if (lp->rx_ring[i].skb==NULL) { + for (;i>=0;i--) kfree_skb(lp->rx_ring[i].skb); return -ENOBUFS; } - + skb_reserve(lp->rx_ring[i].skb, 18); + p=isa_bus_to_virt(lp->base+rx_base); p->control=0; diff -Nru a/drivers/net/8139cp.c b/drivers/net/8139cp.c --- a/drivers/net/8139cp.c 2004-09-03 22:20:03 -04:00 +++ b/drivers/net/8139cp.c 2004-09-03 22:20:03 -04:00 @@ -1698,7 +1698,7 @@ } /* Configure DMA attributes. */ - if ((sizeof(dma_addr_t) > 32) && + if ((sizeof(dma_addr_t) > 4) && !pci_set_consistent_dma_mask(pdev, 0xffffffffffffffffULL) && !pci_set_dma_mask(pdev, 0xffffffffffffffffULL)) { pci_using_dac = 1; diff -Nru a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c --- a/drivers/net/forcedeth.c 2004-09-03 22:20:03 -04:00 +++ b/drivers/net/forcedeth.c 2004-09-03 22:20:03 -04:00 @@ -75,6 +75,7 @@ * added CK804/MCP04 device IDs, code fixes * for registers, link status and other minor fixes. * 0.28: 21 Jun 2004: Big cleanup, making driver mostly endian safe + * 0.29: 31 Aug 2004: Add backup timer for link change notification. * * Known bugs: * We suspect that on some hardware no TX done interrupts are generated. @@ -86,7 +87,7 @@ * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few * superfluous timer interrupts from the nic. */ -#define FORCEDETH_VERSION "0.28" +#define FORCEDETH_VERSION "0.29" #define DRV_NAME "forcedeth" #include @@ -120,10 +121,11 @@ * Hardware access: */ -#define DEV_NEED_LASTPACKET1 0x0001 -#define DEV_IRQMASK_1 0x0002 -#define DEV_IRQMASK_2 0x0004 -#define DEV_NEED_TIMERIRQ 0x0008 +#define DEV_NEED_LASTPACKET1 0x0001 /* set LASTPACKET1 in tx flags */ +#define DEV_IRQMASK_1 0x0002 /* use NVREG_IRQMASK_WANTED_1 for irq mask */ +#define DEV_IRQMASK_2 0x0004 /* use NVREG_IRQMASK_WANTED_2 for irq mask */ +#define DEV_NEED_TIMERIRQ 0x0008 /* set the timer irq flag in the irq mask */ +#define DEV_NEED_LINKTIMER 0x0010 /* poll link settings. Relies on the timer irq */ enum { NvRegIrqStatus = 0x000, @@ -367,6 +369,7 @@ #define OOM_REFILL (1+HZ/20) #define POLL_WAIT (1+HZ/100) +#define LINK_TIMEOUT (3*HZ) #define DESC_VER_1 0x0 #define DESC_VER_2 0x02100 @@ -446,6 +449,11 @@ struct timer_list oom_kick; struct timer_list nic_poll; + /* media detection workaround. + * Locking: Within irq hander or disable_irq+spin_lock(&np->lock); + */ + int need_linktimer; + unsigned long link_timeout; /* * tx specific fields. */ @@ -1384,6 +1392,25 @@ return retval; } +static void nv_linkchange(struct net_device *dev) +{ + if (nv_update_linkspeed(dev)) { + if (netif_carrier_ok(dev)) { + nv_stop_rx(dev); + } else { + netif_carrier_on(dev); + printk(KERN_INFO "%s: link up.\n", dev->name); + } + nv_start_rx(dev); + } else { + if (netif_carrier_ok(dev)) { + netif_carrier_off(dev); + printk(KERN_INFO "%s: link down.\n", dev->name); + nv_stop_rx(dev); + } + } +} + static void nv_link_irq(struct net_device *dev) { u8 *base = get_hwbase(dev); @@ -1391,25 +1418,10 @@ miistat = readl(base + NvRegMIIStatus); writel(NVREG_MIISTAT_MASK, base + NvRegMIIStatus); - dprintk(KERN_DEBUG "%s: link change notification, status 0x%x.\n", dev->name, miistat); + dprintk(KERN_INFO "%s: link change irq, status 0x%x.\n", dev->name, miistat); - if (miistat & (NVREG_MIISTAT_LINKCHANGE)) { - if (nv_update_linkspeed(dev)) { - if (netif_carrier_ok(dev)) { - nv_stop_rx(dev); - } else { - netif_carrier_on(dev); - printk(KERN_INFO "%s: link up.\n", dev->name); - } - nv_start_rx(dev); - } else { - if (netif_carrier_ok(dev)) { - netif_carrier_off(dev); - printk(KERN_INFO "%s: link down.\n", dev->name); - nv_stop_rx(dev); - } - } - } + if (miistat & (NVREG_MIISTAT_LINKCHANGE)) + nv_linkchange(dev); dprintk(KERN_DEBUG "%s: link change notification done.\n", dev->name); } @@ -1452,6 +1464,12 @@ nv_link_irq(dev); spin_unlock(&np->lock); } + if (np->need_linktimer && time_after(jiffies, np->link_timeout)) { + spin_lock(&np->lock); + nv_linkchange(dev); + spin_unlock(&np->lock); + np->link_timeout = jiffies + LINK_TIMEOUT; + } if (events & (NVREG_IRQ_TX_ERR)) { dprintk(KERN_DEBUG "%s: received irq with events 0x%x. Probably TX fail.\n", dev->name, events); @@ -1816,6 +1834,14 @@ np->irqmask = NVREG_IRQMASK_WANTED_2; if (id->driver_data & DEV_NEED_TIMERIRQ) np->irqmask |= NVREG_IRQ_TIMER; + if (id->driver_data & DEV_NEED_LINKTIMER) { + dprintk(KERN_INFO "%s: link timer on.\n", pci_name(pci_dev)); + np->need_linktimer = 1; + np->link_timeout = jiffies + LINK_TIMEOUT; + } else { + dprintk(KERN_INFO "%s: link timer off.\n", pci_name(pci_dev)); + np->need_linktimer = 0; + } /* find a suitable phy */ for (i = 1; i < 32; i++) { @@ -1909,21 +1935,21 @@ .device = PCI_DEVICE_ID_NVIDIA_NVENET_1, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, - .driver_data = DEV_IRQMASK_1|DEV_NEED_TIMERIRQ, + .driver_data = DEV_IRQMASK_1|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce2 Ethernet Controller */ .vendor = PCI_VENDOR_ID_NVIDIA, .device = PCI_DEVICE_ID_NVIDIA_NVENET_2, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ, + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ .vendor = PCI_VENDOR_ID_NVIDIA, .device = PCI_DEVICE_ID_NVIDIA_NVENET_3, .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, - .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ, + .driver_data = DEV_NEED_LASTPACKET1|DEV_IRQMASK_2|DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER, }, { /* nForce3 Ethernet Controller */ .vendor = PCI_VENDOR_ID_NVIDIA, diff -Nru a/drivers/net/r8169.c b/drivers/net/r8169.c --- a/drivers/net/r8169.c 2004-09-03 22:20:03 -04:00 +++ b/drivers/net/r8169.c 2004-09-03 22:20:03 -04:00 @@ -983,7 +983,7 @@ tp->cp_cmd = PCIMulRW | RxChkSum; - if ((sizeof(dma_addr_t) > 32) && + if ((sizeof(dma_addr_t) > 4) && !pci_set_dma_mask(pdev, DMA_64BIT_MASK)) tp->cp_cmd |= PCIDAC; else { diff -Nru a/drivers/net/wireless/airo.c b/drivers/net/wireless/airo.c --- a/drivers/net/wireless/airo.c 2004-09-03 22:20:03 -04:00 +++ b/drivers/net/wireless/airo.c 2004-09-03 22:20:03 -04:00 @@ -25,6 +25,7 @@ #include #include #include +#include #include #include diff -Nru a/drivers/net/wireless/wavelan.c b/drivers/net/wireless/wavelan.c --- a/drivers/net/wireless/wavelan.c 2004-09-03 22:20:03 -04:00 +++ b/drivers/net/wireless/wavelan.c 2004-09-03 22:20:03 -04:00 @@ -3822,17 +3822,18 @@ if ((hasr & HASR_MMC_INTR) && (lp->hacr & HACR_MMC_INT_ENABLE)) { u8 dce_status; -#ifdef DEBUG_INTERRUPT_ERROR - printk(KERN_INFO - "%s: wavelan_interrupt(): unexpected mmc interrupt: status 0x%04x.\n", - dev->name, dce_status); -#endif /* * Interrupt from the modem management controller. * This will clear it -- ignored for now. */ mmc_read(ioaddr, mmroff(0, mmr_dce_status), &dce_status, sizeof(dce_status)); + +#ifdef DEBUG_INTERRUPT_ERROR + printk(KERN_INFO + "%s: wavelan_interrupt(): unexpected mmc interrupt: status 0x%04x.\n", + dev->name, dce_status); +#endif } /* Check if not controller interrupt */ From hadi@cyberus.ca Sat Sep 4 06:20:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 06:20:36 -0700 (PDT) Received: from lotus.znyx.com (znx208-2-156-007.znyx.com [208.2.156.7]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i84DKVQn007432 for ; Sat, 4 Sep 2004 06:20:31 -0700 Received: from localhost ([208.2.156.2]) by lotus.znyx.com (Lotus Domino Release 5.0.11) with ESMTP id 2004090406215222:29254 ; Sat, 4 Sep 2004 06:21:52 -0700 Subject: Re: [PATCH 2.6] watch64: generic variable monitoring system From: jamal Reply-To: hadi@cyberus.ca To: "Josef 'Jeff' Sipek" Cc: Stephen Hemminger , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <200409031744.32970.jeffpc@optonline.net> References: <200409031307.01240.jeffpc@optonline.net> <200409031319.24863.jeffpc@optonline.net> <20040903121657.355a6a8b@dell_ss3.pdx.osdl.net> <200409031744.32970.jeffpc@optonline.net> Organization: jamalopolis Message-Id: <1094303999.1633.116.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 04 Sep 2004 09:19:59 -0400 X-MIMETrack: Itemize by SMTP Server on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 09/04/2004 06:21:52 AM, Serialize by Router on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 09/04/2004 06:21:55 AM, Serialize complete at 09/04/2004 06:21:55 AM Content-Transfer-Encoding: 7bit Content-Type: text/plain X-archive-position: 8398 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev I have a feeling this was discussed somewhere(other than netdev) and i missed it. Why isnt this watch64 being done in user space? cheers, jamal On Fri, 2004-09-03 at 17:44, Josef 'Jeff' Sipek wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Friday 03 September 2004 15:16, Stephen Hemminger wrote: > > - Code doesn't match the kernel style (read Documentation/CodingStyle) > > Sorry about the white space, KMail apparently likes to butcher the text. These > are the same patches with the little cleanup update. > > Jeff. > > - -- > Reality is merely an illusion, albeit a very persistent one. > - Albert Einstein > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.5 (GNU/Linux) > > iD4DBQFBOOW+wFP0+seVj/4RAgSiAJj54qcqdEx66lbMW9ik0XviupTNAKC82an1 > R0pGX0pTBZ78NWrZpxJm+w== > =EesC > -----END PGP SIGNATURE----- From ak@muc.de Sat Sep 4 06:28:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 06:28:24 -0700 (PDT) Received: from colin2.muc.de (qmailr@colin2.muc.de [193.149.48.15]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id i84DSJ1o007817 for ; Sat, 4 Sep 2004 06:28:19 -0700 Received: (qmail 42471 invoked by uid 3709); 4 Sep 2004 13:28:09 -0000 Date: 4 Sep 2004 15:28:09 +0200 Date: Sat, 4 Sep 2004 15:28:09 +0200 From: Andi Kleen To: "David S. Miller" Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, akepner@sgi.com Subject: Re: [PATCH] Extend lock less TX to real devices Message-ID: <20040904132809.GB33964@muc.de> References: <20040901223301.1a8d97a8.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040901223301.1a8d97a8.davem@redhat.com> User-Agent: Mutt/1.4.1i X-archive-position: 8399 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@muc.de Precedence: bulk X-list: netdev On Wed, Sep 01, 2004 at 10:33:01PM -0700, David S. Miller wrote: > On Tue, 31 Aug 2004 14:38:20 +0200 > Andi Kleen wrote: > > > This patch extends the recently added NETIF_F_LLTX to real devices. > > Well, it does a lot of other things too. Not really, it all works to the same goal. > > > I added support for trylocking instead of spinning like sch_generic > > does - for that the driver has to return -1, then the packet is requeued. > > The check for a local device deadlock is lost for this case, > > but that doesn't seem to be a big loss (I've never seen this printk > > ever get triggered) > > It is triggerable if you misconfigure your system. Really? The only reason I can see for it is a buggy driver. > I'm totally against this change, because previously at There is no change, except for drivers that set LLTX and these get different semantics anyways because they have to handle this on their own. In case the driver has bugs I guess it would be better to add the printk directly below the try_lock in the LLTX driver. > least the user would find out in their logs. With your > change the system explodes looping with no explanation why. Hmm, I guess if you're really worried about this class of driver bugs being common adding some real error handling for it (like bailing out and disabling the device) would be the far better option. > > > The patch looks bigger than it really is because i moved some code > > around and converted the macros into inlines. > .. > > I also did an additional micro optimization: > > And for this reason you need to split this patch up. > I would recommend: > > patch 1) Change macros into inlines > patch 2) local_bh_disable() preemption count optimization > patch 3) support for F_LLTX on real devices > patch 4) locking changes At least (3) and (4) are the same thing. I can drop the inlines, it was only for making the code clearer and less ugly but is not essential for the optimizations. -Andi From ak@suse.de Sat Sep 4 06:54:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 06:55:00 -0700 (PDT) Received: from Cantor.suse.de (cantor.suse.de [195.135.220.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i84Dsrud008610 for ; Sat, 4 Sep 2004 06:54:54 -0700 Received: from hermes.suse.de (hermes-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id C6BCEB7E597; Sat, 4 Sep 2004 15:54:39 +0200 (CEST) Date: Sat, 4 Sep 2004 15:54:39 +0200 From: Andi Kleen To: davem@redhat.com, netdev@oss.sgi.com Subject: [PATCH] Do less atomic count changes in dev_queue_xmit Message-ID: <20040904135439.GA23934@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-archive-position: 8400 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Do a single local_bh_disable and a single local_bh_enable instead of changing the atomic count all the time in dev_queue_xmit. Should mostly benefit preemptible kernels, but others see some small improvements too. diff -u linux-2.6.8/net/core/dev.c-o linux-2.6.8/net/core/dev.c --- linux-2.6.8/net/core/dev.c-o 2004-09-04 13:10:47.000000000 +0000 +++ linux-2.6.8/net/core/dev.c 2004-09-04 13:47:16.765722813 +0000 @@ -1249,14 +1249,14 @@ return 0; } -#define HARD_TX_LOCK_BH(dev, cpu) { \ +#define HARD_TX_LOCK(dev, cpu) { \ if ((dev->features & NETIF_F_LLTX) == 0) { \ spin_lock_bh(&dev->xmit_lock); \ dev->xmit_lock_owner = cpu; \ } \ } -#define HARD_TX_UNLOCK_BH(dev) { \ +#define HARD_TX_UNLOCK(dev) { \ if ((dev->features & NETIF_F_LLTX) == 0) { \ dev->xmit_lock_owner = -1; \ spin_unlock_bh(&dev->xmit_lock); \ @@ -1313,7 +1313,12 @@ if (skb_checksum_help(&skb, 0)) goto out_kfree_skb; - rcu_read_lock(); + + /* Disable soft irqs for various locks below. Also + * stops preemption for RCU. + */ + local_bh_disable(); + /* Updates of qdisc are serialized by queue_lock. * The struct Qdisc which is pointed to by qdisc is now a * rcu structure - it may be accessed without acquiring @@ -1332,18 +1337,16 @@ #endif if (q->enqueue) { /* Grab device queue */ - spin_lock_bh(&dev->queue_lock); + spin_lock(&dev->queue_lock); rc = q->enqueue(skb, q); qdisc_run(dev); - spin_unlock_bh(&dev->queue_lock); - rcu_read_unlock(); + spin_unlock(&dev->queue_lock); rc = rc == NET_XMIT_BYPASS ? NET_XMIT_SUCCESS : rc; goto out; } - rcu_read_unlock(); /* The device has no queue. Common case for software devices: loopback, all the sorts of tunnels... @@ -1358,12 +1361,11 @@ Either shot noqueue qdisc, it is even simpler 8) */ if (dev->flags & IFF_UP) { - int cpu = get_cpu(); + int cpu = smp_processor_id(); /* ok because BHs are off */ if (dev->xmit_lock_owner != cpu) { - HARD_TX_LOCK_BH(dev, cpu); - put_cpu(); + HARD_TX_LOCK(dev, cpu); if (!netif_queue_stopped(dev)) { if (netdev_nit) @@ -1371,17 +1373,16 @@ rc = 0; if (!dev->hard_start_xmit(skb, dev)) { - HARD_TX_UNLOCK_BH(dev); + HARD_TX_UNLOCK(dev); goto out; } } - HARD_TX_UNLOCK_BH(dev); + HARD_TX_UNLOCK(dev); if (net_ratelimit()) printk(KERN_CRIT "Virtual device %s asks to " "queue packet!\n", dev->name); goto out_enetdown; } else { - put_cpu(); /* Recursion is detected! It is possible, * unfortunately */ if (net_ratelimit()) @@ -1394,6 +1395,7 @@ out_kfree_skb: kfree_skb(skb); out: + local_bh_enable(); return rc; } From hadi@cyberus.ca Sat Sep 4 07:12:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 07:12:25 -0700 (PDT) Received: from lotus.znyx.com (znx208-2-156-007.znyx.com [208.2.156.7]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i84ECK2R009182 for ; Sat, 4 Sep 2004 07:12:20 -0700 Received: from localhost ([208.2.156.2]) by lotus.znyx.com (Lotus Domino Release 5.0.11) with ESMTP id 2004090407134204:29279 ; Sat, 4 Sep 2004 07:13:42 -0700 Subject: Re: [PATCH] Extend lock less TX to real devices From: jamal Reply-To: hadi@cyberus.ca To: Andi Kleen Cc: "David S. Miller" , Alexey , netdev@oss.sgi.com, akepner@sgi.com In-Reply-To: <20040904132809.GB33964@muc.de> References: <20040901223301.1a8d97a8.davem@redhat.com> <20040904132809.GB33964@muc.de> Organization: jamalopolis Message-Id: <1094307106.1634.147.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 04 Sep 2004 10:11:46 -0400 X-MIMETrack: Itemize by SMTP Server on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 09/04/2004 07:13:42 AM, Serialize by Router on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 09/04/2004 07:13:44 AM, Serialize complete at 09/04/2004 07:13:44 AM Content-Transfer-Encoding: 7bit Content-Type: text/plain X-archive-position: 8401 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sat, 2004-09-04 at 09:28, Andi Kleen wrote: > On Wed, Sep 01, 2004 at 10:33:01PM -0700, David S. Miller wrote: > > On Tue, 31 Aug 2004 14:38:20 +0200 > > Andi Kleen wrote: > > > > > This patch extends the recently added NETIF_F_LLTX to real devices. > > > > Well, it does a lot of other things too. > > Not really, it all works to the same goal. Must be my sleep depravation - what is LLTX again? > > least the user would find out in their logs. With your > > change the system explodes looping with no explanation why. > > Hmm, I guess if you're really worried about this class > of driver bugs ble eing common adding some real error handling > for it (like bailing out and disabling the device) would > be the far better option. Actually that message is pretty useful. I have seen at least a handful of badly written drivers do that. Was also very useful for me when i was doing the tc extensions. I was able to catch a few bugs - so not just driver related. > > patch 1) Change macros into inlines > > patch 2) local_bh_disable() preemption count optimization > > patch 3) support for F_LLTX on real devices > > patch 4) locking changes > > At least (3) and (4) are the same thing. I can drop the > inlines, it was only for making the code clearer and less ugly > but is not essential for the optimizations. do you guys mind if i test these patches/patch out first before final inclusion? Next weekend i will have the chance. cheers, jamal From ak@suse.de Sat Sep 4 07:27:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 07:27:13 -0700 (PDT) Received: from Cantor.suse.de (cantor.suse.de [195.135.220.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i84ER76U009692 for ; Sat, 4 Sep 2004 07:27:07 -0700 Received: from hermes.suse.de (hermes-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id 0D877B7DB65; Sat, 4 Sep 2004 16:24:14 +0200 (CEST) Date: Sat, 4 Sep 2004 16:24:04 +0200 From: Andi Kleen To: jamal Cc: Andi Kleen , "David S. Miller" , Alexey , netdev@oss.sgi.com, akepner@sgi.com Subject: Re: [PATCH] Extend lock less TX to real devices Message-ID: <20040904142404.GA6850@wotan.suse.de> References: <20040901223301.1a8d97a8.davem@redhat.com> <20040904132809.GB33964@muc.de> <1094307106.1634.147.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1094307106.1634.147.camel@jzny.localdomain> X-archive-position: 8402 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On Sat, Sep 04, 2004 at 10:11:46AM -0400, jamal wrote: > On Sat, 2004-09-04 at 09:28, Andi Kleen wrote: > > On Wed, Sep 01, 2004 at 10:33:01PM -0700, David S. Miller wrote: > > > On Tue, 31 Aug 2004 14:38:20 +0200 > > > Andi Kleen wrote: > > > > > > > This patch extends the recently added NETIF_F_LLTX to real devices. > > > > > > Well, it does a lot of other things too. > > > > Not really, it all works to the same goal. > > Must be my sleep depravation - what is LLTX again? NETIF_F_LLTX - a new flag that tells the stack the the driver doesn't want an xmit lock. > > > > > least the user would find out in their logs. With your > > > change the system explodes looping with no explanation why. > > > > Hmm, I guess if you're really worried about this class > > of driver bugs ble eing common adding some real error handling > > for it (like bailing out and disabling the device) would > > be the far better option. > > Actually that message is pretty useful. > I have seen at least a handful of badly written drivers do that. They will still print that, no problem. > > > > patch 1) Change macros into inlines > > > patch 2) local_bh_disable() preemption count optimization > > > patch 3) support for F_LLTX on real devices > > > patch 4) locking changes > > > > At least (3) and (4) are the same thing. I can drop the > > inlines, it was only for making the code clearer and less ugly > > but is not essential for the optimizations. > > do you guys mind if i test these patches/patch out first before final > inclusion? Next weekend i will have the chance. You can do that, but they won't do much unless your driver sets NETIF_F_LLTX. -Andi From herbert@gondor.apana.org.au Sat Sep 4 12:40:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 12:40:49 -0700 (PDT) Received: from arnor.apana.org.au (mail@arnor.apana.org.au [203.14.152.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i84Jeeam022183 for ; Sat, 4 Sep 2004 12:40:41 -0700 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1C3gNk-0000EX-00; Sun, 05 Sep 2004 05:39:28 +1000 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1C3gNb-0006DO-00; Sun, 05 Sep 2004 05:39:19 +1000 From: Herbert Xu To: ak@muc.de (Andi Kleen) Subject: Re: [PATCH] Extend lock less TX to real devices Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, akepner@sgi.com Organization: Core In-Reply-To: <20040904132809.GB33964@muc.de> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.26-1-686-smp (i686)) Message-Id: Date: Sun, 05 Sep 2004 05:39:19 +1000 X-archive-position: 8403 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Andi Kleen wrote: > >> > I added support for trylocking instead of spinning like sch_generic >> > does - for that the driver has to return -1, then the packet is requeued. >> > The check for a local device deadlock is lost for this case, >> > but that doesn't seem to be a big loss (I've never seen this printk >> > ever get triggered) >> >> It is triggerable if you misconfigure your system. > > Really? The only reason I can see for it is a buggy driver. Is this the dead loop message? If so it can happen with tunnels. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From han@mijncomputer.nl Sat Sep 4 13:21:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 13:21:26 -0700 (PDT) Received: from boetes.org (cc15467-a.groni1.gr.home.nl [217.120.147.78]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id i84KLH2b023278 for ; Sat, 4 Sep 2004 13:21:18 -0700 Received: (qmail 1941 invoked by uid 1000); 4 Sep 2004 20:21:29 -0000 Date: Sat, 4 Sep 2004 22:21:07 +0200 From: Han Boetes To: Francois Romieu Cc: netdev@oss.sgi.com Subject: Re: [Fwd: rtl8169 driver from realtek] Message-ID: <20040904202129.GJ2387@boetes.org> References: <4139ED5B.2030002@pobox.com> <20040904182404.GB16875@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040904182404.GB16875@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.5.6i X-archive-position: 8404 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: han@mijncomputer.nl Precedence: bulk X-list: netdev Francois Romieu wrote: > Hello M. Boetes, Bonjour :) > Jeff was kind enough to forward your message to me. Very kind indeed. > Can you provide: > - the kernel/compiler version (vendor/hand built/modular kernel); One thing: this is with the driver from realtek. After this run I'll send you another reply with the normal kernel driver. So you are about to get a very similar mail from me. Linux marsupilami 2.6.9-rc1 #3 Thu Sep 2 21:39:26 CEST 2004 i686 unknown unknown GNU/Linux, a hand build static kernel, only one module ( nvidia driver ) Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/specs Configured with: ../gcc-3.3.3/configure --prefix=/usr --enable-languages=c,c++,objc --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --enable-shared --disable-nls Thread model: posix gcc version 3.3.3 (CRUX) > - the complete dmesg after boot once the r8169 is ifconfig'ed up Linux version 2.6.9-rc1 (han@bereboot) (gcc version 3.3.3 (CRUX)) #3 Thu Sep 2 21:39:26 CEST 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000002fff0000 (usable) BIOS-e820: 000000002fff0000 - 000000002fff8000 (ACPI data) BIOS-e820: 000000002fff8000 - 0000000030000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) 767MB LOWMEM available. On node 0 totalpages: 196592 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 192496 pages, LIFO batch:16 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. Built 1 zonelists Kernel command line: BOOT_IMAGE=269rc1 ro root=305 acpi=off hdc=ide-cd hdd=ide-cd ide_setup: hdc=ide-cd ide_setup: hdd=ide-cd Initializing CPU#0 PID hash table entries: 4096 (order 12: 32768 bytes) Detected 1467.420 MHz processor. Using tsc for high-res timesource Console: colour dummy device 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 774920k/786368k available (2321k kernel code, 10696k reserved, 710k data, 396k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 2891.77 BogoMIPS Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: AMD Athlon(tm) XP 1700+ stepping 02 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. ACPI: IRQ9 SCI: Edge set to Level Trigger. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfdaf1, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20040715 ACPI: Interpreter disabled. usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) PCI: Using IRQ router default [1106/3147] at 0000:00:11.0 PCI: IRQ 0 for device 0000:00:11.1 doesn't match PIRQ mask - try pci=usepirqmask PCI: Hardcoded IRQ 14 for device 0000:00:11.1 get_random_bytes called before random driver initialization vesafb: framebuffer at 0xd0000000, mapped to 0xf0807000, size 3072k vesafb: mode is 1024x768x16, linelength=2048, pages=1 vesafb: protected mode interface info at c000:e4d0 vesafb: scrolling: redraw vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0 fb0: VESA VGA frame buffer device Machine check exception polling timer started. apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac) apm: driver version: No APM present Console: switching to colour frame buffer device 128x48 Real Time Clock Driver v1.12 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected VIA KT266/KY266x/KT333 chipset agpgart: Maximum main memory to use for agp memory: 690M agpgart: AGP aperture is 128M @ 0xe0000000 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize loop: loaded (max 8 devices) 8139too Fast Ethernet driver 0.9.27 eth0: RealTek RTL8139 at 0xf0b88f00, 00:e0:4c:67:52:80, IRQ 10 eth0: Identified 8139 chip type 'RTL-8139B' Universal TUN/TAP device driver 1.5 (C)1999-2002 Maxim Krasnyansky eth1: Identified chip type is 'RTL8169s/8110s'. eth1: RTL8169s/8110s Gigabit Ethernet driver 2.2 at 0xe400, 00:08:a1:3c:34:7a, IRQ 12 eth1: Auto-negotiation Enabled. eth1: 1000Mbps Full-duplex operation. Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller at PCI slot 0000:00:11.1 PCI: Hardcoded IRQ 14 for device 0000:00:11.1 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt8233a (rev 00) IDE UDMA133 controller on pci0000:00:11.1 ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio hda: ST340823A, ATA DISK drive hdb: BDV 108A DVDROM, ATAPI CD/DVD-ROM drive Using anticipatory io scheduler ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: LITE-ON LTR-40125S, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 78165360 sectors (40020 MB) w/512KiB Cache, CHS=65535/16/63, UDMA(100) hda: cache flushes supported hda: hda1 hda2 < hda5 hda6 hda7 hda8 hda9 hda10 > hdb: ATAPI DVD-ROM drive, 512kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 hdc: ATAPI 48X CD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33) ide-floppy driver 0.99.newide USB Universal Host Controller Interface driver v2.2 uhci_hcd 0000:00:11.2: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller uhci_hcd 0000:00:11.2: irq 5, io base 0000d800 uhci_hcd 0000:00:11.2: new USB bus registered, assigned bus number 1 hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected uhci_hcd 0000:00:11.3: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#2) uhci_hcd 0000:00:11.3: irq 5, io base 0000dc00 uhci_hcd 0000:00:11.3: new USB bus registered, assigned bus number 2 hub 2-0:1.0: USB hub found hub 2-0:1.0: 2 ports detected usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.0:USB HID core driver mice: PS/2 mouse device common for all mice serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 input: AT Translated Set 2 keyboard on isa0060/serio0 Advanced Linux Sound Architecture Driver Version 1.0.4 (Mon May 17 14:31:44 2004 UTC). AC'97 0 analog subsections not ready ALSA device list: #0: Sound Blaster Live! (rev.7) at 0xe000, irq 10 NET: Registered protocol family 2 IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 262144 bind 65536) NET: Registered protocol family 1 NET: Registered protocol family 17 usb 1-1: new low speed USB device using address 2 UDF-fs: No VRS found HID Mouse 0xc024 forced to 2 ms polling VFS: Mounted root (jfs filesystem) readonly. Freeing unused kernel memory: 396k freed input: USB HID v1.10 Mouse [B16_b_02 USB-PS/2 Optical Mouse] on usb-0000:00:11.2-1 Adding 257000k swap on /dev/hda6. Priority:-1 extents:1 > - the content of the /proc/interrupts file once the r8169 is ifconfig'ed up CPU0 0: 365527 XT-PIC timer 1: 1830 XT-PIC i8042 2: 0 XT-PIC cascade 5: 2310 XT-PIC uhci_hcd, uhci_hcd 8: 1 XT-PIC rtc 10: 0 XT-PIC EMU10K1 12: 2819 XT-PIC eth1 14: 3207 XT-PIC ide0 15: 20 XT-PIC ide1 NMI: 0 ERR: 0 > - a short description of the "does not work" I'm terribly sorry I didn't provide it. This is the description of what goes wrong with the realtek-driver: One single transfer goes ok as far as I can tell. But once I start doing multiple things, all over nfs, like listening to an mp3 and sending over files the connection gets the hickups. > - the brand of the 8169 cards (if any) This is a real GIGABIT ETHERNET CARD from 19 Euro at my favourite provider. :-) > If you have some version of the Realtek's driver prior to the 2.2 version > which are not available on their site any more, I'll welcome these. No I don't have any other drivers from them. > > Please let me know which driver you think is the best. > > None :o) > -> in tree driver does not work for you > -> From a quick look, Realtek's driver does not include a single barrier > instruction and the failure paths in rtl8169_open() are broken. I would > not be surprized that some fixes are missing which are available in the > kernel tree. No NAPI nor ethtool. Jumbo frames support and other > goodies are there though. Okay, I'll have to isolate the differences. > If you can stay with us until an in kernel, modified, r8169 driver works > with your setup, it should be possible to get the best of both worlds. Ok, now lets try the normal kernel driver with a clear head. # Han -- (I hate large) \||/ A man is already halfway in love with any (sigs) Oo. | @___oo woman who listens to him. -- Brendan Francis /\ /\ / (__,,,,| ) /^\) ^\/ _) ) /^\/ _) ) _ / / _) /\ )/\/ || | )_) < > |(,,) )__) || / \)___)\ | \____( )___) )___ \______(_______;;; __;;; From uucp@ganesha.gnumonks.org Sat Sep 4 13:38:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 13:38:18 -0700 (PDT) Received: from ganesha.gnumonks.org (Debian-exim@ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i84KcB77023776 for ; Sat, 4 Sep 2004 13:38:12 -0700 Received: from uucp by ganesha.gnumonks.org with local-bsmtp (Exim 4.30) id 1C3hIL-0001yt-6q for netdev@oss.sgi.com; Sat, 04 Sep 2004 22:37:57 +0200 Received: from laforge by obroa-skai.gnumonks.org with local (Exim 3.36 #1) id 1C3gmk-0004cM-00; Sat, 04 Sep 2004 22:05:18 +0200 Date: Sat, 4 Sep 2004 22:05:18 +0200 From: Harald Welte To: "David S. Miller" Cc: Harald Welte , netfilter-devel@lists.netfilter.org, netdev@oss.sgi.com Subject: Re: [PATCH 2.6] 2/2: Fix NAT helper locking Message-ID: <20040904200518.GC11414@obroa-skai.de.gnumonks.org> Mail-Followup-To: Harald Welte , "David S. Miller" , netfilter-devel@lists.netfilter.org, netdev@oss.sgi.com References: <20040903070234.GQ26263@sunbeam.de.gnumonks.org> <20040903002453.77beee16.davem@davemloft.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="adJ1OR3c6QgCpb/j" Content-Disposition: inline In-Reply-To: <20040903002453.77beee16.davem@davemloft.net> X-Operating-System: Linux obroa-skai.de.gnumonks.org 2.6.8-rc2-nfp0722-tcpwin X-Date: Today is Pungenday, the 23rd day of Chaos in the YOLD 3070 User-Agent: Mutt/1.5.6+20040722i X-archive-position: 8405 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --adJ1OR3c6QgCpb/j Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Sep 03, 2004 at 12:24:53AM -0700, David S. Miller wrote: =20 > Ummm... wow what ancient tree did you patch against > Harald? The tree you patched against didn't even have the > skb_header_pointer() changes in it, that's caveman > era :-) That's what caused the rejects. I tried to apply the patch against 2.6.9-rc1 before sending it to you. It had offsets, but applied cleanly > Thanks. --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --adJ1OR3c6QgCpb/j Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFBOh/+XaXGVTD0i/8RAgXLAJ4tcjBuw7+0CoQGe1FK9Dc1/2LB+ACbBtxK 8sWfs0Pl9hdUDYNq+oDE+KA= =k3Xl -----END PGP SIGNATURE----- --adJ1OR3c6QgCpb/j-- From han@mijncomputer.nl Sat Sep 4 14:18:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 14:18:57 -0700 (PDT) Received: from boetes.org (cc15467-a.groni1.gr.home.nl [217.120.147.78]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id i84LIoPT024864 for ; Sat, 4 Sep 2004 14:18:51 -0700 Received: (qmail 21712 invoked by uid 1000); 4 Sep 2004 21:19:02 -0000 Date: Sat, 4 Sep 2004 23:18:40 +0159 From: Han Boetes To: Francois Romieu Cc: netdev@oss.sgi.com Subject: Re: [Fwd: rtl8169 driver from realtek] Message-ID: <20040904211902.GK2387@boetes.org> References: <4139ED5B.2030002@pobox.com> <20040904182404.GB16875@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040904182404.GB16875@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.5.6i X-archive-position: 8406 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: han@mijncomputer.nl Precedence: bulk X-list: netdev Here we go again. This is with the normal kernel-driver. It works fine. Obviously I did something else wrong and drew the wrong conclusion. Excuse me for that. Heavy stress-tests work fine with this driver! Francois Romieu wrote: > - the complete dmesg after boot once the r8169 is ifconfig'ed up Linux version 2.6.9-rc1 (han@marsupilami) (gcc version 3.3.3 (CRUX)) #1 Sat Sep 4 22:43:12 CEST 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000002fff0000 (usable) BIOS-e820: 000000002fff0000 - 000000002fff8000 (ACPI data) BIOS-e820: 000000002fff8000 - 0000000030000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) 767MB LOWMEM available. On node 0 totalpages: 196592 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 192496 pages, LIFO batch:16 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. Built 1 zonelists Kernel command line: BOOT_IMAGE=269rc1 ro root=305 acpi=off hdc=ide-cd hdd=ide-cd ide_setup: hdc=ide-cd ide_setup: hdd=ide-cd Initializing CPU#0 PID hash table entries: 4096 (order 12: 32768 bytes) Detected 1467.420 MHz processor. Using tsc for high-res timesource Console: colour dummy device 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 774912k/786368k available (2324k kernel code, 10704k reserved, 714k data, 396k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 2891.77 BogoMIPS Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: AMD Athlon(tm) XP 1700+ stepping 02 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. ACPI: IRQ9 SCI: Edge set to Level Trigger. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfdaf1, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20040715 ACPI: Interpreter disabled. usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) PCI: Using IRQ router default [1106/3147] at 0000:00:11.0 PCI: IRQ 0 for device 0000:00:11.1 doesn't match PIRQ mask - try pci=usepirqmask PCI: Hardcoded IRQ 14 for device 0000:00:11.1 get_random_bytes called before random driver initialization vesafb: framebuffer at 0xd0000000, mapped to 0xf0807000, size 3072k vesafb: mode is 1024x768x16, linelength=2048, pages=1 vesafb: protected mode interface info at c000:e4d0 vesafb: scrolling: redraw vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0 fb0: VESA VGA frame buffer device Machine check exception polling timer started. apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac) apm: driver version: No APM present Console: switching to colour frame buffer device 128x48 Real Time Clock Driver v1.12 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected VIA KT266/KY266x/KT333 chipset agpgart: Maximum main memory to use for agp memory: 690M agpgart: AGP aperture is 128M @ 0xe0000000 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize loop: loaded (max 8 devices) 8139too Fast Ethernet driver 0.9.27 eth0: RealTek RTL8139 at 0xf0b88f00, 00:e0:4c:67:52:80, IRQ 10 eth0: Identified 8139 chip type 'RTL-8139B' Universal TUN/TAP device driver 1.5 (C)1999-2002 Maxim Krasnyansky r8169 Gigabit Ethernet driver 1.2 loaded eth1: Identified chip type is 'RTL8169s/8110s'. eth1: RTL8169 at 0xf0b8ae00, 00:08:a1:3c:34:7a, IRQ 12 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller at PCI slot 0000:00:11.1 PCI: Hardcoded IRQ 14 for device 0000:00:11.1 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt8233a (rev 00) IDE UDMA133 controller on pci0000:00:11.1 ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio hda: ST340823A, ATA DISK drive hdb: BDV 108A DVDROM, ATAPI CD/DVD-ROM drive Using anticipatory io scheduler ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: LITE-ON LTR-40125S, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 78165360 sectors (40020 MB) w/512KiB Cache, CHS=65535/16/63, UDMA(100) hda: cache flushes supported hda: hda1 hda2 < hda5 hda6 hda7 hda8 hda9 hda10 > hdb: ATAPI DVD-ROM drive, 512kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 hdc: ATAPI 48X CD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33) ide-floppy driver 0.99.newide USB Universal Host Controller Interface driver v2.2 uhci_hcd 0000:00:11.2: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller uhci_hcd 0000:00:11.2: irq 5, io base 0000d800 uhci_hcd 0000:00:11.2: new USB bus registered, assigned bus number 1 hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected uhci_hcd 0000:00:11.3: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#2) uhci_hcd 0000:00:11.3: irq 5, io base 0000dc00 uhci_hcd 0000:00:11.3: new USB bus registered, assigned bus number 2 hub 2-0:1.0: USB hub found hub 2-0:1.0: 2 ports detected usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.0:USB HID core driver mice: PS/2 mouse device common for all mice serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 input: AT Translated Set 2 keyboard on isa0060/serio0 Advanced Linux Sound Architecture Driver Version 1.0.4 (Mon May 17 14:31:44 2004 UTC). ALSA device list: #0: Sound Blaster Live! (rev.7) at 0xe000, irq 10 NET: Registered protocol family 2 IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 262144 bind 65536) NET: Registered protocol family 1 NET: Registered protocol family 17 usb 1-1: new low speed USB device using address 2 UDF-fs: No VRS found VFS: Mounted root (jfs filesystem) readonly. Freeing unused kernel memory: 396k freed HID Mouse 0xc024 forced to 2 ms polling input: USB HID v1.10 Mouse [B16_b_02 USB-PS/2 Optical Mouse] on usb-0000:00:11.2-1 Adding 257000k swap on /dev/hda6. Priority:-1 extents:1 > - the content of the /proc/interrupts file once the r8169 is ifconfig'ed up CPU0 0: 1012596 XT-PIC timer 1: 2271 XT-PIC i8042 2: 0 XT-PIC cascade 5: 35354 XT-PIC uhci_hcd, uhci_hcd 8: 1 XT-PIC rtc 10: 8922 XT-PIC EMU10K1 11: 68450 XT-PIC nvidia 12: 203034 XT-PIC eth1 14: 7347 XT-PIC ide0 15: 20 XT-PIC ide1 NMI: 0 ERR: 0 > If you can stay with us until an in kernel, modified, r8169 driver > works with your setup, it should be possible to get the best of both > worlds. Please let me know if you want to test new patches based on stuff you found in the realtek driver. # Han -- A random bookmark: http://crypto.yashy.com/nmap.php From romieu@fr.zoreil.com Sat Sep 4 14:46:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 14:46:39 -0700 (PDT) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i84LkWso025634 for ; Sat, 4 Sep 2004 14:46:33 -0700 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id i84LiLvr019952; Sat, 4 Sep 2004 23:44:21 +0200 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id i84LiLBQ019951; Sat, 4 Sep 2004 23:44:21 +0200 Date: Sat, 4 Sep 2004 23:44:21 +0200 From: Francois Romieu To: Han Boetes Cc: netdev@oss.sgi.com Subject: Re: [Fwd: rtl8169 driver from realtek] Message-ID: <20040904214243.GA19728@electric-eye.fr.zoreil.com> References: <4139ED5B.2030002@pobox.com> <20040904182404.GB16875@electric-eye.fr.zoreil.com> <20040904211902.GK2387@boetes.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040904211902.GK2387@boetes.org> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 8407 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Han Boetes : > > Here we go again. This is with the normal kernel-driver. It works > fine. Obviously I did something else wrong and drew the wrong > conclusion. Excuse me for that. Success reports are welcome too :o) -- Ueimor From alex@towebs.com Sat Sep 4 15:34:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 15:34:17 -0700 (PDT) Received: from gate-one.toservers.com (customer.iplannetworks.net [200.69.210.29] (may be forged)) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i84MY9wS026615; Sat, 4 Sep 2004 15:34:10 -0700 Received: from 127.0.0.1 (localhost [127.0.0.1]) by dummy.domain.name (Postfix) with SMTP id 0F1695F9EB; Sat, 4 Sep 2004 22:32:13 +0000 (UTC) Received: from [10.10.1.38] (alexis [10.10.1.38]) by gate-one.toservers.com (Postfix) with ESMTP id 804305F9C7; Sat, 4 Sep 2004 22:32:12 +0000 (UTC) Message-ID: <413A1566.6050807@towebs.com> Date: Sat, 04 Sep 2004 19:20:06 +0000 From: Alex User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040819 X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com, majordomo@oss.sgi.com Subject: Host AP driver for Intersil Prism2/2.5/3 X-Enigmail-Version: 0.85.0.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 8408 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alex@towebs.com Precedence: bulk X-list: netdev This driver is not build-in in linux kernel. http://www.inlab.csp.it/tools/wireless/hostqs/ From listmoved@calle.in-berlin.de Sat Sep 4 19:15:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 19:15:52 -0700 (PDT) Received: from gnu.in-berlin.de (gnu.in-berlin.de [192.109.42.4]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i852FjMV001172 for ; Sat, 4 Sep 2004 19:15:46 -0700 X-Envelope-From: listmoved@calle.in-berlin.de X-Envelope-To: Received: from calle.in-berlin.de (calle.in-berlin.de [217.197.81.210]) by gnu.in-berlin.de (8.12.11/8.12.11/Debian-4) with ESMTP id i852FZlx011951 for ; Sun, 5 Sep 2004 04:15:35 +0200 Received: by calle.in-berlin.de (Smail3.2.0.115) id ; Sun, 5 Sep 2004 04:15:35 +0200 (CEST) Message-Id: Date: Sun, 5 Sep 2004 04:15:35 +0200 (CEST) To: netdev@oss.sgi.com From: calle@calle.in-berlin.de Subject: Re: Server Error (majordomo@calle.in-berlin.de) Precedence: bulk X-Scanned-By: MIMEDefang 2.43 X-archive-position: 8409 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: calle@calle.in-berlin.de Precedence: bulk X-list: netdev Hallo. Die Mailingliste linux-avmb1@calle.in-berlin.de ist umgezogen, da sie durch Spams verseucht wurde. Siehe https://mlists.in-berlin.de/mailman/listinfo/linux-avmb1 The mailing list linux-avmb1@calle.in-berlin.de was been moved, we got too many spams. Visit https://mlists.in-berlin.de/mailman/listinfo/linux-avmb1 calle From mcgrof@studorgs.rutgers.edu Sat Sep 4 23:46:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 23:46:12 -0700 (PDT) Received: from ruslug.rutgers.edu (studorgs.rutgers.edu [128.6.24.131]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i856k7J6008868 for ; Sat, 4 Sep 2004 23:46:07 -0700 Received: by ruslug.rutgers.edu (Postfix, from userid 503) id 505F6F99C1; Sun, 5 Sep 2004 02:45:58 -0400 (EDT) Date: Sun, 5 Sep 2004 02:45:58 -0400 To: Jeff Garzik Cc: prism54-devel@prism54.org, Netdev Subject: [PATCH 0/4] prism54: remove some module_params, WE17 and initial WPA work Message-ID: <20040905064558.GT31207@ruslug.rutgers.edu> Mail-Followup-To: Jeff Garzik , prism54-devel@prism54.org, Netdev Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ctP54qlpMx3WjD+/" Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Operating-System: 2.4.18-1-686 Organization: Rutgers University Student Linux Users Group From: mcgrof@studorgs.rutgers.edu (Luis R. Rodriguez) X-archive-position: 8410 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcgrof@studorgs.rutgers.edu Precedence: bulk X-list: netdev --ctP54qlpMx3WjD+/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable [ Note: I'm resending these as Jeff's HD died ] Jeff, = =20 = =20 The following patches go ontop of Margit's recent patches which have not = =20 yet been integrated by you. These patches add remove some unncessary = =20 module_params, adds WE17 support and adds our initial WPA work -- = =20 striving to get prism54 wpa_supplicant to work with prism54. = =20 = =20 Please note that these patches are for *both* 2.6 and 2.4. = =20 = =20 Thanks, = =20 = =20 Luis=20 --=20 GnuPG Key fingerprint =3D 113F B290 C6D2 0251 4D84 A34A 6ADD 4937 E20A 525E --ctP54qlpMx3WjD+/ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFBOrYmat1JN+IKUl4RAprmAKCoB+ioyNxT3fGzKg+e5EWPCorzKwCeJC9C DHiJKtT7d+2+wU7JFuR9Jok= =yZXR -----END PGP SIGNATURE----- --ctP54qlpMx3WjD+/-- From mcgrof@studorgs.rutgers.edu Sat Sep 4 23:49:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Sep 2004 23:49:48 -0700 (PDT) Received: from ruslug.rutgers.edu (studorgs.rutgers.edu [128.6.24.131]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i856nhms009204 for ; Sat, 4 Sep 2004 23:49:43 -0700 Received: by ruslug.rutgers.edu (Postfix, from userid 503) id 73114F99C1; Sun, 5 Sep 2004 02:49:34 -0400 (EDT) Date: Sun, 5 Sep 2004 02:49:34 -0400 To: Jeff Garzik , prism54-devel@prism54.org, Netdev Subject: Re: [PATCH 0/4] prism54: remove some module_params, WE17 and initial WPA work Message-ID: <20040905064934.GV31207@ruslug.rutgers.edu> Mail-Followup-To: Jeff Garzik , prism54-devel@prism54.org, Netdev References: <20040905064558.GT31207@ruslug.rutgers.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040905064558.GT31207@ruslug.rutgers.edu> User-Agent: Mutt/1.3.28i X-Operating-System: 2.4.18-1-686 Organization: Rutgers University Student Linux Users Group From: mcgrof@studorgs.rutgers.edu (Luis R. Rodriguez) X-archive-position: 8411 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcgrof@studorgs.rutgers.edu Precedence: bulk X-list: netdev nevermind, I'm going to coordinate with Margit so we send her and my patches together to make life easier for you. Luis On Sun, Sep 05, 2004 at 02:45:58AM -0400, Luis R. Rodriguez wrote: > > [ Note: I'm resending these as Jeff's HD died ] > > Jeff, > > The following patches go ontop of Margit's recent patches which have not > yet been integrated by you. These patches add remove some unncessary > module_params, adds WE17 support and adds our initial WPA work -- > striving to get prism54 wpa_supplicant to work with prism54. > > Please note that these patches are for *both* 2.6 and 2.4. > > Thanks, > > Luis > > -- > GnuPG Key fingerprint = 113F B290 C6D2 0251 4D84 A34A 6ADD 4937 E20A 525E -- GnuPG Key fingerprint = 113F B290 C6D2 0251 4D84 A34A 6ADD 4937 E20A 525E From margitsw@t-online.de Sun Sep 5 02:28:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Sep 2004 02:28:11 -0700 (PDT) Received: from mailout08.sul.t-online.com (mailout08.sul.t-online.com [194.25.134.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id i859S24b015399 for ; Sun, 5 Sep 2004 02:28:03 -0700 Received: from fwd05.aul.t-online.de by mailout08.sul.t-online.com with smtp id 1C3tJ4-0004LB-02; Sun, 05 Sep 2004 11:27:30 +0200 Received: from roglap.local (ZwnRR8Z68eRo4AfVV-pfOw7R64FAxblgxc62jXHZyNUEFIjgZRTbZS@[217.224.19.205]) by fwd05.sul.t-online.com with esmtp id 1C3tIw-0qpJXE0; Sun, 5 Sep 2004 11:27:22 +0200 From: margitsw@t-online.de (Margit Schubert-While) To: jgarzik@pobox.com Subject: [PATCH Linux-2.4.28-pre2] prism54 Update to 2.6 status Date: Sun, 5 Sep 2004 11:15:49 +0200 User-Agent: KMail/1.5.4 Cc: netdev@oss.sgi.com, prism54-devel@prism54.org, marcelo.tosatti@cyclades.com MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_FltOBefVy+CGV5h" Message-Id: <200409051115.50083.margitsw@t-online.de> X-ID: ZwnRR8Z68eRo4AfVV-pfOw7R64FAxblgxc62jXHZyNUEFIjgZRTbZS X-TOI-MSGID: e065345e-8edf-45e4-9b72-74b221704805 X-archive-position: 8412 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: margitsw@t-online.de Precedence: bulk X-list: netdev --Boundary-00=_FltOBefVy+CGV5h Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline 2004-09-05 Margit Schubert-While * This rollup patch brings 2.4 into line with current * 2.6 status. * The code base is then IDENTICAL for both 2.4 and 2.6 with * the exception of an extra compatibility header for 2.4. * There is no point in doing a split out for this patch * as all changes have already been individually posted * against 2.6 and passed through Jeff. * As of now, we can post one patch for both kernels. * I have included Marcelo in the CC so that he is informed. * Jeff, after this patch, the upcoming resend of patches * apply equally to 2.4 and 2.6 unless explicitly stated. Margit --Boundary-00=_FltOBefVy+CGV5h Content-Type: text/x-diff; charset="us-ascii"; name="01_rollup.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="01_rollup.patch" diff -Naur linux-2.4.28-pre2/drivers/net/wireless/prism54/isl_38xx.c linux-2.4.28-pre2msw/drivers/net/wireless/prism54/isl_38xx.c --- linux-2.4.28-pre2/drivers/net/wireless/prism54/isl_38xx.c 2004-09-05 10:08:56.000000000 +0200 +++ linux-2.4.28-pre2msw/drivers/net/wireless/prism54/isl_38xx.c 2004-09-05 10:36:25.000000000 +0200 @@ -18,8 +18,6 @@ * */ -#define __KERNEL_SYSCALLS__ - #include #include #include diff -Naur linux-2.4.28-pre2/drivers/net/wireless/prism54/isl_ioctl.c linux-2.4.28-pre2msw/drivers/net/wireless/prism54/isl_ioctl.c --- linux-2.4.28-pre2/drivers/net/wireless/prism54/isl_ioctl.c 2004-09-05 10:08:56.000000000 +0200 +++ linux-2.4.28-pre2msw/drivers/net/wireless/prism54/isl_ioctl.c 2004-09-05 10:36:25.000000000 +0200 @@ -436,7 +436,7 @@ { struct iw_range *range = (struct iw_range *) extra; islpci_private *priv = netdev_priv(ndev); - char *data; + u8 *data; int i, m, rvalue; struct obj_frequencies *freq; union oid_res_t r; @@ -513,8 +513,7 @@ i = 0; while ((i < IW_MAX_BITRATES) && (*data != 0)) { /* the result must be in bps. The card gives us 500Kbps */ - range->bitrate[i] = (__s32) (*data >> 1); - range->bitrate[i] *= 1000000; + range->bitrate[i] = *data * 500000; i++; data++; } @@ -820,9 +819,11 @@ return mgt_set_request(priv, DOT11_OID_PROFILES, 0, &profile); } - if ((ret = - mgt_get_request(priv, DOT11_OID_SUPPORTEDRATES, 0, NULL, &r))) + ret = mgt_get_request(priv, DOT11_OID_SUPPORTEDRATES, 0, NULL, &r); + if (ret) { + kfree(r.ptr); return ret; + } rate = (u32) (vwrq->value / 500000); data = r.ptr; @@ -840,6 +841,7 @@ } if (!data[i]) { + kfree(r.ptr); return -EINVAL; } @@ -888,8 +890,11 @@ vwrq->value = r.u * 500000; /* request the device for the enabled rates */ - if ((rvalue = mgt_get_request(priv, DOT11_OID_RATES, 0, NULL, &r))) + rvalue = mgt_get_request(priv, DOT11_OID_RATES, 0, NULL, &r); + if (rvalue) { + kfree(r.ptr); return rvalue; + } data = r.ptr; vwrq->fixed = (data[0] != 0) && (data[1] == 0); kfree(r.ptr); @@ -1942,7 +1947,7 @@ { islpci_private *priv = netdev_priv(ndev); struct islpci_mgmtframe *response = NULL; - int ret = -EIO, response_op = PIMFOR_OP_ERROR; + int ret = -EIO; printk("%s: get_oid 0x%08X\n", ndev->name, priv->priv_oid); data->length = 0; @@ -1952,9 +1957,7 @@ islpci_mgt_transaction(priv->ndev, PIMFOR_OP_GET, priv->priv_oid, extra, 256, &response); - response_op = response->header->operation; printk("%s: ret: %i\n", ndev->name, ret); - printk("%s: response_op: %i\n", ndev->name, response_op); if (ret || !response || response->header->operation == PIMFOR_OP_ERROR) { if (response) { @@ -1991,16 +1994,20 @@ priv->priv_oid, extra, data->length, &response); printk("%s: ret: %i\n", ndev->name, ret); + if (ret || !response + || response->header->operation == PIMFOR_OP_ERROR) { + if (response) { + islpci_mgt_release(response); + } + printk("%s: EIO\n", ndev->name); + ret = -EIO; + } if (!ret) { response_op = response->header->operation; printk("%s: response_op: %i\n", ndev->name, response_op); islpci_mgt_release(response); } - if (ret || response_op == PIMFOR_OP_ERROR) { - printk("%s: EIO\n", ndev->name); - ret = -EIO; - } } return (ret ? ret : -EINPROGRESS); diff -Naur linux-2.4.28-pre2/drivers/net/wireless/prism54/islpci_dev.c linux-2.4.28-pre2msw/drivers/net/wireless/prism54/islpci_dev.c --- linux-2.4.28-pre2/drivers/net/wireless/prism54/islpci_dev.c 2004-09-05 10:08:56.000000000 +0200 +++ linux-2.4.28-pre2msw/drivers/net/wireless/prism54/islpci_dev.c 2004-09-05 10:36:25.000000000 +0200 @@ -39,6 +39,7 @@ #include "oid_mgt.h" #define ISL3877_IMAGE_FILE "isl3877" +#define ISL3886_IMAGE_FILE "isl3886" #define ISL3890_IMAGE_FILE "isl3890" static int prism54_bring_down(islpci_private *); @@ -82,7 +83,7 @@ mdelay(50); { - const struct firmware *fw_entry = 0; + const struct firmware *fw_entry = NULL; long fw_len; const u32 *fw_ptr; @@ -185,6 +186,9 @@ void *device = priv->device_base; int powerstate = ISL38XX_PSM_POWERSAVE_STATE; + /* lock the interrupt handler */ + spin_lock(&priv->slock); + /* received an interrupt request on a shared IRQ line * first check whether the device is in sleep mode */ reg = readl(device + ISL38XX_CTRL_STAT_REG); @@ -194,14 +198,10 @@ #if VERBOSE > SHOW_ERROR_MESSAGES DEBUG(SHOW_TRACING, "Assuming someone else called the IRQ\n"); #endif + spin_unlock(&priv->slock); return IRQ_NONE; } - if (islpci_get_state(priv) != PRV_STATE_SLEEP) - powerstate = ISL38XX_PSM_ACTIVE_STATE; - - /* lock the interrupt handler */ - spin_lock(&priv->slock); /* check whether there is any source of interrupt on the device */ reg = readl(device + ISL38XX_INT_IDENT_REG); @@ -212,6 +212,9 @@ reg &= ISL38XX_INT_SOURCES; if (reg != 0) { + if (islpci_get_state(priv) != PRV_STATE_SLEEP) + powerstate = ISL38XX_PSM_ACTIVE_STATE; + /* reset the request bits in the Identification register */ isl38xx_w32_flush(device, reg, ISL38XX_INT_ACK_REG); @@ -339,6 +342,12 @@ isl38xx_handle_wakeup(priv->control_block, &powerstate, priv->device_base); } + } else { +#if VERBOSE > SHOW_ERROR_MESSAGES + DEBUG(SHOW_TRACING, "Assuming someone else called the IRQ\n"); +#endif + spin_unlock(&priv->slock); + return IRQ_NONE; } /* sleep -> ready */ @@ -716,7 +725,7 @@ if (priv->device_base) iounmap(priv->device_base); - priv->device_base = 0; + priv->device_base = NULL; /* free consistent DMA area... */ if (priv->driver_mem_address) @@ -725,10 +734,10 @@ priv->device_host_address); /* clear some dangling pointers */ - priv->driver_mem_address = 0; + priv->driver_mem_address = NULL; priv->device_host_address = 0; priv->device_psm_buffer = 0; - priv->control_block = 0; + priv->control_block = NULL; /* clean up mgmt rx buffers */ for (counter = 0; counter < ISL38XX_CB_MGMT_QSIZE; counter++) { @@ -754,7 +763,7 @@ if (priv->data_low_rx[counter]) dev_kfree_skb(priv->data_low_rx[counter]); - priv->data_low_rx[counter] = 0; + priv->data_low_rx[counter] = NULL; } /* Free the acces control list and the WPA list */ @@ -856,14 +865,14 @@ /* select the firmware file depending on the device id */ switch (pdev->device) { - case PCIDEVICE_ISL3890: - case PCIDEVICE_3COM6001: - strcpy(priv->firmware, ISL3890_IMAGE_FILE); - break; - case PCIDEVICE_ISL3877: + case 0x3877: strcpy(priv->firmware, ISL3877_IMAGE_FILE); break; + case 0x3886: + strcpy(priv->firmware, ISL3886_IMAGE_FILE); + break; + default: strcpy(priv->firmware, ISL3890_IMAGE_FILE); break; @@ -880,9 +889,9 @@ do_islpci_free_memory: islpci_free_memory(priv); do_free_netdev: - pci_set_drvdata(pdev, 0); + pci_set_drvdata(pdev, NULL); free_netdev(ndev); - priv = 0; + priv = NULL; return NULL; } diff -Naur linux-2.4.28-pre2/drivers/net/wireless/prism54/islpci_hotplug.c linux-2.4.28-pre2msw/drivers/net/wireless/prism54/islpci_hotplug.c --- linux-2.4.28-pre2/drivers/net/wireless/prism54/islpci_hotplug.c 2004-09-05 10:08:56.000000000 +0200 +++ linux-2.4.28-pre2msw/drivers/net/wireless/prism54/islpci_hotplug.c 2004-09-05 10:36:25.000000000 +0200 @@ -44,102 +44,30 @@ * If you have an update for this please contact prism54-devel@prism54.org * The latest list can be found at http://prism54.org/supported_cards.php */ static const struct pci_device_id prism54_id_tbl[] = { - /* 3COM 3CRWE154G72 Wireless LAN adapter */ - { - PCIVENDOR_3COM, PCIDEVICE_3COM6001, - PCIVENDOR_3COM, PCIDEVICE_3COM6001, - 0, 0, 0 - }, - - /* D-Link Air Plus Xtreme G A1 - DWL-g650 A1 */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_DLINK, 0x3202UL, - 0, 0, 0 - }, - - /* I-O Data WN-G54/CB - WN-G54/CB */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_IODATA, 0xd019UL, - 0, 0, 0 - }, - - /* Netgear WG511 */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_NETGEAR, 0x4800UL, - 0, 0, 0 - }, - - /* Tekram Technology clones, Allnet, Netcomm, Zyxel */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_TTL, 0x1605UL, - 0, 0, 0 - }, - - /* SMC2802W */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_SMC, 0x2802UL, - 0, 0, 0 - }, - - /* SMC2835W */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_SMC, 0x2835UL, - 0, 0, 0 - }, - - /* Corega CG-WLCB54GT */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_ATI, 0xc104UL, - 0, 0, 0 - }, - - /* I4 Z-Com XG-600 */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_I4, 0x0014UL, - 0, 0, 0 - }, - - /* I4 Z-Com XG-900 and clones Macer, Ovislink, Planex, Peabird, */ - /* Sitecom, Xterasys */ - { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_I4, 0x0020UL, - 0, 0, 0 - }, - - /* SMC 2802W V2 */ + /* Intersil PRISM Duette/Prism GT Wireless LAN adapter */ { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_ACCTON, 0xee03UL, + 0x1260, 0x3890, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, - /* SMC 2835W V2 */ + /* 3COM 3CRWE154G72 Wireless LAN adapter */ { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, - PCIVENDOR_SMC, 0xa835UL, + 0x10b7, 0x6001, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Intersil PRISM Indigo Wireless LAN adapter */ { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3877, + 0x1260, 0x3877, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, - /* Intersil PRISM Duette/Prism GT Wireless LAN adapter */ - /* Default */ + /* Intersil PRISM Javelin/Xbow Wireless LAN adapter */ { - PCIVENDOR_INTERSIL, PCIDEVICE_ISL3890, + 0x1260, 0x3886, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, @@ -166,85 +94,6 @@ /* .enable_wake ; we don't support this yet */ }; -static void -prism54_get_card_model(struct net_device *ndev) -{ - islpci_private *priv; - char *modelp; - int notwork = 0; - - priv = netdev_priv(ndev); - switch (priv->pdev->subsystem_device) { - case PCIDEVICE_ISL3877: - modelp = "PRISM Indigo"; - break; - case PCIDEVICE_ISL3886: - modelp = "PRISM Javelin / Xbow"; - break; - case PCIDEVICE_3COM6001: - modelp = "3COM 3CRWE154G72"; - break; - case 0x3202UL: - modelp = "D-Link DWL-g650 A1"; - break; - case 0xd019UL: - modelp = "WN-G54/CB"; - break; - case 0x4800UL: - modelp = "Netgear WG511"; - break; - case 0x2802UL: - modelp = "SMC2802W"; - break; - case 0xee03UL: - modelp = "SMC2802W V2"; - notwork = 1; - break; - case 0x2835UL: - modelp = "SMC2835W"; - break; - case 0xa835UL: - modelp = "SMC2835W V2"; - notwork = 1; - break; - case 0xc104UL: - modelp = "CG-WLCB54GT"; - break; - case 0x1605UL: - modelp = "Tekram Technology clone"; - break; - /* Let's leave this one out for now since it seems bogus/wrong - * Even if the manufacturer did use 0x0000UL it may not be correct - * by their part, therefore deserving no name ;) */ - /* case 0x0000UL: - * modelp = "SparkLAN WL-850F"; - * break;*/ - - /* We have two reported for the one below :( */ - case 0x0014UL: - modelp = "I4 Z-Com XG-600 and clones"; - break; - case 0x0020UL: - modelp = "I4 Z-Com XG-900 and clones"; - break; -/* Default it */ -/* - case PCIDEVICE_ISL3890: - modelp = "PRISM Duette/GT"; - break; -*/ - default: - modelp = "PRISM Duette/GT"; - } - printk(KERN_DEBUG "%s: %s driver detected card model: %s\n", - ndev->name, DRV_NAME, modelp); - if ( notwork ) { - printk(KERN_DEBUG "%s: %s Warning - This may not work\n", - ndev->name, DRV_NAME); - } - return; -} - /****************************************************************************** Module initialization functions ******************************************************************************/ @@ -354,17 +203,14 @@ /* firmware upload is triggered in islpci_open */ - /* Pretty card model discovery output */ - prism54_get_card_model(ndev); - return 0; do_unregister_netdev: unregister_netdev(ndev); islpci_free_memory(priv); - pci_set_drvdata(pdev, 0); + pci_set_drvdata(pdev, NULL); free_netdev(ndev); - priv = 0; + priv = NULL; do_pci_release_regions: pci_release_regions(pdev); do_pci_disable_device: @@ -380,7 +226,7 @@ prism54_remove(struct pci_dev *pdev) { struct net_device *ndev = pci_get_drvdata(pdev); - islpci_private *priv = ndev ? netdev_priv(ndev) : 0; + islpci_private *priv = ndev ? netdev_priv(ndev) : NULL; BUG_ON(!priv); if (!__in_cleanup_module) { @@ -408,9 +254,9 @@ /* free the PCI memory and unmap the remapped page */ islpci_free_memory(priv); - pci_set_drvdata(pdev, 0); + pci_set_drvdata(pdev, NULL); free_netdev(ndev); - priv = 0; + priv = NULL; pci_release_regions(pdev); @@ -421,7 +267,7 @@ prism54_suspend(struct pci_dev *pdev, u32 state) { struct net_device *ndev = pci_get_drvdata(pdev); - islpci_private *priv = ndev ? netdev_priv(ndev) : 0; + islpci_private *priv = ndev ? netdev_priv(ndev) : NULL; BUG_ON(!priv); printk(KERN_NOTICE "%s: got suspend request (state %d)\n", @@ -446,7 +292,7 @@ prism54_resume(struct pci_dev *pdev) { struct net_device *ndev = pci_get_drvdata(pdev); - islpci_private *priv = ndev ? netdev_priv(ndev) : 0; + islpci_private *priv = ndev ? netdev_priv(ndev) : NULL; BUG_ON(!priv); printk(KERN_NOTICE "%s: got resume request\n", ndev->name); diff -Naur linux-2.4.28-pre2/drivers/net/wireless/prism54/islpci_mgt.h linux-2.4.28-pre2msw/drivers/net/wireless/prism54/islpci_mgt.h --- linux-2.4.28-pre2/drivers/net/wireless/prism54/islpci_mgt.h 2004-09-05 10:08:56.000000000 +0200 +++ linux-2.4.28-pre2msw/drivers/net/wireless/prism54/islpci_mgt.h 2004-09-05 10:36:25.000000000 +0200 @@ -38,21 +38,6 @@ /* General driver definitions */ -#define PCIVENDOR_INTERSIL 0x1260UL -#define PCIVENDOR_3COM 0x10b7UL -#define PCIVENDOR_DLINK 0x1186UL -#define PCIVENDOR_I4 0x17cfUL -#define PCIVENDOR_IODATA 0x10fcUL -#define PCIVENDOR_NETGEAR 0x1385UL -#define PCIVENDOR_SMC 0x10b8UL -#define PCIVENDOR_ACCTON 0x1113UL -#define PCIVENDOR_ATI 0x1259UL -#define PCIVENDOR_TTL 0x16a5UL - -#define PCIDEVICE_ISL3877 0x3877UL -#define PCIDEVICE_ISL3886 0x3886UL -#define PCIDEVICE_ISL3890 0x3890UL -#define PCIDEVICE_3COM6001 0x6001UL #define PCIDEVICE_LATENCY_TIMER_MIN 0x40 #define PCIDEVICE_LATENCY_TIMER_VAL 0x50 diff -Naur linux-2.4.28-pre2/drivers/net/wireless/prism54/oid_mgt.c linux-2.4.28-pre2msw/drivers/net/wireless/prism54/oid_mgt.c --- linux-2.4.28-pre2/drivers/net/wireless/prism54/oid_mgt.c 2004-09-05 10:08:56.000000000 +0200 +++ linux-2.4.28-pre2msw/drivers/net/wireless/prism54/oid_mgt.c 2004-09-05 10:36:25.000000000 +0200 @@ -219,7 +219,7 @@ OID_UNKNOWN(OID_INL_MEMORY, 0xFF020002), OID_U32_C(OID_INL_MODE, 0xFF020003), OID_UNKNOWN(OID_INL_COMPONENT_NR, 0xFF020004), - OID_UNKNOWN(OID_INL_VERSION, 0xFF020005), + OID_STRUCT(OID_INL_VERSION, 0xFF020005, u8[8], OID_TYPE_RAW), OID_UNKNOWN(OID_INL_INTERFACE_ID, 0xFF020006), OID_UNKNOWN(OID_INL_COMPONENT_ID, 0xFF020007), OID_U32_C(OID_INL_CONFIG, 0xFF020008), @@ -481,6 +481,8 @@ BUG_ON(OID_NUM_LAST <= n); BUG_ON(extra > isl_oid[n].range); + res->ptr = NULL; + if (!priv->mib) /* memory has been freed */ return -1; @@ -613,7 +615,9 @@ DOT11_OID_DEFKEYID, DOT11_OID_DOT1XENABLE, OID_INL_DOT11D_CONFORMANCE, + /* Do not initialize this - fw < 1.0.4.3 rejects it OID_INL_OUTPUTPOWER, + */ }; /* update the MAC addr. */ diff -Naur linux-2.4.28-pre2/drivers/net/wireless/prism54/prismcompat24.h linux-2.4.28-pre2msw/drivers/net/wireless/prism54/prismcompat24.h --- linux-2.4.28-pre2/drivers/net/wireless/prism54/prismcompat24.