From mint-bounce@lists.fishpool.fi  Thu Mar 17 09:15:14 2005
X-Original-To: fnaumann@mail.boerde.de
Delivered-To: fnaumann@mail.boerde.de
Message-ID: <42393BD4.9000503@highlandsun.com>
Date: Thu, 17 Mar 2005 00:12:04 -0800
From: Howard Chu <hyc@highlandsun.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a6) Gecko/20050111
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: "Evan K. Langlois" <Evan@CoolRunningConcepts.com>
Cc: mint@fishpool.com
Subject: Re: [MiNT] State of the Union
References: <20050314031912.r4p1ykehn44googk@coolrunningconcepts.com>	 <Pine.NEB.4.62.0503151021460.421@wh58-508.st.uni-magdeburg.de>	 <1110945609.9126.54.camel@taro.coolrunningconcepts.com>	 <Pine.NEB.4.62.0503160911130.421@wh58-508.st.uni-magdeburg.de>	
In-Reply-To: <1111038731.11997.161.camel@taro.coolrunningconcepts.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-ecartis-version: Ecartis v1.0.0
Sender: mint-bounce@lists.fishpool.fi
Errors-To: mint-bounce@lists.fishpool.fi
X-original-sender: hyc@highlandsun.com
Precedence: bulk
List-help: <mailto:ecartis@lists.fishpool.fi?Subject=help>
List-unsubscribe: <mailto:mint-request@lists.fishpool.fi?Subject=unsubscribe>
List-Id: <mint.lists.fishpool.fi>
X-List-ID: <mint.lists.fishpool.fi>
X-Virus-Scanned: by amavisd-new at relay.boerde.de
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on relay.boerde.de
X-Spam-Status: No, hits=-1.0 tagged_above=-50.5 required=7.0 tests=BAYES_00
X-Spam-Level: 

Evan K. Langlois wrote:
>>Don't be stupid. It is possible to design an event mechanism that 
>>efficiently handles all classes of application load, as I have 
>>described. Not doing so, and forcing all applications to work only one 
>>way, is ridiculous.

> And yet, so many people have accepted this massive limitation.  There 
> are so many ridiculous and stupid software designers out there.   Its 
> really too bad that everyone else sucks so bad huh?

It's ironic that you're being sarcastic, when it's so true.

>>> Also, under Linux the trip across the user/kernel barrier is nothing.  

>>Nonsense. I have the benchmarks that prove this statement to be false.

> I have benchmarks that prove the moon is made of cheese too!  It's 
> proven!

Don't be an ass. If you compile the current CVS HEAD (or 2.3alpha) of 
OpenLDAP on a Linux 2.6 kernel you can verify it for yourself. Run any 
load test you like with HAVE_EPOLL defined, and run the same test with 
HAVE_EPOLL undefined.

> One of the things that sets Linux apart from most other 
> kernels is that the OS call interface has been extensively optimized and 
> is orders of magnitudes faster than most commerical OSs.  This is about 
> as accepted a statement as you can get when it comes to Linux.

The majority of the world's computer using population accepts that 
Windows is "the best desktop OS" and that Word is "the best word 
processor" too. Just because a statement is widely accepted doesn't make 
it true. And just because Linux might do it better than some other OS 
doesn't mean it isn't still expensive.

The fact is that using epoll vs select slows down slapd by 5-10% under 
heavy load, and the only difference is the event calls themselves.

> Funny.  I read almost the same thing as I just said above, but in a 
> paper detailing kqueue, and I had never read the paper before in my 
> life!  Odd that me and apparently quite a few other people have such 
> similar ideas, and yet, we're all horribly wrong.

Not horribly wrong, just using tunnel vision.

Funny that the folks responding from the linux-kernel mailing list 
agreed with me.

>>the event structure at any time. If an event happens on a descriptor, 
>>and the kernel checks the event pointer, and sees the MASK bit set, it 
>>ignores the event. Like I said, this is similar to the sigprocmask concept.
>>
>>Thus, you can change things in the shared memory region and they will 
>>take effect directly without requiring a system call.

> I have no idea why you would want to temporarily ignore incoming events, 
> but that is about the only thing you will gain on in efficiency.

And since you don't understand the requirement, you're ill-qualified to 
judge its validity.

> Not what I said.  There is a way that the interfaces were designed, and 
> if you attempt to use them differently, they won't scale well, and you 
> will have performance issues like you describe.   I'm pretty sure the 
> problems you're having are best solved in a way other than blaming the 
> OS design.

Without understanding the requirements you're not qualified to offer an 
opinion one way or the other.

> Actually I was doing a project that involved deep packet inspection and 
> manipulation at gigabit speeds, operating from an ip-less bridge.  Not 
> really a client/server app though so your right.

Not very compelling, unless you're going to tell me that you 
accomplished this using hardware running at only 8MHz, in which case 
I'll be duly impressed. Any fool can run something on a gigahertz CPU 
and get reasonable throughput. Absolute numbers aren't the quality 
metric, it's a matter of efficiency and that's only measured in 
percentages. I've tuned multiprocessor TCP stacks to get 99% of 
available bandwidth when the vendor's implementation only delivered 25%. 
  Not even the original OS authors were able to approach those levels 
(Alliant and Thinking Machines, fyi) and they had full source code and 
hardware specs; I had a disassembler and assembler.

> slapd is well known for its usefulness, but not its stellar 
> performance.  The performance of slapd is the one thing I see most 
> criticized from what is otherwise a really remarkable program.

Times have changed. Between OpenLDAP 2.0 and 2.1 the work I did sped up 
slapd's performance by a factor of 200. Between 2.1 and 2.2 I sped it up 
another 50%. It was pretty piss-poor code that we inherited from the 
UMich LDAP project, but it totally outperforms everything else on the 
market today.

A lot of people are most familiar with RedHat's distro, and all the way 
up to RedHat 9 they only bundled OpenLDAP 2.0.27. The poor reputation 
that OpenLDAP performance carries today is largely due to that. We have 
customers flocking to OpenLDAP now, away from Sun and Netscape, because 
they see the performance difference on their own actual loads.

>>Don't be like Sun and forget about signals. epoll() certainly is an 
>>improvement for some class of applications on Linux, but without a 
>>totally integrated kitchen-sink approach like kqueue/equeue you're going 
>>to go through all this churn and still have an unsolved event management 
>>problem.

> 2 sides to every coin.   There is some things that signals are great 
> for, but signals are often somewhat limited as well.  One could very 
> well argue that signals are SUPPOSED to be handled outside the regular 
> flow of a program.   Many people want to dispatch signals from their 
> event loops, and this is quite possible with epoll() or even select() - 
> the signal handler sets a flag and returns, but the system call is 
> interrupted anyway and you see the error code, and dispatch your code to 
> handle what the signal told you.   It works, sure, but why not have a 
> message queue or some file handle for signals if you are only going to 
> look at them in your event loop anyway?  Signals are supposed to work 
> outside the scope of program event loops.  They can interrupt anything 
> and everything (just about).
> 
> While I'm sure the idea of integrating signal handling with the event 
> loop is an exciting prospect for some, for others they'll say "what 
> for?"  .  One just handles the same thing at a different point in the 
> code, shifting slightly more or less to userspace with associated trade 
> offs.  I'm not going to say one way is better or worse, but there isn't 
> any loss in not having signals integrated with epoll.

You're not seeing the big picture. You seem to like programming with 
threads and yet you don't see the problem with leaving signals out of 
the equation. The fact that signals can interrupt everything means you 
can't just route them through a message queue, etc. And the fact that 
you can't control which thread receives a signal means that you can't 
rely on your event listener seeing the flag that your signal handler 
sets until some indeterminate time later.

Just to bring this conversation back home - trying to handle signals 
along with AES events is a royal pain. Something like kqueue will be a 
huge improvement. equeue is even more efficient.

> Now, having said all that, and being a big fan of Linux and not liking 
> BSD nearly as much, I will say that I do like the idea of using kqueue 
> for AES events in MiNT, although I'd like to see some more discussion of 
> the details of the implementation plan.   I don't really see your 
> equeue() as a good match for the needs or MiNT or the needs of the AES.

Obviously your vision isn't that good.

-- 
   -- Howard Chu
   Chief Architect, Symas Corp.       Director, Highland Sun
   http://www.symas.com               http://highlandsun.com/hyc
   Symas: Premier OpenSource Development and Support


