Niagra and network stacks, TCP and talloc: LCA Presentations Day 1 Morning

Overnight interlude: I spent all evening installing WordPress 2.0, and fixing up a few old posts for XHTML compliance. The new WYSIWYG editor is neat, but will lose chunks of unparsable markup (ie. missed quotes and brackets). New posts will prolly be fine to use it for, but for the moment I’m sticking with writing straight HTML.

The whole AJAX interface thing is cool. I’m looking forward to the PHP5 talk this afternoon.

Of course, once I had that done, I decided to grab a new theme. This one’s pretty cool, although the whole lense thing is a bit weird…

And I’m now appearing on Planet 2006, although because I use the excerpt in all my posts to produce a clarification (or declarification) in Chinese Kung Fu Novel Chapter Synopsis Style, my posts end up being quite short on the site, while long on my page.

This morning’s keynote by David Miller was interesting. He maintains the Linux networking stack, and also is the sole porter of the Sparc64 port. So he actually gave three presentations, an overview of the recent changes in the Linux networking stack, a presentation about the Linux port to the new Sun Niagra CPU line, and a brief talk about how to actually deal with kernel maintainers. Lack of wireless there meant I didn’t get my laptop out, so don’t have much more to say about it.

Well, I’ll talk about the new Sun chip, known as Niagra, UltraSPARC T1 or CoolThreads depending on who’s marketing department you ask. It’s a 8-core CPU, each core actually runs four threads in a round-robin fashion when they are able to be scheduled, and leaving them out when they’re waiting on main memory or the FPU or otherwise. This means that any task which can actually _use_ 32 threads for integer-only code will be able to run fast. Kernel compiles are a prime example (looking forward to the kernels-per-second numbers for comparison to the 128 CPU PowerPC G5 box talked about at LCA05. This would also be very nice for video encoding, I suspect. Mind you, the small Sun Fire T1000 Server (shipping March 2006) lists at US$3495, so I doubt I’ll have an array of these to play with anytime soon…. Imagine a Beowulf cluster of these things. ^_^

Morning tea interlude: Posters have gone up. There’s Thousand Parsec, WorldForge and FAI. I’ve looked at WorldForge and Thousand Parsec before, at LCA05, but if I have time tonight (Ha!) I might see where they’re up to these days. FAI on the other hand I’ve only been vaugely aware of, since I never seem to deploy more than one box at a time… But now it’s in my blog, so I’ll be able to find the link when I do want it.

Congestion Advancements with Ian McDonald. A technically-oriented delve into the new congestion control algorithm module structure for TCP, as touched upon by Dave Miller.

Ian presented both the work done recently to generalise and modularise the congestion control algorithms for TCP in the Linux kernel, which had originally been kind of ad-hoc and wide-raning in their touching. The interface they use is fairly simple (if you know TCP backwards, that is ^_^) and they turn out to be per-socket switchable. This will allow much easier use of different algorithms, which are optimised for various combinations of high and low bandwidth, high and low latency, and timeout vs loss vs congestion vs drop situations.

He then presented his current research project, which is a TCP-like protocol (I think… Or was it a congestion-control algorithm?) called TCP-Nice, which is designed to back off from congestion so that the rest of the network functions as if it wasn’t there, while it uses all the left-over bandwidth… I like this, I’d love to see BitTorrent ported to use it. Then I could give free TCP-Nice traffic, and lower my TCP quotas significantly. ^_^ A vast improvement over my previous Second-Class Traffic plan.

He then presented a futher, already live use of the modularised congestion control code in Linux, DCCP. This is a session-based congestion controlled (like TCP) unreliable (like UDP) protocol, mainly intended for multimedia traffic, where you want as much as possible to get through, to back off (somewhat) under congestion, while not doing retransmits and re-ordering since retransmitting live data is a pain.

It’s in the final call for the RFC, and he’s already gotten it working. It’s in the 2.6.14 Linux kernel, with a NAT fix to come in 2.6.16. However, they still haven’t gotten the perfect congestion control algorithm for multimedia streams… The TCP-like CCID2 isn’t very good, the smoothed and slower-falling version CCID3/TFRC didn’t help much, and the latest attempt, MFRC is currently too agressive, and needs tuning to avoid killing other traffic under congestion conditions. But it’s getting there, and shows a lot of promise.

Netem: A last-five-minutes gem… Introduces loss, delay, reorder and duplicate packets on an intermediate box. Can only currently work on output queues.

Finally for the morning, Rusty Russell presented Talloc. Talloc was touched upon by Tridge in his “non-junk” code tour at LCA05, but he didn’t spend too much time on it, looking mainly instead at tdb and ldb…

Basically, talloc is a hierachial pool allocator, which gives destructors, pools and hierachy to your memory allocation calls. This mean that managing your memory usage in C becomes sensible. It’s mainly been driven by Samba, which in fact produces huge whacks of memory allocation… Rusty showed a graph of it, I’ve no idea where to find it. (There was also a URL to the program to make such graphs, I missed that too. >_<) Anyway, it’s pretty impressive.

Andrew Bartlett pointed out last week that he’s using talloc in Samba 4 to trivially wrap the krb5-allocated blobs coming out of the kerberos libraries. This basically gets him free destructors, solving the nasty lifetime problems kerberos’s allocation and free activities otherwise bring.

nfsim uses talloc to simulate kmalloc, providing simple and easy kernel memory leak detection in the netfilter modules being tested. Also has a very neat graphical live talloc allocation tree display. I think that is really neat!

Now to find myself a project to use talloc on… That’s also what I said last year about tdb, as it happens. I actually have one for the latter…. I want to unbone FreeRADIUS‘s IP Pool module, specifically so I don’t have to kill FreeRADIUS to make changes to the pools. I just didn’t get it done in the last 9 months. Gah.

In the more general programming talk at the beginning of the talloc presentation, Rusty suggested that interfaces should be hard to misuse first, easy to use second. He also suggested the following list of tools as being of great importance:

  • distcc
  • ccache
  • ccontrol – This one’s new to me. In fact, I’m still not clear what it does…
  • Mercurial – Source control tool. I’ve not tried it, but Alan DeKok from FreeRADIUS uses it for his own development, and then breaks up the patches for shoving into CVS for the rest of us…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s