Conman Laboratories

Better living through software …

Fixing the bugs in Linux 2.0 AppleTalk

Mark Grosberg

In the process of trying to get one of my Linux 2.0 file servers able to serve files via AppleTalk I discovered a bug in the Linux AppleTalk kernel code:

Restarting of the user-mode daemons (atalkd and afpd) required rebooting of the machine.

Because I don't want to continually have to reboot my machine as I make changes to the configuration of the daemons I developed a fix in the Linux kernel. For educational purposes, I describe that fix here.

Because I do not run modular kernels because of the security risks involved I had no simple fix that could be done from outside the kernel.

Step 1: Determining the problem.

When debugging something like the kernel, it is usually helpful to litter things with printk() calls to get an idea of the flow of control.

I traced the error back in the user mode code to a bind system call returning an EADDRINUSE error. So my first step in isolating the problem was to put a distinctive printk() in each place where the kernel code for AppleTalk could return such an error.

After performing this experiment, I realized that the following code was where the error was being returned:

if (atalk_find_socket(addr)!=NULL)
  return -EADDRINUSE;

Apparently, the code was trying to bind to a socket that was already of the same address as one in the system.

Step 2: A few wrong turns …

So, my initial guess was that some sockets were being left over even after the AppleTalk daemons exited. The immediate solution to this problem was to allow user-mode code to clear out the socket table. I knew that IPv4 forwarding could be controlled via the /proc file system, so I wrote some skeletal code to handle writes to a file /proc/sys/net/appletalk/ddp_reset.

This required two things:

  1. Add an entry to the sysctl table for AppleTalk.
  2. Define a handler procedure

The body of the handler did the following:

while (atalk_socket_list != NULL)
  atalk_destroy_socket(atalk_socket_list);

I derived this code by studying the code in atalk_find_socket to realize that the socket list was implemented as a singly-linked list.

I compiled the kernel, rebooted, and tried to run my reset code. It did clear out the socket table (which I found, via printk() out was empty) but the problem of restarting the AppleTalk daemons still existed.

To figure out what was going on I put some logging in the loop of atalk_find_socket(). This allowed me to see that during the first (and error-free) startup of atalkd, the network and node addresses of the socket being bound were zero.

During a failed attempt at starting atalkd, however, the node was 21 and the network was 255. This caused a duplicate socket name to be present because atalkd binds to socket 6 (the Zone Information Protocol) twice. It assumes the first time is network zero, node 0 and then it binds the interface to a different network and node and re-binds port 6 under that address.

The problem is that on a subsequent atalkd start up, the interfaces' address is not cleared back to zero when atalkd exits.

Making it work

After determining the problem, it was a trivial matter to write some kernel code to clean the interface list when dpp_reset is written to. The relevant code is as follows:

struct atalk_iface *iface;

for(iface = atalk_iface_list; iface != NULL; iface = iface->next)
{
  iface->status         = 0;
  iface->address.s_net  = 0;
  iface->address.s_node = 0;
}

This code must be done under the protection of CLI. It simply walks the interface list (which is usually very short) and it resets a few bytes to zero.

Adding a line to the shell script that brings up the AppleTalk servers to reset the DDP stack in the kernel before starting the daemons.

This allows them to be restarted at any time without resorting to a modular kernel (which I consider a security risk) or rebooting a critical server.

My new SYSCTL table for Linux AppleTalk

by Mark Grosberg

Peeking in …

by Mark Grosberg