Intro – running a cluster
So I had assembled this computing cluster thing based on execnet for a CS project. The slave nodes were dozens of student lab computers. Those are running firewalls blocking all ingress traffic, not just from the big bad internet but also from the lab networks — which contain only trusted machines and are already protected by a firewall restricting flow to and from larger networks!
Thing is, I need access to the SSH daemons running on those nodes. And I do not (and should not) have root on those machines…
Solving this problem on OSI Layer-8 proved to be an inefficient approach. The best way to fight bureaucracy is to not take it seriously. A bureaucracy (almost by definition) is not creative and cannot keep up with workarounds. To admit existence of creative, necessary workarounds to achieve legitimate ends is to admit defeat and that is why workarounds usually get ignored. It’s ignore and counter-ignore which is a bit sad since there is (or should be) a common goal: education, in this case. But I digress. The solution proved to be to ignore layer 8 and solve the problem on OSI layer 4. That certainly saved a lot of keystrokes!
To punch through the firewall, I have each slave node initiate a reverse tunnel (ssh -R) to a central node, using pubkey authentication. This way the otherwise unreachable SSH daemons on the slave nodes are proxied on the master node and are reachable via a socket on the master nodes’ loopback interface. The slave nodes specify the reverse tunnel endpoint port based on their IP. They connect automatically, and with some clever but too-kludgy-to-show scripting I can dynamically update my running clusterfied program with newly joining slave nodes to export compute jobs to.
Obviously, the slave nodes need to login to the central node via SSH and do so using a single account on the central node. But I don’t fully trust the slave nodes. I want them to be able to initiate reverse tunnels, and not do much else. Here’s how I approached this problem, with the server running OpenSSH 5.3p1.
Chroot
Chrooting comes to mind. With a chroot, you can limit the filesystem locations the user can access to some remote uninteresting branch of the tree.
From man 5 sshd_config:
ChrootDirectory Specifies a path to chroot(2) to after authentication. This path, and all its components, must be root-owned directories that are not writable by any other user or group. After the chroot, sshd(8) changes the working directory to the user's home directory. The path may contain the following tokens that are expanded at runtime once the connecting user has been authenticated: %% is replaced by a literal '%', %h is replaced by the home directory of the user being authenticated, and %u is replaced by the username of that user.
I set up a group for that and use the following ‘Match’ stanza in the central node’s sshd_config:
Match Group chrootage
ChrootDirectory %h
%h is the matched user’s homedir.
I set up a user named ‘chroot’ with a not-so-regular home directory. More on that home directory later on. This user needs a valid shell, even if it will never run that shell (I do not want them to run one). Without a valid shell the user will not be able to log in.
ForceCommand
From man 5 sshd_config:
ForceCommand Forces the execution of the command specified by ForceCommand, ignoring any command supplied by the client and ~/.ssh/rc if present. The command is invoked by using the user's login shell with the -c option. This applies to shell, command, or subsystem execution. It is most useful inside a Match block. The command originally supplied by the client is available in the SSH_ORIGINAL_COMMAND environment variable. Specifying a command of ``internal-sftp'' will force the use of an in-process sftp server that requires no support files when used with ChrootDirectory.
Well that ought to suit my purposes. Ideally, this command would be a rather boring and useless one that just sits there waiting for input. I could copy cat and the libs it links to over to the chroot directory. I might as well make it the login shell for the chroot user. But the user would then still be able to use sftp. And if I would want to disable sftp, I would have to do so server-wide! Me and the other users use sftp extensively, so I’d then need to run a second SSH daemon with no sftp subsystems especially for this purpose. Hmmm.
Another approach is to limit the user to the internal sftp subsystem. Added benefit is that I do not need to maintain any binaries and libs in the chroot. With this, the user logging in with SSH is presented by a boring interpreter in which he cannot launch any programs — it’s the sftp subsystem. Portforwarding, which is needed for the reverse tunnel, will work. SFTP will work, obviously. Let’s do it.
sshd_config:
Subsystem sftp internal-sftp
Match Group chrootage
ChrootDirectory %h
ForceCommand internal-sftp
Make SFTP boring — chrooted home directory contents
So what do we need inside the chroot directory? No binaries or device nodes, that is one of the advantages of using the internal-sftp subsystem.
We do need an authorized_keys file readable by the chrooted user since this user won’t be able to log in using pubkey auth otherwise. We can’t use AuthorizedKeysFile inside a Match stanza, so the authorized_keys will have to reside in the default location — the .ssh subdirectory of the user’s homedir. I will use authorized_keys to limit the set of hosts this user can login from to the domain of the slave nodes, so obviously the user should not be able to add keys or lift limitations — authorized_keys will need to be read-only. This is what I ended up with.
#ls -Rla /path/to/chrooted_homedir /path/to/chrooted_homedir/: total 16 drwxr-xr-x 3 root root 178 May 8 16:36 . drwxrwxrwt 9 root sys 4887 May 8 16:36 .. dr-xr-x--- 2 root chrootage 189 May 8 16:22 .ssh /path/to/chrooted_homedir/.ssh: total 12 dr-xr-x--- 2 root chrootage 189 May 8 16:22 . drwxr-xr-x 3 root root 178 May 8 16:36 .. -rw-r----- 1 root chrootage 1221 May 8 14:25 authorized_keys
This is all the user can view over sftp. It is minimal and extremely boring. Note that there is no place the user can write to. Even if he would have some way of launching programs (he hasn’t, due to ForceCommand) there is no place to upload them to and launch them from in the first place.
On to authorized_keys:
no-pty,from="*.science.uu.nl" ssh-rsa AAAAB3and-the-rest-of-the-pubkey
no-pty serves to prevent the allocation of a terminal. I don’t think it offers any security beyond the measures already in place, but I like to think this saves resources.
Tunneling risks
I need TCP forwarding to get the reverse tunnels to work. That means that using the account, users can set up arbitrary TCP tunnels, using the master node as a proxy… Less nice. But wait – can’t I use PermitOpen inside Match to limit the creation of arbitrary tunnels? Yes. But the reverse tunnel is subjected to the same restrictions. Therefore, I would need a PermitOpen for every slave node and I’d need to know the source port of the incoming client connection in advance. There’s no wildcarding in PermitOpen, so this is infeasable.
‘Match’ directive precedence
The order appears to be significant. That’s also what the manpage seems to hint at where it says
Match: Introduces a conditional block. If all of the criteria on the Match line are satisfied, the keywords on the following lines override those set in the global section of the config file, until either another Match line or the end of the file.
It’s not as clear as could be what happens to keywords in the next Match directive. What if the next Match directive matches? Nothing! The next Match directives don’t matter! In my tests, incoming connections appeared to be only matched once. The match blocks have no relation to eachother, only to the main config.
That implies that you need to put the most specific match directives — the ones with the most patterns — on top, and that each Match block needs to hold any and all keywords you want to specify beyond the main config. It’s not CSS ;-)
Tags: chroot, English, openssh, sftp, tunnel —

