My Sysadmin Toolbox
After seeing lots of these at Linux.com recently, I thought I’d try to come up with my own list. I used to be a sysadmin (I’m now a programmer), and I’ve long felt that you really a good set of tools (and to know how to use them) in order to be most productive.
I spend the vast majority of my time at a command line. Zsh ensures I make best use of my time. If you’ve used bash, you might think you know what completion is—press the tab key and it fills out file names for you. But zsh takes it to a whole new level. Not only does it complete file names, but also users, hostnames, option flags, environment variables, PIDs and more. On top of that, it does it in a context-sensitive manner. So if you type in “chown ” and press TAB, it starts completing usernames. Type in a space and another TAB and it starts completing file names again.
On top of that, it allows partial completion. If I type in
/u/l/e/r/and press TAB, It gets expanded to
/usr/local/etc/rc.d/. This is phenomenally useful.
But it’s not just completion that zsh is good at. It’s also good at globbing. That’s turning wildcards into filenames. In addition to the usual forms of globbing, zsh can glob recursively. So if I want to look for “foobar” in all my files (but not directories), I can do:
% grep foobar **/*(.)
**/*” is the recursive glob, and the “
(.)” limits it to files and not directories. You can also limit by user, by timestamp and a few other things.
This is just covering the surface of zsh. Suffice to say that if you make heavy use of the command line, investing some time in learning zsh will make you vastly more productive.
This was mentioned on a few of the other lists as well. GNU Screen on the face of it doesn’t do anything. You just end up with another command line when you first run it. But the beauty of it is that if you get disconnected, you can just log back in, run
screen -d -rand pick up exactly where you left off. For me, this is ideal, given the flakiness of my home wireless network. But you might want to use it so you can shut down your PC at night and pick up where you left off in the morning.
On top of that, screen lets you run multiple command lines at once inside it, log the output and cut’n’paste between them. Think of it as a safety harness for your work.
Rsync is one of the closer things to magic that’s around. It’s a simple file copying utility. But the clever bit is that it only copies the things that have changed. This doesn’t sound like much until you’ve edited several files in a collection which is 200Mb and needs to be on another box. When rsync tells you it’s finished and only transferred 10Kb instead of 200Mb, you’ll really come to appreciate it.
If you’re still using tar/gzip or zip to create an archive to ship to another computer, stop wasting your time and disk space. Learn to use rsync and your life will be far more pleasant.
Thankfully, ssh is pretty ubiquitous these days. It seems to have mostly worked in its mission to eliminate telnet. But it has a few tricks that are worth knowing about.
First, the agent. One of the nice things about ssh is that it doesn’t have to rely on sending passwords around. Instead, you can use public key authentication. However, even typing in your passphrase can get pretty tedious after a while for every connection. Enter the ssh-agent. Just stick
eval `ssh-agent`in your startup scripts and then run
ssh-addonce. After that, you don’t get asked for your passphrase any more. The only caveat is that now you really need to lock your screen when you walk away from it.
Next are the tunnels. Ssh is able to create network “tunnels” in and out of otherwise secure locations. This is very handy for creating ad-hoc networks. For example, I’m allowed to ssh into my work, but not to anything else. Yet, I can use RDP to connect to my workstation by running this command:
% ssh -L3389:myworkstation:3389 firewall.mywork.com % rdesktop localhost
That says: listen on port 3389, and any connections that come in, forward them on to myworkstation port 3389 from the other side of the ssh connection to firewall.mywork.com.
If you’re on windows, check out PuTTY. It’s got all the features, but wrapped up in a nice GUI interface.
Lsof (List Open Files) is one of the first diagnostic tools that I reach for when I need to understand something. The purpose is simple: it tells you what files (and network connections) a process has open. If you’re wondering where a process is logging to, this might be able to tell you. Conversely, it can also tell you which processes have a particular file open (usually a lock file).
On Solaris, the pfiles command is similiar.
On a related note, FreeBSD also has the very, very useful sockstat command, which lists all open sockets and what processes hold them open. The useful bit is that it does this without needing to be root (unlike Linux’s
These are the second diagnostic tool that I reach for when something’s not right. Unix operating systems have a very clear distinction between userland and kernel, and this tool shows all the points where a program crosses between the two (makes a system call). If you really want to know how a program is interacting with its environment, these tools will tell you. It’s godo for answering questions like:
- What files is this process opening and closing?
- What connections to the network are being made?
- What’s been read in by this program?
A recent addition to my toolbox. It’s like
tail -f, except that it looks at more than one file at once. It also does highlighting of search terms. Dead handy.
Most of the stuff I do these days involves the web. Curl is a fantastic little tool for inspecting the web from the command line. It covers all the protocols you need, and can dump out any information about the transaction. Want to issue a PUT request to an SSL server, verifying the certificate and specifing basic auth? It’s got you covered.
Everybody needs a good editor. Vim isn’t the only choice, but it’s pretty likely to be available wherever you go. And once you’ve started learning how to use it properly, you won’t go back. In particular, I can’t live without
^Nfor doing completion inside the file you’re editing.
If you’re still using su, then you need help. Sudo allows you to dole out root access on a much more granular level and you get proper logging of who did what. If you haven’t looked at the manual recently, then check out
sudo -efor editing a file as another user. It ensures that you get your regular editor (vim or Emacs) instead of the incredibly unhelpful ed that root probably defaults to.
Everybody needs version control. If you’re editing files, stick them in subversion. You won’t regret it. Particularly when you need to see what those files looked like 6 months ago.
Every now and again, you need to deal with mailbox files. Mutt is a great choice for that, thanks to its mini language for filtering mail. Need delete all mail over 10 days old sent by cron@somehost? Not a problem. Even if you don’t use it on a regular basis, it’s worth getting familiar with.
Yes, this is a programmers tool. But it’s worth knowing a tiny amount about if you’re a sysadmin as well. What for? It lets you see why something dumped core. If you find a core file, then do
file coreto see what program left it behind and then
gdb /path/to/program core. When you’re inside gdb, type in
whereand it will (most of the time) give you a stack trace, showing what it was doing at the time of the crash. This is normally a big help in trying to figure out what went wrong.
You can also use gdb to find out what a running program is doing by specifing a PID instead of a corefile.
If you perform the same series of actions more than a couple of times, you should consider investing some time in automating the process. Shell scripts are handy, but can only go so far. Learning one of these languages will give you a really powerful ability to write your own sysadmin toolbox.
Documentation. Everybody hates doing it. Why not make it as easy as possible? A wiki is the answer to that, and mediawiki is one of the better pieces of wiki software out there. It’s pretty simple to get going (although it does depend on MySQL).
Remember: getting it documented is first priority. Once the information is in the wiki, it can be restructured later. So long as the information is there, it will be searchable (and hence useful).
Hmmm, that’s a bit more than the 10 they wanted. But it’s a large portion of my regular toolkit. Hopefully there’s something useful for other people in there as well…
Normally, subversion uses a builtin diff command to show you your changes. This works pretty well most of the time, except that you can’t use it to disable whitespace-only changes. How to do this isn’t spelled out 100% clearly in the FAQ, so here’s what I found out:
- Make a shell script
~/bin/svn_diff. It should look like this:
#!/bin/sh exec diff -bu "$@"
chmod +x ~/bin/svn_diff
- Put these lines in
[helpers] diff-cmd = /home/dom/bin/svn_diff
Naturally, you can adjust the diff flags to taste in the
I’ve just released a new version of subatom, that now gets linking correct if you have viewcvs or similiar installed. Unfortunately, I’ve now realised that the amount of command line arguments are such that I’m going to have to start adding a configuration file to it. My crontab lines are looking a bit unwieldy.
Nonetheless, if you want to generate atom feeds of your subversion repository, give it a try. It seems to work quite well.
I’ve just noticed in the pugs::hack document a useful tip for informing subversion about the charset of a file.
svn propset svn:mime-type "text/plain; charset=UTF-8" myfile
A quick browse through the subversion list archives shows that it appears to be “not forbidden” even if not 100% officially sanctioned. It’ll certainly work for its intended purpose of displaying character sets correctly when browsing a repository via HTTP.
As noted in my talk earlier, SVK is a really handy tool for building remote operation on top of subversion. If you haven’t tried it yet, you need too. I’ve just tried it and been absolutely blown away by its mirroring abilities.
I committed a change on a local copy of a mirror of XML-Genx. I then said
svk push and saw the changes go back from the mirror to my master repository. I ran svn update in a working copy attached to the master repository and there were the changes. This rocks. Everybody needs SVK!
Subversion for CVS Users
Today I gave a little presentation on Subversion for CVS Users at work. We’re switching to subversion for new projects rather than having a “big bang” repository conversion approach.
Why not RSS? Well, basically I spotted XML::Atom first and it was dead easy to use. And Atom just happens to be new and shiny right now.
I tried running subclipse a little while ago when I first started looking at Eclipse. I couldn’t make it work. It would just hang on commit. However, this morning, I ran the automatic update in Eclipse and noticed that a new version of subclipse was available (0.9.27). I upgraded, and tried it out.
And it works, beautifully! Full subversion integration inside Eclipse. It’s lovely! Although by now I’ve gotten used to ow the CVS integration works, so it feels a little clunky to have to use the SVN Browsing perspective instead of simply saying “new project; checkout from cvs.” But that’s a minor quibble.
subversion crash, redux
After much fiddling, upgrading, dumping, restoring, I’ve realised that I don’t have a clue what the problem in subversion is. What’s really irritating is that when I compile it with debugging, the problem goes away. So, I’ve accepted that as a solution for a moment, disconcerting as it may be. At least I’ll be able to checkout without crashing.
Normally I run subversion under apache as I like being able to get at my stuff from anywhere. But recently, some upgrade has broken. I’ve now started seeing broken checkouts. This is most disconcerting. For now, I’ve switched to accessing it over the filesystem which seems to work ok. But an update on that mailing list post…
I’ve managed to get a stack trace. I switched to gdb 5.3 instead of the system default (6). And that managed to get me this stack trace:
#0 0x0807a015 in core_output_filter () #1 0x285c3c0d in logio_out_filter () from /usr/local/libexec/apache2/mod_logio.so #2 0x0805d253 in chunk_filter () #3 0x080744c0 in ap_content_length_filter () #4 0x08061757 in ap_byterange_filter () #5 0x285d130e in expires_filter () from /usr/local/libexec/apache2/mod_expires.so #6 0x2820d35d in apr_brigade_write () from /usr/local/lib/apache2/libaprutil-0.so.9 #7 0x2820d9f2 in apr_brigade_vprintf () from /usr/local/lib/apache2/libaprutil-0.so.9 #8 0x289c42b7 in send_xml () from /usr/local/libexec/apache2/mod_dav_svn.so #9 0x289c5049 in upd_change_xxx_prop () from /usr/local/libexec/apache2/mod_dav_svn.so #10 0x289da9da in change_file_prop () from /usr/local/lib/libsvn_repos-1.so.0 #11 0x289dacae in delta_proplists () from /usr/local/lib/libsvn_repos-1.so.0 #12 0x289db73a in update_entry () from /usr/local/lib/libsvn_repos-1.so.0 #13 0x289db19c in delta_dirs () from /usr/local/lib/libsvn_repos-1.so.0 #14 0x289db872 in update_entry () from /usr/local/lib/libsvn_repos-1.so.0 #15 0x289db19c in delta_dirs () from /usr/local/lib/libsvn_repos-1.so.0 #16 0x289db872 in update_entry () from /usr/local/lib/libsvn_repos-1.so.0 #17 0x289db19c in delta_dirs () from /usr/local/lib/libsvn_repos-1.so.0 #18 0x289db872 in update_entry () from /usr/local/lib/libsvn_repos-1.so.0 #19 0x289db19c in delta_dirs () from /usr/local/lib/libsvn_repos-1.so.0 #20 0x289db872 in update_entry () from /usr/local/lib/libsvn_repos-1.so.0 #21 0x289db19c in delta_dirs () from /usr/local/lib/libsvn_repos-1.so.0 #22 0x289dc31e in svn_repos_finish_report () from /usr/local/lib/libsvn_repos-1.so.0 #23 0x289c5e25 in dav_svn__update_report () from /usr/local/libexec/apache2/mod_dav_svn.so #24 0x289c79b9 in dav_svn_deliver_report () from /usr/local/libexec/apache2/mod_dav_svn.so #25 0x28615b37 in dav_method_report () from /usr/local/libexec/apache2/mod_dav.so #26 0x2861719d in dav_handler () from /usr/local/libexec/apache2/mod_dav.so #27 0x08065275 in ap_run_handler () #28 0x080656cb in ap_invoke_handler () #29 0x08062679 in ap_process_request () #30 0x0805d468 in ap_process_http_connection () #31 0x0806f3b5 in ap_run_process_connection () #32 0x0806357a in child_main () #33 0x08063778 in make_child () #34 0x08063880 in startup_children () #35 0x08064032 in ap_mpm_run () #36 0x0806a835 in main () #37 0x0805cef6 in _start ()
Now I need to start building debug versions of apache and subversion in order to start making some sense of that. It’s my suspicion that subversion is sending bad buckets to apache somehow.