Geoff Collyer
Giganews Usenet History > Geoff Collyer
Geoff Collyer, along with Henry Spencer, co-authored C News, a news server package and a replacement for B News. Collyer later gave C News a new index facility called NOV (News Overview). This indexing system is still used today in the form of the NNTP XOVER command.
Currently, Collyer works as a Member of Technical Staff at Bell Labs in Murray Hill. This is his second employment at Bell Labs, his first one being with AT&T in 1994 through 2001. Collyer has been involved with Plan 9 and is still working on the project.
Interview (3/26/2007) with Geoff Collyer:
1. What benefits did Usenet provide for your professional or academic life?
In the early years, Usenet was a source of technical information, notably bug fixes and ideas. It likely increased my visibility, at least to the Usenet community, for a time. Developing C News led to getting work in the US and ultimately helped me to get permanent residency there.
2. How did you and Henry Spencer become involved with co-developing C News?
Henry and I worked about a block apart initially (me at the computing center and him in the Zoology department of the University of Toronto). We met when I started work at U of T in 1981 (Henry interviewed me for a junior position in Zoology before I went to work for the computing center). We had similar views about programming (minimalist) and collaborated on some minor work (e.g., fixes and extensions to ed(1)).
Henry was running V7 Unix on a PDP-11/44 for Zoology’s computing needs (largely typesetting) and I was responsible (along with a varying collection of co-workers) for an ever-growing collection of Unix machines, initially a PDP-11/70 running PWB/UNIX (essentially V6 Unix with newer utilities and a buggy C compiler) split between the statistics Department and the computing center, who offered a typesetting service; a VAX-11/750 running 4.1BSD for undergraduate CS and Engineering use; and soon an IBM 3033 running Amdahl’s UTS under VM/370.
At that time, Ethernet was still new and we didn’t have any, Canada was still not on the ARPAnet, and we’d started running a locally-developed network over fiber (Hubnet) and SLIP over serial lines run between machines. It was clear that UUCP would be a useful thing to have on the PDP-11, but it had to be backported to PWB and the bugs in the C compiler had to be worked around. I eventually got that going, so we set up a UUCP connection between utzoo and utcsstat, the 11/70. Henry was a proponent of Usenet, so I backported B News 2.something, probably 2.6, and worked around the compiler bugs and got utcsstat onto Usenet via utzoo.
Almost immediately it became clear that the B News code was in sorry shape. It was obvious that many hands of varying ability had hacked on it without much coordination and the result was slow, unreliable and just plain buggy. Before long, one of my co-workers at the computing center came to me, having analyzed our process accounting data and pointed out that programs running under the “usenet” userid were consuming something like 1/3rd of our CPU time. The machine also slowed down noticeably during the processing of incoming news, and this became more noticeable as volume grew.
I think Henry started what would become C News first. He was using expire pretty exhaustively, archiving news to tape and generally giving expire a work-out. Expire had memory leaks and other bugs and eventually just stopped working with some new release (B 2.9, perhaps). Henry looked at what expire actually did and concluded that it couldn’t be that hard to write from scratch, so he wrote a replacement that became the earliest C News expire.
A little later, I got fed up with rnews losing news when disk partitions filled up and generally eating up the PDP-11’s resources, and started thinking about writing a replacement for rnews. I took a stab at modifying B rnews to not fork to process each article, but just couldn’t make it work; assumptions were too deeply embedded and the code was a mess. The idea of an rnews that could process a batch without forking (excluding control messages) still seemed like a good one. In the fall of 1985, my boss, Ian Darwin, thought it was a worthwhile project, so I started writing (also from scratch) what became relaynews on a Monday and by Friday (working eight-hour days) had a barely-usable relaynews, which Ian took home and exercised on his shiny new Dual Systems Unix System III 68000 machine.
Henry and I were in frequent communication and pretty quickly realised that with relaynews generating the history file, and the news readers not caring greatly about the format of some fields, we had room to extend the format in useful ways that expire could exploit for greater speed. I think it was at about this time that we adopted the name `C News’ and began to flesh out a full news transport. We had no interest in readers, which seemed to be constantly changing, and rn (and later nn) seemed good enough.
3. How did you collaborate while developing C News?
I think we talked some about this in `#ifdef Considered Harmful’, but perhaps not in detail. We divided up the work; I took relaynews (the hard part) and Henry took most of the other commands. We both contributed to a small library of common routines. Later, Henry took an interest in packaging and so prepared the releases. We wrote and updated manual pages for all of this as we worked.
We worked separately, exchanged mail and code and manual pages, and had dinner once or twice a week. Occasionally we wrote longer documents.
4. Was the development of C News an official project or a personal pursuit?
Aside from the above-mentioned blessing of my boss at the time, who did have a personal interest in the results, C News was purely a personal pursuit while we were both at U of T. We were fortunate to not have slave-driving bosses watching over our shoulders constantly, and some of our C News work was directly applicable to our jobs as system programmers and administrators, but we also put in quite a bit of our own time.
Once I moved to Boston, to work for Barry Shein at Software Tool & Die, my work was driven more by specific objectives than just trying to finish off the ever-growing to-do list, but I was also able to work on C News full-time and made much faster progress than I had been able to while working on it in my spare time. I didn’t ever finish my own C News to-do list, but I think all the important things got done.
5. What were your main technical challenges when developing C News?
The lack of dbm in System III/V was a nuisance. Various other people filled the gap with dbz, sdbm, and the like. Henry eventually adopted dbz, after overhauling it thoroughly.
The gratuitous incompatibilities (e.g., df(1) output) within `consider it standard’ System V, across releases and across vendor ports, was another nuisance. This was part of the larger problem of pre-POSIX lack of standardisation of some fairly basic things across Unixes.
We considered doing the sort of multiple-input relaynews that INN does, but the portability nightmares lying therein were daunting. In addition, the Unix C libraries were missing some things like fselect, a version of select that takes stdio buffering into account. Not that select is wonderful, but doing the equivalent with inexpensive processes sharing memory (as one would in Plan 9) wasn’t happening: Unix processes were getting fatter and the relevant system calls were not only not yet standardised, but some of them weren’t even being considered for standardisation yet.
Adapting to the scale of UUNET’s news operation took a little while and produced the batch-file exploder and a much faster newsgroup matcher. Newsgroup matching had been acceptably fast for a large ordinary Usenet site, with perhaps a dozen outgoing feeds and moderately complex newsgroup patterns in the sys file. UUNET had hundreds or thousands of outgoing feeds and since they went to paying customers rather than peers, the newsgroup patterns ran on for hundreds or thousands of characters in order to select exactly the groups that the customer wanted that day. In that situation, newsgroup matching consumed a majority of the CPU time. Barry Shein and I brainstormed and came up with a rather nice algorithm and implementation that isn’t just much faster but has better time complexity.
8. Were you involved with other Usenet development or maintenance projects after C News?
No, by the time I finished my work for Software Tool & Die, I was pretty burned out on Usenet software and content.
9. Where are you currently employed? What is your current role? What are you currently working on?
I’m back at Bell Labs in Murray Hill as a Member of Technical Staff since July 2006 (my first employment there was from 1994 through 2001, first for AT&T, then Lucent). I’m doing a mix of Plan 9 support, development and assisting with research projects. I’m replacing old Plan 9 machines with new Plan 9 machines. We just retired two of the old optical jukeboxes, including the one from the original Plan 9 file server, bootes.
10. Do you have any historically significant documents or photos relating to Usenet that you would like to share?
No photos. I have filing cabinet drawers full of news-related paper, but I’m not sure that any of it would be of interest to anyone else.