I have been working in the areas of bioinformatics and molecular
evolution since 1985. My main work has been on methods and software
for DNA and protein sequence alignment. This grew from a need in the
late 1980s to make multiple alignments of sets of related sequences,
in order to carry out further analyses such as phylogenetic
reconstruction. Up to about 1987, if you wanted to take a set of
sequences and make a multiple alignment, you had to do this using
sheets of paper and coloured pens, or, if you were well off, a
word-processing program and a computer. This was tedious and
error-prone, so there was a clear need to automate this process. It
turned out to be surprisingly difficult to write
computer programs to
do this, especially considering that humans could do it easily, even
if painfully slowly.
My first work in this area was when I was a post-doc in Paul Sharp’s
laboratory in Trinity College, Dublin, Ireland. I wrote a series of
programs, called Clustal, that could carry out reasonably accurate
multiple alignments but could do this quickly and simply, even on an
old IBM PC with an 8088 processor and very little memory. This program
quickly grew in popularity and has evolved through a long series of
jumps and steps to become the world-wide standard software for this
analysis. Because we started it off on such a miserably weak computer,
as computers became very powerful in the 1990s, this more than
compensated for the great explosion in the numbers and lengths of
sequences that most people needed to align. The first papers
describing these programs were published in 1988 and 1989. Later, I
moved to the EMBL in Heidelberg, Germany, where I published a new,
improved version of the program with my colleagues Alan Bleasby and
Rainer Fuchs. This was ClustalV (the V stood for 5 as this replaced
the earlier programs which were called Clustal1 to Clustal4) and it
was available for IBM PC, Mac, Unix, and VAX/VMS computers.
In 1994, with my colleagues Julie Thompson (now in Strasbourg,
France) and Toby Gibson (still at EMBL, Heidelberg), we produced
ClustalW which is now the main software for carrying out multiple
alignments along with ClustalX, which is essentially ClustalW with a
nice graphical interface and many extra visual features. ClustalX was
also made in collaboration with Francois Jeanmougin and Frederic
Plewniak, also in Strasbourg. These programs are used hundreds of
times every day to produce multiple alignments, and the papers that
describe them get cited about 50 times a week. There are dozens of WWW
servers that use these programs and they are resold in many commercial
packages.
More recently, I have been working on a new method, called T-Coffee
which was invented by a Ph.D. student in my group at the EMBL, Cedric
Notredame. T-Coffee is slower than ClustalW, but it produces more
accurate alignments and allows us to mix different types of data
together such as protein structures, ESTs etc. I now teach
biochemistry in University College Cork, Ireland, and my group works
on not only multiple alignments but also on bacterial genomics and the
application of multivariate analysis to analysing microarray data.
Des Higgins, Ph.D.
Department of Biochemistry
University College Cork
Cork, Ireland