Abstract
Science and engineering rely on the accumulation
and dissemination of knowledge to make discoveries
and create new designs. Discovery-driven genome
research rests on knowledge passed on via gene
annotations. In response to the deluge of sequencing
big data, standard annotation practice employs automated
procedures that rely on majority rules. We
argue this hinders progress through the generation
and propagation of errors, leading investigators into
blind alleys. More subtly, this inductive process discourages
the discovery of novelty, which remains
essential in biological research and reflects the nature
of biology itself. Annotation systems, rather than
being repositories of facts, should be tools that support
multiple modes of inference. By combining
deduction, induction and abduction, investigators can
generate hypotheses when accurate knowledge is
extracted from model databases. A key stance is to
depart from ‘the sequence tells the structure tells the
function’ fallacy, placing function first. We illustrate
our approach with examples of critical or unexpected
pathways, using MicroScope to demonstrate how
tools can be implemented following the principles we
advocate. We end with a challenge to the reader.