Mr Grumpy and the Kings new Clothes

This is not another book review although the subject does suggest references to the work of Roger Hargreaves  (Mr Grumpy)...

.. and Hans Christian Anderson (The Kings New Clothes).
That is actually a farmer celebrating getting some rain!

What this is actually going to be covering is the topic of Taxonomy and particularly the use of DNA sequencing in that Dark Art.  The reason I am writing this is because a number of people whose knowledge and views I respect have a rather different perspective on these topics to me, and it behooves me to try and get my ideas straightened out.  Trying to write about them is a good way of doing that.

The view of others is that DNA sequencing enables researchers to examine more clearly how the process of mutation has led to new species developing and the relationships between species.  The value of that knowledge isn't questioned.

In some cases these studies are based upon an assumption that mutation rates are constant over time (which seems intuitively to me to be unlikely).  In other cases it has been possible to have samples of DNA from various time periods (I can't remember how they were acquired, or the time period involved) which provides a temporal backdrop against which changes in the study organism can be compared,  This seemed a lot more believable.  There is some more coverage of this topic in the discussion of dogs and wolves below.

However the fact that a DNA based approach is useful for such studies doesn't dispel all my problems.  My basic interest is based on geography: what species occurs where?  Part of this is the potential for such knowledge to be used to control the depredations of extractive industries and urban developers on the environment.

Mr Grumpy has his Day

Mr Grumpy might say that it is all very well knowing that species A and B shared a common ancestor, Xn million years ago, but what value is that when a coal mine or new slum has obliterated the remaining habitat of both because, like the 3 old ladies locked in a lavatory, nobody knew they were there.  So why does DNA analysis affect this?

Primarily it does so by preventing people from 'correctly' identifying what they see.  For example, they get an excellent book on orchids and identify the flowers they have seen but then discover that those names don't appear in other authoritative sources because the forces in academia have changed the way orchids are viewed.  A lot of folk then either give up their interest in botany completely or don't report their findings because they are unsure of the name of what they've seen.

I have used orchids as an example as they are about the most DNA-sequencing-trashed taxon of which I am aware.  (I remember attending a lecture where an authority listed a number of things to look at in identifying an orchid - basic structure of the plant, type of insect which fertilises them, habitat etc - and then said don't worry about all of that old-fashioned stuff we've used DNA and its all wrong.)

It is particularly annoying when one set of taxonomists descend the mountain proclaiming to have found the truth graven on tablets of stone ...

... only to find that a short while later the tablets are stated (sometimes by new prophets or sometimes by the old ones, recanting) to be Fake News.  This may mean:
  1. the former view is reinstated, or
  2. another completely different view is proclaimed; or (most likely)
  3. a mixture of old and new views appears.
Fake News is bad enough when it deals with trivia like US Government policy or the revolving door which is Australian Prime Ministership, but when it gets to Natural History things are really crook.

An interesting divergence from the "DNA is King" approach used to be the Birds Australia "Atlas code".  This stayed the same for each species regardless of how the fashionistas reorganised the taxonomic sequence.  Although it was a bit odd at times (for example Spotted Pardalote is 565 while Striated Pardalote is 976) it does mean that to get data about a species the code is always the same.  In contrast the taxonomic codes and names of some species have changed greatly over time.

This leads into a second problem with the DNA approaches in that they make data extraction difficult if not impossible. Trying to extract data for the blue "wrens" of Australia by any other way has become difficult recently as "Fairywren" is not the same as "Fairy-wren".  (It is unfortunate that Birdlife Australia appear to have abandoned publishing the list of Atlas codes - presumably because it was heresy in the true faith of DNA.)  There are ways around this, but they like unto a haemorrhoid!

The King's New Clothes

In the matter of clothing, some readers of this may recall the work of Ray Davies in the 1960s.  Once a new technique is developed everyone wants to use it and applications for grants etc have to mention it to show they are modern and fashionable in their outlook.  That is unfortunate but just a reality.

It is also evident that in the case of DNA sequencing there are many new terms in use to the extent that authors writing for their colleagues produce material that is unreadable let alone understandable by non-experts.  Which makes it rather hard to evaluate what is going on, and decide whether it is correct and/or helpful for the analysis being undertaken.

In traditional work analysts talk about the statistical significance of their work and give some statistics (typically one or more of standard deviation, variance, t-test value or z statistic) to support that statement.  I can handle that.  Unfortunately that tends not to be done with DNA sequencing but other measures, which I can't discover due to the depths and aromatic attributes of jargon, must be used.

A couple of anecdotes illustrate my problems.  If scientists can't get matters consistent (let's settle for consistency, rather than the higher bar of "correct") for such obvious taxa as apes and canids what hope is there for the exotica of orchids and birds?

Chimpanzees and humans

A popular comment about DNA analysis is along the lines of "Chimpanzees and humans share 98.8% of their DNA.".  On poking this statement a bit I am unsure whether it is is actually chimpanzees or their close relative the Bonobo who are our closest relatives.  There are also some nuances regarding what form of DNA is being considered and how the % is calculated.  

I recall - this is an anecdote so don't expect chapter and verse - reading a scientific article that addressed the question of what element of the DNA makes a chimp a chimp and not a human.  They concluded that there was not a clear answer to that question.  An article in Discover magazine - and I have no idea whether that is reputable or not - suggests it is to do with the number of neurons which develop but concludes 
"..Genes may have something to do with that quantity, and thus with the complexity of the quality that emerges. Yet no gene or genome can ever tell us what sorts of qualities those will be."

Dogs and Wolves

An interesting question is how long ago did dogs and wolves diverge from their common ancestor?

I first noted this as a topic of interest when someone published an article greatly changing the date at which dogs and wolves diverged and a couple of years later recanted and brought the estimate back to where it had been.  Somehow or another I exchanged emails with another author with knowledge of this and they commented that the confidence interval for both estimates probably included zero!

I suspect this may have something to do with a story from PBS.  This includes  
"It was thought until very recently that dogs were wild until about 12,000 years ago. But DNA analysis published in 1997 suggests a date of about 130,000 years ago for the transformation of wolves to dogs."
I'm unsure of the date of this article (possibly 1999) and suspect it refers to the first change suggested in the previous paragraph.

A more recent "popular science" article in in Live Science.  It says
"Their common ancestor was a prehistoric wolf that lived in Europe or Asia anywhere between 9,000 to 34,000 years ago, according to various studies."
It is a bit hard to establish the parameters  of a traditional confidence interval in this as the extremes quoted could be single estimates.  However, this is a popular article, and rigour is not to be expected.  The novel research they refer to is a paper by Skoglund et al  (congratulations to the authors for arranging free access, not the usual $10 a page) which seems to be in a serious, peer reviewed Journal.  An early statement is :
"While molecular estimates of the time of origin of the dog lineage are contingent on principally unknown mutation rates and generation times, the most recent genomic estimates of the divergence between wolves and dogs date to 11,000 to 16, 000 years ago." (emphasis added)
The article is not in totally simple language (fair enough: it is in a technical journal) but there did appear to be enough English in there for me to understand what the authors did and to have confidence that their conclusions are reasonable.  I was particularly interested in:
"We find that calibration using the most commonly assumed mutation rate of 1 × 10−8 per generation and a 3-year gray wolf generation time [515181927] would imply that the Taimyr wolf diverged from the Chinese wolf 10,000–14,000 years ago (Figure 3), which is incompatible with its calibrated direct radiocarbon date of ∼35,000 years BP. Instead, the mutation rate must be substantially slower in order to be compatible with the age of the Taimyr individual, and we find that the Taimyr divergence can be accommodated by a mutation rate of 0.4 × 10−8 per generation."
The key points for me in this are 
  1. the degree of difference between the "received" and "required" mutation rates (highlighted in red above); 
  2. the importance of generation time, and
  3. the use of independent data (the radiocarbon date of the bone) to resolve the issue.
Where I get to with all of this is that there is almost a lottery with dates ranging over a period of about 120,000 years (from 9000 years BP to 130,000 years BP).  There is also material in Skoglund's article about the way in which dogs and various species of wolves split apart which appears to throw previous views about dogs splitting off from Grey Wolf  into the toilet.

Bringing it all together

The key points I hope I have made above are:
  • There are some important uses of information about life forms that are made much more difficult by the constant changes of opinions about taxonomy; and
  • Some of the changes appear to be made without being given "full consideration" and/or lack rigour in the underlying analysis..
Where this gets is illustrated by an analogy with car making.  People buy vehicles from David Brown and Lamborghini.  However in both cases they need to have a good idea of the use to which the equipment is to be put.  
  • Getting from London to Glasgow, or Milan to Naples, go for the GT sportscar; but
  • Ploughing a field, make sure you get a tractor.
I suggest that DNA sequencing is like the rather fragile sportscar: very modern and good looking, but falls to bits when the going gets tough.  For the average citizen scientist what is needed is something robust and long lasting that gets a job done.  The problem is that taxonomy appears to be driven by a mob of opinionated speed freaks (think Jeremy Clarkson with a PhD) who cannot see the utility of a tractor.

Comments

Popular posts from this blog

Insects from pine trees

A tour of the West (part 1)

Maslins beach rules