ANPS WW database: some metrics after completing 'catch-up'

This post should be seen as of "special interest".  In this case the risk is to your boredom filter which may erupt at a string of statistics.  I justify this assault on your time budget on the grounds that:

  • knowledge about the structure of the data can be useful in interpreting it; and
  • there is a social policy interest in recording the effort put in by groups such as the WWs.
As a step towards the first of these points what follows will include some analysis of aspects of the broad measures.I will offer.

The first item is the total number of records created.  A record is defined as a recording of a taxon (species or subspecies) at a location on a day (or for older records a series of days - see below for more on this).  

A second crucial definition is that of visit.  These are a combination of a date and a site.  Some days we visit more than one site and some sites have been visited more than once.  With luck I'll get back to them following the first parenthesis.

As at 29 August 2013 the database contains 21,218 records.  Of these 1,779 (8.3% ) include the designation 'sp.' indicating that there was uncertainty about some element of the name at a finer level of detail than genus.  In some cases the uncertainty may be at the species level, in others at the subspecies/variety level.  I have not distinguished these, thinking that for the purposes for which the data will be used it is enough to know that there was "doubt" about the detail at the time the observation was recorded.

Parenthesis 1: species with uncertainty
The proportion of taxa rated as 'uncertain' differs between visits.  The highest proportion designated in this way is 34.48% for a site off the Western Distributor Rd close to the Coast.  There are also 8 visits with no "uncertain" taxa.  The distribution of visits according to proportion of uncertain taxa is summarised below.
That looks pretty close to a normal distribution to me, which is almost certainly a good thing.

I have attempted to find a pattern in the incidence of the uncertain records.  Thus far it has not been totally successful, other than noting that the sites which are out of our normal territory to the North East and East - ie the sandstone sites - are over-represented in the >20% category.  Hypotheses (to use a big word for idea) I have looked at, and rejected, include:

  • distance from Canberra (while this catches the sandstone sites some more distant sites with Grassy box woodland have quite low uncertain scores)
  • First date of visit or last date of visit (no correlation).
  • Month of last visit - the values for months vary a little but are grouped around the average value of 8.3%

There might be an explanation lurking in family level data.  Of the top 10 families (as measured by total number of records) most have a small % uncertain.  However Poaceae - grasses, and Orchidaceae - orchids and Cyperaceae - sedges all have a large number of records and a high % uncertain.  I shall seek advice on this.

Back to business

Having mentioned 'Family in the previous paragraph I will note that we have recorded taxa in 73 families.  The number of records per family varies considerably, from 2,753 in Asteraceae and 2,124 in cow food - sorry, Poaceae - to 11 families (none of which ring any sort of mental bell with me) with but a single record.

There are also 376 genera represented by 954 species.  The genera with the greatest number of species are:

Genus Count Of Species
Acacia 37
Eucalyptus 37
Pomaderris 18
Olearia 16
Leptospermum 15
Persoonia 13
Austrodanthonia 13
Pultenaea 12
Carex 11
Leucopogon 10
Diuris 10
Senecio 10
Brachyscome 10

We have been to 206 sites, the locations of which are shown in this Google Earth screenshot. (It is currently a mystery to me why the sites are shown with a variety of icons, but tend to start with a theory that it is something weird with the software!)
It is rather difficult to establish how many visits have been made to each and every site, not least because some of the weblists contain more than 1 site (as I have defined them for this exercise).  However in some early head-scratching I recall that Brooks Hill reserve, just west of Bungendore had been graced with the presence of the WWers on 10 occasions, spread over 12 years.  In contrast 115 sites have only been visited on one occasion (so far at least).

Comments

Popular posts from this blog

A tour of the West (part 1)

Insects from pine trees

Maslins beach rules