Friday, 23 February 2018

Weather for the weekend

In case you wonder why I am so fixated on rain, here is an image of Whiskers Creek this morning.
The leaves are off some willows growing along the Creek. In a normal year they'd shed their leaves in late March.  Drought stress is getting them off now, as happened last year.

A post on a weather related site of which I am a member  showed that we were on the cusp of some serious rainfall next Sunday.  This has led me to snuffle around various forecasts to see what they say. My first call was to Wunderground which had zip for Carwoola on Saturday and 7.8mm on Sunday.

I then went to BoM which as the 'headline forecast' for Canberra had 2-8mm on Saturday and 1-5mm on Sunday. I then went to the BoM rainfall forecast for Southern Tablelands which showed the Carwoola area as no rain Saturday and then this....
....for Sunday. Very similar to the pattern in Andrew's post. We - almost directly below the 'a' at the end of Canberra - appear to be on the boundary between the 15+ and 25+ ranges - a lot more than the forecast in the headline! 

Weatherzone has 10-20mm on Saturday and 1-5mm on Sunday! Time and Date offer 0 Saturday and 13.2mm on Sunday, TWC doesn't offer amounts.

Pick a number.

By Saturday morning the headline BoM forecast was 1-4 Saturday and 10-25 Sunday: presumably mainly overnight.  There were lenticular clouds over the airport.

Thursday, 22 February 2018

Census Imputation by type of area

I was looking back through some of the posts I have done about the 2016 Census and found a comment about being able to analyse imputation rates by Urban-Rural nature of area once that classification is included in Table Builder.  That data release has now happened so here is the analysis I have done.  As is often the case, investigating that topic led to a number of other interesting topics so this post has got rather long but hopefully is still coherent.

By way of background, nearly everyone that completes a form provides an age.  Thus imputed age is an indicator of non-response.  In many cases this is where:
  1. it is believed the dwelling is occupied but 
  2. a completed questionnaire has not been received.  
In the past both situations would be identified by a Collector who delivered and collected the questionnaires, and that is still the case for rural areas and small towns.  In Major Urban Areas for the 2016 Census the first situation would be based upon Australia Post having an entry on their mailing list and the second by an ABS Field Officer following up when an on-line form (or hard copy replacement) was not completed.

Aspects of Imputation

There are two aspects to imputation.  The first, and simplest, I will term dwelling imputation where no form was received for an occupied private dwelling and the number of males and females in the dwelling needed to be imputed - in effect all person records are imputed.  The second I will term person imputation and also includes cases where the number of males and females is known but at least some persons within the dwelling have not provided the basic demographic information (age, marital status and place of usual residence).

My previous posts (here and here) have all covered Person imputation, and I will return to that below, but before that here are some observations about Dwelling imputation..

Dwelling Imputation

A variable of particular interest is where the dwelling is rated as an "Other non-classifiable household", since this excludes them from the crucial General Profile table G32.  This category includes:
  1. households which the ABS Field Officer determined were occupied on Census night but where the ABS Field Officer could not make contact; 
  2. households that contained only persons aged under 15 years; or 
  3. households which could not be classified elsewhere in this classification because there was insufficient information on the Census form. 
The Dwelling imputation flag would be set in subset 1. This category seems to provide a very high proportion of the dwellings categorised as Other non-classifiable household"  - ONCH - as shown below.
% ONCH dwellings imputed
NSW 98.3
QPRC 98.9
Carwoola 90.0
In the case of Carwoola there are 3 (thus probably only 1, due to rounding) ONCH dwellings which were not imputed. So it would seem fair to say that ONCH ~ Non-responding dwelling.

The % of total dwellings imputed is also interesting.
% total dwellings imputed
NSW 4.41
QPRC 5.21
Carwoola 5.09
I shall return to this comparison below.

Person Imputation

My starting point for this element is that most people who complete information in a Census record will show their age as it isn't a sensitive item (unlike say, income).  Thus age imputation could be seen as a reasonable proxy for a person not completing a form.  To check on this I created a table for Australia cross classifying Country of Birth of the Person x Age Imputed.  In summary;
  • 94% of person records for which age was imputed showed Not Stated for Birthplace; 
  • 75% of records which showed Not Stated for Birthplace had age imputed; and
  • where a Birthplace was stated (at the 1 digit level) the age imputation rates varied from 0.2 -0.4%
Those rates support use of age imputation as an indication of overall non response.

Later, for reasons discussed in comparing Dwelling and Person Imputation I looked at tables comparing imputation of age and sex for NSW, Queanbeyan and Carwoola. The proportions of age records for which sex was also imputed were respectively 93.3%, 96.1% and 96.3%.  Again this shows support for age imputation as a simple proxy for Dwelling imputation.

The first chart is looking at age imputation rates by type of area for New South Wales as a whole.
 As might be expected those with No usual address have the highest imputation rate.  The Rural Balance are higher than any of the more urban types of area, with Major Urban being the best performed.

I then looked at the Queanbeyan Palerang Regional Council (QPRC) area and found a similar pattern apart from the Migratory etc and No usual  address categories which were both nil.
While the ranking of the types of area is similar the difference between Rural and Urban areas  is even more pronounced.

I then investigated which areas were in which type of area. These are commented on below. A  summary of the definition of the types of area is shown in the Census Dictionary.  A key factor is the definition of an Urban Centre as "a cluster of contiguous SA1s with an aggregate population exceeding 1,000 persons contained within SA1s that are of 'urban character'. "
  • Major Urban: The urban area of Queanbeyan is part of the Urban Centre of Canberra-Queanbeyan, which with a combined population of over 100,000 is a Major Urban Area.  ( I am intrigued as to the how the areas can be combined, given the farmland along Canberra and Pialligo Avenues, but for now just accept it,)  The Major Urban area includes the State Suburbs of Queanbeyan; Queanbeyan East;  Queanbeyan West; Greenleigh; Karrabar; and Jerrabomberra.
  • Other Urban: This is Urban Centres that are not Major.  In the QPRC area this is the urban parts of the State Suburbs of Braidwood, Bungendore and (very surprisingly given the merger of Queanbeyan and Canberra noted above) Googong. There will be more detail about these areas below.
  • Bounded Locality: The only Locality in the QPRC area is Captains Flat (population 450).  (On driving through the village Nerriga has some aspects of a locality but the population of 73 is too small, so it is part of the Rural Balance.)
  • Rural Balance: The rest of the area, including the entire Stoney Creek Gazette catchment.
For the three Other Urban areas the State Suburbs contain both Urban and Rural elements.

Braidwood  Bungendore  Googong
Urban 1267 3323 1522
Rural 380 861 1163

The imputation rates for the six elements shown in that table are "interesting":
Yet again there is a marked difference between the Rural and Urban elements.

Comparison of Dwelling and Person Imputation rates

It is unfortunate that it is not possible in Table Builder to compare dwelling and person variables in a single table.  This table compares the imputation rates for dwellings and persons.
In each case the % of persons for which age was imputed is higher than the % of dwellings for which age was imputed.  Possibly this reflects the fact that some persons for which age was imputed did provide a questionnaire.  However the ratio of the two rates is far above the proportion of age imputed persons for which age and sex were both imputed.  This causes me to consider that too many people are imputed when a dwelling is not contacted: that conclusion is supported by findings in the CIAP report which showed over-imputation to be a significant component of gross overcount.


The above is simply a statement of fact in that these are the results of the Census.  What follows is somewhat more normative as it is based upon the author's inferences and opinions.  There are three elements to be considered:
  • What has happened to cause the difference between Rural and Urban elements?
  • How does this affect the "fitness for use" of the data?
  • What are the implication for the ABS and users of the data/

What has happened?

I have described in other blogposts including this one my observations of the Collector for the area in which I live.  (As they don't actually collect forms any more they are now called Field Officers by ABS.  However I am a demon for tradition and will stick with Collectors.)

I also contacted some neighbours (in different Collector's workloads) to try to get a feel for what happened in a somewhat larger area.  It seemed to me that our area was similar to many others, but not all, with Collectors not actually visiting the house but just leaving the materials in the letter boxes.

To some extent this is a continuation of problems evident for rural-residential areas in the 1996 Census (when I was the Director of the Field operation).  The Collectors for such areas felt hard done by as: 
  1. they had a large number of dwellings in their workload relative to 'pure' rural areas; but 
  2. had more issues of access (long drives, locked gates, territorial dogs) relative to the urban areas.
However in 1996 their pay was linked (in part) to the number of forms they physically delivered to their supervisor so they had a fair incentive to maximise response rates.  I don't know the pay system for 2016, but the ABS' ambition was to maximise on-line response so the incentive of handing in a hard copy form didn't exist.  Thus some rural-residential (and possibly some pure rural) Collectors appeared to have overcome the problems in point 2 above by simply dropping the forms off in letter boxes.

In the more urban areas the access problems don't arise so the Collectors would be more likely to walk the 10 metres to the house and make contact. (There are other issues such as security buildings but those may be overcome by the mail out approach and it is generally able to be determined which residences are occupied etc.)

In cases where the Collector has (in effect) enumerated letter boxes rather than dwellings the effect can vary between three circumstances (more detail is at this post):
  1. An occupied dwelling exists but has no letter box: unless the occupants contact ABS to get a form (or complete the form somewhere else, which was our situation) neither the dwelling nor the residents thereof are counted;
  2. A letter box exists but there isn't a permanently occupied dwelling:
    1. If material is removed from the box - an occupied dwelling is incorrectly recorded and people are incorrectly imputed;
    2. If material is not removed from the box - an unoccupied dwelling is incorrectly recorded but no people are imputed;
From my knowledge of our area it is more likely that there are dwellings with no letterboxes as people have mail boxes in town.  Over the collection period it is more likely that even where there isn't a dwelling (either occupied or vacant) the owner will clear the crud from the mailbox.

Impact on fitness for use

Taking the three situations of letter box problems into the data it would seem that the likely outcomes would be:
  • an understatement of the number of dwellings - situation 1 being greater than situation 2.1 plus 2.2; and 
  • an unduly high imputation rate as a consequence of 2.1.
That appears to be the case.

It is also the case that this will distort the comparison between rural and more urban areas.  This arises because the quality of the Australia Post address lists in Major Urban Areas has been thoroughly assessed in deciding to use those lists as the basis for delivery and thus there is not likely to be a problem in those areas. 

I have suggested to my Council that a key aspect of their use of Census data is to compile estimates of the expected number of dwellings by State Suburb based on the number of dwellings in 2011 plus development applications for dwellings in the period up to 2016.  If this is greater than the number of dwellings recorded in 2016 further investigation is needed.

If, as seems likely, the number of people has been over-imputed this may to some extent counterbalance the understatement of dwellings for applications where the total number of persons is all that is required.  However any analysis of characteristics other than age and sex is going to be flawed due to the unduly high not stated rate.  It is also probable that the characteristics of people in 'missing' dwellings will be different to those in reporting dwellings.  I have no idea how significant those problems are because I have no information about:

  • the nature of the the persons in missed dwellings; nor
  • the analyses undertaken by people looking at small area Census data.


It is hard to come to a hard conclusion when the discussion finishes with a statement about the author's lack of knowledge.  However it seems clear to me that:
  • there is scope for ABS to improve the quality of Census data, in particular for Rural areas, by closely monitoring the delivery process; and
  • users of small area data need to pay heed to the number of imputed records, both dwelling and person, in the area of interest.

Wednesday, 21 February 2018

COG does the ponds of Gungahlin

After the leader survived Horse Park Drive (tailback from the Federal Highway to Anthony Rolfe Avenue) and another jam caused by road works on Gundaroo Rd he made it to the appointed spot.  25 Members gathered on the shores of Yerrabi Pond with the intention of heading to the Western end.  Departure was slightly delayed by the appearance of 5 Superb Parrots but we began our shore patrol about 8:45.  Had we waited a little longer I might have got a better photo.
Counting of Eurasian Coots began as soon as we could see water! Cutting to the chase, by the time we turned for home the total was up to 470.  Bill Graham advised that at peak he had assessed 800 Coots here, but he agreed there were many less today.  A Little Pied Cormorant posed nicely.
On the shore we spotted a Red-rumped Parrot being fed by an adult female which counted as Feeding Young (which presumably maps to the COG DY category).
My photo skills were improving, but still not as good as Sandra's image on the checklist. A little further along some members were busy observing a young Pacific Koel.
Sandra and Ian have posted excellent photos on the checklist.  A few metres back from this a similar ruckus was audible, which was not surprising as there was a second Koel.  Both Koels (1) attended by Red Wattlebirds (2).
The breeding activity was completed by Magpie Lark and Australian Reed Warbler Feeding Young.

With a good array of other water birds and bush birds we recorded 41 species at this site  It would have been 42 had we spotted the Red-capped Robins recently reported from here on the far side of the Bridge but we didn't.
It was unfortunate that there was a lot of garbage around.  I think the total collected by Tina was 4 bagsful.
So it was braving the foul traffic along Gundaroo Rd and on to Gungahlin Pond.  Thanks to members cooperation (and the departure of a few observers for other duties) we had condensed to 6 cars which fitted nicely in the parking spots on Gorman Cr.
Again we started counting Coot as soon as we got in sight of the water, ending with an estimate of 350.  The biggest flock of Coots were playing an interesting game: trying to bum food from some people on the bank while avoiding their mutt which seemed quite frisky (but not unduly keen on a poultry lunch).
Also 20 Black Swans, 20 Little Pied Cormorant and 30 Little Black Cormorant, one of which did a fly-by carrying a stick which was interpreted as Nest Building.

The main business here was the activity on the 3 small Islands in the middle of the pond.  The most obvious were an estimated 80 Australian White Ibis of which several were ON (Occupied Nests).  A few Straw-necked Ibis were hanging around in the trees and the addition of a flight of 19 gave a total of 23 for this species.

The excitement of the day however was the Leader thinking that he saw a Royal Spoonbill on a nest containing two fluffballs.  After quite a few minutes peering by everyone with a telescope or long lens it was confirmed that a pair of Royal Spoonbills were raising chicks in that nest.  I think that is the third ACT breeding event for that species.  Unfortunately my digiscoping skills fritzed completely on this: hopefully someone else got a better shot which they can add to the checklist

Allowing for other species seen we recorded 23 species for this site  Combining sites the total species for the day was 45.

Sunday, 18 February 2018

A look up page for 2016 Census posts

The idea behind this post is to give an easy way of identifying the posts I have done about the 2016 Census and indicating what they are about.  This will be updated as the impulse strikes me to work on other topics.
  • 10 Collection phase Dwelling count; age profiles,collection methods 170627  My observations in Carwoola Talks about the collection process in our area.

  • 21 Data quality Imputation rates 170712 Imputation 1 A brief look at rates
  • 22 Data quality Imputation rates, collection errors,CIAP 170712 Imputation 2 The importance of collection methods in determining imputation rates.
  • 23 Data Quality Imputation rates, collection errors 180221 Census imputation by type of area; Comparison of urban and rural rates of imputation 
  • 29 Data quality Output media, Data consistency 180217 Dwelling counts Identifies and resolves some issue with different sources of Census data.

Sunset 17 February 2018

The clouds were well positioned last evening.  Take a snap to the West ...

. ,, then a snap to the East.
Lets do the evening again.

Saturday, 17 February 2018

Traps -or at least pitfalls - for old players

Reader advisory:  Please note that is a tale about an error - so read the whole post before trying to emulate this!  As consequnce of my discoveries some of my earlier posts about the 2016 Census have been massaged a little to reflect my new understanding.

When the first tranche of 2016 Census data was released in June 2016 I seized upon the General Profile for Carwoola with glee and did a basic comparison of people and dwellings between 2016 and 2011 Census results.  There was no drama with the people comparison, but the comparison of data from the Profile with information for 2011 Table Builder results suggested a fairly serious undercount of dwellings in 2016.  This is that comparison (its an image not a table).
I developed various conspiracy theories about how this arose and in February 2018 decided to document the issues with the poor count of dwellings.  As I greatly prefer using Table Builder (TB) to wading through a squillion worksheets of profile data I created a TB table of the dwellings data to check I was doing the right thing.  A significant difference appeared for the 2016 Census data .

Following an exchange of emails with the ABS contact (thanks Harry for your great service) I have realised where my error(s) lay.  Here is the correct data (again an image):
Rather than a decline in number of dwellings there is the expected increase.  How has this error arisen?

Here is an image of the tables pretty much as they appear in TB and the General Profile G32. (I have removed some 'empty' lines from the Profile to fit the image on to a single page.)
The similarity of the two sets of labels is clearer when the repetitious stuff is removed from the TB version.  I have also emphasised a few crucial words in the Profile version.

First Issue

In many cases the data label in the two sets is identical (eg Separate house; Improvised home, tent, sleepers out) so I believed the data items were the same.  This was an error, as in the TB series both those items included numbers of unoccupied dwellings while they were in the separate item "Unoccupied private dwellings" in the Profile.

Second Issue

The second footnote to the Profile table says the data excludes "Other non-classifiable households".  Exactly what was included in that element was difficult to track down but eventually I found the following under Household Composition HHCD in the Glossary component of the Census Dictionary: 
"The 'Other not classifiable' category includes those households which the ABS Field Officer determined were occupied on Census night but where the ABS Field Officer could not make contact; households that contained only persons aged under 15 years; or households which could not be classified elsewhere in this classification because there was insufficient information on the Census form." 
I have emphasised some words there as it emerged when looking (through TB) at the "Other not classifiable households" they all contained Imputed persons and no other persons.  Thus in the case of Carwoola this item is a measure of Dwelling non-response.


Dwellings in Table Builder530
unoccupied Dwellings -41
Occupied Dwellings489
Visitor only Households-3
Other non-classified households-27
Profile definition459
Profile occupied dwellings
I am relaxed about regarding the residual difference of 7 dwellings as being the cumulative impact of the random perturbations to preserve privacy.


I realise that in Census it isn't always possible to satisfy everyone and that many trade-offs have to be made.  In particular if product A for year 20xx is changed to be compatible with product B for that year it immediately raises an incompatibility with product B for year 20XX-5).  Further making the changes suggested below for Table G32 and 2 entries in the Census Dictionary might require other changes to render other tables and classifications to be consistent in style. (That might not be a bad thing, apart from the workload involved.)

Further it isn't possible to idiot proof things (because, as a Canadian friend said "They keep making bigger idiots." Perhaps I should put my hand up here?).  However it seems to me that there are a few things that could be done to avoid words having different meanings between platforms.

In writing what follows I am assuming that the Profiles series will continue into the future, as being used by many folk,  even though TB is IMHO far more useful for anyone with on-line access.
  • I suggest that the profile table G32 be adjusted to include an extra column for "All private dwellings" which include occupied private dwellings, unoccupied private dwellings and other non-classifiable households.  This should thus be comparable with the TB results for STRD.  By also retaining the column as currently designed it will allow consistency between years and a direct relationship between the number of occupied dwelling and the people therein.
  • It would be helpful for Census Dictionary classifications to be supplemented as below:
    • STRD have additional text added to indicate that "not stated includes households which the ABS Field Officer determined were occupied on Census night but where the ABS Field Officer could not make contact" - ie the text from HHCD and possibly a second item to indicate that it included dwellings where the person who filled in the form on-line did not indicate the structure of the dwelling.
    • HHCD have additional test summarising the explanation of "not classifiable currently in the Glossary.

Thursday, 15 February 2018

Looking down on Foxlow Lagoon

I decided to do one of my monthly visits to Foxlow Lagoon today as it often adds a few species of waterbirds to the list for the Gazette area.

The view as I turned on to Captains Flat Rd was not exactly promising.  Hazy murk was the phrase that sprang to mind.
 My usual practice is to park at a gate and peer down from the car.  As today had a total fire ban I was unsure whether the grass would be too high to park on.  It didn't matter as the Council has graded the road and I didn't fancy blasting the Pajero over that heap of crud.
So I parked on the side of the road and looked down on the Lagoon, which was still apparently well endowed with water.
However when one observes that it barely covered a Black Swan's feet some rain is needed to prevent the water vanishing completely.
Yesterday I spent some time unsuccessfully searching water bodies in Belconnen for some reported Red-necked Avocets.  Today was more successful with two of them by the red arrows.  The yellow arrows point at 4 of the 11 White-faced Herons I recorded. Sorry about the quality of the image but it was hazy and the birds were 500m away.
Two of the Herons flew in towards the end of my stay.  They came in at a fair height, gliding like very large raptors and side-slipping like a paraglider!

That image also shows a few of the ducks and other fowl on the margins of the Lagoon.  Suddenly all the fowl took off.  I immediately scanned for a raptor but couldn't spot one.  Then I noticed a couple of hundred sheep galloping along - perhaps a fox?  Nope, a ute full of Kelpies escorting the mutton.  I hoped they were going to come up to the road as I'd like a chat with those folk, but they went in the opposite direction.

The fowl all landed back on the water.  Click on this image and they will be visible, with a yellow dot as I counted them on the screen.  There's 84 in the image which I think was between 25 and 30% of the flock.  My total estimate, as reported to eBird, across several species was just over 300 birds so that hangs together fairly well.