10.48 Location

20180723 From our review of the data so far we start to make some observations about the character variables. The first is location. We note that several locations were reported in the above exploration of the dataset. We can confirm the number of locations by counting the number of data.table::unique() values the variable has in the original dataset.

# How many locations are represented in the dataset.

ds$location %>% 
  unique() %>%
  length()
## [1] 49

We may not know in general what other locations we will come across in related datasets and we already have quite a collection of 49 locations. We will retain this variable as a character data type.

Here is a list of locations and their frequencies in the dataset.

ds$location %>%
  table()
## .
##         Adelaide           Albany           Albury     AliceSprings 
##             3924             4739             4740             4740 
##    BadgerysCreek         Ballarat          Bendigo         Brisbane 
##             4690             4738             4731             4893 
##           Cairns         Canberra            Cobar     CoffsHarbour 
##             4740             5136             4709             4709 
##         Dartmoor           Darwin        GoldCoast           Hobart 
##             4709             4893             4740             4893 
##        Katherine       Launceston        Melbourne MelbourneAirport 
##             3278             4740             4893             4709 
##          Mildura            Moree     MountGambier      MountGinini 
##             4709             4709             4739             4728 
##        Newcastle             Nhil        NorahHead    NorfolkIsland 
##             4030             3278             4704             4709 
##        Nuriootpa       PearceRAAF          Penrith            Perth 
##             4708             4708             4739             4892 
##     PerthAirport         Portland         Richmond             Sale 
##             4708             4709             4709             4709 
##       SalmonGums           Sydney    SydneyAirport       Townsville 
##             4662             5044             4709             4740 
##      Tuggeranong            Uluru       WaggaWagga          Walpole 
##             4739             3278             4709             4705 
##         Watsonia      Williamtown      Witchcliffe       Wollongong 
....


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0