10.33 Location

20180723 From our review of the data so far we start to make some observations about the character variables. The first is location. We note that several locations were reported in the above exploration of the dataset. We can confirm the number of locations by counting the number of base::unique() values the variable has in the original dataset.

# How many locations are represented in the dataset.

ds$location %>% 
  unique() %>%
  length()
## [1] 49

We may not know in general what other locations we will come across in related datasets and we already have quite a collection of 49 locations. We will retain this variable as a character data type.

Here is a list of locations and their frequencies in the dataset.

ds$location %>%
  table()
## .
##         Adelaide           Albany           Albury     AliceSprings 
##             3832             3679             3680             3680 
##    BadgerysCreek         Ballarat          Bendigo         Brisbane 
##             3632             3680             3671             3833 
##           Cairns         Canberra            Cobar     CoffsHarbour 
##             3680             4076             3649             3649 
##         Dartmoor           Darwin        GoldCoast           Hobart 
##             3649             3833             3680             3833 
##        Katherine       Launceston        Melbourne MelbourneAirport 
##             2218             3680             3833             3649 
##          Mildura            Moree     MountGambier      MountGinini 
##             3649             3649             3679             3680 
##        Newcastle             Nhil        NorahHead    NorfolkIsland 
##             3680             2218             3644             3649 
##        Nuriootpa       PearceRAAF          Penrith            Perth 
##             3648             3648             3679             3832 
##     PerthAirport         Portland         Richmond             Sale 
##             3648             3649             3649             3649 
##       SalmonGums           Sydney    SydneyAirport       Townsville 
##             3602             3984             3649             3680 
##      Tuggeranong            Uluru       WaggaWagga          Walpole 
##             3679             2218             3649             3645 
##         Watsonia      Williamtown      Witchcliffe       Wollongong 
##             3649             3649             3648             3680 
##          Woomera 
##             3649


Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.