Go to TogaWare.com Home Page. Data Science Desktop Survival Guide
by Graham Williams
Duck Duck Go



CLICK HERE TO VISIT THE UPDATED SURVIVAL GUIDE

Read Strings from a File

20180604 There may be occasions where we would like to load a dataset from a file as strings, one line as a string, returning a vector of strings. We achieve this using the function base::readLines(). In the following example we access the system file weather.csv that is provided by the rattle package.

library(glue)         # Format strings: glue().

dsname <- "weather"   # Dataset name.
ftype  <- "csv"       # Source dataset file type.
fname <- glue("{dsname}.{ftype}")

fname %T>%
  print() %>%
  system.file(ftype, ., package="rattle") %>%
  readLines() ->
ds
## weather.csv

Show the first few lines using utils::head():

head(ds)
## [1] "\"Date\",\"Location\",\"MinTemp\",\"MaxTemp\",\"Rainfall\",\"Evapor...
## [2] "2007-11-01,\"Canberra\",8,24.3,0,3.4,6.3,\"NW\",30,\"SW\",\"NW\",6,...
## [3] "2007-11-02,\"Canberra\",14,26.9,3.6,4.4,9.7,\"ENE\",39,\"E\",\"W\",...
## [4] "2007-11-03,\"Canberra\",13.7,23.4,3.6,5.8,3.3,\"NW\",85,\"N\",\"NNE...
## [5] "2007-11-04,\"Canberra\",13.3,15.5,39.8,7.2,9.1,\"NW\",54,\"WNW\",\"...
## [6] "2007-11-05,\"Canberra\",7.6,16.1,2.8,5.6,10.6,\"SSE\",50,\"SSE\",\"...
....

Find those strings that contain a specific pattern using base::grep().

grep("ENE", ds)
##  [1]   3  10  23  26  28  36  37  42  43  49  50  54  68  69  71  76  86...
## [19]  97 101 103 106 108 109 110 118 129 132 133 135 138 145 160 171 176...
## [37] 222 278 303 304 310 323 341 348 351 357 365

grep("ENE", ds, value=TRUE)
##  [1] "2007-11-02,\"Canberra\",14,26.9,3.6,4.4,9.7,\"ENE\",39,\"E\",\"W\"...
##  [2] "2007-11-09,\"Canberra\",8.8,19.5,0,4,4.1,\"S\",48,\"E\",\"ENE\",19...
##  [3] "2007-11-22,\"Canberra\",16.4,19.4,0.4,9.2,0,\"E\",26,\"ENE\",\"E\"...
##  [4] "2007-11-25,\"Canberra\",15.4,28.4,0,4.4,8.1,\"ENE\",33,\"SSE\",\"N...
##  [5] "2007-11-27,\"Canberra\",13.3,22.2,0.2,6.6,2.3,\"ENE\",39,\"E\",\"E...
##  [6] "2007-12-05,\"Canberra\",14.5,21.8,0,8.4,9.8,\"ENE\",43,\"ESE\",\"E...
....


Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.