Data Science Desktop Survival Guide
by Graham Williams |
|||||
Dollar to Numeric Conversion |
20200813 Dollar amounts ingested from other applications might be formatted as a string with a $ prefix and commas. Use readr::parse_number() to convert to a numeric rather than a string value. This then allows us to perform calculations on the amounts.
In the example below we use readr::parse_number() within a pipeline. It begins with ingesting the data from a csv file using readr::read_csv(). We then normalise the variable names using janitor::clean_names(). A tee pipe is then used to send the data along two pipes. The first pipe (between the curly brackets) will print a sample from the column to be converted, for information.
It is the other pipe that continues along the pipeline to perform the actual conversion using dplyr::mutate(). Here readr::parse_number() is applied to the appropriate column to effect the transformation of the dollar amount from string to numeric.
Another tee pipe is then used to print the result, for information and confirmation, whilst also assigning the dataset into a variable.
IFILE <- "data/dollars.csv"
IFILE %>% read_csv() %>% clean_names(numerals="right") %T>% {pull(., income) %>% head(20) %>% print(); cat("\n")} %>% mutate(income=parse_number(income)) %T>% {pull(., income) %>% head(20) %>% print(); cat("\n")} -> dollars
##
## [1] "$81,838.00" "$72,099.00" "$154,676.74" "$27,743.82" "$7,568.23" ## [6] "$33,144.40" "$43,391.17" "$59,906.65" "$126,888.91" "$52,466.49" ## [11] "$291,416.11" "$24,155.31" "$143,254.86" "$120,554.81" "$34,919.16" ## [16] "$67,176.79" "$9,608.48" "$12,475.84" "$32,963.39" "$31,534.97" ## ## [1] 81838.00 72099.00 154676.74 27743.82 7568.23 33144.40 43391.17 ## [8] 59906.65 126888.91 52466.49 291416.11 24155.31 143254.86 120554.81 ## [15] 34919.16 67176.79 9608.48 12475.84 32963.39 31534.97 |