Go to TogaWare.com Home Page. Data Science Desktop Survival Guide
by Graham Williams
Duck Duck Go



CLICK HERE TO VISIT THE UPDATED SURVIVAL GUIDE

Regexp Character Classes

20180608 A character class is a collection of characters that are in some way grouped together. We enclose the characters to be grouped within square backets []. The pattern then matches any one of the characters in the set. For example, the character class [0-9] matches any of the digits from 0 to 9.

  Character Class Description
1 [0-9] Digits
2 [a-z] Lower-case letters
3 [A-Z] Upper-case letters
4 [a-zA-Z] Alphabetic characters
5 [^a-zA-Z] Non-alphabetic characters
6 [a-zA-Z0-9] Alphanumeric characters
7 [$\backslash$n$\backslash$t$\backslash$r$\backslash$f$\backslash$v] Space characters
8 [!,:;`$\backslash$)}@-]$*+.?[^{$\vert$($\backslash$$\backslash$#%&˜_/$<$=$>$'] Punctuation characters

s <- c("abc12", "@#$", "345", "ABcd")
grep(pattern="[0-9]+", s, value=TRUE)
## [1] "abc12" "345"
grep(pattern="[A-Z]+", s, value=TRUE)
## [1] "ABcd"
grep(pattern="[^@#$]+", s, value=TRUE)
## [1] "abc12" "345"   "ABcd"

R also supports the use of POSIX character classes which are represented within [[]] (double braces).

grep(pattern="[[:alpha:]]", s, value=TRUE)
## [1] "abc12" "ABcd"
grep(pattern="[[:upper:]]", s, value=TRUE)
## [1] "ABcd"


Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.