Regexp Character Classes

		Data Science Desktop Survival Guide by Graham Williams

CLICK HERE TO VISIT THE UPDATED SURVIVAL GUIDE

Regexp Character Classes

20180608 A character class is a collection of characters that are in some way grouped together. We enclose the characters to be grouped within square backets []. The pattern then matches any one of the characters in the set. For example, the character class [0-9] matches any of the digits from 0 to 9.

	Character Class	Description
1	[0-9]	Digits
2	[a-z]	Lower-case letters
3	[A-Z]	Upper-case letters
4	[a-zA-Z]	Alphabetic characters
5	[`^`a-zA-Z]	Non-alphabetic characters
6	[a-zA-Z0-9]	Alphanumeric characters
7	[ $\backslash$ n $\backslash$ t $\backslash$ r $\backslash$ f $\backslash$ v]	Space characters
8	[!,:;` $\backslash$ )}@-]$*+.?[`^`{ $\vert$ ( $\backslash$ $\backslash$ #%&˜_/=']	Punctuation characters

s <- c("abc12", "@#$", "345", "ABcd")
grep(pattern="[0-9]+", s, value=TRUE)

## [1] "abc12" "345"

grep(pattern="[A-Z]+", s, value=TRUE)

## [1] "ABcd"

grep(pattern="[^@#$]+", s, value=TRUE)

## [1] "abc12" "345"   "ABcd"

R also supports the use of POSIX character classes which are represented within [[]] (double braces).

grep(pattern="[[:alpha:]]", s, value=TRUE)

## [1] "abc12" "ABcd"

grep(pattern="[[:upper:]]", s, value=TRUE)

## [1] "ABcd"

Support further development by purchasing the PDF version of the book.
Other online resources include the GNU/Linux Desktop Survival Guide.
Books available on Amazon include Data Mining with Rattle and Essentials of Data Science.
Popular open source software includes rattle and wajig.
Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 2000-2020 Togaware Pty Ltd. . Creative Commons ShareAlike V4.