Letters, numbers, and underscore are all considered word characters (see Recipe 2.6).
Note that the first word boundary token appears after the optional, opening parenthesis.
Standard capturing groups require the regular expression engine to keep track of backreferences, so it’s more efficient to use noncapturing groups whenever the text matched by a group does not need to be referenced later.
Another reason to use a noncapturing group here is to allow you to keep using the same replacement string as in the previous examples.
By using capturing groups to remember each set of digits, the same regular expression can be used to replace the subject text with precisely the format you want. Two other types of tokens used in this regular expression are character classes and quantifiers.
The supported formats are 1234567890, 123-456-7890, 123.456.7890, 1, (123) 456 7890, and all related combinations.
If you want to limit matches to valid phone numbers according to the North American Numbering Plan, here are the basic rules: Beyond the basic rules just listed, there are a variety of reserved, unassigned, and restricted phone numbers.
Unless you have very specific needs that require you to filter out as many phone numbers as possible, don’t go overboard trying to eliminate unused numbers.
The entire, added noncapturing group is also optional, but since the “1” is required within the group, the preceding plus sign and separator are not allowed if there is no leading “1”.
To allow matching phone numbers that omit the local area code, enclose the first group of digits together with its surrounding parentheses and following separator in an optional, noncapturing group: , with an empty set of parentheses.