(JavaScript) Regular Expression with Capture Groups
Note: Chilkat uses PCRE2. See PCRE2 Regular Expressions
Also see: PCRE2 Performance
Demonstrates the following PCRE2 regular expression:
See the sample code below.
Name:\s+(\w+)\s+(\w+),\s+Email:\s+(\S+)
And apply it to this string:
Name: John Smith, Email: john.smith@example.com
Regex Components Explained
| Part |
Meaning |
Matched Text |
| "Name:" |
Matches the literal text "Name:" |
"Name:" |
| "\s+" |
Matches one or more whitespace characters (spaces, tabs, etc.) |
(space) |
| "(\w+)" |
Capture Group 1: One or more word characters ("a-zA-Z0-9_") |
"John" |
| "\s+" |
More whitespace |
(space) |
| "(\w+)" |
Capture Group 2: Another word (the last name) |
"Smith" |
| "," |
A literal comma |
"," |
| "\s+" |
Whitespace again |
(space) |
| "Email:" |
Matches the literal "Email:" |
"Email:" |
| "\s+" |
Whitespace |
(space) |
| "(\S+)" |
Capture Group 3: One or more non-whitespace characters |
"john.smith@example.com" |
Matches for Your Example String
String:
"Name: John Smith, Email: john.smith@example.com"
Regex Match Groups:
| Group |
Captured Value |
| Group 1 |
"John" |
| Group 2 |
"Smith" |
| Group 3 |
"john.smith@example.com" |
Notes on Character Classes
\w matches [a-zA-Z0-9_] — so it doesn’t include punctuation like a period.
\S matches any non-whitespace character, so it’s good for capturing an email.
Note: This example requires Chilkat v11.1.0 or greater. For more information, see https://www.chilkatsoft.com/chilkat_pcre2.asp
var success = false;
var subject = "Name: John Smith, Email: john.smith@example.com";
var pattern = "Name:\\s+(\\w+)\\s+(\\w+),\\s+Email:\\s+(\\S+)";
var sb = new CkStringBuilder();
sb.Append(subject);
var json = new CkJsonObject();
json.EmitCompact = false;
var timeoutMs = 2000;
var numMatches = sb.RegexMatch(pattern,json,timeoutMs);
if (numMatches < 0) {
// Probably an error in the regular expression.
// Suggestion: Use AI to help create and/or diagnose regular expressions.
console.log(sb.LastErrorText);
return;
}
// Examine the matches:
console.log(json.Emit());
// This is the JSON with the match information.
// See the JSON parsing code below to get the matched capture group values.
// Important: Capture group 0 always contains the entire match — that is, the portion of the input string that matches the full regular expression.
// {
// "match": [
// {
// "group": [
// {
// "cap": "Name: John Smith, Email: john.smith@example.com",
// "idx": 0,
// "len": 47
// },
// {
// "cap": "John",
// "idx": 6,
// "len": 4
// },
// {
// "cap": "Smith",
// "idx": 11,
// "len": 5
// },
// {
// "cap": "john.smith@example.com",
// "idx": 25,
// "len": 22
// }
// ]
// }
// ]
// }
var cap;
var i = 0;
var matchCount = json.SizeOfArray("match");
while (i < matchCount) {
console.log("Match " + (i+1) + ":");
json.I = i;
var j = 0;
var numCaptureGroups = json.SizeOfArray("match[i].group");
while (j < numCaptureGroups) {
json.J = j;
cap = json.StringOf("match[i].group[j].cap");
console.log(j + ": " + cap);
j = j+1;
}
i = i+1;
}
// Capture group 0 always contains the entire match — that is, the portion of the input string that matches the full regular expression.
// Output
// Match 1:
// 0: Name: John Smith, Email: john.smith@example.com
// 1: John
// 2: Smith
// 3: john.smith@example.com
|