Capture a substring between two characters?
Capture a substring between two characters?
I am trying to write a regex pattern which will capture a substring between two characters. The string is
default_checks/my_checks/VLG6.3: Unsupported system function call
I need to capture VLG6.3
. It is between a slash /
and a colon :
.
VLG6.3
/
:
I have tried these ideas
my $rule = $line =~ //(.*):/;
my $rule = $line =~ //(.+?):/ ;
my $rule = $line =~ //(w+):/ ;
But none of them are working. In the best case I get my_checks/VLG6.3
my_checks/VLG6.3
4 Answers
4
You are interested in a non-empty string, meeting the following conditions:
/
/
So the intuitive regex, without any capturing group is:(?<=/)[^/:]+(?=:)
(positive lookbehind, the actual content
and positive lookahead).
(?<=/)[^/:]+(?=:)
Using such a regex, you can:
=~
$&
And the example script can look like below:
use strict;
use warnings;
my $line = 'default_checks/my_checks/VLG6.3: Unsupported system function call';
print "Source: $linen";
if ($line =~ /(?<=/)[^/:]+(?=:)/) {
print "Rule: $&n";
} else {
print "No match.n";
}
Aside from the issue with assigning a list to a scalar, which ikegami has helpfully pointed out, the regex pattern can use some fixing.
The repeater *
in regex is greedy. It gobbles up as many characters as it can as long as it matches. You need to let another repeater do the gobbling up front so that it only leaves just enough for the repeater you really want to match.
*
my ($rule) = $line =~ /.*/(.*):/;
Alternatively, in this case you can just use an exclusion class instead of matching any characters.
my ($rule) = $line =~ //([^/]*):/;
Both of the above will end up with $rule
assigned with 'VLG6.3'
.
$rule
'VLG6.3'
The reason you are getting 1
is because you are evaluating the match in scalar context. For the match to return the captures, it needs to be evaluated in list context.
1
You need to evaluate the match in list context by evaluating the =~
in list context. Unlike the scalar assignment operator you used, the list assignment operator evaluates its operands in list context. You can cause the list assignment operator to be used by replacing my $rule
with my ($rule)
.
=~
my $rule
my ($rule)
my ($rule) = $line =~ //(.*):/;
See Why are there parentheses around scalar when assigning the return value of regex match in this Perl snippet?.
Furthermore, the match operator will grab more than desired. You can address that by replacing
//(.*):/
with
//([^/])*:/
I would write that as follows:
m{/([^/])*:}
Thank you for the quick answer! But unfortunately, this does not fully work. when i print $rule, i get a "my_checks/VLG6.3". And i just need what is after the last "/" ...
– kalonkadour
Jun 30 at 0:40
Change
.*
to [^/]*
– Barmar
Jun 30 at 1:00
.*
[^/]*
I have updated my answer to account for that problem too.
– ikegami
Jun 30 at 1:34
To capture a string between two characters, capture everything that is not the two characters.
my $line = 'default_checks/my_checks/VLG6.3: Unsupported system function call';
my ( $rule ) = $line =~ //([^/:]*):/;
print "$rulen";
PS: To capture content between two string involves skipping sequences of the starting string.
my $line = 'begin not this begin or this begin wanted end not this end or this end';
my ( $rule ) = $line =~ m{ (?: begin .* )? begin (.*?) end }msx;
print "$rulen";
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
That worked great, thank you very much !
– kalonkadour
22 hours ago