Capture a substring between two characters?


Capture a substring between two characters?



I am trying to write a regex pattern which will capture a substring between two characters. The string is


default_checks/my_checks/VLG6.3: Unsupported system function call



I need to capture VLG6.3. It is between a slash / and a colon :.


VLG6.3


/


:



I have tried these ideas


my $rule = $line =~ //(.*):/;


my $rule = $line =~ //(.+?):/ ;


my $rule = $line =~ //(w+):/ ;



But none of them are working. In the best case I get my_checks/VLG6.3


my_checks/VLG6.3




4 Answers
4



You are interested in a non-empty string, meeting the following conditions:


/


/



So the intuitive regex, without any capturing group is:
(?<=/)[^/:]+(?=:) (positive lookbehind, the actual content
and positive lookahead).


(?<=/)[^/:]+(?=:)



Using such a regex, you can:


=~


$&



And the example script can look like below:


use strict;
use warnings;

my $line = 'default_checks/my_checks/VLG6.3: Unsupported system function call';
print "Source: $linen";
if ($line =~ /(?<=/)[^/:]+(?=:)/) {
print "Rule: $&n";
} else {
print "No match.n";
}





That worked great, thank you very much !
– kalonkadour
22 hours ago



Aside from the issue with assigning a list to a scalar, which ikegami has helpfully pointed out, the regex pattern can use some fixing.



The repeater * in regex is greedy. It gobbles up as many characters as it can as long as it matches. You need to let another repeater do the gobbling up front so that it only leaves just enough for the repeater you really want to match.


*


my ($rule) = $line =~ /.*/(.*):/;



Alternatively, in this case you can just use an exclusion class instead of matching any characters.


my ($rule) = $line =~ //([^/]*):/;



Both of the above will end up with $rule assigned with 'VLG6.3'.


$rule


'VLG6.3'



The reason you are getting 1 is because you are evaluating the match in scalar context. For the match to return the captures, it needs to be evaluated in list context.


1



You need to evaluate the match in list context by evaluating the =~ in list context. Unlike the scalar assignment operator you used, the list assignment operator evaluates its operands in list context. You can cause the list assignment operator to be used by replacing my $rule with my ($rule).


=~


my $rule


my ($rule)


my ($rule) = $line =~ //(.*):/;



See Why are there parentheses around scalar when assigning the return value of regex match in this Perl snippet?.



Furthermore, the match operator will grab more than desired. You can address that by replacing


//(.*):/



with


//([^/])*:/



I would write that as follows:


m{/([^/])*:}





Thank you for the quick answer! But unfortunately, this does not fully work. when i print $rule, i get a "my_checks/VLG6.3". And i just need what is after the last "/" ...
– kalonkadour
Jun 30 at 0:40






Change .* to [^/]*
– Barmar
Jun 30 at 1:00


.*


[^/]*





I have updated my answer to account for that problem too.
– ikegami
Jun 30 at 1:34



To capture a string between two characters, capture everything that is not the two characters.


my $line = 'default_checks/my_checks/VLG6.3: Unsupported system function call';
my ( $rule ) = $line =~ //([^/:]*):/;
print "$rulen";



PS: To capture content between two string involves skipping sequences of the starting string.


my $line = 'begin not this begin or this begin wanted end not this end or this end';
my ( $rule ) = $line =~ m{ (?: begin .* )? begin (.*?) end }msx;
print "$rulen";






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV