Select all substrings that are contained in a string


Select all substrings that are contained in a string



I have a string that contains multiple substrings that are separated by a delimiter character.
substr1#substr2#substr3...#substrN


substr1#substr2#substr3...#substrN



I want to query for all the values in a column that are also in this string.



What I have so far is:


SELECT substring_col
FROM table
WHERE id IN SUBSTR(str_to_search,INSTR(str_to_search,substring_col),INSTR(str_to_search,'#',INSTR(str_to_search,substring_col))-1)



However, this only returns the first substring that is found. How can I make it return all substrings that are found?





Could you share an example? This sounds easy but somehow I want to confirm if I am getting what you are asking.
– trollster
Jun 29 at 14:33





Probably duplicate of stackoverflow.com/questions/51045592/extract-nth-substring/…
– wolφi
Jun 29 at 15:50




3 Answers
3



If I understood you correctly, splitting that long delimited string might be what you're looking for.



Here's how:


SQL> with test as (select 'substr1#substr2#substr3#substrN' col from dual)
2 select regexp_substr(col, '[^#]+', 1, level) subs
3 from test
4 connect by level <= regexp_count(col, '#') + 1;

SUBS
--------------------------------------------------------------------------------
substr1
substr2
substr3
substrN

SQL>



It means that your query might look like this:


SELECT substring_col
FROM table
WHERE id IN (SELECT regexp_substr('substr1#substr2#substr3#substrN', '[^#]+', 1, level) subs
FROM dual
CONNECT BY level <= regexp_count('substr1#substr2#substr3#substrN', '#') + 1
);



The delimited string is probably a parameter; I guess you can rewrite the above code in that manner.



I'm afraid that regex of the form '[^#]+' does not handle NULL elements. Unfortunately it's the most common answer for questions on parsing delimited strings. For proof and details see post: https://stackoverflow.com/a/31464699/2543416. Using it, the data set with a NULL element 2 gives the following result set:


'[^#]+'


SUBS
-----------
substr1
substr3
substrN
<NULL here>

SQL>



Instead use this form built on Littlefoot's answer (Note element 2 is NULL):


with test as (select 'substr1##substr3#substrN' col from dual)
select regexp_substr(col, '(.*?)(#|$)', 1, level, NULL, 1) subs
from test
connect by regexp_substr(col, '(.*?)(#|$)', 1, level) is not null;

SUBS
-----------
substr1

substr3
substrN

SQL>



Here the 2nd element's NULL is preserved and the remaining values are in the correct position.



For your case, you may not care about the position of the value, just that it is in the list. But, for re usability (and for accuracy), you could turn this into a function where you pass in the string, the delimiter, and the value you are after and have it return it's position. Non-zero means it is in the list and also you have it's position if that is ever needed. Just a thought.



Awful data format, and there are a lot of reasons why you should change it.



But, sometimes we are stuck with other people's really, really bad formats. One method is to use like:


like


where '#' || listcol || '#' like '%#' || id || '#%'





@Boneist . . . I'm so used to doing this with commas that I messed up. Thank you.
– Gordon Linoff
Jun 30 at 2:58






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Export result set on Dbeaver to CSV

Opening a url is failing in Swift