I need to extract some data from malformed XML stored in an Oracle database. The XPath expressions would look like this:
//image/type/text(). One take at a regular expression which would work in a similar fashion would be
<image>.*?<type>(.+?)<\/type> (with appropriate flags for multiline matching).
Since Oracle does not support match groups in any form for
REGEXP_SUBSTR I am unsure how to extract a set (with potentially n > 1 members) of match groups from an Oracle CLOB column. Any ideas?
AFAIK you can’t extract a set with Oracle regex functions direcly, but you can iterate through the string calling
regex_substr function and saving result to collection (or whatever you need) as a workaround, something like that:
... fOccurence := 0; loop fSubstr := regex_substr(fSourceStr, '<image>.*?<type>(.+?)<\/type>', 1, fOccurence, 'gci'); exit when fSubstr is null; fOccurence := fOccurence + 1; fResultStr := fResultStr || fSubstr; end loop; ...