How to extract text between two different xml tags multiline

问题内容:

For example we have some xml like this

<parent>
    <child>SomeText</child>sometext<otherChild>sometext</otherChild>
    <child>SomeText2</child>somtext2<otherChild>sometext2</otherChild>
</parent>

Which regex could be applied in order to extract content after </child> and before next <child>
This string should be extracted sometext<otherChild>sometext</otherChild> in group 1, group 2 should include somtext2<otherChild>sometext2</otherChild>.

Already tried to apply regex like this but it works only for the first match

String textToParse = ...;
Pattern pattern = Pattern.compile("(?<=</child>)(.*?)(?=<child>)", Pattern.DOTALL);

        final Matcher matcher = pattern.matcher(textToParse);
        if (matcher.find()) {
            LOGGER.info(matcher.group());
        }

问题评论:

1  
what about idene demo

答案:

答案1:

This should work:

Pattern pattern = Pattern.compile("(?<=</child>)(.*?)(?=<child>|</parent>)", Pattern.DOTALL);

Add the |</parent> because in the last match there is no next <child> tag.

Also you should do matcher.find() and matcher.group() again to get to the next match.

答案评论:

原文地址:

https://stackoverflow.com/questions/47748001/how-to-extract-text-between-two-different-xml-tags-multiline

Tags:, ,

添加评论

友情链接:蝴蝶教程