Find correct regex for XML tag value – java

问题内容:

I would like to find correct regex to get value of xml tag and replace it with X.
This tag:

<number>1234I0000ABC0001</number>

I creted regex like this:

.*number>([A-Z0-9 _]*[A-Z0-9][A-Z0-9 _]*)</

but it is not work as weel as I want. I would like to get the value by regex, replace all characters with X and set this changed value into tag.

问题评论:

    
Don’t use regular expressions to parse XML – use an XML parser instead. Java has extensive support for parsing XML in the standard library.
    
I know, but it requires parsing big xml into DOC, which is not a good solution for me. IT takes too much time.
– allocer
7 hours ago
1  
please check whether the below link helps you stackoverflow.com/questions/13241615/…
1  
@allocer It might take a long time, but unlike regular expressions, it will give you the right answer.

答案:

答案1:

It s not a good idea to parse XML with regex. But if you insist then you can use

<number>([\s\S]*?)<\/number>

this will capture the value as Group 1. You can easily replace that with whatever you like. For detail explanation you can visit this regex101
in live action

答案评论:

    
Any solution using regular expressions will have bugs. This solution for example will give you false matches on commented-out <number> elements, and it will fail to match valid <number> elements containing whitespace or namespace declarations in the start tag, or comments in the value, and it will fail completely if your XML document uses external entities or character references. Users will not thank you (and will flood StackOverflow with questions) if they send you valid documents that you handle incorrectly. We get fed up with this, which is why we advise you not to do it.
    
Yes, your explationation is very right and legit. Hence in the very first line I warned about it. But as @allocer asked the question specifying something with Regex, I think it will just show him a way and he will know the pitfalls too. End of the day he might know what suits him most.

答案2:

You might look at something like this:

(<.+>)(.+)(</.+>)

or

<number>(.+?)</number>

I have to note that it is not actually a number 🙂

It’ll be group(1)

答案评论:

原文地址:

https://stackoverflow.com/questions/47749150/find-correct-regex-for-xml-tag-value-java

添加评论

友情链接:蝴蝶教程