Difference between revisions of "How can I parse XML with regular expressions"
Jump to navigation
Jump to search
m (moved How to parse XML with regular expressions to How can I parse XML with regular expressions: better) |
|||
(One intermediate revision by the same user not shown) | |||
Line 7: | Line 7: | ||
==See also== |
==See also== |
||
− | * [[Xml grep]] |
+ | * [[Xml grep]] – what you should be using instead. |
− | [[Category: |
+ | [[Category:Theoretical background]] |
[[Category:Documentation in English]] |
[[Category:Documentation in English]] |
||
[[Category:XML]] |
[[Category:XML]] |
Latest revision as of 07:38, 16 May 2013
In more technical detail, all formal languages can be placed somewhere on the Chomsky hierarchy, where some languages are more expressive than others. In particular, context-free languages are more expressive than regular languages. Ignoring some technical details, this means that a regular language can never be used to completely parse non-trivial context-free languages (this can be proven by the pumping lemma). XML is a context-free language, while regular expressions are regular, so regular expressions can—provably—never fully parse XML.
That said, regexes can be useful as tools in the parsing process (e.g. to find the next occurrence of a non-quote-character).
See also[edit]
- Xml grep – what you should be using instead.