Handling Nested CDATA With Builder21 September 2010

As noted by our associates at Atomic Object, XML doesn't allow for nested<![CDATA[…]]> elements. In the course of rewriting some pieces of code, I developed the following Builder workaround to allow our application to export valid XML by breaking the nested CDATA elements into distinct chunks. When read back in via our Nokogiri-based parser, it concatenates the values automagically, and the end result is clean, valid XML. Fix code:
module Builder
  class XmlMarkup < XmlBase

    def cdata_with_escaping!(text)
      if text =~ /(\]\]>)/
        text.gsub!(/(\]\]>)/, "]]>")
      end
      cdata_without_escaping!(text)
    end
    alias_method_chain 'cdata!', 'escaping'

  end
end

Sample output:

>> xml = Builder::XmlMarkup.new(str)
>> xml.cdata!("> xml.target!
=> ""  # valid XML!
>> xml.cdata_without_escaping!("Foo bar sna")
>> xml.target!
=> "" # invalid XML!


Sample parsing with Nokogiri:


>> doc = Nokogiri::XML("")
=> #]>]>
>> doc.css('baz').first.content
=> "Foo bar sna"
]]>

Want to talk about this a bit more? Send a tweet to @cgansen or email me at cgansen@gmail.com.