What is the regex for these complex metadata which contain elements, ids, XML and base64? | Hashnode

Thread

Gustavo Benedito Costa

Deaf-born, computer science student and language lover

Sep 29, 2018

What is the regex for these complex metadata which contain elements, ids, XML and base64?

Hi,

I would like to search elements to remove Gravit metadata and its base64 (<![CDATA[...==]]>) to optimise the SVG files. Here are:

<gravitDesigner:gravitElementRef xmlns:gravitDesigner="ns.gravit.io" xlink:href="#...."/>

<gravitDesigner:gravitGraphicSource xmlns:gravitDesigner="ns.gravit.io" id="..." version="1">
  <![CDATA[...==]]>
</gravitDesigner:gravitGraphicSource>

#regex #bash #programming

Responses(3)

Peter Scheler

JS enthusiast

Oct 1, 2018

Try this:

/<gravitDesigner(?:[^<]*?\/>|(?:.|\s)*?<\/gravitDesigner.*?>)/gm

Matt Strom

Software Engineer, TypeScript ninja

Sep 29, 2018

I think XSLT, or Extensible Stylesheet Language and Transformation, might be that tool that you need.

It's been a while since I last used XSLT, but essentially what you would do is to make a template that will match the CDATA node and then output an empty node. Something like this (not guaranteed to work, I haven't run it myself):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="w3.org/1999/XSL/Transform" xmlns:gravitDesigner="ns.gravit.io" version="1.0">
  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="//gravitDesigner:gravitGraphicSource/text()">
    <xsl:comment>Data removed</xsl:comment>
  </xsl:template>
</xsl:stylesheet>

Search Hashnode

What is the regex for these complex metadata which contain elements, ids, XML and base64?

Responses(3)

Recent in Forum