Purpose:我有一个包含大量混合内容CDATA元素的XML文档,需要以编程方式对其进行编辑.令人恼火的是,因为CDATA元素有其他/混合内容,默认的",CDATA"标记不能正常工作(根据XML规范).如果你有关于这方面的细节的问题,请让我知道.
Issue:在下面的简化示例中,我将其中带有CDATA的元素标记为",innerxml",以便自己处理前缀/后缀.对于解组,一切都按预期工作,但是使用编组(编码)时,特殊字符被转义.为什么EncodeElement方法在标记明确表示不要转义特殊字符时对其进行转义(通过",innerxml"标记)?当我在文档中读到此方法时,它向我推荐了xml.Marshal方法,其中显示了以下内容:
带有",innerxml"标记的字段是逐字写入的,不遵循通常的封送处理过程.
Example:个
以下是代码(也可以在https://go.dev/play/p/MH_ONAVaG_1处获得):
package main
import (
"encoding/xml"
"fmt"
"strings"
)
var xmlFile string = `<?xml version="1.0" encoding="UTF-8"?>
<statusdb>
<status date="today">
<![CDATA[today is < yesterday]]>
</status>
<status date="yesterday">
<![CDATA[PM,
1. there are issues with the marshaller
2. i don't know how to solve them]]>
</status>
</statusdb>`
type statusDB struct {
Status []*status `xml:"status"`
}
type status struct {
Text string `xml:",innerxml"`
Date string `xml:"date,attr"`
}
type statusMarshaller status
func main() {
var projectStatus statusDB
err := xml.Unmarshal([]byte(xmlFile), &projectStatus)
if err != nil {
fmt.Println(err)
return
}
fmt.Println("In Go: \"" + projectStatus.Status[0].Text + "\"")
fmt.Println("In Go: \"" + projectStatus.Status[1].Text + "\"")
x, err := xml.MarshalIndent(projectStatus, "", " ")
if err != nil {
fmt.Println(err)
return
}
//why this is not printing properly
fmt.Printf("%s\n", x)
}
func (tagElement *status) UnmarshalXML(d *xml.Decoder, se xml.StartElement) error {
temp := statusMarshaller{}
d.DecodeElement(&temp, &se)
temp.Text = strings.TrimSpace(temp.Text)
temp.Text = strings.TrimPrefix(temp.Text, "<![CDATA[")
temp.Text = strings.TrimSuffix(temp.Text, "]]>")
*tagElement = status(temp)
return nil
}
func (tagElement status) MarshalXML(d *xml.Encoder, se xml.StartElement) error {
tagElement.Text = "<![CDATA[" + tagElement.Text + "]]>"
temp, _ := xml.Marshal(statusMarshaller(tagElement))
return d.EncodeElement(temp, se)
}
此代码返回以下内容:
In Go: "today is < yesterday"
In Go: "PM,
1. there are issues with the marshaller
2. i don't know how to solve them"
<statusDB>
<status><statusMarshaller date="today"><![CDATA[today is < yesterday]]></statusMarshaller></status>
<status><statusMarshaller date="yesterday"><![CDATA[PM,
 1. there are issues with the marshaller
 2. i don't know how to solve them]]></statusMarshaller></status>
</statusDB>
Program exited.
Conclusion:请解释一下XML包为什么要这样做,以及可能的解决方法是什么?
谢谢!