How can I remove all diacritics from the given UTF8 encoded string using Go? e.g. transform the string "žůžo"
=> "zuzo"
. Is there a standard way?
How can I remove all diacritics from the given UTF8 encoded string using Go? e.g. transform the string "žůžo"
=> "zuzo"
. Is there a standard way?
您可以使用Text normalization in Go中描述的库.
下面是这些库的一个应用程序:
// Example derived from: http://blog.golang.org/normalization
package main
import (
"fmt"
"unicode"
"golang.org/x/text/transform"
"golang.org/x/text/unicode/norm"
)
func isMn(r rune) bool {
return unicode.Is(unicode.Mn, r) // Mn: nonspacing marks
}
func main() {
t := transform.Chain(norm.NFD, transform.RemoveFunc(isMn), norm.NFC)
result, _, _ := transform.String(t, "žůžo")
fmt.Println(result)
}