我是一名围棋新手,试图在生产环境中调试一些网络问题,在一个正在经历一些问题的应用程序上,我怀疑这些问题与DNS解析有关.长话短说,在成功连接到Redis集群并与其交互后,连接失败,应用程序开始记录大量dial: tcp :0: connection refused个错误.此消息中的:0:是我怀疑DNS的原因,因为据我所知,这表明它正试图连接到零值远程地址,我真的不知道为什么会发生这种情况,除非主机名查找出现问题.

无论如何,如果任何人对这个特定的问题有一些见解,我很想听听,但这里的主要问题与我在try 调试这个程序时遇到的一个问题有关:

为了了解主机名解析的实际情况,我向拨号器传递了一个定制的解析器,它从自己的拨号器返回包装的连接.连接包装器本质上只记录在本地连接上读取和写入的字节数,而不记录其他内容.这里有一个简单的可重现的例子:

package main

import (
    "context"
    "fmt"
    "net"
    "time"
)

type ConnWrapper struct {
    net.Conn
}

func (c *ConnWrapper) Write(bytes []byte) (int, error) {
    n, err := c.Conn.Write(bytes)
    if err != nil {
        fmt.Printf("Failed to write bytes %s: %v\n", string(bytes), err)
    } else {
        fmt.Printf("Successfully wrote %d of bytes %x\n", n, bytes[:n])
    }
    return n, err
}

func (c *ConnWrapper) Read(bytes []byte) (int, error) {
    n, err := c.Conn.Read(bytes)
    if err != nil {
        fmt.Printf("Failed to read bytes: %v\n", err)
    } else {
        fmt.Printf("Successfully read %d of bytes %x\n", n, bytes[:n])
    }
    return n, err
}

func (c *ConnWrapper) SetDeadline(t time.Time) error {
    fmt.Printf("Setting deadline %v %s %s\n", t, c.LocalAddr(), c.RemoteAddr())
    return c.Conn.SetDeadline(t)
}

func (c *ConnWrapper) SetReadDeadline(t time.Time) error {
    fmt.Printf("Setting read deadline %v %s %s\n", t, c.LocalAddr(), c.RemoteAddr())
    return c.Conn.SetReadDeadline(t)
}

func (c *ConnWrapper) SetWriteDeadline(t time.Time) error {
    fmt.Printf("Setting write deadline %v %s %s\n", t, c.LocalAddr(), c.RemoteAddr())
    return c.Conn.SetWriteDeadline(t)
}

func main() {
    var d *net.Dialer
    d = &net.Dialer{
        Timeout: time.Duration(30) * time.Minute,
        Resolver: &net.Resolver{
            PreferGo: true,
            Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
                fmt.Printf("Redis dialing (from resolver) %s %s\n", network, address)
                conn, err := d.DialContext(ctx, network, address)

                if err != nil {
                    fmt.Printf("Redis resolver failed %v\n", err)
                }

                // When I return c2, the reads time out
                c2 := &ConnWrapper{conn}
                return c2, err

                // When I return conn, everything is fine
                //return conn, err
            },
        },
    }
    conn, err := d.DialContext(context.Background(), "tcp", "redis-node-0:6379")
    if err != nil {
        fmt.Printf("Redis dial failed %v\n", err)
    } else {
        fmt.Printf("Successfully dialed %s\n", conn.RemoteAddr())
    }
}

我不明白的是,当我返回本机连接时,一切都按预期运行.当我返回包装的连接时,连接在解析DNS名称时挂起/超时(特别是在conn.Read调用中).我很困惑为什么会发生这种情况,但我对围棋的实际经验也很少,所以我想知道是否有一个简单的解释,我只是没有看到.我试着向Google和ChatGPT请教,但都无济于事,所以希望社区里的人能给我一些建议.

执行上面的程序会在我的系统上产生以下输出(使用Golang 1.21)

Redis dialing (from resolver) udp 1.1.1.1:53
Redis dialing (from resolver) udp 1.1.1.1:53
Setting deadline 2023-10-15 10:43:44.031475 -0400 AST m=+5.001741085 192.168.68.61:62862 1.1.1.1:53
Setting deadline 2023-10-15 10:43:44.031494 -0400 AST m=+5.001759751 192.168.68.61:57358 1.1.1.1:53
Successfully wrote 49 of bytes 002f53c8010000010000000000010c72656469732d6e6f64652d30056c6f63616c00001c000100002904d0000000000000
Successfully wrote 49 of bytes 002f29c8010000010000000000010c72656469732d6e6f64652d30056c6f63616c000001000100002904d0000000000000
Successfully read 2 of bytes 002f
Successfully read 2 of bytes 002f
Failed to read bytes: read udp 192.168.68.61:57358->1.1.1.1:53: i/o timeout
Redis dialing (from resolver) udp 1.0.0.1:53
Failed to read bytes: read udp 192.168.68.61:62862->1.1.1.1:53: i/o timeout
Redis dialing (from resolver) udp 1.0.0.1:53
Setting deadline 2023-10-15 10:43:49.033465 -0400 AST m=+10.003676876 192.168.68.61:64641 1.0.0.1:53
Setting deadline 2023-10-15 10:43:49.033817 -0400 AST m=+10.004027543 192.168.68.61:59264 1.0.0.1:53
Successfully wrote 49 of bytes 002f6612010000010000000000010c72656469732d6e6f64652d30056c6f63616c00001c000100002904d0000000000000
Successfully wrote 49 of bytes 002fd26c010000010000000000010c72656469732d6e6f64652d30056c6f63616c000001000100002904d0000000000000
Successfully read 2 of bytes 002f
Failed to read bytes: read udp 192.168.68.61:59264->1.0.0.1:53: i/o timeout
Redis dialing (from resolver) udp 1.1.1.1:53
Setting deadline 2023-10-15 10:43:54.036073 -0400 AST m=+15.006228210 192.168.68.61:56757 1.1.1.1:53
Failed to read bytes: read udp 192.168.68.61:64641->1.0.0.1:53: i/o timeout
Redis dialing (from resolver) udp 1.1.1.1:53
Successfully wrote 49 of bytes 002f4f6d010000010000000000010c72656469732d6e6f64652d30056c6f63616c00001c000100002904d0000000000000
Setting deadline 2023-10-15 10:43:54.036487 -0400 AST m=+15.006642126 192.168.68.61:58296 1.1.1.1:53
Successfully wrote 49 of bytes 002f404c010000010000000000010c72656469732d6e6f64652d30056c6f63616c000001000100002904d0000000000000
Successfully read 2 of bytes 002f
Successfully read 2 of bytes 002f
Failed to read bytes: read udp 192.168.68.61:58296->1.1.1.1:53: i/o timeout
Failed to read bytes: read udp 192.168.68.61:56757->1.1.1.1:53: i/o timeout

当我更新程序以简单地返回内置的conn实例时,我得到如下结果(请注意,我在一个不能使用redis主机的环境中运行这个程序,所以"没有这样的主机"实际上是这里预期的行为.关键是,实际上已经完成了DNS查找.)

Redis dialing (from resolver) udp 1.1.1.1:53
Redis dialing (from resolver) udp 1.1.1.1:53
Redis dialing (from resolver) udp 1.1.1.1:53
Redis dialing (from resolver) udp 1.1.1.1:53
Redis dial failed dial tcp: lookup redis-node-0 on 1.1.1.1:53: no such host

推荐答案

解析器区分net.PacketConnnet.Conn(documentation).如果Dial返回数据包连接,则返回数据包连接包装.

包装是这样的:

type PacketConnWrapper struct {
    ConnWrapper
    pc net.PacketConn
}

func (c PacketConnWrapper) ReadFrom(bytes []byte) (int, net.Addr, error) {
    n, addr, err := c.pc.ReadFrom(bytes)
    if err != nil {
        fmt.Printf("Failed to read from bytes: %v\n", err)
    } else {
        fmt.Printf("Successfully read %d bytes %x from %s\n", n, bytes[:n], addr)
    }
    return n, addr, err
}

func (c PacketConnWrapper) WriteTo(bytes []byte, addr net.Addr) (int, error) {
    n, err := c.pc.WriteTo(bytes, addr)
    if err != nil {
        fmt.Printf("Failed to write bytes %x to %v: %s\n", bytes, addr, err)
    } else {
        fmt.Printf("Successfully write bytes %x to %s\n", bytes, addr)
    }
    return n, err
}

在拨号功能中使用数据包包装,如下所示:

d = &net.Dialer{
    Timeout: time.Duration(30) * time.Minute,
    Resolver: &net.Resolver{
        PreferGo: true,
        Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
            fmt.Printf("Redis dialing (from resolver) %s %s\n", network, address)
            conn, err := d.DialContext(ctx, network, address)

            if err != nil {
                fmt.Printf("Redis resolver failed %v\n", err)
                return conn, err
            }

            if pc, ok := conn.(net.PacketConn); ok {
                return PacketConnWrapper{ConnWrapper{conn}, pc}, nil
            }

            return ConnWrapper{conn}, nil
        },
    },
 }

https://go.dev/play/p/1KOr95FbDKD

Go相关问答推荐

Google OAuth2没有刷新令牌

使用ciph.AEAD.Seal()查看内存使用情况

在Golang中Mergesort的递归/并行实现中出现死锁

Date.Format正在输出非常奇怪的日期

exec的可执行决议.命令+路径

Redis:尽管数据存在,但 rdb.Pipelined 中出现redis:nil错误

使用 goroutine 比较 Golang 中的两棵树是等价的

如何测试光纤参数

从单词中删除特殊字符

类型/ struct 函数的 GoDoc 示例函数

如何使用 Go 代理状态为 OK 的预检请求?

将 big.Int 转换为 [2]int64,反之亦然和二进制补码

grpc-gateway:重定向与定义不匹配(原始文件)

数据流中的无根单元错误,从 Golang 中的 PubSub 到 Bigquery

如何在 GORM 中迭代一个 int 数组

golang jwt.MapClaims 获取用户ID

httprouterhttp.HandlerFunc() 是如何工作的?

使用 xml.Name 将 xml 解组为 [] struct

Go 赋值涉及到自定义类型的指针

在 go (1.18) 的泛型上实现多态的最佳方法是什么?