如何在日常工作中遇到的问题学到些什么? 这是我开启杂记-FAQ的初衷.
其次就是不时去看一下, 当时理解或者求索是否正确, 常看常新.

不积跬步无以至千里.

pandas==2.1.1读取excel的默认引擎问题

1
2
data_frame = pd.DataFrame(pd.read_excel(io.BytesIO(yesterday_xls_data), sheet_name="Sheet1",dtype={}))
# ImportError: Missing optional dependency 'openpyxl'. Use pip or conda to install openpyxl
  • FAQ:

pandas在未指定引擎, 如pd.ExcelWriter(xls_path, mode='a', engine='openpyxl'), 默认指定 openpyxl作为其默认引擎.

当然也可以指定xlsxwriter作为引擎, 因为XlsxWriteropenpyxl能够操作更多excel特性, 例如:合并单元格以及设置颜色等等

  • Origin:

在给AWS Lambda python 3.11增加了pandas 2.1.1层后, 报此错.

原因就在于本地环境已安装, 这一点请后续做docker还是其他service-less的层注意.

bcrypt==4.0.1AWS Lambda中做层失败问题

参考

在本地使用pip install安装包, 这些包会下载并编译适用与本地机器架构.但是, 包含已编译代码(如Numpy, bcrypt)的python包并不总是与Lambda运行时的虚拟系统兼容.

1
2
#requirements.txt
bcrypt==4.0.1
1
2
3
4
5
# 创建lambda python 3.11 的layer层
# 指定x86 --platform manylinux2014_x86_64进行编译
# 指定arm --platform manylinux2014_aarch64
pip install --platform manylinux2014_x86_64 --target=./python -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple \ 
--implementation cp --python-version 3.11  --only-binary=:all: --upgrade

使用Docker搭建python 3.11开发测试环境

本次主要因为AWS Lambda python会在2023/11月份舍弃python3.7, 进而升级到python 3.9, 3.10. 3.11.

为了将已有lambda的工程适配到python 3.11, 主要有pandas, pymysql等适配.

  • Dockerfile
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 使用python 3.11.6 的镜像
FROM python:3.11.6

# 安装所需的工具和依赖项
# debian 14.4 的python 3.11 的source位于/etc/apt/sources.list.d/debian.sources文件中
# cp /etc/apt/sources.list.d/debian.sources /etc/apt/sources.list.d/debian.sources.bak && \
RUN sed -i 's/deb.debian.org/mirrors.aliyun.com/g' /etc/apt/sources.list.d/debian.sources && \
    apt-get update && \
    apt-get install -y openssh-server vim zip

# 创建特权分离目录
# 设置 SSH 密码(请根据需要更改)
# 允许 root 用户远程登录, 密码为root123
# 使用密码进行身份验证
RUN mkdir /run/sshd && \
    echo 'root:root123' | chpasswd && \
    sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config && \  
    sed -i 's/#PasswordAuthentication yes/PasswordAuthentication yes/' /etc/ssh/sshd_config

# 暴露SSH端口
EXPOSE 22

# 启动 SSH 服务
CMD ["/usr/sbin/sshd", "-D"]
  • build and run
1
2
sudo docker build --rm -t python3.11-debian:v1 .
sudo docker run --name debian-py311-ssh --restart always -p 2222:22 -dti python3.11-debian:v1
  • 关于debian 14.4基础镜像的ap source源配置

/etc/apt/sources.list.d/debian.sources文件格式, 通过sed s/../g进行全局替换

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## debian.sources
Types: deb
# http://snapshot.debian.org/archive/debian/20231009T000000Z
URIs: http://deb.debian.org/debian
Suites: bookworm bookworm-updates
Components: main
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg

Types: deb
# http://snapshot.debian.org/archive/debian-security/20231009T000000Z
URIs: http://deb.debian.org/debian-security
Suites: bookworm-security
Components: main
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg

fatal error: concurrent map read and map write

githu issue

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[GIN] 2023/11/08 - 01:15:52 | 200 |   67.748541ms |    172.31.31.36 | POST     "/v3/tokyo-higher/exercise/log/submit"
fatal error: concurrent map writes

goroutine 5123434 [running]:
runtime.throw({0xff5a02, 0x15})
        D:/services/go-v1.18.5/src/runtime/panic.go:992 +0x5c fp=0x6724d64 sp=0x6724d50 pc=0x48b94
runtime.mapassign_faststr(0xf63c58, 0x814a420, {0x38c8c20, 0xb})
        D:/services/go-v1.18.5/src/runtime/map_faststr.go:212 +0x3f8 fp=0x6724d98 sp=0x6724d64 pc=0x22dc4
net/http.(*Request).ParseMultipartForm(0x6f72680, 0x2000000)
        D:/services/go-v1.18.5/src/net/http/request.go:1327 +0x200 fp=0x6724e1c sp=0x6724d98 pc=0x2e42ac
github.com/gin-gonic/gin/binding.formMultipartBinding.Bind({}, 0x6f72680, {0xe2d368, 0x7e57d00})
        D:/AwesomeGo/pkg/mod/github.com/gin-gonic/gin@v1.7.7/binding/form.go:55 +0x2c fp=0x6724e40 sp=0x6724e1c pc=0x520d14
github.com/gin-gonic/gin/binding.(*formMultipartBinding).Bind(0x1c01970, 0x6f72680, {0xe2d368, 0x7e57d00})
        <autogenerated>:1 +0x44 fp=0x6724e58 sp=0x6724e40 pc=0x525d28
github.com/gin-gonic/gin.(*Context).ShouldBindWith(...)
        D:/AwesomeGo/pkg/mod/github.com/gin-gonic/gin@v1.7.7/context.go:706
github.com/gin-gonic/gin.(*Context).MustBindWith(0x6bac120, {0xe2d368, 0x7e57d00}, {0x1350ef0, 0x1c01970})
        D:/AwesomeGo/pkg/mod/github.com/gin-gonic/gin@v1.7.7/context.go:649 +0x40 fp=0x6724e7c sp=0x6724e58 pc=0x52f8c8
github.com/gin-gonic/gin.(*Context).Bind(0x6bac120, {0xe2d368, 0x7e57d00})
        D:/AwesomeGo/pkg/mod/github.com/gin-gonic/gin@v1.7.7/context.go:607 +0x248 fp=0x6724ea4 sp=0x6724e7c pc=0x52f570
elst-api/controllers/integral.PushIntegralOfEikenTokyo.func1()
        D:/AwesomeGo/src/elst-api/controllers/integral/eiken.go:200 +0x58 fp=0x6724fec sp=0x6724ea4 pc=0xd5e530
runtime.goexit()
        D:/services/go-v1.18.5/src/runtime/asm_arm.s:824 +0x4 fp=0x6724fec sp=0x6724fec pc=0x7af9c
created by elst-api/controllers/integral.PushIntegralOfEikenTokyo
        D:/AwesomeGo/src/elst-api/controllers/integral/eiken.go:197 +0x58

sence:

1
router handler_chains: api.POST("test/goroutine-chain", handlerFun0, handlerFunc1)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// first handled chain
func (s *contest) handlerFun0(c *gin.Context) {
	g := app.Gin{C: c}
	var form constants.ScoreLog
	if err := g.ShouldBind(&form); err != nil {
		g.Error("form param_missing")
		logging.Error(err.Error())
		return
	}
}

// second handled chain
func handlerFunc1(c *gin.Context) {
	go func() {
		var form FormIntegral
                // 第二次读取的时候,
		if err := c.ShouldBind(&form); err != nil {
			logging.Error(err.Error())
			return
		}
                // 主营业务
                ...     
        }
}

github.com/gin-gonic/gin@v1.7.7/context.go:705

1
2
3
func (c *Context) ShouldBindWith(obj interface{}, b binding.Binding) error {
	return b.Bind(c.Request, obj)
}

github.com/gin-gonic/gin@v1.7.7/binding/form.go:54

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
func (formMultipartBinding) Bind(req *http.Request, obj interface{}) error {
	if err := req.ParseMultipartForm(defaultMemory); err != nil {
		return err
	}
	if err := mappingByPtr(obj, (*multipartRequest)(req), "form"); err != nil {
		return err
	}

	return validate(obj)
}

go1.18/src/net/http/request.go:1299

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
func (r *Request) ParseMultipartForm(maxMemory int64) error {
	if r.MultipartForm == multipartByReader {
		return errors.New("http: multipart handled by MultipartReader")
	}
	var parseFormErr error
	if r.Form == nil {
		// Let errors in ParseForm fall through, and just
		// return it at the end.
		parseFormErr = r.ParseForm()
	}
	if r.MultipartForm != nil {
		return nil
	}

	mr, err := r.multipartReader(false)
	if err != nil {
		return err
	}

	f, err := mr.ReadForm(maxMemory)
	if err != nil {
		return err
	}

        // Point: 1 
	if r.PostForm == nil {
		r.PostForm = make(url.Values)
	}
        // The reason: fatal error: concurrent map writes
        // here will write the r.PostForm
        // the type of r.PostForm is map[string][]string in `go1.18/src/net/url/url.go:865`, which is unsafe concurrent map
	for k, v := range f.Value {
		r.Form[k] = append(r.Form[k], v...)
		// r.PostForm should also be populated. See Issue 9305.
		r.PostForm[k] = append(r.PostForm[k], v...)
	}

	r.MultipartForm = f

	return parseFormErr
}

go1.18/src/net/url/url.go:865

1
2
// no safe cocurrent 
type Values map[string][]string

To fix:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// second handled chain
func NewPushIntegralOfWrhandlerFunc1iteContest(c *gin.Context) {
	var form FormIntegral
	// the scond bind the PostForm cannot be used in goroutine and must be executed sequentially
	if err := c.ShouldBind(&form); err != nil {
			logging.Error(err.Error())
			return
	}
	// 将bind操作移出来, 仍按照链式线性执行
	// 而其他业务放在goroutine处理
	// shouldbind在高并发时会触发map的concurrent map read and map write错误, 导致服务重启
	go func() {
                // 主营业务
                ...     
    }
}

[mysql] closing bad idle connection: EOF

参考

solution: use ConnMaxLifeTime.

大量数据库连接需要维护这个ConnMaxLifeTime, 设置连接的有效时长. 定期关闭连接池连接, 超期关闭

建议时5分钟. 这个数据怎么来的??

  • 源码位置 go-sql-driver/mysql@v1.6.0/packets.go:93
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// Perform a stale connection check. We only perform this check for
// the first query on a connection that has been checked out of the
// connection pool: a fresh connection from the pool is more likely
// to be stale, and it has not performed any previous writes that
// could cause data corruption, so it's safe to return ErrBadConn
// if the check fails.
if mc.reset {
	mc.reset = false
	conn := mc.netConn
	if mc.rawConn != nil {
		conn = mc.rawConn
	}
	var err error
	// If this connection has a ReadTimeout which we've been setting on
	// reads, reset it to its default value before we attempt a non-blocking
	// read, otherwise the scheduler will just time us out before we can read
	// 引发错误源头 --1
	if mc.cfg.ReadTimeout != 0 {
		// SetReadDeadline sets the deadline for future Read calls
		// and any currently-blocked Read call.
		// A zero value for t means Read will not time out.
		err = conn.SetReadDeadline(time.Time{})
	}
	// 引发错误源头 --2
	if err == nil && mc.cfg.CheckConnLiveness {
		err = connCheck(conn)
	}
	if err != nil {
		errLog.Print("closing bad idle connection: ", err)
		mc.Close()
		return driver.ErrBadConn
	}
}

// SetConnMaxLifetime sets the maximum amount of time a connection may be reused.
//
// Expired connections may be closed lazily before reuse.
//
// If d <= 0, connections are reused forever.
func (db *DB) SetConnMaxLifetime(d time.Duration) {
    if d < 0 {
        d = 0
    }
    db.mu.Lock()
    // wake cleaner up when lifetime is shortened.
    if d > 0 && d < db.maxLifetime && db.cleanerCh != nil {
        select {
        case db.cleanerCh <- struct{}{}:
        default:
        }
    }
    db.maxLifetime = d
    db.startCleanerLocked()
    db.mu.Unlock()
}

IP的漂移和AWS Lambda缓存问题[综合]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Traceback (most recent call last):
  File "/opt/python/mysql/connector/network.py", line 459, in switch_to_ssl
    self.sock, server_hostname=self.server_host)
  File "/var/lang/lib/python3.7/ssl.py", line 423, in wrap_socket
    session=session
  File "/var/lang/lib/python3.7/ssl.py", line 870, in _create
    self.do_handshake()
  File "/var/lang/lib/python3.7/ssl.py", line 1139, in do_handshake
    self._sslobj.do_handshake()
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/python/sqlalchemy/engine/base.py", line 3240, in _wrap_pool_connect
    return fn()
  File "/opt/python/sqlalchemy/pool/base.py", line 310, in connect
    return _ConnectionFairy._checkout(self)
  File "/opt/python/sqlalchemy/pool/base.py", line 868, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/opt/python/sqlalchemy/pool/base.py", line 481, in checkout
    rec._checkin_failed(err, _fairy_was_created=False)
  File "/opt/python/sqlalchemy/util/langhelpers.py", line 72, in __exit__
    with_traceback=exc_tb,
  File "/opt/python/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/opt/python/sqlalchemy/pool/base.py", line 478, in checkout
    dbapi_connection = rec.get_connection()
  File "/opt/python/sqlalchemy/pool/base.py", line 636, in get_connection
    self.__connect()
  File "/opt/python/sqlalchemy/pool/base.py", line 666, in __connect
    pool.logger.debug("Error on connect(): %s", e)
  File "/opt/python/sqlalchemy/util/langhelpers.py", line 72, in __exit__
    with_traceback=exc_tb,
  File "/opt/python/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/opt/python/sqlalchemy/pool/base.py", line 661, in __connect
    self.dbapi_connection = connection = pool._invoke_creator(self)
  File "/opt/python/sqlalchemy/engine/create.py", line 590, in connect
    return dialect.connect(*cargs, **cparams)
  File "/opt/python/sqlalchemy/engine/default.py
  • 错误
1
2
Exec the watch primary homework mission failed, err (2013, 'Lost connection to MySQL server during query')
Exec the watch homework mission failed, err (2006, "MySQL server has gone away (TimeoutError(110, 'Connection timed out'))")
  • 错误原因:
  1. lambda通过域名连接mysql, mysql的解析IP存在网络漂移
  2. lambda存在缓存, 单个服务会一直使用缓存的漂移IP, 导致后面不断报执行错误
  • 解决 在lambda的执行脚本中, 每次都重新解析连接数据库, 即使是存在网络漂移, 也能后正确连接执行