• 欢迎访问DBA的辛酸事儿,推荐使用最新版火狐浏览器和Chrome浏览器访问本网站
  • 欢迎大家关注博主公众号:DBA的辛酸事儿
  • 博文中若有错误的地方,请大家指正,大家的指正是我前进的动力

MySQL多线程复制报错之slave_pending_jobs_size_max

MySQL SEian.G 3年前 (2021-07-06) 1511次浏览 已收录 0个评论

近期遇到一个主从复制报错的问题,具体的报错详情如下所示:

mysql >show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: xxxxx
                  Master_User: dba_repl
                  Master_Port: 4306
                Connect_Retry: 60
              Master_Log_File: mysqlbin.001589
          Read_Master_Log_Pos: 353490645
               Relay_Log_File: slave-relay-bin.004764
                Relay_Log_Pos: 912732994
        Relay_Master_Log_File: mysqlbin.001588
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1864
                   Last_Error: Cannot schedule event Rows_query, relay-log name /data/mysql_4306/log/slave-relay-bin.004764, position 912733141 to Worker thread because its size 21520792 exceeds 18777088 of slave_pending_jobs_size_max.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 912732823
              Relay_Log_Space: 1427234531
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1864
               Last_SQL_Error: Cannot schedule event Rows_query, relay-log name /data/mysql_4306/log/slave-relay-bin.004764, position 912733141 to Worker thread because its size 21520792 exceeds 18777088 of slave_pending_jobs_size_max.
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 255
                  Master_UUID: 548723ca-1f7f-11e9-b3ab-005056b748c5
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 210514 16:02:41
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 548723ca-1f7f-11e9-b3ab-005056b748c5:532252-1257990582
            Executed_Gtid_Set: 548723ca-1f7f-11e9-b3ab-005056b748c5:1-1257510314,
5965cc1e-42b8-11e8-9759-005056b7d9af:1-2805913654
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

从报错信息看,和参数slave_pending_jobs_size_max有关,那么这个参数具体是干什么的呢?我们今天就来了解一下;

下面是官方文档对该参数的介绍:

For multithreaded replicas, this variable sets the maximum amount of memory (in bytes) available to worker queues holding events not yet applied. Setting this variable has no effect on replicas for which multithreading is not enabled. Setting this variable has no immediate effect. The state of the variable applies on all subsequent START SLAVE commands.

The minimum possible value for this variable is 1024; the default is 16MB. The maximum possible value is 18446744073709551615 (16 exabytes). Values that are not exact multiples of 1024 are rounded down to the next-highest multiple of 1024 prior to being stored.

The value of this variable is a soft limit and can be set to match the normal workload. If an unusually large event exceeds this size, the transaction is held until all the worker threads have empty queues, and then processed. All subsequent transactions are held until the large transaction has been completed.

大概的意思是:

对于多线程复制,slave_pending_jobs_size_max变量设置用于保存尚未应用的event的工作队列可用的最大内存量(以字节为单位)。设置此变量对未启用多线程处理的复制没有影响。设置此变量不会立即生效。必须要停掉复制之后,重新start slave。

此变量的最小值为1024;默认值为16MB。最大可能值为18446744073709551615(16 EB)。

此变量的值是软限制,可以设置为与正常工作负载匹配。如果异常大的事件超过此大小,事务将被保留,直到所有工作线程都有空队列,然后进行处理。如果内存富余,或者延迟较大时,可以适当调大;注意这个值要比主库的max_allowed_packet大!

划重点

该参数在多线程复制中起作用,

当worker线程正在处理的event的总大小超过slave_pending_jobs_size_max变量的大小时,将发生此等待操作。此时可有在主库看到线程的状态为: Waiting for Slave Workers to free pending events

当event大小降至该限制以下时,协调器将恢复调度。仅当slave_parallel_workers设置为大于0时,才会出现此状态。

那么对该参数了解之后,具体的解决方法也就有了:

1、查看主库max_allowed_packet的大小

mysql> show variables like '%max_allowed_packet%';
+--------------------------+------------+
| Variable_name            | Value      |
+--------------------------+------------+
| max_allowed_packet       | 67108864   |
| slave_max_allowed_packet | 1073741824 |
+--------------------------+------------+
2 rows in set (0.00 sec)

2、设置从库slave_pending_jobs_size_max的大小,注意,需要大于主库max_allowed_packet的大小

mysql>stop slave;
Query OK, 0 rows affected (0.01 sec)

mysql>set global slave_pending_jobs_size_max=1073741824;
Query OK, 0 rows affected (0.00 sec)

mysql>start slave;
Query OK, 0 rows affected (0.02 sec)

MySQL多线程复制报错之slave_pending_jobs_size_max


如果您觉得本站对你有帮助,那么可以收藏和推荐本站,帮助本站更好地发展,在此谢过各位网友的支持。
转载请注明原文链接:MySQL多线程复制报错之slave_pending_jobs_size_max
喜欢 (4)
SEian.G
关于作者:
用心去记录工作,用心去感受生活,用心去学着成长;座右铭:苦练七十二变,笑对八十一难
发表我的评论
取消评论

表情 贴图 加粗 删除线 居中 斜体 签到

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址