super_read_only mode breaks replication with some queries

Bug #1441259 reported by Arturas Moskvinas
40
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Invalid
Undecided
Unassigned
5.6
Fix Released
High
Yura Sorokin
5.7
Invalid
High
Yura Sorokin

Bug Description

We're using super_read_only mode on slaves to avoid accidental data changes/corruptions if modification query is executed on replica directly. It was working fine with replication and yesterday replication stopped with such error:
Last_SQL_Error: Error 'The MySQL server is running with the --read-only (super) option so it cannot execute this statement' on query. Default database: 'somerandomdb'. Query:
'UPDATE tbl INNER JOIN tbl_postfix on tbl.id_ = tbl_postfix.tbl_id SET is_visible = 1 WHERE hex(tbl_postfix.postfix_id) LIKE 'yadada%' AND tbl.is_visible = 0'

Table definitions:
CREATE TABLE `tbl` (
  `created` datetime NOT NULL,
  `updated` datetime NOT NULL,
  `id_` binary(16) NOT NULL,
  `updated_by` varchar(191) DEFAULT NULL,
  `deleted` datetime DEFAULT NULL,
  `type_` varchar(191) NOT NULL,
  `parent_id` binary(16) DEFAULT NULL,
  `user_type` varchar(191) NOT NULL,
  `device_type` varchar(191) DEFAULT NULL,
  `issue_type` varchar(191) DEFAULT NULL,
  `is_visible` tinyint(1) DEFAULT NULL,
  `user_status` varchar(191) DEFAULT 'signed-in',
  PRIMARY KEY (`id_`),
  KEY `parent_id` (`parent_id`),
  KEY `ix_tbl_deleted` (`deleted`),
  CONSTRAINT `tbl_ibfk_1` FOREIGN KEY (`parent_id`) REFERENCES `tbl` (`id_`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `tbl_postfix` (
  `tbl_id` binary(16) NOT NULL,
  `postfix_id` binary(16) NOT NULL,
  `ordinal` int(11) DEFAULT NULL,
  PRIMARY KEY (`tbl_id`,`postfix_id`),
  KEY `ix_tbl_postfix_tbl_id` (`tbl_id`),
  KEY `ix_tbl_postfix_postfix_id` (`postfix_id`),
  CONSTRAINT `tbl_postfix_ibfk_1` FOREIGN KEY (`tbl_id`) REFERENCES `tbl` (`id_`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Percona server version we use: 5.6.21-70.1-log, platform - ubuntu.

tags: added: super-read-only
Changed in percona-server:
assignee: nobody → Muhammad Irfan (muhammad-irfan)
Revision history for this message
Muhammad Irfan (muhammad-irfan) wrote :

I did tried to reproduce it with no luck so far. Can you please provide my.cnf file from both servers i.e. master and slave.
Further, is the slave stops only on join query with update statement ? Also, provide slave server error log to check further.

Changed in percona-server:
assignee: Muhammad Irfan (muhammad-irfan) → nobody
status: New → Incomplete
Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

I'll try to repeat on actual data set, I had to resync slaves and I didn't have time to save all logs and data. It will take me several days to prepare everything, I though it will be easy to repeat issue, but yes - only update with joins failed, everything else worked fine.

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

Ok, I'm able to reproduce slave error immediately, trying to carve smaller test case.

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

Muhammad - I was able to reproduce issue with my provided examples in bug description. I created tbl and tbl_postfix tables on master server, set super_read_only mode on on its' slave and executed:
'UPDATE tbl INNER JOIN tbl_postfix on tbl.id_ = tbl_postfix.tbl_id SET is_visible = 1 WHERE hex(tbl_postfix.postfix_id) LIKE 'yadada%' AND tbl.is_visible = 0'

Which immediately broke replication on slave:
               Last_SQL_Errno: 1290
               Last_SQL_Error: Error 'The MySQL server is running with the --read-only (super) option so it cannot execute this statement' on query. Default database: 'test'. Query: 'UPDATE tbl INNER JOIN tbl_postfix on tbl.id_ = tbl_postfix.tbl_id SET is_visible = 1 WHERE hex(tbl_postfix.postfix_id) LIKE 'yadada%' AND tbl.is_visible = 0'

Mysql version which we use: 5.6.21-70.1-698.precise

Changed in percona-server:
status: Incomplete → New
Revision history for this message
Muhammad Irfan (muhammad-irfan) wrote :

Now I am able to reproduce the same with below settings.

super_read_only=1
read_only=1

slave1 [localhost] {msandbox} ((none)) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
                   Last_Errno: 1290
                   Last_Error: Error 'The MySQL server is running with the --read-only (super) option so it cannot execute this statement' on query. Default database: 'test'. Query: 'UPDATE tbl INNER JOIN tbl_postfix on tbl.id_ = tbl_postfix.tbl_id SET is_visible = 1 WHERE hex(tbl_postfix.postfix_id) LIKE 'yadada%' AND tbl.is_visible = 0'

master [localhost] {msandbox} ((none)) > SHOW GLOBAL VARIABLES LIKE "%version%";
+-------------------------+--------------------------------------------------+
| Variable_name | Value |
+-------------------------+--------------------------------------------------+
| innodb_version | 5.6.21-rel70.1 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.6.21-70.1-log |
| version_comment | Percona Server (GPL), Release 70.1, Revision 698 |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
+-------------------------+--------------------------------------------------+

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Muhammad, is it upstream? (i.e. WebScaleSQL)

Revision history for this message
Arturas Moskvinas (arturas-w) wrote :

Any news on this bug?

Revision history for this message
Rick Pizzi (pizzi) wrote :

This bug is nasty.

It hits when the following cases are met:

- UPDATE with JOIN
- BINLOG_FORMAT = STATEMENT

Very easy to reproduce:

use rick;

CREATE TABLE `t1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `val` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1;
Query OK, 1 row affected (0.00 sec)

CREATE TABLE `t2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `val` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1;
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1 values (1, 1);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t2 values (1, 2);
Query OK, 1 row affected (0.00 sec)

mysql> SET SESSION binlog_format = 'STATEMENT';
Query OK, 0 rows affected (0.00 sec)

mysql> update rick.t1 left join rick.t2 using(id) set t1.val = 19;
Query OK, 0 rows affected, 1 warning (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0

=> replication breaks on slave with the following error:

Error 'The MySQL server is running with the --read-only (super) option so it cannot execute this statement' on query. Default database: 'rick'. Query: 'update rick.t1 left join rick.t2 using(id) set t1.val = 19'

This makes the option "super_read_only" unusable in production.

Revision history for this message
Rick Pizzi (pizzi) wrote :

This is still present in latest version - 5.6.25-73.1-log

Revision history for this message
monty solomon (monty+launchpad) wrote :

The failure reported in bug 1500638 happened again

               Last_SQL_Error: Error 'The MySQL server is running with the --read-only (super) option so it cannot execute this statement' on query. Default database: 'BroadcastApi'. Query: 'UPDATE channels c, accounts a SET c.active = false WHERE c.active = true and a.deleted = true and c.accountGuid = a.accountGuid'

     Last_SQL_Error_Timestamp: 150929 10:10:25

tags: added: i63567
Revision history for this message
Yura Sorokin (yura-sorokin) wrote :

The problem was introduced in commit 78e1f5d "Cherry picking patch for BUG#37051" (https://github.com/percona/percona-server/commit/78e1f5d5232a471e21234efb55a683161b72d40a)
Original BUG#37051 "Replication rules not evaluated correctly"
(https://bugs.mysql.com/bug.php?id=37051).

In "sql_parse.cc" in "mysql_execute_command()", "case SQLCOM_UPDATE_MULTI:" branch
*****************************************
#ifdef HAVE_REPLICATION
    /* Check slave filtering rules */
    if (unlikely(thd->slave_thread && !have_table_map_for_update))
    {
      if (all_tables_not_ok(thd, all_tables))
      {
        if (res!= 0)
        {
          res= 0; /* don't care of prev failure */
          thd->clear_error(); /* filters are of highest prior */
        }
        /* we warn the slave SQL thread */
        my_error(ER_SLAVE_IGNORED_TABLE, MYF(0));
        break;
      }
      if (res)
        break;
    }
    else
    {
#endif /* HAVE_REPLICATION */
      if (res)
        break;
      bool enforce_ro = true;
      if (!opt_super_readonly)
        enforce_ro = !(thd->security_ctx->master_access & SUPER_ACL);
      if (opt_readonly &&
   enforce_ro &&
   some_non_temp_table_to_be_updated(thd, all_tables))
      {
        my_error(ER_OPTION_PREVENTS_STATEMENT, MYF(0),
                 opt_super_readonly ? "--read-only (super)" : "--read-only");
      }
#ifdef HAVE_REPLICATION
    } /* unlikely */
#endif
*****************************************

"!have_table_map_for_update" additional condition was put in the wrong place. It should have been added inside the "slave" ("if (unlikely(thd->slave_thread))") branch.

Revision history for this message
Yura Sorokin (yura-sorokin) wrote :

Although the original bug was fixed in 5.5 (or even 5.1) and therefore a new bug was introduced in the same version, this problem did not reveal itself until "super_read_only" functionality was introduced (in 5.6).

Despite the fact that the execution was going through the wrong path ("else" block of the "if (unlikely(thd->slave_thread))"), the error was never reported.

Here is this "else" block from 5.5
*****************************************
    else
    {
#endif /* HAVE_REPLICATION */
      if (res)
        break;
      if (opt_readonly &&
   !(thd->security_ctx->master_access & SUPER_ACL) &&
   some_non_temp_table_to_be_updated(thd, all_tables))
      {
 my_error(ER_OPTION_PREVENTS_STATEMENT, MYF(0), "--read-only");
 break;
      }
#ifdef HAVE_REPLICATION
    } /* unlikely */
#endif

*****************************************
Because for slave threads "thd->security_ctx->master_access" was always set in "Security_context::skip_grants()" and therefore always included "SUPER_ACL" flag, "my_error()" was never called.

Revision history for this message
Yura Sorokin (yura-sorokin) wrote :

Original MySQL Server 5.5 and 5.6 still go through the wrong branch, but for the same reason (no "super_read_only" mode) it's not a problem there.

In MySQL Server 5.7 "case SQLCOM_UPDATE_MULTI:" code branch was seriously reworked and the problem does not appear this this version.

Revision history for this message
Yura Sorokin (yura-sorokin) wrote :

Percona Server 5.7 does not have this problem either.

Revision history for this message
Yura Sorokin (yura-sorokin) wrote :
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Removing 5.7 milestone and "Fix Released" status as 5.7 was not affected by the bug and the patch only upmerged the testcase.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-891

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.