Connection Failures Between Nodes in MySQL Cluster

If you're setting up MySQL Cluster on Oracle Linux or another Linux such as Red Hat, CentOS or Fedora, you might have come across some problems getting the nodes to communicate. There are two ways you might bump into problems with network connectivity: The iptables firewall, and SELinux. These security mechanisms might prevent your various nodes—management, data, and API—from communicating with each other in various ways and with various symptoms.

Let's have a look at what you're likely to see.

Data nodes stuck at "starting"

The first thing you might notice is that your data nodes get stuck in the "starting" state. Running show in the management client gives something like this:

[ndbd(NDB)]     2 node(s)
id=2    @192.168.56.211  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0, Master)
id=3    @192.168.56.212  (mysql-5.6.11 ndb-7.3.2, starting, Nodegroup: 0)

...and it never moves away from starting.

Looking in the management node's log (possibly named something like ndb_1_cluster.log), you might see something like this, repeated many times:

[MgmtSrvr] INFO     -- Node 3: Initial start, waiting for 2 to connect,  
      nodes [ all: 2 and 3 connected: 3 no-wait:  ]

This sort of thing appears too:

[MgmtSrvr] INFO     -- Node 2: Waiting 60 sec for non partitioned start,      nodes [ all: 2 and 3 connected: 2 missing: 3 no-wait:  ]

Just to make matters worse, the first data node might start, but the second gets stuck at this point:

[ndbd] INFO     -- Start phase 0 completed

What might confuse you is that it's obvious that both data nodes have connected to the management node, but they're not talking to each other. The reason for this is grounded in how the various nodes communicate with data nodes.

Data Node Communication

Data nodes communicate with the management node on port 1186. This port is registered with IANA, and is often open on system that have MySQL installed. Once a data node launches and has been given a node ID, it gets a unique dynamically allocated port on its host, and opens a socket so other hosts can communicate with it. This port is assigned by the operating system, and as a result cannot be predicted by the data node, but is usually in the range 49152–65535.

This is where the problem lies: firewalls routinely block incoming traffic on those high ports, so while each data node can open a connection to the management node, they can't open connections to other data nodes if those nodes are listening on sockets that are blocked by the firewall. MySQL Cluster won't declare the cluster started until all data nodes have connected (unless you use --nowait-nodes, and in general you shouldn't), so they get stuck in "starting" until they can talk to other data nodes.

iptables and ServerPort

You can assign fixed ports to data nodes by using the ServerPort option in your management node's config.ini file, like this:

[ndbd]
Hostname=datahost1
ServerPort=50501
[ndbd]
Hostname=datahost2
ServerPort=50502

Make sure to use ports that are unique to that host, and are unlikely to be used by other services.

Once you've done that, open those ports in the data node hosts' firewalls. To match the example shown above, open port 50501 on datahost1 and port 50502 on datahost2. For example, the following command, when run as root on datahost1, opens its socket:

iptables -I INPUT -p tcp --dport 50501 -s 192.168.56.0/24 -j ACCEPT

Once you've verified that this works, save your firewall rules with the following command (run as root) on each data node:

service iptables save

Stuck Management Nodes

Of course, it's also possible that your management node can't listen on port 1186; this is particularly possible on systems that you've installed by copying the binary from compressed archive, tar.gz or similar. If so, you might need to open that port too by using a similar technique.

For example, if you've configured two management servers, and port 1186 is not open on one of them, you'll see something like this when running show in the management client that cannot see the other:

ERROR Message: The cluster configuration is not yet confirmed 
   by all defined management servers. This management server 
   is still waiting for node 2 to connect.
Could not get configuration
*  4012: Failed to get configuration
*    The cluster configuration is not yet confirmed by all 
     defined management servers. This management server is 
     still waiting for node 2 to connect.

If one management node has port 1186 open and the other closed, the one that can communicate with the other displays this:

[ndb_mgmd(MGM)] 2 node(s)
id=1 (not connected, accepting connect from mgmhost1)
id=2    @192.168.56.216  (mysql-5.6.11 ndb-7.3.2)

iptables and PortNumber

In this situation, the one that appears to be working is actually the one with the blocked port; it can see the other, because the other's port is available. Again, a quick blast of iptables will fix the problem:

iptables -I INPUT -p tcp --dport 1186 -s 192.168.56.0/24 -j ACCEPT

You can, of course, change the management node's port. The relevant option is PortNumber, although bear in mind that 1186 is a registered port, so changing it requires changing not only your iptables Tables configuration, but also SELinux and any application connect strings you use.

MySQL Server Node failing to connect

When you install MySQL from RPM—either MySQL Server or MySQL Cluster—the installer adds SELinux configuration to the machine. The default configuration allows MySQL's server process mysqld to talk to ports 3306 and 1186. These are the ports of the mysqld process and the MySQL Cluster management node respectively.

However, the default SELinux configuration doesn't know anything about MySQL acting as an API node in a MySQL Cluster, so doesn't configure any permissions that allow the mysqld process to access the data nodes' dynamic ports.

This typically manifests itself as the data node appearing to connect, but show showing this:

[mysqld(API)]   1 node(s)
id=5 (not connected, accepting connect from any host)

The MySQL Error log shows this:

[Warning] NDB : Tables not available after 30 seconds.  
   Consider increasing --ndb-wait-setup value

The cluster log shows this:

[MgmtSrvr] INFO     -- Nodeid 5 allocated for API at 192.168.56.215
[MgmtSrvr] INFO     -- Node 5: mysqld --server-id=1
[MgmtSrvr] INFO     -- Node 3: Node 5 Connected
[MgmtSrvr] INFO     -- Node 4: Node 5 Connected
[MgmtSrvr] INFO     -- Node 4: Node 5: API mysql-5.6.11 ndb-7.3.2
[MgmtSrvr] INFO     -- Node 3: Node 5: API mysql-5.6.11 ndb-7.3.2
[MgmtSrvr] ALERT    -- Node 4: Node 5 Disconnected
[MgmtSrvr] INFO     -- Node 4: Communication to Node 5 closed
[MgmtSrvr] ALERT    -- Node 3: Node 5 Disconnected

...and several more such lines.

Attempts to create a clustered table result in a vague error:

ERROR 157 (HY000): Could not connect to storage engine

Running show warnings gives a more bizarre result:

mysql> show warnings;
+---------+------+---------------------------------------------------------------------------------+
| Level   | Code | Message                                                                         |
+---------+------+---------------------------------------------------------------------------------+
| Warning | 1296 | Got error 4009 'Cluster Failure' from NDB. Could not acquire global schema lock |
| Warning | 1296 | Got error 4009 'Cluster Failure' from NDB                                       |
| Error   |  157 | Could not connect to storage engine                                             |
| Error   | 1499 | Too many partitions (including subpartitions) were defined                      |
+---------+------+---------------------------------------------------------------------------------+

4 rows in set (0.00 sec)

The MySQL Error log adds lines like this:

[Warning] NDB: Could not acquire global schema lock (4009)Cluster Failure

SELinux and MySQL Cluster SQL Nodes

All looks pretty disastrous, but you can fix it by tweaking SELinux a bit. I've written about SELinux and MySQL in the past, and the same advice in that article applies here. You could, for example, disable SELinux entirely. But, being a more intelligent reader, you're more likely to want to know how to configure it.

Once you've set ServerPort as shown above, you can easily let SELinux in on that particular secret and permit the mysqld process to access the ports you've assigned by running the following command as root on the MySQL host:

 semanage port -a -t mysqld_port_t -p tcp 50501-50524

This opens 24 ports starting with the two in the configuration example shown earlier. Having done this, your MySQL Server instance can talk to data nodes on their fixed ports.

Other Connections

A lot of what's said in this post also applies to other connections. If you've got a Java application that uses ClusterJPA to talk to your cluster, you'd better make sure it can see the management nodes and that SELinux isn't preventing it from accessing the data nodes.

Similarly, if you've changed your management node's port from the default, ensure you've configured the connection string correctly, with that port, in all your application. This includes MySQL Server applications where you'll need to include the revised port number in the ndb-connectstring option.

You don't need to tell your applications what hosts or ports your data nodes are at. That's the joy of a centralized configuration, isn't it?

Connection Failures Between Nodes in MySQL Cluster

Data nodes stuck at "starting"

Data Node Communication

iptables and ServerPort

Stuck Management Nodes

iptables and PortNumber

MySQL Server Node failing to connect

SELinux and MySQL Cluster SQL Nodes

Other Connections

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...