Understanding how to manage MySQL Cluster requires a knowledge of four essential processes. In the next few sections of this chapter, we cover the roles played by these processes in a cluster, how to use them, and what startup options are available for each of them:
mysqld is the traditional MySQL server
process. To be used with MySQL Cluster,
mysqld needs to be built with support for the
NDB Cluster
storage engine, as it is in the
precompiled -max
binaries available from
http://dev.mysql.com/downloads/. If you build MySQL from
source, you must invoke configure with the
--with-ndbcluster
option to enable NDB
Cluster
storage engine support.
If the mysqld binary has been built with
Cluster support, the NDB Cluster
storage
engine is still disabled by default. You can use either of two
possible options to enable this engine:
-
Use
--ndbcluster
as a startup option on the command line when starting mysqld. -
Insert a line containing
ndbcluster
in the[mysqld]
section of yourmy.cnf
file.
An easy way to verify that your server is running with the
NDB Cluster
storage engine enabled is to
issue the SHOW ENGINES
statement in the MySQL
Monitor (mysql). You should see the value
YES
as the Support
value
in the row for NDBCLUSTER
. If you see
NO
in this row or if there is no such row
displayed in the output, you are not running an
NDB
-enabled version of MySQL. If you see
DISABLED
in this row, you need to enable it
in either one of the two ways just described.
To read cluster configuration data, the MySQL server requires at a minimum three pieces of information:
-
The MySQL server's own cluster node ID
-
The hostname or IP address for the management server (MGM node)
-
The number of the TCP/IP port on which it can connect to the management server
Node IDs can be allocated dynamically, so it is not strictly necessary to specify them explicitly.
The mysqld parameter
ndb-connectstring
is used to specify the
connectstring either on the command line when starting
mysqld or in my.cnf
. The
connectstring contains the hostname or IP address where the
management server can be found, as well as the TCP/IP port it
uses.
In the following example, ndb_mgmd.mysql.com
is the host where the management server resides, and the
management server listens for cluster messages on port 1186:
shell> mysqld --ndb-connectstring=ndb_mgmd.mysql.com:1186
See Section 15.4.4.2, “The Cluster connectstring
”, for more
information on connectstrings.
Given this information, the MySQL server will be a full participant in the cluster. (We sometimes refer to a mysqld process running in this manner as an SQL node.) It will be fully aware of all cluster data nodes as well as their status, and will establish connections to all data nodes. In this case, it is able to use any data node as a transaction coordinator and to read and update node data.
You can see in the mysql client whether a
MySQL server is connected to the cluster using SHOW
PROCESSLIST
. If the MySQL server is connected to the
cluster, and you have the PROCESS
privilege,
then the first row of the output is as shown here:
mysql> SHOW PROCESSLIST \G *************************** 1. row *************************** Id: 1 User: system user Host: db: Command: Daemon Time: 1 State: Waiting for event from ndbcluster Info: NULL
ndbd is the process that is used to handle all the data in tables using the NDB Cluster storage engine. This is the process that empowers a data node to accomplish distributed transaction handling, node recovery, checkpointing to disk, online backup, and related tasks.
In a MySQL Cluster, a set of ndbd processes cooperate in handling data. These processes can execute on the same computer (host) or on different computers. The correspondences between data nodes and Cluster hosts is completely configurable.
ndbd generates a set of log files which are
placed in the directory specified by DataDir
in the config.ini
configuration file. These
log files are listed below. Note that
node_id
represents the node's unique
identifier. For example, ndb_2_error.log
is
the error log generated by the data node whose node ID is
2
.
-
ndb_
node_id
_error.log is a file containing records of all crashes which the referenced ndbd process has encountered. Each record in this file contains a brief error string and a reference to a trace file for this crash. A typical entry in this file might appear as shown here:Date/Time: Saturday 30 July 2004 - 00:20:01 Type of error: error Message: Internal program error (failed ndbrequire) Fault ID: 2341 Problem data: DbtupFixAlloc.cpp Object of reference: DBTUP (Line: 173) ProgramName: NDB Kernel ProcessID: 14909 TraceFile: ndb_2_trace.log.2 ***EOM***
Note: It is very important to be aware that the last entry in the error log file is not necessarily the newest one (nor is it likely to be). Entries in the error log are not listed in chronological order; rather, they correspond to the order of the trace files as determined in the
ndb_
node_id
_trace.log.next file (see below). Error log entries are thus overwritten in a cyclical and not sequential fashion. -
ndb_
node_id
_trace.log.trace_id
is a trace file describing exactly what happened just before the error occurred. This information is useful for analysis by the MySQL Cluster development team.It is possible to configure the number of these trace files that will be created before old files are overwritten.
trace_id
is a number which is incremented for each successive trace file. -
ndb_
node_id
_trace.log.next is the file that keeps track of the next trace file number to be assigned. -
ndb_
node_id
_out.log is a file containing any data output by the ndbd process. This file is created only if ndbd is started as a daemon. -
ndb_
node_id
.pid is a file containing the process ID of the ndbd process when started as a daemon. It also functions as a lock file to avoid the starting of nodes with the same identifier. -
ndb_
node_id
_signal.log is a file used only in debug versions of ndbd, where it is possible to trace all incoming, outgoing, and internal messages with their data in the ndbd process.
It is recommended not to use a directory mounted through NFS
because in some environments this can cause problems whereby the
lock on the .pid
file remains in effect
even after the process has terminated.
To start ndbd, it may also be necessary to specify the hostname of the management server and the port on which it is listening. Optionally, one may also specify the node ID that the process is to use.
shell> ndbd --connect-string="nodeid=2;host=ndb_mgmd.mysql.com:1186"
See Section 15.4.4.2, “The Cluster connectstring
”, for
additional information about this issue.
Section 15.6.5, “Command Options for MySQL Cluster Processes”, describes other
options for ndbd.
When ndbd starts, it actually initiates two processes. The first of these is called the “angel process”; its only job is to discover when the execution process has been completed, and then to restart the ndbd process if it is configured to do so. Thus, if you attempt to kill ndbd via the Unix kill command, it is necessary to kill both processes, beginning with the angel process. The preferred method of terminating an ndbd process is to use the management client and stop the process from there.
The execution process uses one thread for reading, writing, and scanning data, as well as all other activities. This thread is implemented asynchronously so that it can easily handle thousands of concurrent activites. In addition, a watch-dog thread supervises the execution thread to make sure that it does not hang in an endless loop. A pool of threads handles file I/O, with each thread able to handle one open file. Threads can also be used for transporter connections by the transporters in the ndbd process. In a system performing a large number of operations, including updates, the ndbd process can consume up to 2 CPUs if permitted to do so. For a machine with many CPUs it is recommended to use several ndbd processes which belong to different node groups.
The management server is the process that reads the cluster configuration file and distributes this information to all nodes in the cluster that request it. It also maintains a log of cluster activities. Management clients can connect to the management server and check the cluster's status.
It is not strictly necessary to specify a connectstring when starting the management server. However, if you are using more than one management server, a connectstring should be provided and each node in the cluster should specify its node ID explicitly.
See Section 15.4.4.2, “The Cluster connectstring
”, for
information about using connectstrings.
Section 15.6.5, “Command Options for MySQL Cluster Processes”, describes other
options for ndb_mgmd.
The following files are created or used by
ndb_mgmd in its starting directory, and are
placed in the DataDir
as specified in the
config.ini
configuration file. In the list
that follows, node_id
is the unique
node identifier.
-
config.ini
is the configuration file for the cluster as a whole. This file is created by the user and read by the management server. Section 15.4, “MySQL Cluster Configuration”, discusses how to set up this file. -
ndb_
node_id
_cluster.log is the cluster events log file. Examples of such events include checkpoint startup and completion, node startup events, node failures, and levels of memory usage. A complete listing of cluster events with descriptions may be found in Section 15.7, “Management of MySQL Cluster”.When the size of the cluster log reaches one million bytes, the file is renamed to
ndb_
node_id
_cluster.log.seq_id
, whereseq_id
is the sequence number of the cluster log file. (For example: If files with the sequence numbers 1, 2, and 3 already exist, the next log file is named using the number4
.) -
ndb_
node_id
_out.log is the file used forstdout
andstderr
when running the management server as a daemon. -
ndb_
node_id
.pid is the process ID file used when running the management server as a daemon.
The management client process is actually not needed to run the cluster. Its value lies in providing a set of commands for checking the cluster's status, starting backups, and performing other administrative functions. The management client accesses the management server using a C API. Advanced users can also employ this API for programming dedicated management processes to perform tasks similar to those performed by ndb_mgm.
To start the management client, it is necessary to supply the hostname and port number of the management server:
shell>ndb_mgm [
host_name
[port_num
]]
For example:
shell> ndb_mgm ndb_mgmd.mysql.com 1186
The default hostname and port number are
localhost
and 1186, respectively.
Additional information about using ndb_mgm can be found in Section 15.6.5.4, “Command Options for ndb_mgm”, and Section 15.7.2, “Commands in the Management Client”.
All MySQL Cluster executables (except for
mysqld) take the options described in this
section. Users of earlier MySQL Cluster versions should note
that some of these options have been changed from those in MySQL
4.1 Cluster to make them consistent with one another as well as
with mysqld. You can use the
--help
option to view a list of supported
options.
The following sections describe options specific to individual NDB programs.
-
--help
--usage
,-?
Prints a short list with descriptions of the available command options.
-
--connect-string=
connect_string
,-c
connect_string
connect_string
sets the connectstring to the management server as a command option.shell>
ndbd --connect-string="nodeid=2;host=ndb_mgmd.mysql.com:1186"
-
--debug[=
options
]This option can only be used for versions compiled with debugging enabled. It is used to enable output from debug calls in the same manner as for the mysqld process.
-
--execute=
command
-e
command
Can be used to send a command to a Cluster executable from the system shell. For example, either of the following:
shell>
ndb_mgm -e show
or
shell>
ndb_mgm --execute="SHOW"
is equivalent to
NDB>
SHOW;
This is analogous to how the
--execute
or-e
option works with the mysql command-line client. See Section 4.3.1, “Using Options on the Command Line”. -
--version
,-V
Prints the version number of the ndbd process. The version number is the MySQL Cluster version number. The version number is relevant because not all versions can be used together, and the MySQL Cluster startup process verifies that the versions of the binaries being used can co-exist in the same cluster. This is also important when performing an online (rolling) software upgrade or downgrade of MySQL Cluster. (See Section 15.5.1, “Performing a Rolling Restart of the Cluster”).
-
--ndb-connectstring=
connect_string
When using the
NDB Cluster
storage engine, this option specifies the management server that distributes cluster configuration data. -
--ndbcluster
The
NDB Cluster
storage engine is necessary for using MySQL Cluster. If a mysqld binary includes support for theNDB Cluster
storage engine, the engine is disabled by default. Use the--ndbcluster
option to enable it. Use--skip-ndbcluster
to explicitly disable the engine.
For options common to NDB programs, see Section 15.6.5, “Command Options for MySQL Cluster Processes”.
-
--daemon
,-d
Instructs ndbd to execute as a daemon process. This is the default behavior.
--nodaemon
can be used to not start the process as a daemon. -
--initial
Instructs ndbd to perform an initial start. An initial start erases any files created for recovery purposes by earlier instances of ndbd. It also re-creates recovery log files. Note that on some operating systems this process can take a substantial amount of time.
An
--initial
start is to be used only the very first time that the ndbd process is started because it removes all files from the Cluster filesystem and re-creates all REDO log files. The exceptions to this rule are:-
When performing a software upgrade which has changed the contents of any files.
-
When restarting the node with a new version of ndbd.
-
As a measure of last resort when for some reason the node restart or system restart repeatedly fails. In this case, be aware that this node can no longer be used to restore data due to the destruction of the datafiles.
This option does not affect any backup files that have already been created by the affected node.
-
-
--initial-start
This option is used when performing a partial initial start of the cluster. Each node should be started with this option, as well as
--no-wait-nodes
.For example, suppose you have a 4-node cluster whose data nodes have the IDs 2, 3, 4, and 5, and you wish to perform a partial initial start using only nodes 2, 4, and 5 — that is, omitting node 3:
ndbd --ndbd-nodeid=2 --no-wait-nodes=3 --initial-start ndbd --ndbd-nodeid=4 --no-wait-nodes=3 --initial-start ndbd --ndbd-nodeid=5 --no-wait-nodes=3 --initial-start
This option was added in MySQL 5.0.21.
-
--nowait-nodes=
node_id_1
[,node_id_2
[, ...]]This option takes a list of data nodes which for which the cluster will not wait for before starting.
This can be used to start the cluster in a partitioned state. For example, to start the cluster with only half of the data nodes (nodes 2, 3, 4, and 5) running in a 4-node cluster, you can start each ndbd process with
--nowait-nodes=3,5
. In this case, the cluster starts as soon as nodes 2 and 4 connect, and does not waitStartPartitionedTimeout
milliseconds for nodes 3 and 5 to connect as it would otherwise.If you wanted to start up the same cluster as in the previous example without one ndbd — say, for example, that the host machine for node 3 has suffered a hardware failure — then start nodes 2, 4, and 5 with
--no-wait-nodes=3
. Then the cluster will start as soon as nodes 2, 4, and 5 connect and will not wait for node 3 to start.This option was added in MySQL 5.0.21.
-
--nodaemon
Instructs ndbd not to start as a daemon process. This is useful when ndbd is being debugged and you want output to be redirected to the screen.
-
--nostart
Instructs ndbd not to start automatically. When this option is used, ndbd connects to the management server, obtains configuration data from it, and initializes communication objects. However, it does not actually start the execution engine until specifically requested to do so by the management server. This can be accomplished by issuing the proper command to the management client.
For options common to NDB programs, see Section 15.6.5, “Command Options for MySQL Cluster Processes”.
-
--config-file=
filename
,-f
filename
,Instructs the management server as to which file it should use for its configuration file. This option must be specified. The filename defaults to
config.ini
.Note: This option also can be given as
-c
file_name
, but this shortcut is obsolete and should not be used in new installations. -
--daemon
,-d
Instructs ndb_mgmd to start as a daemon process. This is the default behavior.
-
--nodaemon
Instructs ndb_mgmd not to start as a daemon process.
For options common to NDB programs, see Section 15.6.5, “Command Options for MySQL Cluster Processes”.
-
--try-reconnect=
number
If the connection to the management server is broken, the node tries to reconnect to it every 5 seconds until it succeeds. By using this option, it is possible to limit the number of attempts to
number
before giving up and reporting an error instead.