Mysql cluster & replication.
A typical MySQL cluster setup involves 3 components in at least this configuration:
- 1 management (ndb_mgmd) node.
- Management nodes contain the cluster configuration.
- A management node is only needed to connect new storage and query nodes to the cluster and do some arbitration.
- Existing storage and query nodes continue to operate normally if the management node goes down.
- Therefore, it's relatively safe to have only 1 management node running on a very low spec machine (configuring 2 management nodes is possible but is slightly more complex and less dynamic).
- Interfacing with a management node is done via an ndb_mgm utility.
- Management nodes are configured using config.ini.
- My setup here involves 1 management node.
- 2 storage (ndbd) nodes.
- You do not interface directly with those nodes, instead you go through SQL nodes, described next.
- It is possible to have more storage nodes than SQL nodes.
- It is possible to host storage nodes on the same machines as SQL nodes.
- It is possible, although not recommended, to host storage nodes on the same machines as management nodes.
- Storage nodes will split up the data between themselves automatically. For example, if you want to store each row on 2 machines for redundancy (NoOfReplicas=2) and you have 6 storage nodes, your data is going to be split up into 3 distinct non-intersecting chunks, called node groups.
- Given a correctly formulated query, it is possible to make MySQL scan all 3 chunks in parallel, thus returning the result set quicker.
- Node groups are formed implicitly, meaning you cannot assign a storage node to a specific node group. What you can do, however, is manipulate the IDs of the nodes in such a way that the servers you want will get assigned to the node groups you want. The nodes having consecutive IDs get assigned to the same node group until there are NoOfReplicas nodes in a node group, at which point a node group starts.
- Storage nodes are configured using /etc/my.cnf. They are also affected by settings in config.ini on the management node.
- My setup here involves 4 storage nodes.
- 2 query (SQL) nodes.
- SQL nodes are regular mysqld processes that access data in the cluster. You guessed it right – the data sits in storage nodes, and SQL nodes just serve as gateways to them.
- Your application will connect to these SQL node IPs and will have no knowledge of storage nodes.
- It is possible to have more SQL nodes than storage nodes.
- It is possible to host SQL nodes on the same machines as storage nodes.
- It is possible, although not recommended, to host SQL nodes on the same machines as management nodes.
- SQL nodes are configured using /etc/my.cnf. They are also affected by settings in config.ini on the management node.
- My setup here involves 4 SQL nodes.