Demystifying Cassandra’s broadcast_address
When configuring Apache Cassandra to work in a new environment or with a new application or service we sometimes find ourselves asking“What’s the difference between broadcast_address and broadcast_rpc_address again?”.
The difference is broadcast_address relates to gossip and node to node communications, whereas broadcast_rpc_address is associated with client connections. Read on for more details.
The Cassandra configuration file has a few interdependent properties related to communication, which can take a bit of concentration to make sense of. Here is a (hopefully) easy to understand explanation.
Node to node communication (i.e. gossip)
Cassandra will bind to the listen_address or listen_interface and listen on the storage_port or ssl_storage_port for gossip. In most cases, these properties may be omitted, resulting in Cassandra binding to the hostname’s IP address (Cassandra uses InetAddress.getLocalHost()). Note:setting listen_address to “localhost” results in Cassandra binding to the loopback interface (not recommended as it only works as a 1-node cluster).
broadcast_address is reported to nodes for peer discovery. Topologies that span separate networks need this set to a public address. If this property is omitted, the listen_address will be broadcast to nodes. Note:Nodes can be configured to gossip via the local network and use public addresses for nodes outside its local network by setting
prefer_local=true
in cassandra-rackdc.properties and using certain endpoint_snitches (such as GossipingPropertyFileSnitch or Ec2MultiRegionSnitch).Client to node communication
rpc_address, rpc_interface and broadcast_rpc_address
By “client” I mean Cassandra drivers and clqsh. The drivers may use the Thrift transport or the Native transport (CQL binary protocol). Cqlsh uses the Native transport.
Cassandra will bind to the rpc_address or rpc_interface and listen on rpc_port and native_transport_port for client connections. If these properties are omitted, Cassandra will bind to the hostname’s IP address (and would need to be specified to a locally running cqlsh because
cqlsh
=cqlsh <loopback address>
).
broadcast_rpc_address is a property available in Cassandra 2.1 and above. It is reported to clients during cluster discovery and as cluster metadata. It is useful for clients outside the cluster’s local network. This property is typically either:
- the public address if most clients are outside the cluster’s local network
- the local network address if most clients are in the cluster’s local network
If this property is omitted, rpc_address will be reported to clients.
Note 1: If there are a mix of clients inside and outside the local network, use an AddressTranslator policy to compensate for unreachable addresses (only available for Java and Python drivers at the time of writing. Here is a Java example.)
Note 2: rpc_address may be set to 0.0.0.0. In this case, Cassandra binds to all available interfaces, including loopback, which is used by cqlsh when no host is specified. But 0.0.0.0 is not routable, so Cassandra will use a different property to determine the address to broadcast to clients:
- for Cassandra 2.1 and later: broadcast_rpc_address must be set and will be reported to clients.
- for Cassandra prior to 2.1: broadcast_address (or listen_address if omitted) will be reported to clients.
Summary:
Cassandra Version | Purpose | Properties | Typical Setting |
---|---|---|---|
All | gossip | listen_address or listen_interface (with storage_port or ssl_storage_port) | Omit to bind to InetAddress.getLocalHost() |
All | peer discovery (within the cluster) | broadcast_address else listen_address | public address |
All | client requests (CQL and Thrift) | rpc_address or rpc_interface (with rpc_port and native_transport_port) | Omit to bind to InetAddress.getLocalHost() |
2.1 and later | cluster discovery | metadata (to the client) | broadcast_rpc_address else rpc_address | Omit to broadcast InetAddress.getLocalHost() |
2.0 and prior | cluster discovery | metadata (to the client) | rpc_address or broadcast_address if 0.0.0.0 | Omit to broadcast InetAddress.getLocalHost() |
No comments:
Post a Comment