Sunday, 11 February 2018

Demystifying Cassandra’s broadcast_address

Demystifying Cassandra’s broadcast_address


When configuring Apache Cassandra to work in a new environment or with a new application or service we sometimes find ourselves asking“What’s the difference between broadcast_address and broadcast_rpc_address again?”.
The difference is broadcast_address relates to gossip and node to node communications, whereas broadcast_rpc_address is associated with client connections. Read on for more details.
The Cassandra configuration file has a few interdependent properties related to communication, which can take a bit of concentration to make sense of. Here is a (hopefully) easy to understand explanation.

Node to node communication (i.e. gossip)

Cassandra will bind to the listen_address or listen_interface and listen on the storage_port or ssl_storage_port for gossip. In most cases, these properties may be omitted, resulting in Cassandra binding to the hostname’s IP address (Cassandra uses InetAddress.getLocalHost()). Note:setting listen_address to “localhost” results in Cassandra binding to the loopback interface (not recommended as it only works as a 1-node cluster).
broadcast_address is reported to nodes for peer discovery. Topologies that span separate networks need this set to a public address. If this property is omitted, the listen_address will be broadcast to nodes.  Note:Nodes can be configured to gossip via the local network and use public addresses for nodes outside its local network by settingprefer_local=true in cassandra-rackdc.properties and using certain endpoint_snitches (such as GossipingPropertyFileSnitch or Ec2MultiRegionSnitch).

Client to node communication

rpc_address, rpc_interface and broadcast_rpc_address

By “client” I mean Cassandra drivers and clqsh. The drivers may use the Thrift transport or the Native transport (CQL binary protocol). Cqlsh uses the Native transport.
Cassandra will bind to the rpc_address or rpc_interface and listen on rpc_port and native_transport_port for client connections. If these properties are omitted, Cassandra will bind to the hostname’s IP address (and would need to be specified to a locally running cqlsh because cqlsh =cqlsh <loopback address>).
broadcast_rpc_address is a property available in Cassandra 2.1 and above. It is reported to clients during cluster discovery and as cluster metadata. It is useful for clients outside the cluster’s local network. This property is typically either:
  • the public address if most clients are outside the cluster’s local network
  • the local network address if most clients are in the cluster’s local network
If this property is omitted, rpc_address will be reported to clients.
Note 1: If there are a mix of clients inside and outside the local network, use an AddressTranslator policy to compensate for unreachable addresses (only available for Java and Python drivers at the time of writing. Here is a Java example.)
Note 2: rpc_address may be set to 0.0.0.0. In this case, Cassandra binds to all available interfaces, including loopback, which is used by cqlsh when no host is specified. But 0.0.0.0 is not routable, so Cassandra will use a different property to determine the address to broadcast to clients:
  • for Cassandra 2.1 and later: broadcast_rpc_address must be set and will be reported to clients.
  • for Cassandra prior to 2.1: broadcast_address (or listen_address if omitted) will be reported to clients.
Summary:
Cassandra VersionPurposePropertiesTypical Setting
Allgossiplisten_address or listen_interface (with storage_port or ssl_storage_port)Omit to bind to InetAddress.getLocalHost()
Allpeer discovery (within the cluster)broadcast_address else listen_addresspublic address
Allclient requests (CQL and Thrift)rpc_address or rpc_interface (with rpc_port and native_transport_port)Omit to bind to InetAddress.getLocalHost()
2.1 and latercluster discovery | metadata (to the client)broadcast_rpc_address else rpc_addressOmit to broadcast InetAddress.getLocalHost()
2.0 and priorcluster discovery | metadata (to the client)rpc_address or broadcast_address if 0.0.0.0Omit to broadcast InetAddress.getLocalHost()

No comments:

Post a Comment

Cassandra Authentication and Create User

Cassandra Authentication and Create User: By default when we install cassandra on a machine it do not has any username and password a...