Bootstrapping the Cluster

Bootstrapping the Cluster

This section explains how to start a Cluster for the first time depending on the consensus choice (crdt or raft) made during initialization.

The first start of the Cluster is the most critical step during the  lifetime. We must ensure that peers are able to contact each other  (connectivity) and discard common configuration errors (like using  different value for secret).

Starting a cluster peer is as easy as running:

$ ipfs-cluster-service daemon

BUT, unlike the IPFS daemon, which by default  connects to the public IPFS network and can discover other peers in it  by first connecting to a well known list of available bootstrappers, a  Cluster peer runs on a private network and does not have any public peer  to bootstrap to.

Make sure the ipfs daemon is running  before starting the Cluster. Although this is not an strict requirement,  it avoids a few error messages.

Thus, when starting IPFS Cluster peers for the first time,  it is important to provide information so that they can discover the  other peers and join the Cluster. Once a peer has successfully started  once, it can be subsequently re-started with the command above. During  shutdown, each peer’s peerstore file will be updated to remember known addresses for other peers.

As we will see below, the first start has slightly different requirements depending on whether you will be running a crdt-based or a raft-based Cluster. You can read more about the differences between the two in the CRDT vs Raft table.

All peers in a Cluster must run in the same mode, either CRDT or Raft.

Bootstrapping the Cluster in CRDT mode

This is the easiest option to start a cluster because the only  requirement a crdt-based peer has to become part of a Cluster is to  contact at least one other peer. This can be achieved in several ways:

  • Pre-filling the peerstore file with addresses for other peers (as we saw in the previous section).
  • Running with the --bootstrap <peer-multiaddress1,peer-multiaddress2> flag. Note that using this flag will automatically trust the given peers. For more information about trust, read the CRDT section.
  • In local networks with mDNS discovery support, peers will autodiscover each other and no additional measures are necessary.

Example 1. Starting the first peer in a CRDT-based Cluster:

 $ ipfs-cluster-service daemon --consensus crdt

Example 2. Starting more peers in a CRDT-based cluster by customizing the peerstore. The given multiaddress corresponds to the first peer:

$ echo "/dns4/cluster1.domain/tcp/9096/ipfs/QmcQ5XvrSQ4DouNkQyQtEoLczbMr6D9bSenGy6WQUCQUBt" >> ~/.ipfs-cluster/peerstore
ipfs-cluster-service daemon --consensus crdt

Example 3. Starting more peers in a CRDT-based cluster using the --bootstrap flag. The given multiaddress corresponds to the first peer:

$ ipfs-cluster-service daemon --consensus crdt --bootstrap /dns4/cluster1.domain/tcp/9096/ipfs/QmcQ5XvrSQ4DouNkQyQtEoLczbMr6D9bSenGy6WQUCQUBt

Bootstrapping the Cluster in Raft mode

In Raft Clusters, the first start of a peer must not only contact a  different peer, but also complete the task of becoming a member of the  Raft Cluster. Therefore the first start of a peer must always use the --bootstrap flag:

Example 1. Starting the first peer in a Raft-based Cluster:

$ ipfs-cluster-service daemon --consensus raft

Example 2. Starting more peers in a Raft-based cluster. The given multiaddress corresponds to the first peer:

$ ipfs-cluster-service daemon --consensus raft --bootstrap /dns4/cluster1.domain/tcp/9096/ipfs/QmcQ5XvrSQ4DouNkQyQtEoLczbMr6D9bSenGy6WQUCQUBt

Example 3. Subsequent starts when the peer already successfully joined a Raft cluster before:

$ ipfs-cluster-service daemon --consensus raft

Verifying a successful bootstrap

After starting your Cluster peers (especially the first time you are  doing so), you should check that things are working correctly:

Check for errors in the logs. A successful peer start will print the “READY” message:

INFO    cluster: ** IPFS Cluster is READY **

Run ipfs-cluster-ctl id to verify the details of your  local Cluster peer. You should be able to see information for the  Cluster peer and for the IPFS daemon it is connected to:

$ ipfs-cluster-ctl id
QmYY1ggjoew5eFrvkenTR3F4uWqtkBkmgfJk8g9Qqcwy51 | peername | Sees 3 other peers
  > Addresses:
    - /ip4/
    - /ip4/
  > IPFS: QmPFJcZfhFCmz1rAoew214h9d7Nv4aseqtCg5sm4fMdeYq
    - /ip4/
    - /ip4/
    - /ip6/::1/tcp/4001/ipfs/QmPFJcZfhFCmz1rAoew214h9d7Nv4aseqtCg5sm4fMdeYq
    - /ip6/::1/tcp/4002/ws/ipfs/QmPFJcZfhFCmz1rAoew214h9d7Nv4aseqtCg5sm4fMdeYq
  • CRDT Cluster peers may take a few minutes to discover additional  peers in the Cluster (depending on how they were bootstrapped), even  after the READY message.
  • Raft Cluster peers will fail to start after a few seconds if they  have not successfully joined or re-joined the Raft Cluster. The READY  message indicates that the peer is fine, even though it may show errors  when contacting other peers that are down.

Common issues

Here are some things that usually go wrong during the first boot. For more instructions on how to debug a cluster peer, see the Troubleshooting guide.

Different secret among Cluster peers

Using different Cluster secrets makes Cluster peers unable to  communicate while causing libp2p to throw unobvious errors on the logs:

dial attempt failed: incoming message was too large

No connectivity between peers

The following error messages indicate connectivity issues. Make sure that the necessary ports are open and that connectivity can be established between peers:

dial attempt failed: context deadline exceeded
dial backoff
dial attempt failed: connection refused

Subscribe to SimpleAsWater

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.