Apache ZooKeeper is a service used by a cluster (group of nodes) to coordinate between themselves and maintain shared data with robust synchronization techniques. ZooKeeper is itself a distributed application providing services for writing a distributed application.
The common services provided by ZooKeeper are as follows :
Naming service:Identifying the nodes in a cluster by name. It is similar to DNS, but for nodes.
Configuration management: Latest and up-to-date configuration information of the system for a joining node.
Cluster management:Joining / leaving of a node in a cluster and node status at real time.
Leader election:Electing a node as leader for coordination purpose.
Locking and synchronization service:Locking the data while modifying it. This mechanism helps you in automatic fail recovery while connecting other distributed applications like Apache HBase.
Highly reliable data registry:Availability of data even when one or a few nodes are down.
Creating children is similar to creating new znodes. The only difference is that the path of the child znode will have the parent path as well.
create /parent/path/subnode/path /data
Reliability:Failure of a single or a few systems does not make the whole system to fail.
Scalability : Performance can be increased as and when needed by adding more machines with minor change in the configuration of the application with no downtime.
Trparency: Hides the complexity of the system and shows itself as a single entity / application.
The central part of the ZooKeeper API is ZooKeeper class. It provides options to connect the ZooKeeper ensemble in its constructor and has the following methods -
connect - connect to the ZooKeeper ensemble
ZooKeeper(String connectionString, int sessionTimeout, Watcher watcher)
create - create a znode
create(String path, byte data, List acl, CreateMode createMode)
exists - check whether a znode exists and its information
exists(String path, boolean watcher)
getData - get data from a particular znode
getData(String path, Watcher watcher, Stat stat)
setData - set data in a particular znode
setData(String path, byte data, int version)
getChildren - get all sub-nodes available in a particular znode
getChildren(String path, Watcher watcher)
delete - get a particular znode and all its children
delete(String path, int version)
close - close a connection
Race condition: Two or more machines trying to perform a particular task, which actually needs to be done only by a single machine at any given time. For example, shared resources should only be modified by a single machine at any given time.
Deadlock:Two or more operations waiting for each other to complete indefinitely.
Inconsistency:Partial failure of data.
ZooKeeper is a distributed co-ordination service to manage large set of hosts. Co-ordinating and managing a service in a distributed environment is a complicated process. ZooKeeper solves this issue with its simple architecture and API. ZooKeeper allows developers to focus on core application logic without worrying about the distributed nature of the application.
The ZooKeeper framework was originally built at “Yahoo!” for accessing their applications in an easy and robust manner. Later, Apache ZooKeeper became a standard for organized service used by Hadoop, HBase, and other distributed frameworks.
ZooKeeper Command Line Interface (CLI) is used to interact with the ZooKeeper ensemble for development purpose. It is useful for debugging and working around with different options. To perform ZooKeeper CLI operations, first turn on your ZooKeeper server (“bin/zkServer.sh start”) and then, ZooKeeper client (“bin/zkCli.sh”).
Once the client starts, you can perform the following operation:
Create a znode with the given path. The flag argument specifies whether the created znode will be ephemeral, persistent, or sequential. By default, all znodes are persistent.
If no flags are specified, then the znode is considered as persistent.
create /path /data
To create a Sequential znode, add -s flag as shown below.
create -s /path /data
To create an Ephemeral Znode, add -e flag as shown below.
create -e /path /data
Here are the benefits of using ZooKeeper:
Removes a specified znode and recursively all its children. This would happen only if such a znode is available.
Below are some of instances where Apache ZooKeeper is being utilized:
Znodes are categorized as persistence, sequential, and ephemeral.
Persistence znode - Persistence znode is alive even after the client, which created that particular znode, is disconnected. By default, all znodes are persistent unless otherwise specified.
Ephemeral znode - Ephemeral znodes are active until the client is alive. When a client gets disconnected from the ZooKeeper ensemble, then the ephemeral znodes get deleted automatically. For this reason, only ephemeral znodes are not allowed to have a children further. If an ephemeral znode is deleted, then the next suitable node will fill its position. Ephemeral znodes play an important role in Leader election.
Sequential znode - Sequential znodes can be either persistent or ephemeral. When a new znode is created as a sequential znode, then ZooKeeper sets the path of the znode by attaching a 10 digit sequence number to the original name. For example, if a znode with path /myapp is created as a sequential znode, ZooKeeper will change the path to /myapp0000000001 and set the next sequence number as 00000000@If two sequential znodes are created concurrently, then ZooKeeper never uses the same number for each znode. Sequential znodes play an important role in Locking and Synchronization.
Once a ZooKeeper ensemble starts, it will wait for the clients to connect. Clients will connect to one of the nodes in the ZooKeeper ensemble. It may be a leader or a follower node. Once a client is connected, the node assigns a session ID to the particular client and sends an acknowledgement to the client. If the client does not get an acknowledgment, it simply tries to connect another node in the ZooKeeper ensemble. Once connected to a node, the client will send heartbeats to the node in a regular interval to make sure that the connection is not lost.
If a client wants to read a particular znode, it sends a read request to the node with the znode path and the node returns the requested znode by getting it from its own database. For this reason, reads are fast in ZooKeeper ensemble.
If a client wants to store data in the ZooKeeper ensemble, it sends the znode path and the data to the server. The connected server will forward the request to the leader and then the leader will reissue the writing request to all the followers. If only a majority of the nodes respond successfully, then the write request will succeed and a successful return code will be sent to the client. Otherwise, the write request will fail. The strict majority of nodes is called as Quorum.
Application interacting with ZooKeeper ensemble is referred as ZooKeeper Client or simply Client. Znode is the core component of ZooKeeper ensemble and ZooKeeper API provides a small set of methods to manipulate all the details of znode with ZooKeeper ensemble. A client should follow the steps given below to have a clear and clean interaction with ZooKeeper ensemble.