[Cloudera] Installing CDH thorugh Docker # 2 - Run Cloudera Server and Agents with Docker

[Cloudera] Installing CDH thorugh Docker # 2 - Run Cloudera Server and Agents with Docker

2022, Jul 09    


Cluster Configuration with Cloudera Manager


  • The image below shows the configuration of cluster that we will set up today

    • Hadoop01 : namenode with cloudera mangaer installed, this node will serve as our Cloudera Manager Server
    • Hadoop02 - 04 : datanodes linked with Hadoop01 namenode will be our Cloudera Manager Agents


- Server/Agent Architecture of Cloudera

image

  • Cloudera Manager Server
    • Cloudera Manager runs a central server “the Cloudera Manager Server”, which has also been called the “SCM Server”
    • It hosts the Cloudera Manager Admin Console (Web-based user interface that administrators use to manage clusters and Cloudera Manager) and API
    • It’s also responsible for installing software, configuring, starting, and stopping services
    • Monitors the heartbeats coming from every cloudera agents and send orders for those agents to perform
  • Agent
    • Installed on every host.
    • mainly responsible for actually starting and stopping processes ordered from server, unpacking configurations, triggering installations, and monitoring the host.
    • Reports current state of its host to the server and detects a new process from the server heartbeats
    • Upon detection, it creates a directory for it in /var/run/cloudera-scm-agent and unpacks the configuration. It then contacts to “supervisord”, which actually starts the process detected
  • Heartbeating
    • Server and Agent communicates through a process called “Heartbeating”, which is a primary communication mechanism in Cloudera Manager
    • By default, Agents send heartbeats every 15 seconds to the Cloudera Manager Server (frequency can be optimized if necessary).
    • During the heartbeat exchange, the Agent notifies the Cloudera Manager Server of its state and activities. In response, Cloudera Manager Server allocates the actions the Agent should perform.
    • If agent stops heartbeating, the host is marked as having bad condition


1. Install Cloudera Manager and Create CentOS:CM Imagea


  • (local terminal) create a container named hadoop01 with centos:base image with the right options
      $ docker run --privileged --name hadoop01 -p 7180:7180 -itd -h hadoop01.hadoop.com -e container=docker -v /sys/fs/cgroup:/sys/fs/cgroup:ro centos:base /usr/sbin/init
    
      $ docker exec -it hadoop01 /bin/bash
    


    • Docker run options (Documentations)
      • -p : port forwarding localhost 7180 to docker container 7180
      • -h : Container host name
      • -e (–env) : set simple (non-array) environment variables in the container you’re running, or overwrite variables that are defined in the Dockerfile of the image you’re running
      • -v (–volume) : mounts the current working directory into the container
        • volumes : file systems mounted on Docker containers to preserve data generated by the running that container
        • can be optionally suffixed with :ro or :rw to mount the volumes in read-only or read-write mode, respectively
      • –privileged : Give extended privileges to this container (without this, mounting will be denied)


  • (container : hadoop01) if you are successfully connected the running container, then let’s execute the cloudera manager installer
      $ cd
      $ ./cloudera-manager-installer.bin
    


    • keep enter “yes”