[Cloudera] Installing CDH thorugh Docker # 1 - Prepare Base Docker Image
2022, Jul 01
CDH (Cloudera Distribution for Hadoop Platform)
- Hadoop Ecosystem requires multiple components and each component has complex version dependency to another, which makes it really hard for customers to manage entire hadoop cluster by themselves.
- CDH is the most widely deployed distribution of Apache Hadoop as an end-to-end management tool for Hadoop system that allows integrated control over all necessary hadoop components .
It provides automated installation process, reduced deployment cost, real-time view of nodes, central console to process across clusters, and other range of tools needed to operate Hadoop cluster
- Today, we will use 7.1.4 version of cloudera (open source)
- Alternatively, you can just download cloudera docker image with a brief command
docker pull cloudera/quickstart:latest
Creates Docker Image with Cloudera Manager Installed
- !) Before start, if you’re using Mac (Apple M1 Silicon), then you need to build base Docker image on AMd64 architecture environment instead of ARM64 architecture
- Refer to HERE instead of this post
- Steps
- Create CentOS base Image - centos:base
- Install Cloudera Manager 7.1.4 on centos:base image - centos:CM
- Set Haddop Cluster with one namenode (Hadoop 01, cloudera manager server) and three datanodes (Hadoop 02~04, cloudera mangaer agent) —> this part will be addressed next time
1. Create CentOS Base Image
- (local terminal) first, create a new container named as centos_base with CentOS image (version 7 here)
$ docker run -it --name centos_base -dt centos:7
- execute centos_base container and install all the necessary basic components
## terminal $ docker exec -it centos_base /bin/bash ## container $ yum update $ yum install wget -y $ yum install vim -y $ yum install openssh-server openssh-clients openssh-askpass -y $ yum install initscripts -y $ ssh-keygen -t dsa -P "" -f ~/.ssh/id_dsa ## create dsa key file with passphrases "" (empty) $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys ## append public key file on authorized_keys file $ ssh-keygen -f /etc/ssh/ssh_host_rsa_key -t rsa -N "" ## rsa key with passphrases "" (empty) $ ssh-keygen -f /etc/ssh/ssh_host_ecdsa_key -t ecdsa -N "" ## ecdsa key with passphrases "" (empty) $ ssh-keygen -f /etc/ssh/ssh_host_ed25519_key -t ed25519 -N "" ## ed25519 key with passphrases "" (empty)
- wget : software package for interacting with REST APIs to retrieve files using HTTP, HTTPS, FTP and FTPS
- vim : can edit files on terminal
- openssh-server openssh-clients openssh-askpass : connectivity tool for remote login between computers with the SSH protocol
- SSH protocl (Secure Shell Protocol)
- network communication protocol that enables two remote computers to share data or perform operations on each other
- Communication between remote computers is encrypted by a pair of keys (private and public) for authentication instead of using passwords, which allows safer interation between computers even on insecrue, public networks
- ssh-keygen : generate private - public key pairs
- Command Options (For more informations, here)
- -t : Specifies the type (algorithm) of key to create, default by rsa but alternatively you can use other algorithms such as DSA, ECDSA and ED25519
- -b : Specifies the number of bits in the key to create. The default length is 3072 bits (RSA) or 256 bits (ECDSA)
- -p : Requests changing the passphrase (phrases to protect private key files) of a private key file
- -P : old passphrase
- -N : new passphrase
- -f : Specifies the filename of the key file.
- Command Options (For more informations, here)
- SSH protocl (Secure Shell Protocol)
- (container) edit the “bashrc” file using vim
$ vim ~/.bashrc ## bashrc file $ /usr/sbin/sshd $ source ~/.bashrc ## by sourcing it, you can relaod the file and execute the commands placed in there
- add the command to activate sshd and exit (writing mode start - :i / overwrite - :w / exit - :q)
- sshd (OpenSSH server process)
- receives incoming connections using the SSH protocol and acts as the server for the protocol.
- It handles user authentication, encryption, terminal connections, file transfers, and tunneling.
2. Download Cloudera Manager Installer on centos:base image and Give Access Permission
- (container) now, let’s download cloudera managaer installer file on container
$ wget https://archive.cloudera.com/cm7/7.1.4/cloudera-manager-installer.bin
- (container) allow execute permission to installer file and exit
$ chmod u+x cloudera-manager-installer.bin $ exit
- chmod (change mode)
- command used to change the access permissions (file mode)
- With a set of options, you can specify the classes of users to whom the permissions are applied and the types of access allowed (which permissions are to be granted or removed)
- -u : user, only for file owner
- + : operator that indicates you want to add the permission following behind
- -x : execute permission (recursive, includes all sub-directory)
- For more informations, here
- chmod (change mode)
- (local terminal) now, commit the container (centos_base) with all the files and packages set into a docker image centos:base
$ docker commit centos_base centos:base
- you can see new docker image named centos:base has just been created
- https://docs.cloudera.com/cloudera-manager/7.5.4/concepts/cm-concepts.pdf
- https://taaewoo.tistory.com/23?category=917407