All about Big Data!!: Change IP or Hostname for an already installed Cloudera Manager 4.x (CM 4.0)

Thursday, March 7, 2013

Change IP or Hostname for an already installed Cloudera Manager 4.x (CM 4.0)

I installed the Cloudera Manager 4 on a 1GbE default setup on eth0/bond0 and now I need to use 10GbE or Infiniband IB or 40GbE without the reinstall. How I did it!!

The assumptions here are that you already have a CM 4.x already installed with the default embedded postgres DB on a linux server
. If you want to learn how-to just ask...

Let's start here where I have a 3 cluster setup with yosemite001 - yosemite003

hostname = yosemite00[1-3].somedomain.com = CM Installed on 1GbE

Shutdown all services
service cloudera-scm-agent stop on all nodes
service cloudera-scm-server stop on CM server

From here if you have a paid or supported version, You are about to lose it or maybe they say "Cool good job!!" proceed at your own risk...

from the CM Server

as root su – postgres (If you have installed on a different database then login using that credentials)

psql -h localhost -p 7432 -U scm

à When asked for the password open another terminal and run

[root@yosemite001 ~]# grep password /etc/cloudera-scm-server/db.properties

com.cloudera.cmf.db.password=TVkDZxuNCw

paste the password & you should see the scm prompt

scm=> select host_id,host_identifier,name,ip_address from hosts; --> Check the current config and save the values in a file/notepad you will need this if you need to flip back.

host_id | host_identifier | name | ip_address

---------+-----------------------+-----------------------+----------------

2 | yosemite001.somedomain.com | yosemite001.somedomain.com | 192.168.0.11

3 | yosemite002.somedomain.com | yosemite002.somedomain.com | 192.168.0.12

4 | yosemite003.somedomain.com | yosemite003.somedomain.com | 192.168.0.13

(3 rows)

scm=> update hosts set (host_identifier,name,ip_address) = ('yosemite001-10g.somedomain.com','yosemite001-10g.somedomain.com','192.168.10.11') where host_id=2;

UPDATE 1

Update all the other rows.

Check if the updates went through

scm=> select host_id,host_identifier,name,ip_address from hosts;

host_id | host_identifier | name | ip_address

---------+-------------------------------------+---------------------------------------+----------------

3 | yosemite002.somedomain.com | yosemite002.somedomain.com | 192.168.0.12

4 | yosemite003.somedomain.com | yosemite003.somedomain.com | 192.168.0.13

2 | yosemite001-somedomain.com | yosemite001-ib.somedomain.com | 192.168.10.11

(3 rows)

Exit the tool “\q;”

edit the /etc/cloudera-scm-agent/config.ini and update the server and the listen ip & hostname section on all nodes to the new interface ip & address

edit /etc/sysconfig/network with the new hostname for the interface

run hostname yosemite001-10g (Update all hostnames on the servers)

run “exec bash” and verify the hostname change is done

Run on the CM Server

chkconfig cloudera-scm-server on

service cloudera-scm-server start

Run the following on all the nodes

chkconfig cloudera-scm-agent on

service cloudera-scm-agent start on all the nodes

Force CM to rerun the configuration to register the changes (mimic changes and revert to force CM a rerun of config )

Go to the hdfs->system-wide config edit/changes and make any minimal changes and save and revert back to original and save

Do the same for mapreduce as well

Restart the services

Verify the client functionality by running Terasort.

9 comments:

Kirk TrueJune 3, 2013 at 2:03 PM
Wow! Thanks. I don't like this move from "simple" configuration files to this series of incantations :(
ReplyDelete
Replies
UnknownJuly 25, 2013 at 3:42 AM
Thank you very much, it really help us
ReplyDelete
Replies
UnknownJuly 25, 2013 at 3:49 AM
Thank so much!!! That useful information helped to solve many connection problems those led to fails during calculation of jobs....
ReplyDelete
Replies
ps40January 9, 2014 at 5:45 PM
Hi,

I have managed to screw up my HDFS ( CDH 4) installation. Are you able to provide paid/unpaid help regarding this?

Thanks
ReplyDelete
Replies
Big Data Helpline...January 13, 2014 at 8:04 AM
Yes. Could you explain the issue with HDFS ?
ReplyDelete
Replies
ps40January 13, 2014 at 8:52 AM
We have a 4 node CDH 4.1 cluster with HDFS on it on Amazon EC2. ( 1 x Name Node + 3 x Data nodes )

I took a snapshot of the data node servers at T1 but forgot to take a snapshot of the name node server as well.

18 hours later (T1 + 18) I realized that sometime during the last 18 hours the disk was full and took another snapshot of the 3 data nodes.

I increased the size of the disks, but when I tried to bring the cluster back up, Name Node won't come out of safe mode because a number of blocks were missing.

I do have a backup of the name node which ran sometime around T1 + 12 hours.

I have tried to bring the cluster back up with different combinations of the Name Node backup and Data Nodes backup but I get anywhere from 150 to 600 blocks missing ( out of ~1200 total ).

Is there a way to recover the data or I'm screwed?
ReplyDelete
Replies
Matthew SchlachtmanJanuary 22, 2014 at 1:30 PM
Hey When I run the select query I get something like this:
scm=> select host_id,host_identifier,name,ip_address from hosts;
host_id | host_identifier | name | ip_address
---------+--------------------------------------+------------------------+---------------
8 | dc7c25c8-958a-4332-b78e-1780820333a4 | yamazaki.lunexa.local | 192.168.2.222
6 | 52a9d844-0f16-42b8-9f1b-2572dbbc04fa | woodford.lunexa.local | 192.168.2.224

The host_identifier seems odd, do I still need to change that part, mind you I am running this select query while scm is still up I just wanted to see how this would work when porting the machines to a new network/domain. Any ideas/help on this issue.
ReplyDelete
Replies
Big Data Helpline...January 22, 2014 at 11:04 PM
It should be fairly straight forward and the earlier blog should help you transition over to the new network. If you using Hive, then it get's a little complicated. The simpler option if you are running 4.4 or above (might work on earlier version but have not tried it) , There is file /etc/cloudera-scm-agent/config.ini -- In this file uncomment the following lines
# listening_ip=
# Hostname that Agent reports as its hostname
# listening_hostname=
and fill these in with the new network or domain once you have moved over but before you start any agents/services. This is very effective if you have multiple interfaces. Also if you need to force from a config side, you can use dfs.datanode.hostname in hdfs-site.xml and mapred.tasktracker.dns.nameserver or slave.host.name in mapred-site.xml.

Hopefully that should help you migrate to the new Network/domain. Regarding the host identifier, you might get new records added for the new network, which is fine and you can delete the old network names from the cluster. As always, you can try this on test site if you have before trying on production :-)
ReplyDelete
Replies

Add comment