Profile small white2

by Erik Nor

December 03, 2014

Handling an expired SSL Certificate in Hortonworks Ambari

It is your worst nightmare, Ambari loses touch with all of the nodes on your cluster.  You can no longer manage the Hadoop processes across the cluster via the GUI.  So you check the ambari-server log to find out why it is no longer talking to the agents.  In the log you find an error like this;

WARN nio:651 – javax.net.ssl.SSLException: Received fatal alert: certificate_expired

So following suggestions on Ambari's Apache wiki page you attempt to delete the existing certificates and regenerate new ones via openssl.  It is on the official wiki, it has to be right?  Once you get around the certificate pass not being default you get a new error;

WARN nio:651 – javax.net.ssl.SSLException: Received fatal alert: unknown_ca

Things are not looking good.  Luckily it is very easy to recover from this error, and actually very fast if you have the right tools installed.  First stop all of the agents and the Ambari server.  Once everything is down delete the agent certificates located here;

/var/lib/ambari-agent/keys/*

Using a tool such as pdsh makes doing this fast and easy.  Once the keys are deleted you can start all of the agents back up.  

Now delete the keys on the server located here;

/var/lib/ambari-server/ca.key

/var/lib/ambari-server/*.csr

/var/lib/ambari-server/*.crt

echo “” > /var/lib/ambari-server/db/index.txt

Once those files are backed up and removed you can start the Ambari server and it will regenerate new keys and be able to communicate with all of the agents again.

By default it generates keys good for one year, so expect to be doing this once a year if you aren't doing a clean install with new releases.

Profile small white2

Erik Nor

Erik Nor is a Principal Consultant and Big Data Technology Leader at Moser Consulting. He's been working with Big Data and Hadoop for over four years and holds multiple development and admin certifications from Hortonworks and Cloudera.