As the data in Enterprise Hadoop clusters continues to grow, the security of that data begins to be an important part of any implementation. This article will be the first in a series discussing the current state of security in the Hortonworks Ecosystem. I will cover the four major tools used to secure a Hortonworks Data Platform (HDP) cluster. Ranger and Kerberos handle the four A’s of security: Authentication, Authorization, Auditing, and Administration. In addition, data encryption and cluster access are also provided by HDE and Knox respectively.

In a previous article I covered troubleshooting a Kerberos One Way Trust with Active Directory, but didn’t explain the motives for implementing Kerberos within your cluster. Kerberos provides a layer of secure and proven authentication. When a cluster is kerberized, the services and clients utilize a ticket granted by a KDC to provide a secure means of authentication.

Currently in an unsecure hadoop cluster without Kerberos there is no authentication required. If you present yourself as any user, the system trusts the username without authenticatng. This means you

can create an account named hdfs on a local vm, install the hadoop client, and configure it to access any un-kerberized cluster.

Utilizing this account and client configuration, you now have root (or any other users) access to HDFS on the target cluster. If the cluster is kerberized, you would need access to the keytab file granted to an account in order to have access. If that keytab is protected as it should be, unauthorized users will not be able to connect to the cluster.

In closing, without Kerberos there is no authentication in a Hadoop cluster. Without authentication, there is no value in any authorization or auditing. Without authentication, authorization, or auditing there is no security in any service in the cluster.