Skip to main content

Issue:

When trying to read from Hadoop from an unmanaged Cloudera node with a pipeline the following error appears:

/usr/bin/hadoop: No such file or directory  

Solution:

  1. On the external host download the CDH repo file to the /etc/yum.repos.d/ directory (change the path to match the OS release and CDH version of the client you need):
    curl -O https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo
  2. Edit the base URL in the cloudera-cdh5.repo file to install the CDH version (otherwise, it will install the latest). For example, to install the 5.7.1 hadoop-client, update the baseurl to:
    baseurl=https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.7.1/
  3. Install the hadoop-client rpm:
    $ yum clean all$ yum install hadoop-client

Note: You can also download the RPM file and install it locally if desired.

(See http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.7.1/RPMS/x86_64/)

Once we have installed the needed packages we need to configure the client on the unmanaged node.

  1. In Cloudera Manager navigate to, HDFS -> "Actions" drop down -> "Download Client Configuration" (this will download a zip file called hdfs-clientconfig.zip).
  2. Move the zip file over to the external host and unzip it.
  3. Copy all the unzipped configuration files to /etc/hadoop/conf. Example:
    $ cp *  /etc/hadoop/conf

     

  4. Run Hadoop commands. Example:

    $ sudo -u hdfs hadoop fs -ls

     

Be the first to reply!