I couldn't submit a small Hadoop job today and was repeatedly getting errors like this:
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class FOOCLASS not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1961) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.ClassNotFoundException: Class WordCount$WordMapper not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1867) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1959) ... 8 more
The key line of error output was:
WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
After Googling I tried all the normal expectations.
Make sure the jar has all the classes in it that it's supposed to have.
Make sure you're calling job.setJarByClass() in your code.
Make sure the permissions on the jar file are correct.
But still couldn't get it to work. Then a stackoverflow post suggested calling job.setJar() to manually set the jar by it's full path.
This worked. Why?
Duh ... Hadoop couldn't find the jar submitted in question. So step 4 to try.
Make sure your jar can be found in the environment variable HADOOP_CLASSPATH.
An alternate option may be to ensure your jar is the classpath of yarn.application.classpath. But I didn't try this.