Home » Java » Counting total number of lines in final map reduce output in hadoop

Counting total number of lines in final map reduce output in hadoop

Posted by: admin July 15, 2018 Leave a comment

Questions:

Currently my num reduce task is set to job.setNumReduceTasks(100);

So my final output directory is in S3 and looks like the following

/output/part-r-00000.gz
/output/part-r-00001.gz
... etc

in order to count all the lines I have to manually download and unzip all files and go through each file to count the total lines.

Is there a total line metric store somewhere in hadoop context?

Answers: