Hadoop MCQs - CSE MCQs Questions and Answers

Home / CSE MCQs / Hadoop MCQs :: Discussion

Which of the following code is used to find scripts that use only the default parallelism ?

a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);

b = foreach a generate (Chararray) j#'STATUS' as status, j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, j#'JOBID' as

job;

c = filter b by status != 'SUCCESS';

dump c;

a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);

b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, (Long) r#'NUMBER_REDUCES' as reduces;

c = group b by (id, user, script_name) parallel 10;

d = foreach c generate group.user, group.script_name, MAX(b.reduces) as max_reduces;

e = filter d by max_reduces == 1;

dump e;

a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);

b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'QUEUE_NAME' as queue;

c = group b by (id, user, queue) parallel 10;

d = foreach c generate group.user, group.queue, COUNT(b);

dump d;

None of the mentioned

Answer : Option B

Explanation :

The first map in the schema contains job-related entries.

Be The First To Comment

Comments

Name