Discussion :: Hadoop MCQs
- Which of the following code is used to find scripts that use only the default parallelism ?
A.
a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]); b = foreach a generate (Chararray) j#'STATUS' as status, j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, j#'JOBID' as job; c = filter b by status != 'SUCCESS'; dump c; |
B.
a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]); b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, (Long) r#'NUMBER_REDUCES' as reduces; c = group b by (id, user, script_name) parallel 10; d = foreach c generate group.user, group.script_name, MAX(b.reduces) as max_reduces; e = filter d by max_reduces == 1; dump e; |
C.
a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]); b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'QUEUE_NAME' as queue; c = group b by (id, user, queue) parallel 10; d = foreach c generate group.user, group.queue, COUNT(b); dump d; |
D.
None of the mentioned
|
Answer : Option B
Explanation :
The first map in the schema contains job-related entries.
Be The First To Comment