Home / CSE MCQs / Hadoop MCQs :: Hadoop Pig

CSE MCQs :: Hadoop MCQs

  1. Which of the following function is used to read data in PIG ?
  2. A.
    WRITE
    B.
    READ
    C.
    LOAD
    D.
    None of the mentioned

    View Answer

    Workspace

    Discuss Discuss in Forum


  3. You can run Pig in interactive mode using the ______ shell.
  4. A.
    Grunt
    B.
    FS
    C.
    HDFS
    D.
    None of the mentioned

    View Answer

    Workspace

    Discuss Discuss in Forum


  5. __________ is a framework for collecting and storing script-level statistics for Pig Latin.
  6. A.
    Pig Stats
    B.
    PStatistics
    C.
    Pig Statistics
    D.
    None of the mentioned

    View Answer

    Workspace

    Discuss Discuss in Forum


  7. The ________ class mimics the behavior of the Main class but gives users a statistics object back.
  8. A.
    PigRun
    B.
    PigRunner
    C.
    RunnerPig
    D.
    None of the mentioned

    View Answer

    Workspace

    Discuss Discuss in Forum


  9. ___________ return a list of hdfs files to ship to distributed cache.
  10. A.
    relativeToAbsolutePath()
    B.
    setUdfContextSignature()
    C.
    getCacheFiles()
    D.
    getShipFiles()

    View Answer

    Workspace

    Discuss Discuss in Forum


  11. The loader should use ______ method to communicate the load information to the underlying InputFormat.
  12. A.
    relativeToAbsolutePath()
    B.
    setUdfContextSignature()
    C.
    getCacheFiles()
    D.
    setLocation()

    View Answer

    Workspace

    Discuss Discuss in Forum


  13. Which of the following command can be used for debugging ?
  14. A.
    exec
    B.
    execute
    C.
    error
    D.
    throw

    View Answer

    Workspace

    Discuss Discuss in Forum


  15. Which of the following file contains user defined functions (UDFs) ?
  16. A.
    script2-local.pig
    B.
    pig.jar
    C.
    tutorial.jar
    D.
    excite.log.bz2

    View Answer

    Workspace

    Discuss Discuss in Forum


  17. Which of the following script is used to check scripts that have failed jobs ?
  18. A.

    a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);

    b = foreach a generate (Chararray) j#'STATUS' as status, j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, j#'JOBID' as 


    job;

    c = filter b by status != 'SUCCESS';

    dump c;

    B.

    a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);

    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, (Long) r#'NUMBER_REDUCES' as reduces;

    c = group b by (id, user, script_name) parallel 10;

    d = foreach c generate group.user, group.script_name, MAX(b.reduces) as max_reduces;

    e = filter d by max_reduces == 1;

    dump e; 

    C.

    a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);

    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'QUEUE_NAME' as queue;

    c = group b by (id, user, queue) parallel 10;

    d = foreach c generate group.user, group.queue, COUNT(b);

    dump d;

    D.
    None of the mentioned

    View Answer

    Workspace

    Discuss Discuss in Forum


  19. Which of the following code is used to find scripts that use only the default parallelism ?
  20. A.

    a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);


    b = foreach a generate (Chararray) j#'STATUS' as status, j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, j#'JOBID' as 


    job;


    c = filter b by status != 'SUCCESS';


    dump c;

    B.

    a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);

    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'JOBNAME' as script_name, (Long) r#'NUMBER_REDUCES' as reduces;

    c = group b by (id, user, script_name) parallel 10;

    d = foreach c generate group.user, group.script_name, MAX(b.reduces) as max_reduces;

    e = filter d by max_reduces == 1;

    dump e; 

    C.

    a = load '/mapred/history/done' using HadoopJobHistoryLoader() as (j:map[], m:map[], r:map[]);

    b = foreach a generate j#'PIG_SCRIPT_ID' as id, j#'USER' as user, j#'QUEUE_NAME' as queue;

    c = group b by (id, user, queue) parallel 10;

    d = foreach c generate group.user, group.queue, COUNT(b);

    dump d;

    D.
    None of the mentioned

    View Answer

    Workspace

    Discuss Discuss in Forum