MCQ on Big Data Computing with answers

In this blog post, we have shared MCQ on Big Data Computing. These Multiple choice questions on Big Data analytics are very important for technical exams of university placements, entrance exams, etc;

MCQ on Big Data Computing with answers

1. The maximum number of super keys for the relation schema R(E,F,G,H) with E as the key is

2. Consider a B+-tree in which the maximum number of keys in a node is 5. What is the minimum number of keys in any non-root node ?

3. Consider the join of a relation R, with a relation S. If R has m number of tuples and S has n number of tuples then the maximum and minimum sizes of the join respectively are:

m + n & 0

mn & 0

m + n & | m – n |

mn & m + n

4. Which one of the following is NOT a part of the ACID properties of database transactions ?

Atomicity

Consistency

Isolation

Deadlock-freedom

5. In the IPv4 addressing format, the number of networks allowed under Class C addresses is:

2^14

2^7

2^21

2^24

6. One of the header fields in an IP datagram is the Time to Live (TTL) field. Which of the following statements best explains the need for this field ?

It can be used to prioritize packets

It can be used to reduce delays

It can be used to optimize throughput

It can be used to prevent packet looping

7. The address resolution protocol (ARP) is used for

Finding the IP address from the DNS

Finding the IP address of the default gateway

Finding the IP address that corresponds to a MAC address

Finding the MAC address that corresponds to an IP address

8. A process executes the code

fork();

9. The total number of child processes created is

10. Which of the following page replacement algorithms suffers from Belady’s anomaly?

FIFO

LRU

Optimal Page Replacement

Both LRU and FIFO

11. ________________ is responsible for allocating system resources to the various applications running in a Hadoop cluster and scheduling tasks to be executed on different cluster nodes.

Hadoop Common

Hadoop Distributed File System (HDFS)

Hadoop YARN

Hadoop Map Reduce

12. Which of the following tool is designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases?

Pig

Mahout

Apache Sqoop

Flume

13. _________________is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.

Flume

Apache Sqoop

Pig

Mahout

14. _______________refers to the connectedness of big data.

Value

Veracity

Velocity

Valence

15. Consider the following statements:

Statement 1: Volatility refers to the data velocity relative to timescale of event being studied

Statement 2: Viscosity refers to the rate of data loss and stable lifetime of data

Only statement 1 is true

Only statement 2 is true

Both statements are true

Both statements are false

16. ________________refers to the biases, noise and abnormality in data, trustworthiness of data.

Value

Veracity

Velocity

Volume

17. _____________ brings scalable parallel database technology to Hadoop and allows users to submit low latencies queries to the data that’s stored within the HDFS or the Hbase without acquiring a ton of data movement and manipulation.

Apache Sqoop

Mahout

Flume

Impala

True or False ?

18. NoSQL databases store unstructured data with no particular schema.

True

False

19. ____________is a highly reliable distributed coordination kernel , which can be used for distributed locking, configuration management, leadership election, and work queues etc.

Apache Sqoop

Mahout

ZooKeeper

Flume

True or False ?

20. MapReduce is a programming model and an associated implementation for processing and generating large data sets.

True

False

21. ____is the slave/worker node and holds the user data in the form of Data Blocks.

NameNode

Data block

Replication

DataNode

22. _______________works as a master server that manages the file system namespace and basically regulates access to these files from clients, and it also keeps track of where the data is on the Data Nodes and where the blocks are distributed essentially.

Name Node

Data block

Replication

Data Node

23. The number of maps in MapReduce is usually driven by the total size of____________

Inputs

Outputs

Tasks

None of the mentioned

24. True or False ? The main duties of task tracker are to break down the receive job that is big computations in small parts, allocate the partial computations that is tasks to the slave nodes monitoring the progress and report of task execution from the slave.

True

False

25. Point out the correct statement in context of YARN:

YARN is highly scalable.

YARN enhances a Hadoop compute cluster in many ways

YARN extends the power of Hadoop to incumbent and new technologies found within the data center

All of the mentioned

26. The namenode knows that the datanode is active using a mechanism known as

Heartbeats

Datapulse

h-signal

Active-pulse

True or False ?

27. HDFS performs replication, although it results in data redundancy?

True

False

28. _______________function processes a key/value pair to generate a set of intermediate key/value pairs.

Map

Reduce

Both Map and Reduce

None of the mentioned

Thanks for visiting our website for mcq on big data computing.

MCQ on Big Data Computing with answers

Leave a Comment Cancel reply