MCQ on Big Data Computing with answers

In this blog post, we have shared MCQ on Big Data Computing. These Multiple choice questions on Big Data analytics are very important for technical exams of university placements, entrance exams, etc;

MCQ on Big Data Computing with answers

1. The maximum number of super keys for the relation schema R(E,F,G,H) with E as the key is

 5

 6

 7

 8

2. Consider a B+-tree in which the maximum number of keys in a node is 5. What is the minimum number of keys in any non-root node ?

 1

 2

 3

 4

3. Consider the join of a relation R, with a relation S. If R has m number of tuples and S has n number of tuples then the maximum and minimum sizes of the join respectively are:

 m + n & 0

 mn & 0

 m + n & | m – n |

 mn & m + n

4. Which one of the following is NOT a part of the ACID properties of database transactions ?

 Atomicity

 Consistency

 Isolation

 Deadlock-freedom

5. In the IPv4 addressing format, the number of networks allowed under Class C addresses is:

 2^14

 2^7

 2^21

 2^24

6. One of the header fields in an IP datagram is the Time to Live (TTL) field. Which of the following statements best explains the need for this field ?

 It can be used to prioritize packets

 It can be used to reduce delays

 It can be used to optimize throughput

 It can be used to prevent packet looping

7. The address resolution protocol (ARP) is used for

 Finding the IP address from the DNS

 Finding the IP address of the default gateway

 Finding the IP address that corresponds to a MAC address

 Finding the MAC address that corresponds to an IP address

8. A process executes the code

fork();

fork();

fork();

9. The total number of child processes created is

 3

 4

 7

 8

10. Which of the following page replacement algorithms suffers from Belady’s anomaly?

 FIFO

 LRU

 Optimal Page Replacement

 Both LRU and FIFO

11. ________________ is responsible for allocating system resources to the various applications running in a Hadoop cluster and scheduling tasks to be executed on different cluster nodes. 

Hadoop Common 

Hadoop Distributed File System (HDFS) 

Hadoop YARN 

Hadoop Map Reduce 

12. Which of the following tool is designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases? 

Pig 

Mahout 

Apache Sqoop 

Flume 

13. _________________is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. 

Flume 

Apache Sqoop 

Pig 

Mahout 

14. _______________refers to the connectedness of big data. 

Value  

Veracity  

Velocity  

Valence 

15. Consider the following statements:  
 
Statement 1: Volatility refers to the data velocity relative to timescale of event being studied  
 
Statement 2: Viscosity refers to the rate of data loss and stable lifetime of data 

Only statement 1 is true 

Only statement 2 is true 

Both statements are true 

Both statements are false 

16. ________________refers to the biases, noise and abnormality in data, trustworthiness of data. 

Value 

Veracity 

Velocity 

Volume 

17. _____________ brings scalable parallel database technology to Hadoop and allows users to submit low latencies queries to the data that’s stored within the HDFS or the Hbase without acquiring a ton of data movement and manipulation.  

Apache Sqoop 

Mahout 

Flume 

Impala 

True or False ?  
 
18. NoSQL databases store unstructured data with no particular schema. 

True 

False 

19. ____________is a highly reliable distributed coordination kernel , which can be used for distributed locking, configuration management, leadership election, and work queues etc. 

Apache Sqoop 

Mahout 

ZooKeeper 

Flume 

True or False ?  
 
20. MapReduce is a programming model and an associated implementation for processing and generating large data sets. 

True 

False 

21. ____is the slave/worker node and holds the user data in the form of Data Blocks. 

NameNode 

Data block 

Replication 

DataNode 

22. _______________works as a master server that manages the file system namespace and basically regulates access to these files from clients, and it also keeps track of where the data is on the Data Nodes and where the blocks are distributed essentially. 

Name Node 

Data block 

Replication 

Data Node 

23. The number of maps in MapReduce is usually driven by the total size of____________ 

Inputs 

Outputs 

Tasks 

None of the mentioned 

24. True or False ? The main duties of task tracker are to break down the receive job that is big computations in small parts, allocate the partial computations that is tasks to the slave nodes monitoring the progress and report of task execution from the slave. 

True 

False 

25. Point out the correct statement in context of YARN: 

YARN is highly scalable. 

YARN enhances a Hadoop compute cluster in many ways 

YARN extends the power of Hadoop to incumbent and new technologies found within the data center 

All of the mentioned 

26. The namenode knows that the datanode is active using a mechanism known as 

Heartbeats 

Datapulse 

h-signal 

Active-pulse 

True or False ?  
 
27. HDFS performs replication, although it results in data redundancy? 

True 

False 

28. _______________function processes a key/value pair to generate a set of intermediate key/value pairs. 

Map 

Reduce 

Both Map and Reduce 

None of the mentioned 

Thanks for visiting our website for mcq on big data computing.

Leave a Comment