In this blog post, we have shared MCQ on Big Data Computing. These Multiple choice questions on Big Data analytics are very important for technical exams of university placements, entrance exams, etc;
MCQ on Big Data Computing with answers
1. The maximum number of super keys for the relation schema R(E,F,G,H) with E as the key is
5
6
7
8
2. Consider a B+-tree in which the maximum number of keys in a node is 5. What is the minimum number of keys in any non-root node ?
1
2
3
4
3. Consider the join of a relation R, with a relation S. If R has m number of tuples and S has n number of tuples then the maximum and minimum sizes of the join respectively are:
m + n & 0
mn & 0
m + n & | m – n |
mn & m + n
4. Which one of the following is NOT a part of the ACID properties of database transactions ?
Atomicity
Consistency
Isolation
Deadlock-freedom
5. In the IPv4 addressing format, the number of networks allowed under Class C addresses is:
2^14
2^7
2^21
2^24
6. One of the header fields in an IP datagram is the Time to Live (TTL) field. Which of the following statements best explains the need for this field ?
It can be used to prioritize packets
It can be used to reduce delays
It can be used to optimize throughput
It can be used to prevent packet looping
7. The address resolution protocol (ARP) is used for
Finding the IP address from the DNS
Finding the IP address of the default gateway
Finding the IP address that corresponds to a MAC address
Finding the MAC address that corresponds to an IP address
8. A process executes the code
fork();
fork();
fork();
9. The total number of child processes created is
3
4
7
8
10. Which of the following page replacement algorithms suffers from Belady’s anomaly?
FIFO
LRU
Optimal Page Replacement
Both LRU and FIFO
11. ________________ is responsible for allocating system resources to the various applications running in a Hadoop cluster and scheduling tasks to be executed on different cluster nodes.
Hadoop Common
Hadoop Distributed File System (HDFS)
Hadoop YARN
Hadoop Map Reduce
12. Which of the following tool is designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases?
Pig
Mahout
Apache Sqoop
Flume
13. _________________is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.
Flume
Apache Sqoop
Pig
Mahout
14. _______________refers to the connectedness of big data.
Value
Veracity
Velocity
Valence
15. Consider the following statements:
Statement 1: Volatility refers to the data velocity relative to timescale of event being studied
Statement 2: Viscosity refers to the rate of data loss and stable lifetime of data
Only statement 1 is true
Only statement 2 is true
Both statements are true
Both statements are false
16. ________________refers to the biases, noise and abnormality in data, trustworthiness of data.
Value
Veracity
Velocity
Volume
17. _____________ brings scalable parallel database technology to Hadoop and allows users to submit low latencies queries to the data that’s stored within the HDFS or the Hbase without acquiring a ton of data movement and manipulation.
Apache Sqoop
Mahout
Flume
Impala
True or False ?
18. NoSQL databases store unstructured data with no particular schema.
True
False
19. ____________is a highly reliable distributed coordination kernel , which can be used for distributed locking, configuration management, leadership election, and work queues etc.
Apache Sqoop
Mahout
ZooKeeper
Flume
True or False ?
20. MapReduce is a programming model and an associated implementation for processing and generating large data sets.
True
False
21. ____is the slave/worker node and holds the user data in the form of Data Blocks.
NameNode
Data block
Replication
DataNode
22. _______________works as a master server that manages the file system namespace and basically regulates access to these files from clients, and it also keeps track of where the data is on the Data Nodes and where the blocks are distributed essentially.
Name Node
Data block
Replication
Data Node
23. The number of maps in MapReduce is usually driven by the total size of____________
Inputs
Outputs
Tasks
None of the mentioned
24. True or False ? The main duties of task tracker are to break down the receive job that is big computations in small parts, allocate the partial computations that is tasks to the slave nodes monitoring the progress and report of task execution from the slave.
True
False
25. Point out the correct statement in context of YARN:
YARN is highly scalable.
YARN enhances a Hadoop compute cluster in many ways
YARN extends the power of Hadoop to incumbent and new technologies found within the data center
All of the mentioned
26. The namenode knows that the datanode is active using a mechanism known as
Heartbeats
Datapulse
h-signal
Active-pulse
True or False ?
27. HDFS performs replication, although it results in data redundancy?
True
False
28. _______________function processes a key/value pair to generate a set of intermediate key/value pairs.
Map
Reduce
Both Map and Reduce
None of the mentioned
Thanks for visiting our website for mcq on big data computing.