Data Mining MCQs

Multiple Choice Questions 33 Pages
NC

Contributed by

Nidhi Chand
Loading
  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    MCQ
    Specialization : Business Analytics
    Course Code : 206 Course Name Data Mining
    Sr.no
    Question
    Answer
    1
    According to analysts, for what can traditional IT systems provide a
    foundation when they’re integrated with big data technologies like
    Hadoop?
    a) Big data management and data mining
    b) Data warehousing and business intelligence
    c) Management of Hadoop clusters
    d) Collecting and storing unstructured data
    A
    2
    All of the following accurately describe Hadoop, EXCEPT:
    a) Open source
    b) Real-time
    c) Java-based
    d) Distributed computing approach
    B
    3
    __________ has the world’s largest Hadoop cluster.
    a) Apple
    b) Datamatics
    c) Facebook
    d) None of the mentioned
    C
    4
    What are the five V’s of Big Data?
    a) Volume
    b) Velocity
    c) Variety
    d) All the above
    D

    Page 1

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    5
    _________ hides the limitations of Java behind a powerful and concise
    Clojure API for Cascading.
    a) Scalding
    b) Cascalog
    c) Hcatalog
    d) Hcalding
    B
    6
    What are the main components of Big Data?
    a) MapReduce
    b) HDFS
    c) YARN
    d) All of these
    D
    7
    What are the different features of Big Data Analytics?
    a) Open-Source
    b) Scalability
    c) Data Recovery
    d) All the above
    D
    8
    Define the Port Numbers for NameNode, Task Tracker and Job Tracker.
    a) NameNode
    b) Task Tracker
    c) Job Tracker
    d) All of the above
    D
    9
    This is an approach to selling goods and services in which a prospect
    explicitly agrees in advance to receive marketing information.
    a) customer managed relationship
    b) data mining
    c) permission marketing
    d) one-to-one marketing
    e) batch processing
    C

    Page 2

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    10
    This is an XML-based metalanguage developed by the Business Process
    Management Initiative (BPMI) as a means of modeling business
    processes, much as XML is, itself, a metalanguage with the ability to
    model enterprise data.
    a. BizTalk
    b. BPML
    c. e-biz
    d. ebXML
    e. ECB
    B
    11
    This is a central point in an enterprise from which all customer contacts
    are managed.
    a. contact center
    b. help system
    c. multichannel marketing
    d. call center
    e. help desk
    C
    12
    This is the practice of dividing a customer base into groups of individuals
    that are similar in specific ways relevant to marketing, such as age,
    gender, interests, spending habits, and so on.
    a. customer service chat
    b. customer managed relationship
    c. customer life cycle
    d. customer segmentation
    e. change management
    D
    13
    Movie Recommendation systems are an example of:
    1. Classification
    2. Clustering
    3. Reinforcement Learning
    4. Regression
    Options:
    B. A. 2 Only
    C. 1 and 2
    D. 1 and 3
    E. 2 and 3
    F. 1, 2 and 3
    H. 1, 2, 3 and 4
    E

    Page 3

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    14
    Sentiment Analysis is an example of:
    1. Regression
    2. Classification
    3. Clustering
    4. Reinforcement Learning
    Options:
    A. 1 Only
    B. 1 and 2
    C. 1 and 3
    D. 1, 2 and 3
    E. 1, 2 and 4
    F. 1, 2, 3 and 4
    E
    15
    Can decision trees be used for performing clustering?
    A. True
    B. False
    A
    16
    Which of the following is the most appropriate strategy for data
    cleaning before performing clustering analysis, given less than desirable
    number of data points:
    1. Capping and flouring of variables
    2. Removal of outliers
    Options:
    A. 1 only
    B. 2 only
    C. 1 and 2
    D. None of the above
    A
    17
    The problem of finding hidden structure in unlabeled data is called
    A. Supervised learning
    B. Unsupervised learning
    C. Reinforcement learning
    B
    18
    Task of inferring a model from labeled training data is called
    A. Unsupervised learning
    B. Supervised learning
    C. Reinforcement learning
    B
    19
    Some telecommunication company wants to segment their customers
    into distinct groups in order to send appropriate subscription offers, this
    is an example of
    A. Supervised learning
    B. Data extraction
    C. Serration
    D. Unsupervised learning
    D

    Page 4

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    20
    Self-organizing maps are an example of
    A. Unsupervised learning
    B. Supervised learning
    C. Reinforcement learning
    D. Missing data imputation
    A
    21
    You are given data about seismic activity in Japan, and you want to
    predict a magnitude of the next earthquake, this is in an example of
    A. Supervised learning
    B. Unsupervised learning
    C. Serration
    D. Dimensionality reduction
    A
    22
    Assume you want to perform supervised learning and to predict number
    of newborns according to size of storks’ population it is an example of
    A. Classification
    B. Regression
    C. Clustering
    D. Structural equation modelling
    B
    23
    Discriminating between spam and ham e-mails is a classification task,
    true or false?
    A. True
    B. False
    A
    24
    In the example of predicting number of babies based on storks’
    population size, number of babies is
    A. outcome
    B. feature
    C. attribute D. observation
    A
    25
    Data set {brown, black, blue, green , red} is example of Select one:
    a. Continuous attribute
    b. Ordinal attribute
    c. Numeric attribute
    d. Nominal attribute
    C
    26
    Which of the following activities is NOT a data mining task?
    a. Predicting the future stock price of a company using historical records
    b. Monitoring and predicting failures in a hydropower plant
    c. Extracting the frequencies of a sound wave
    d. Monitoring the heart rate of a patient for abnormalities Show Answer
    C
    27
    Data Visualization in mining cannot be done using Select one:
    a. Photos
    b. Graphs
    c. Charts
    d. Information Graphics
    A
    28
    Which of the following is not a data pre-processing methods Select one:
    a. Data Visualization
    b. Data Discretization
    c. Data Cleaning
    d. Data Reduction
    A
    29
    Dimensionality reduction reduces the data set size by removing _________
    C

    Page 5

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    Select one:
    a. composite attributes
    b. derived attributes
    c. relevant attributes
    d. irrelevant attributes
    30
    The difference between supervised learning and unsupervised learning is given
    by Select one:
    a. unlike unsupervised learning, supervised learning needs labeled data
    b. unlike unsupervised learning, supervised learning can be used to detect
    outliers
    c. there is no difference
    d. unlike supervised leaning, unsupervised learning can form new classes
    D
    31
    Which of the following activities is a data mining task? Select one:
    a. Monitoring the heart rate of a patient for abnormalities
    b. Extracting the frequencies of a sound wave
    c. Predicting the outcomes of tossing a (fair) pair of dice
    d. Dividing the customers of a company according to their profitability
    A
    32
    Identify the example of sequence data Select one:
    a. weather forecast
    b. data matrix
    c. market basket data
    d. genomic data
    A
    33
    To detect fraudulent usage of credit cards, the following data mining task
    should be used Select one:
    a. Outlier analysis
    b. prediction
    c. association analysis
    d. feature selection
    D
    34
    Which of the following is NOT example of ordinal attributes? Select one:
    a. Zip codes
    b. Ordered numbers
    c. Movie ratings
    d. Military ranks
    A
    35
    Data scrubbing can be defined as Select one:
    a. Check field overloading
    b. Delete redundant tuples
    c. Use simple domain knowledge (e.g., postal code, spell-check) to detect
    errors and make corrections
    d. Analyzing data to discover rules and relationship to detect violators
    A
    36
    Which data mining task can be used for predicting wind velocities as a function
    of temperature, humidity, air pressure, etc.?
    Select one:
    a. Cluster Analysis
    b. Regression
    c. Clasification
    d. Sequential pattern discovery
    C
    37
    In asymmetric attibute Select one:
    a. No value is considered important over other values
    B

    Page 6

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    b. All values are equals c
    c. Only non-zero value isc important
    d. Range of values is impodrtant
    38
    Which statement is not TRUbE regarding a data mining task?
    Select one:
    a. Clustering is a descriptive data mining task
    b. Classification is a predictive data mining task
    c. Regression is a descriptive data mining task
    d. Deviation detection is a predictive data mining task
    C
    39
    Identify the example of Nominal attribute Select one:
    a. Temperature
    b. Salary
    c. Mass
    d. Gender
    C
    40
    Synonym for data mining is Select one:
    a. Data Warehouse
    b. Knowledge discovery in database
    c. Business intelligence
    d. OLAP
    D
    41
    Nominal and ordinal attributes can be collectively referred to as_________
    attributes Select one:
    a. perfect
    b. qualitative
    c. consistent
    d. optimized
    B
    42
    Which of the following is not a data mining task?
    Select one:
    a. Feature Subset Detection
    b. Association Rule Discovery
    c. Regression
    d. Sequential Pattern Discovery
    B
    43
    Which of the following is an Entity identification problem? Select one:
    a. One person with different email address
    b. One person’s name written in different way
    c. Title for person
    d. One person with multiple phone numbers Show Answer
    A
    44
    In Binning, we first sort data and partition into (equal-frequency) bins and then
    which of the following is not a valid step Select one:
    a. smooth by bin boundaries
    b. smooth by bin median
    c. smooth by bin means
    d. smooth by bin values
    B
    45
    Incorrect or invalid data is known as _________ Select one: a. Missing data b.
    Outlier c. Changing data d. Noisy data Show Answer
    Question 23
    The important characteristics of structured data are Select one:
    a. Sparsity, Resolution, Distribution, Tuples
    b. Sparsity, Centroid, Distribution , Dimensionality
    d

    Page 7

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    c. Resolution, Distribution, Dimensionality ,Objects
    d. Dimensionality, Sparsity, Resolution, Distribution
    46
    Which of the following are descriptive data mining activities? Select one:
    a. Deviation detection
    b. Classification
    c. Clustering
    d. Regression
    D
    47
    In a data mining task where it is not clear what type of patterns could be
    interesting, the data mining system should Select one:
    a. allow interaction with the user to guide the mining process
    b. perform both descriptive and predictive tasks
    c. perform all possible data mining tasks
    d. handle different granularities of data and patterns
    D
    48
    Correlation analysis is used for Select one:
    a. handling missing values
    b. identifying redundant attributes
    c. handling different data formats
    d. eliminating noise Show Answer
    C
    49
    The number of item sets of cardinality 4 from the items lists {A, B, C, D, E}
    Select one:
    a. 2
    b. 10
    c. 20
    d. 5
    A
    50
    Question text Which of the following is NOT a data quality related issue?
    Select one:
    a. Missing values
    b. Outlier records
    c. Duplicate records
    d. Attribute value range
    B

    Page 8

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    Sr.no
    Question
    Answer
    1
    MCQ
    Data set {brown, black, blue, green , red} is example of Select one:
    a. Continuous attribute
    b. Ordinal attribute
    c. Numeric attribute
    d. Nominal attribute
    D
    2
    Which of the following activities is NOT a data mining task? Select one:
    a. Predicting the future stock price of a company using historical
    records
    b. Monitoring and predicting failures in a hydropower plant
    c. Extracting the frequencies of a sound wave
    d. Monitoring the heart rate of a patient for abnormalities
    C
    3
    Data Visualization in mining cannot be done using Select one:
    a. Photos
    b. Graphs
    c. Charts
    d. Information Graphics
    A
    4
    Which of the following is not a data pre-processing methods Select
    one:
    a. Data Visualization
    b. Data Discretization
    c. Data Cleaning
    d. Data Reduction
    A
    5
    Dimensionality reduction reduces the data set size by removing
    _________ Select one:
    a. composite attributes
    b. derived attributes
    c. relevant attributes
    d. irrelevant attributes
    D

    Page 9

  • DNYANSAGAR INSTITUTE OF MANAGEMENT AND RESEARCH
    Prof. Dhananjay Bhavsar www.dimr.edu.in
    6
    The difference between supervised learning and unsupervised learning
    is given by Select one:
    a. unlike unsupervised learning, supervised learning needs labeled
    data
    b. unlike unsupervised learning, supervised learning can be used to
    detect outliers
    c. t.here is no difference
    d. unlike supervised leaning, unsupervised learning can form new
    classes
    A
    7
    Which of the following activities is a data mining task? Select one:
    a. Monitoring the heart rate of a patient for abnormalities
    b. Extracting the frequencies of a sound wave
    c. Predicting the outcomes of tossing a (fair) pair of dice
    d. Dividing the customers of a company according to their profitability
    A
    8
    Identify the example of sequence data Select one:
    a. weather forecast
    b. data matrix
    c. market basket data
    d. genomic data
    D
    9
    To detect fraudulent usage of credit cards, the following data mining
    task should be used Select one:
    a. Outlier analysis
    b. prediction
    c. association analysis
    d. feature selection
    A
    10
    Which of the following is NOT example of ordinal attributes? Select
    one:
    a. Zip codes
    b. Ordered numbers
    c. Movie ratings
    d. Military ranks

    Page 10

Download this file to view remaining 23 pages

logo StudyDocs
StudyDocs is a platform where students and educators can share educational resources such as notes, lecture slides, study guides, and practice exams.

Contacts

Links

Resources

© 2025 StudyDocs. All Rights Reserved.