Data Science – Interview Q & A .


Data Science – Interview Q & A.

Set-1:

  1. Difference Between Training & Testing Set?
  2. Difference In Validation Set & Testing Set?
  3. Define Bias & Variance.
  4. How You Will Handle Missing Values In The Dataset ?
  5. How Decision Tree Classifier Works ?
  6. How Logistic Regression Model Evaluated?
  7. Assumptions Of Linear Regression Model.
  8. What Is Multicollinearity How To Handle It?
  9. Explain Why Performance Of XGBoost Is Better & Why ?
  10. Why Is An Encoder & Decoder Model Is Used In NLP ?

Set-2:

  1. Difference In Machine Learning & Artificial Intelligence ?
  2. Difference In Deep Learning & Machine Learning .
  3. What Is Cross Validation ?
  4. What Are The Types Of Machine Learning ?
  5. Difference Between Supervised & Unsupervised Machine Learning ?
  6. What Is Selection Bias ?
  7. What Is The Difference Between The Correlation & Causality ?
  8. What Is The Difference Between Correlation & Covariance ?
  9. What Is The Difference Between Supervised & Reinforcement Learning ?
  10. What Are The Requirements Of Reinforcement Learning Environment ?

Set-3:

  1. What Different Targets Do Classification & Regression Algorithm Requires ?
  2. What Five Popular Algorithms Used In Machine Learning ?
  3. What Is Confusion Matrix ?
  4. List The Difference Between KNN & K – Means Clustering .
  5. What Are Difference Between Type-1 & Type – 2 Error ?
  6. What Is Semi Supervised Learning ?
  7. Where Are Semi Supervised Learning Applied ?
  8. What Is Stemming ?
  9. What Is Lemmatization ?
  10. What Is A PCA ?

Set-4:

  1. What Are Support Vectors In SVM ?
  2. In terms Of Access How Arrays & Linked Lists Are Different ?
  3. What Is P – Value ?
  4. What Techniques Are Used To Find Resemblance In The Recommendation System ?
  5. What Are Difference Between Regression & Classification ?
  6. What Does Area Under ROC Curve Indicate ?
  7. What Is A Neural Network ?
  8. What Is An Outlier ?
  9. What Is Another Name Of The Bayesian Network ?
  10. What Is Ensemble Learning ? 

Set-5:

  1. What Is Clustering ?
  2. How Would You Define Collinearity ?
  3. What Is Overfitting ?
  4. What Is The Bayesian Network ?
  5. What Is The Time Series ?
  6. What Is The Dimension Reduction In ML ?
  7. What Is Underfitting ?
  8. What Is Sensitivity ?
  9. What Is Specificity ?
  10. What Is The Difference Between Stochastic Gradient Descent & Gradient Descent Algorithm ? 

Set-6:

  1. Explain Decision Tree In ML ?
  2. Why Is Naive Bayes Method Is ‘Naive’ ?
  3. State The Bayes Theorem For Naive Bayes Algorithm.
  4. How Would You Define Precision & Recall ?
  5. What Are Some Tools Used To Discover Outliers ?
  6. Explain Kernel In SVM .
  7. What Are Different Types Of Clustering Algorithms ?
  8. How Would You Describe Reinforcement Learning ?
  9. What Is Context Based Filtering & Collaborative Filtering ?
  10. What Is Deductive Learning & Inductive Learning ?

Set-7:

  1. How Do You Differentiate Data Mining Vs. Machine Learning ?
  2. Why ROC Curve Is Important ?
  3. Why Does Overfitting Occurs In ML ?
  4. What Are Some Functions Of Unsupervised Learning ?
  5. What Are Some Functions Of Supervised Learning ?
  6. What Are Two Components Of Bayesian Logic ?
  7. How Would You Describe A Recommender System ?
  8. What Is Regularization In ML?
  9. Advantages & Disadvantages Of Decision Tree ?
  10. What Do You Understand About Exploding Gradient Problem In Machine Learning ?

Set-7:

  1. How Do You Differentiate Data Mining Vs. Machine Learning ?
  2. Why ROC Curve Is Important ?
  3. Why Does Overfitting Occurs In ML ?
  4. What Are Some Functions Of Unsupervised Learning ?
  5. What Are Some Functions Of Supervised Learning ?
  6. What Are Two Components Of Bayesian Logic ?
  7. How Would You Describe A Recommender System ?
  8. What Is Regularization In ML?
  9. Advantages & Disadvantages Of Decision Tree ?
  10. What Do You Understand About Exploding Gradient Problem In Machine Learning ?

Set-8:

  1. What Is Vanishing Gradient Problem In ML ?
  2. What Do You Understand About Bias & Variance Tradeoff.
  3. How Would You Describe F1 Score And How Would You Use It ?
  4. Explain The Difference Between Loss Function & Cost Function ?
  5. How Would You Handle Outlier Values ?
  6. What Is A Random Forest & How Does It Works ?
  7. What Ensemble Techniques Can Be Used To Aggregate Multiple Models ?
  8. What Methods Can Be Used To Find The Threshold Of A Classifier ?
  9. How Can You Check Normality Of A Dataset ?
  10. How Can You Differentiate Between A Parametric & Non Parametric Model ?

Set-9:

  1. How Can Logistic Regression Can Be Used For More Than One Class ?
  2. What Difference Exists Between Softmax & Sigmoid Functions ?
  3. How To Avoid Overfitting In ML Models ?
  4. Which Is Better To Have A False Positive Or False Negative ?
  5. How Would You Handle A Dataset Suffering From High Variance ?
  6. What Are Some Classification Methods That SVM Can Handle ?
  7. Why Do You Thing Instance Based Learning Algorithm Is Sometimes Referred To As Lazy Learning Algorithm ?
  8. Explain The Reason For Pruning In Decision Tree ?
  9. How Regularization Reduces The Cost Term ?
  10. What Is The Need To Convert Categorical Variables To Factors ?

Set-10:

  1. Do You Believe Treating A Categorical Variable As A Continuous Variable Will Result In A Better Predictive Model ?
  2. Why Do We Need The Confusion Matrix ?
  3. Difference Between Gradient Boosting & Random Forest ?
  4. How Does Box -Cox Transformation Occur ?
  5. How Is Data Divided Into Cross Validation ?
  6. What Are Support Vectors In SVM ?
  7. What Are Different Method To Split A Tree In Decision Tree Algorithm ?
  8. How Does Support Vector Machine Algorithm Helps Self – Learning ?
  9. How To Choose Optimal Number Of Clusters ?
  10. What Is Feature Engineering ? How Does It Affects Model Performance ?

Set-11:

  1. Why Do We Perform Normalization ?
  2. What Is Difference Between Up Sampling & Down Sampling ? 
  3. What Is Data Leakage And How To Identify It ?
  4. What Are Some Of The Hyperparameters Of The Random Forest Regressor Which Helps To Avoid Overfitting ?
  5. Is It Always Necessary To Use 80:20 Ratio For The Train Test Split ?
  6. What Is One – Shot Learning ?
  7. What Is The Difference Between Manhattan Distance And Euclidean Distance ?
  8. What Is The Difference Between One Got Encoding & Ordinal Encoding ?
  9. Explain The Working Principle Of SVM .
  10. How Random Forest Is Robust To Outliers ?

Set-12:

  1. How To Handle Data Imbalance In Machine Learning ?
  2. Does The Accuracy Score Is Always A Good Metric To Measure The Performance Of The Classification Model ?
  3. What Is KNN Imputer And How Does It Work ?
  4. Explain The Working Procedure Of The XGBoost Model ?
  5. What Is Linear Discriminant Analysis ?
  6. How Can You Visualize High Dimensional Data In 2-D ?
  7. What Is The Reason Behind The Curse Of Dimensionality ?
  8. Which Metric Is More Robust To Outlier : MAE , MSE , RMSE ?
  9. How Would You Access The Goodness Of Fit For A Linear Regression Model ?
  10. What Is Null Hypothesis In Linear Regression Model ?

Set-13:

  1. Can SVMs Be Used For Both Classification & Regression Task ?
  2. Explain The Concept Of Weighting In KNN ? What are the different ways to assign weights, and how do they affect the model’s predictions?
  3. What is the concept of information gain in decision trees? How does it guide the creation of the tree structure?
  4. How does the independence assumption affect the accuracy of a Naive Bayes classifier?
  5. Why does PCA maximize the variance in the data?
  6. How do you evaluate the effectiveness of a machine learning model in an imbalanced dataset scenario? What metrics would you use instead of accuracy?
  7. How the One-Class SVM algorithm works for anomaly detection?
  8. Explain the concept of “concept drift” in anomaly detection.

Leave a Reply

Your email address will not be published. Required fields are marked *