Category: Data Science Interview Questions


  • Machine Learning – L1 & L2 Regularization.

    Machine Learning – L1 & L2 Regularization.

    Machine Learning – L1 & l2 Regularization Table Of Contents: What Is L1 & L2 Regularization ? How Controlling The Magnitude Of The Model’s Coefficients, Overcome Overfitting ? How Too Large Coefficients More Likely To Fit Random Noise In The Training Set ? What Is Sparsity In The Model ? How The L2 Regularization Handle The Larger Weights ? Explain With Mathematical Example How The Weights Are Getting Zero In L1 Normalization ? Why For L2 Regularization Weight Can’t Be Zero Explain With One Example ? (1) What Is L1 & L2 Regularization ? (2) How Controlling The Magnitude Of

    Read More

  • What Is Hierarchical Representations In Deep Learning?

    What Is Hierarchical Representations In Deep Learning? Table Of Contents: What Is Hierarchical Feature Representation? Key Concepts of Hierarchical Representations. (1) What Is Hierarchical Feature Representation? Hierarchical representations refer to the layered structure of features or patterns that a machine learning model, particularly in deep learning, learns from input data. These representations progress from simple, low-level features in early layers to more complex, high-level abstractions in deeper layers of a neural network. (2) Key Concepts In Hierarchical Representation. (3) Benefits Of Hierarchical Representation.

    Read More

  • Probability Theory

    Probability Theory Table Of Contents: Probability Of ‘A’ and ‘B’ Happening Together. Probability Of ‘A’ Given ‘B’ has already Happened. (1) Probability Of ‘A’ and ‘B’ Happening Together. P(A∩B)=P(A)×P(B) for an independent event if there is no relationship between a and b how we are multiplying there individual probability As here we are considering two events we need to consider all possible outcomes for both the events. For rolling two dies together we will have 36 number of outcomes. For tossing two coins together we will hae 4 possible outcomes. (2) Probability Of ‘A’ Given ‘B’ Has Already Happened. (3)

    Read More

  • Parametric & Non Parametric Models

    Parametric Vs Non Parametric Models Table Of Contents: Parametric Models Key Characteristics  Examples Advantages Disadvantages Non Parametric Models Key Characteristics  Examples Advantages Disadvantages Parametric Model: A parametric model assumes a specific functional form for the relationship between the input features and the output. These models have a fixed number of parameters that are determined during the training process. Key Features – Parametric Model Examples – Parametric Model Advantages & Disadvantages – Parametric Model Non – Parametric Model: A non-parametric model makes no strong assumptions about the form of the mapping function. Instead, it learns the structure directly from the data,

    Read More

  • Topics To Learn In Support Vector Machine.

    Topics To Learn In SVM Table Of Contents: Introduction to SVM What is SVM? Use cases and applications of SVM Strengths and weaknesses of SVM Mathematical Foundations Linear separability Concept of a hyperplane Margin and margin maximization Support vectors and their role Functional and geometric margins SVM for Linearly Separable Data Objective function for linear SVM Hard margin SVM Optimization problem formulation Lagrange multipliers and the dual problem SVM for Non-Linearly Separable Data Soft margin SVM Slack variables and their significance Trade-off parameter CCC: bias-variance tradeoff Practical scenarios for soft margin SVM Kernel Trick What is the kernel trick? Common

    Read More

  • Hyper Parameters In Decision Tree.

    Hyper Parameters In Decision Tree Table Of Contents: Maximum Depth (max_depth) Minimum Samples Split (min_samples_split) Minimum Samples per Leaf (min_samples_leaf) Maximum Features (max_features) Maximum Leaf Nodes (max_leaf_nodes) Minimum Impurity Decrease (min_impurity_decrease) Split Criterion (criterion) Random State (random_state) Class Weight (class_weight) Presort (presort) Splitter (splitter) (1) Maximum Depth (max_depth) from sklearn.datasets import load_iris from sklearn.tree import DecisionTreeClassifier, export_text from sklearn.tree import plot_tree import matplotlib.pyplot as plt # Load the Iris dataset iris = load_iris() X, y = iris.data, iris.target # Build a decision tree with max_depth=3 clf = DecisionTreeClassifier(max_depth=3, random_state=42) clf.fit(X, y) # Plot the decision tree plt.figure(figsize=(12, 8)) plot_tree(clf, feature_names=iris.feature_names,

    Read More

  • Drawbacks Of Decision Tree.

    Drawbacks Of Decision Tree Table Of Contents: Overfitting Instability Biased Towards Features with More Levels Difficulty with Continuous Variables Greedy Nature (Local Optima) Lack of Smooth Decision Boundaries Poor Performance on Unstructured Data Computational Complexity for Large Datasets Difficulty in Handling Correlated Features Interpretability Challenges for Large Trees (1) Overfitting (2) Instability (3) Biased Towards Features with More Levels (4) Difficulty with Continuous Variables (5) Greedy Nature (Local Optima) (6) Lack of Smooth Decision Boundaries (7) Poor Performance on Unstructured Data (8) Computational Complexity for Large Datasets (9) Difficulty in Handling Correlated Features (10) Interpretability Challenges for Large Trees

    Read More

  • Gini Index In Decision Tree.

    What Is Gini Index? Table Of Contents: Introduction To Gini Index. Formula & Calculation. Gini Index In Decision Tree. Properties Of Gini Index. Weighted Gini Index. Gini Index Vs Entropy. Splitting Criteria and Gini Index. Advantages and Limitation Of Gini Index. Practical Implementation. Real-World Examples Advanced Topics (1) Introduction To Gini Index. Definition Of Gini Index In the context of decision trees, the Gini Index is a metric used to evaluate the purity of a split or node. It quantifies the probability of incorrectly classifying a randomly chosen element from the dataset if it were labeled randomly according to the

    Read More

  • All Machine Learning Algorithms To Study

    All Machine Learning Algorithms To Study Table Of Contents: Regression Algorithms. Classification Algorithms. Clustering Algorithms. Ensemble Learning Algorithms. Dimensionality Reduction Algorithms Association Algorithms. Reinforcement Learning Algorithms. Deep Learning Algorithms. (1) Regression Algorithms Linear Regression. Regression Trees. Non-Linear Regression. Bayesian Linear Regression. Polynomial Regression. LASSO Regression. Ridge Regression. Weighted Least Squares Regression. (2) Classification Algorithms Logistic Regression Decision Trees Random Forest Support Vector Machines K – Nearest Neighbors Naive Bayes Algorithm (3) Clustering Algorithms K-Means Clustering K-Medoids (PAM) Hierarchical Clustering (Agglomerative and Divisive) DBSCAN (Density-Based Spatial Clustering of Applications with Noise) Mean Shift Clustering Gaussian Mixture Models (GMM) Spectral Clustering Affinity

    Read More

  • Gain Ratio In Decision Tree.

    Gain Ratio In Decision Tree Table Of Contents: What Is Gain Ratio In Decision Tree? Example Of Gain Ratio Interpreting Split Information What Is The Range Of Gain Ratio? What We Want ? Balanced, Unbalanced & Moderate Split Which Split Information Is Better: Balanced, Unbalanced & Moderate Split How Gain Ratio Penalized Lower Split Information? Advantages Of Gain Ratio Disadvantages Of Gain Ratio (1) What Is Gain Ratio In Decision Tree? In decision tree learning, the Gain Ratio is an improvement over Information Gain to evaluate splits. While Information Gain measures the effectiveness of a feature in classifying the data,

    Read More