• Transformers – Layered Normalization

    Transformers – Layered Normalization

    Transformers – Layered Normalization Table Of Contents: What Is Normalization ? What Is Batch Normalization ? Why Batch Normalization Does Not Works On Sequential Data ? (1) What Is Normalization? What We Are Normalizing ? Generally you normalize the input values which you pass to the neural networks and also you can normalize the output from an hidden layer. Again we are normalizing the hidden layer output because again the hidden layer may produce the large range of numbers, hence we need to normalize them to bring them in a range. Benefits Of Normalization. (2) What Is Batch Normalization? https://www.praudyog.com/deep-learning-tutorials/transformers-batch-normalization/

    Read More

  • Deep Learning – Batch Normalization.

    Deep Learning – Batch Normalization.

    What Is Batch Normalization ? Table Of Contents: What Is Batch Normalization ? Why Is Batch Normalization Needed ? Why Is Batch Normalization Needed ? Example Of Batch Normalization. Why is Internal Covariate Shift (ICS) a Problem If Different Distributions Are Natural? (1) What Is Batch Normalization ? Batch Normalization is a technique used in Deep Learning to speed up training and improve stability by normalizing the inputs of each layer. Batch Normalization keeps activations stable by normalizing each layer’s output. Without Batch Normalization it can lead to unstable training, slow convergence, overfitting, or underfitting. Special Note: If at every

    Read More

  • Transformers – Positional Encoding in Transformers

    Transformers – Positional Encoding in Transformers

    Transformers – Positional Encoding Table Of Contents: What Is Positional Encoding In Transformers? Why Do We Need Positional Encoding? How Does Positional Encoding Works? Positional Encoding In Attention All You Need Paper. Interesting Observations In Sin & Cosine Curve. How Positional Encoding Captures The Relative Position Of The Words ? (1) What Is Positional Encoding In Transformer? Positional Encoding is a technique used in Transformers to add order (position) information to input sequences. Since Transformers do not have built-in sequence awareness (unlike RNNs), they use positional encodings to help the model understand the order of words in a sentence. (2)

    Read More

  • AWS – Jupyter Lab

    AWS – Jupyter Lab

    AWS – Jupyter Lab Table Of Contents: What Is Jupyter Lab? How To Open Jupyter Lab? (1) What Is Jupyter Lab JupyterLab is an advanced, flexible, and interactive development environment for working with Jupyter Notebooks, code, and data. It is the next-generation interface of the classic Jupyter Notebook and provides a more powerful and modular experience. (2) How To Open Jupyter Lab? Step – 1: Search For Sagemaker In The Search Box. Step – 2: Click On Amazon Sagemaker Step – 3: Click On Open Studio Step – 4: It Will Open Sagemaker Studio Step – 5: Click On Jupyter

    Read More

  • Transformers – Multi-Head Attention in Transformers 

    Transformers – Multi-Head Attention in Transformers 

    Multi Head Attention Table Of Contents: Disadvantages Of Self Attention Mechanism. What Is Multi-Head Attention ? How Multi Headed Attention Works ? (1) Disadvantages Of Self Attention. The task is read the sentence and tell me the meaning of it. Meaning-1: An astronomer was standing and another man saw him with a telescope. Meaning-2: An astronomer was standing with a telescope and another man just saw him. In this sentence we are getting two different meaning of a single sentence. How Self Attention Will Works On This Sentence ? The self attention will find out the similarity of each word

    Read More

  • Transformers – Why Self Attention Is Called Self ?

    Transformers – Why Self Attention Is Called Self ?

    Why Self Attention Is Called Self ? Table Of Contents: Why Self Attention Is Called Self ? (1) Why Self Attention Is Called Self ? We have learnt the Attention concepts from the Luong Attention. In Luong attention mechanism, we calculate which word of the Encoder is more important in predicting the current time step output of the Decoder. To Do this we assign an attention score to each word of the Encoder and pass it as input to the Decoder. We put a SoftMax layer to normalize the attention score. The same operation mathematical we are performing in case

    Read More

  • Transformers – Self Attention Geometric Intuition!!

    Transformers – Self Attention Geometric Intuition!!

    Self Attention Geometric Intuition!! Table Of Contents: What Is Self Attention? Why Do We Need Self Attention? How Self Attention Works? Example Of Self Attention. Where is Self-Attention Used? Geometric Intuition Of Self-Attention. (1) What Is Self Attention? Self-attention is a mechanism in deep learning that allows a model to focus on different parts of an input sequence when computing word representations. It helps the model understand relationships between words, even if they are far apart, by assigning different attention weights to each word based on its importance in the context. (2) Why Do We Need Self Attention? (3) How

    Read More

  • AWS – Cost Explorer & Budgets

    AWS – Cost Explorer & Budgets

  • AWS – Private Link

    AWS – Private Link

  • AWS – VPC (Virtual Private Cloud)

    AWS – VPC (Virtual Private Cloud)