GPT: GLU(x) = x ⊗ sigmoid(Wx + b)

  • ⊗ denotes the element-wise multiplication.