this blog derives it: https://shivammehta25.github.io/posts/deriving-categorical-cross-entropy-and-softmax/ - but I’m sussed cause the final term doesn’t look like: - see cross-entropy loss for the definition of terms
Search
Mar 01, 2025, 1 min read
this blog derives it: https://shivammehta25.github.io/posts/deriving-categorical-cross-entropy-and-softmax/ - but I’m sussed cause the final term doesn’t look like: −∑c=1Myo,clog(po,c) - see cross-entropy loss for the definition of terms