🏖️ Kaggle Solutions
Search
Search
Search
Dark mode
Light mode
Explorer
competitions
Bengali.AI Speech Recognition
CAFA 5 Protein Function Prediction
Child Mind Institute - Detect Sleep States
CommonLit - Evaluate Student Summaries
G-Research Crypto Forecasting
Google - Fast or Slow? Predict AI Model Runtime
Google QUEST Q&A Labeling
ICR - Identifying Age-Related Conditions
LANL Earthquake Prediction
Linking Writing Processes to Writing Quality
Mechanisms of Action (MoA) Prediction
MLB Player Digital Engagement Forecasting
NeurIPS 2023 - Machine Unlearning
Novozymes Enzyme Stability Prediction
Open Problems – Single-Cell Perturbations
RSNA 2023 Abdominal Trauma Detection
Stanford Ribonanza RNA Folding
Tweet Sentiment Extraction
UBC Ovarian Cancer Subtype Classification and Outlier Detection (UBC-OCEAN)
Vesuvius Challenge - Ink Detection
eval-functions
Average Precision
Balanced Accuracy
balanced logloss
BCELoss
BCEWithLogitsLoss
categorical cross entropy
CosineEmbeddingLoss
cross-entropy loss
DiceLoss
F-score
HingeLoss
Huber loss
jaccard similarity (aka Intersection Over Union)
Kendall Tau correlation
KLDivergenceLoss
Levenshtein distance
ListMLE Loss
log loss
log-likelihood
LogCosh
Loss functions
MAELoss
MAPE loss
marginRankingLoss
Mean absolute error (MAE)
mean column-wise mean absolute error (MCMAE)
mean reciprocal rank (MRR)
Mean Rowwise Root Mean Squared Error
MSELoss
PairwiseHingeLoss
Pearson's Correlation Coefficient
Precision
ranking loss functions
recall
receiver operating characteristic curve (ROC)
RMSE
Spearman's correlation Coefficient
substring segmentation
Word Error Rate
ml-concepts
activation functions
activation functions
ELU (Exponential Linear Unit)
Gated Linear Units (GLU)
Gaussian Error Linear Unit (GELU)
sigmoid
Sigmoid Linear Unit (Swish)
Smish
Softmax
SwiGLU
augmentations
cutmix
cutout
mixup
test time augmentation (tta)
clustering
cluster sampling
DBSCAN
K-nearest neighbour (KNN)
kmeans
mean-shift clustering
TSNE
UMAP dimension reduction
cross-validation
Blocked Cross-Validation
forward chaining cross validation
GroupKFold
hold-out cross validation
kfold
stratified kfold
input-tricks
absolute positional embedding
adversarial validation
ALiBi positional encoding
BPE tokenizer
Online hard negative mining
Sequence bucketing
singular value decomposition (SVD)
layers
axial attention
channel shuffle
Convolutional Block Attention Module (CBAM)
GPS Layers
GRU
linformers
LSTM
pointwise convolution
SAGEConv
Spectral Graph Convolutions
Squeeze-and-Excitation layer
Squeezeformer layer
learning rates
cosine annealing LR
decreasing learning rate
differential learning rate
models
AlphaFold
alternative targets (auxiliary objective)
autoencoder
catboost
denoise autoencoder
Graph Attention Networks (GATs)
Graph Auto-Encoders (GAEs)
Graph Convolutional Network (GCN)
graph isomorphism network
graph neural networks
GraphSAGE
HuberRegressor
lgbm
linear regression
Message Passing GNNs (MP-GNN)
multi layer perceptron
Pyboost
RANSAC
Relational Graph Convolutional Networks (R-GCNs)
RoBERTa
segformer
stacking
TabNet
TabPFN
Temporal Fusion Transformers (TFT)
xgboost
optimizers
Adam optimizer
statistics
Bayes' Theorem
Evidence lower bound (ELBO) (aka variational Lower Bound)
likelihood
Maximum Likelihood Estimation (MLE)
beam search decoding
bilinear interpolation
Connectionist temporal classification (CTC)
curse of dimensionality
factor analysis
Fast-Fourier Transform
GNN Positional encodings
gradient accumulation
graph laplancian
Graph Segment Training
image augmentation
KS statistic
Lasso Regression
Masked Language Modeling (MLM)
Mel frequency cepstral coefficients (MFCC)
Multiple Instance Learning
Over-smoothing
overfitting yourself
regularization
ridge regression
Short-time Fourier Transform (STFT)
thresholding
triplet mining
voice activity detection (VAD)
problem-type
binary classification
classification
hierarchical label classification
Image Classification
image segmentation
learning to rank
machine translation
multi-class classification
Multi-label Classification
NLP
Node Classification Task (using GCN)
ordering objects in list
regression
semantic segmentation
signal processing
speech to Text (STT)
Time Series
transcription
Unknown Class Classification
techniques
code quality
encapsulate team's code in class
cross validation
cross validation
select 2 best models for each fold on CV
diagnosis
identify domain shift
efficiency
data compression
identifying slow feature generation
learn on subsets
resize layer to reduce dimensions
speedup iteration
use a single embedding matrix
ensembling
lasso feature importance for ensembling
Weighted Boxes Fusion (WBF) ensembling
features
Add noise to denoise median statistic
considering features
create features through the ratio between different features
data period selection
dimension reduction for feature generation
distribution matching
downscale upscale examples
drop outliers
drop redundant columns
extract features using NLP on academic papers
Fibonacci window lag
filling training data (impute data)
frequency encoding
Identify poor data sources
is present bit
Leave-one-out encoding
normalize features
one-hot encoding
permutation feature importance to select features
placeholder for invalid values
reduce resolution
sliding window
Spectrogram dithering
subtraction to avoid dependence on mean
target encoding
text data cleaning
thermometer encoding
Training own Tokenizer
Transformer to compress dimensions rather than flattening
formulas
entropy
variance
kaggle-only-techniques
How to understand your place on an overfitted Leaderboard
leaderboard probing
models
add signal to attention bias
create enriching features first, then mix across time
expert models
Freezing Layers
Gradient-Boosted Decision Tree
GRU head (neck) after the backbone layer
kenlm
RAPIDS SVR
Stochastic Weights Averaging
Train on external data first
regularization
adding epsilon to regularize
DropEdge
dropout
label smoothing
weight decay
target
binary encoded categorical ordinal targets
clip outputs to be within range
custom labels
derive results from logits
downsample and upsample output
drop bad targets from CV
hardness to predict label
ignore edge of output prediction
percentile thresholding
postprocess to match target distribution
pseudo-labeling
remove stray pixels
target scaling
time since an event occurred as an auxiliary target
use intermediate layer results (weighted)
custom loss
remove easy examples
remove rows where feat=x to find unknown data clusters
remove test data leakage
Rotational positional embedding
sanity check
xpos positional encoding
tools
Demucs
ftfy
Numba
polars
Ray Tune
runpod.io
thefuzz (prev fuzzywuzzy)
vast.ai
All Competitions
Kaggle Grandmaster Tools
Home
❯
ml concepts
❯
activation functions
Folder: ml-concepts/activation-functions
9 items under this folder.
May 08, 2024
sigmoid
May 08, 2024
activation functions
May 08, 2024
SwiGLU
May 08, 2024
Softmax
May 08, 2024
Smish
May 08, 2024
Sigmoid Linear Unit (Swish)
May 08, 2024
Gaussian Error Linear Unit (GELU)
May 08, 2024
Gated Linear Units (GLU)
May 08, 2024
ELU (Exponential Linear Unit)