Loading...
Swish (also called SiLU) is x*sigmoid(x), a smooth self-gated activation used in EfficientNet and many modern architectures. It outperforms ReLU on many tasks while being differentiable everywhere.
Compute x*sigmoid(x) efficiently.
__device__ float swish(float x) {
// x * sigmoid(x) = x / (1 + exp(-x))
if (x >= 0) {
return x / (1.0f + __expf(-x));
} else {
float ex = __expf(x);
return x * ex / (1.0f + ex);
}
}Separate sigmoid then multiply.
__global__ void swish_naive(float* x, float* y, int n) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if (idx < n) {
float v = x[idx];
float sig = 1.0f / (1.0f + expf(-v));
y[idx] = v * sig;
}
}Vectorized with stable formulation.
__global__ void swish_opt(float4* x, float4* y, int n) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if (idx < n) {
float4 v = x[idx];
y[idx] = make_float4(
swish_stable(v.x), swish_stable(v.y),
swish_stable(v.z), swish_stable(v.w)
);
}
}
__device__ float swish_stable(float x) {
float ax = fabsf(x);
float ex = __expf(-ax);
return (x >= 0) ? x / (1.0f + ex) : x * ex / (1.0f + ex);
}| Metric | Naive | Optimized | Improvement |
|---|---|---|---|
| Throughput | 320 GB/s | 580 GB/s | 81% faster |
Similar smooth gated activations. GELU is x*Φ(x) (Gaussian CDF). Both outperform ReLU. GELU more common in transformers, Swish in CNNs.
Ready to optimize your CUDA code? Download RightNow AI and get real-time performance analysis for your kernels.