Nullspace Augmented Finetuning
We introduce the $\epsilon$-approximate nullspace $\tilde{\mathcal{N}}_\phi(\epsilon)$, the set of perturbations that induce at most $\epsilon$ expected change in model output:
To enhance model robustness, we fine-tune the model using these nullspace noise vectors via a bi-level optimization:
Here, noise is iteratively updated by gradient descent and early-stopped once within $\epsilon$-approximate nullspace, promoting invariance and robustness under distribution shifts.
Our nullspace finetuning method consistently improves the robustness of models under distribution shifts and adversarial attacks, yielding a large gain in average performance for the vanilla ViT-small and ViT-base model and slight outperforming DAT. This not only shows that our nullspace finetuning method is effective but also validates our previous hypothesis about the connection between the tolerance to nullspace and the robustness of transformer models.
We futher compare our method with fine-tuning using two PGD adversarial training methods, Madry (Madry et al., 2018a) and TRADES (Zhang et al., 2019) on the ViT-S model.
Despite their high performance on adversarial attacks, compared to our method, Madry and TRADES perform considerably lower in the natural OOD setting.