Approximate Nullspace Augmented Finetuning for Robust Vision Transformer

Haoyang Liu1, Aditya Singh2, Yijiang Li3, Haohan Wan1,
1University of Illinois Urbana-Champaign, Sorted Technologies, 3UC San Diego
Overview of WMDD method

An example of nullspace noise. (a) sample input image; (b) noise generated by the basis vectors of the nullspace; (c) noisy image as a result of adding the nullspace noise to the input.

Abstract

Enhancing the robustness of deep learning models, particularly in the realm of vision transformers (ViTs), is crucial for their real-world deployment. In this work, we provide a finetuning approach to enhance the robustness of vision transformers inspired by the concept of nullspace from linear algebra. Our investigation centers on whether a vision transformer can exhibit resilience to input variations akin to the nullspace property in linear mappings, which would imply that perturbations sampled from this nullspace do not influence the model's output when added to the input. We start from the observation that many existing ViTs satisfy this property because their patch embedding layer has a non-trivial nullspace. Then, we extend the notion of nullspace to nonlinear settings and demonstrate that it is possible to synthesize approximate nullspace elements for ViT's encoder blocks through optimization. Finally, we propose a finetuning strategy for ViTs wherein we augment the training data with synthesized approximate nullspace noise. We find that our finetuning approach significantly improves the models' robustness to both adversarial and natural image perturbations.

Nullspace in Vision Transformer

We first identifies that most off-the-shelf pre-trained ViT models exhibit a nontrivial nullspace due to the linear patch embedding layer. Since this layer is the first block of a ViT, any invariance to it implies invariance to the entire model. Consequently, a nontrivial nullspace also exists for ViTs.


Nullspace in Vision Transformer
An illustration of the nullspace in three cases (left top: projection function; left bottom: linear function; right: vision transformer)

The patch embedding layer of a Vision Transformer (ViT) defines a linear map $f_\theta$, whose nullspace $\mathcal{N}_\theta = \{\mathbf{v} : f_\theta(\mathbf{x} + \mathbf{v}) = f_\theta(\mathbf{x})\} $ is non-trivial when $cr^2 > d$. Any perturbation in this nullspace leaves the model’s output invariant. For the self-attention encoder $f_\phi$, we generalize this idea and define the generalized nullspace as $$ \tilde{\mathcal{N}}_\phi = \{\mathbf{v} \mid f_\phi(\mathbf{u} + \mathbf{v}) = f_\phi(\mathbf{u}),\ \forall \mathbf{u} \in \mathcal{U}\}, $$ enabling us to synthesize input noise that does not affect predictions. Approximate solutions are found by minimizing: $$ \mathcal{L}_\phi(\tilde{\mathbf{v}}) = \mathbb{E}_{\mathbf{u} \in \mathcal{D}} \lVert f_\psi(f_\phi^{0}(\mathbf{u}+\tilde{\mathbf{v}})) - f_\psi(f_\phi^{0}(\mathbf{u}))\rVert - \lambda \log(\lVert \tilde{\mathbf{v}} \rVert). $$

Synthesizing (approximate) nullspace noise

To probe the existence of generalized nullspace elements $\tilde{\mathbf{v}}_\phi$ in transformers, we numerically optimize for additive perturbations that minimally affect the model’s output. This is achieved by minimizing the following loss function:

$$ \mathcal{L}_\phi(\tilde{\mathbf{v}}) = \mathbb{E}_{\mathbf{u} \in \mathcal{D}} \lVert f_\psi(f_\phi^{0}(\mathbf{u}+\tilde{\mathbf{v}})) - f_\psi(f_\phi^{0}(\mathbf{u}))\rVert - \lambda \log(\lVert \tilde{\mathbf{v}} \rVert). $$

By varying the regularization strength $\lambda$, we obtain perturbations that are nearly invisible to the classifier, revealing benign directions in input space. Experiments confirm that the learned vectors are stable and exhibit approximate closure under scaling and convex combination.

Generalized Nullspace Illustration
Generalized nullspace. (Left) Solid lines (---):s the model performance under the learned noise; dashed lines ($\cdot\cdot\cdot$): performance after random permutation of the elements of the learned noise vector. (Right) by changing the regularization strengths, we explore noise in the generalized nullspace at different magnitudes.

Nullspace Augmented Finetuning

We introduce the $\epsilon$-approximate nullspace $\tilde{\mathcal{N}}_\phi(\epsilon)$, the set of perturbations that induce at most $\epsilon$ expected change in model output:

$$ \tilde{\mathcal{N}}_\phi(\epsilon) = \left\{ \tilde{\mathbf{v}} \;\middle|\; \mathbb{E}_{\mathbf{u} \in \mathcal{D}} \lVert f(\mathbf{u} + \tilde{\mathbf{v}}) - f(\mathbf{u}) \rVert \leq \epsilon \right\}. $$

To enhance model robustness, we fine-tune the model using these nullspace noise vectors via a bi-level optimization:

$$ \min_{\phi} \;\mathbb{E}_{\mathbf{u} \in \mathcal{D}}\, \ell(f_\psi(f_\phi^{0}(\mathbf{u} + \tilde{\mathbf{v}}_\phi^*)), \mathbf{y}), \quad \tilde{\mathbf{v}}_{\phi}^* = \arg\max_{\tilde{\mathbf{v}} \in \mathcal{N}_\phi(\epsilon)} \lVert \tilde{\mathbf{v}} \rVert. $$

Here, noise is iteratively updated by gradient descent and early-stopped once within $\epsilon$-approximate nullspace, promoting invariance and robustness under distribution shifts.

Performance Table
Effect of our nullspace augmented finetuning (NS) method on different models evaluated on multiple benchmark datasets.

Our nullspace finetuning method consistently improves the robustness of models under distribution shifts and adversarial attacks, yielding a large gain in average performance for the vanilla ViT-small and ViT-base model and slight outperforming DAT. This not only shows that our nullspace finetuning method is effective but also validates our previous hypothesis about the connection between the tolerance to nullspace and the robustness of transformer models.

We futher compare our method with fine-tuning using two PGD adversarial training methods, Madry (Madry et al., 2018a) and TRADES (Zhang et al., 2019) on the ViT-S model.

Performance Table
Comparison of our NS method with PGD-based adversarial robustness methods of Madry and TRADES.

Despite their high performance on adversarial attacks, compared to our method, Madry and TRADES perform considerably lower in the natural OOD setting.

Enlarged Approximate Nullspace

The model allows for noises with larger and larger norms to be within $\epsilon$-approximate, which informally suggests an enlarging $\epsilon$-approximate nullspace.

Enlarged Approximate Nullspace
L2 norm of the learned noise during training.

Accompanied by the trend is the increase in robustness scores in both OOD and adversarial settings, which corroborates our findings.

Ablation: Impact of $\epsilon$.

Ablation: Impact of epsilon

Citation

If you find our work useful in your research, please consider citing:

@article{liu2024approximate,
    title={Approximate Nullspace Augmented Finetuning for Robust Vision Transformers},
    author={Liu, Haoyang and Singh, Aditya and Li, Yijiang and Wang, Haohan},
    journal={arXiv preprint arXiv:2403.10476},
    year={2024}
}