ReFTA: Breaking the Weight Reconstruction Bottleneck in Tensorized Parameter-Efficient Fine-Tuning
Abstract
Tensor–based fine-tuning has attracted growing interest for its ability to reduce trainable parameters beyond matrix-based approaches such as LoRA and PiSSA, while capturing inter-layer correlations within networks. However, existing tensor-based methods typically require repeated reconstruction of model weights during training, leading to substantial computational and memory overhead. To overcome these limitations, we propose Reconstruction-Free Tensor-Based Adaptation (ReFTA), which offers four key advantages: (1) it eliminates repeated explicit tensor reconstruction by exploiting the algebraic properties of tensors; (2) it achieves lower quantization error by fine-tuning only the principal tensor components; (3) it is supported by a rigorous generalization guarantee rooted in the algebraic foundations of tensor product–based approaches; and (4) it adopts a unified design controlled by a single tensor rank configuration. Extensive experiments on both image classification (IC) and natural language understanding (NLU) tasks demonstrate that ReFTA achieves the best accuracy–efficiency trade-off among all evaluated methods. Across most cases, ReFTA attains the highest average accuracy with the fewest trainable parameters. On NLU tasks with RoBERTa-Large, ReFTA improves the average accuracy by approximately 5% over most existing methods while using only 86.4% fewer parameters than LoRA (r=1) and 97.5% fewer than PiSSA. In particular, ReFTA achieves substantially lower peak GPU memory consumption, reducing usage by over 30% compared with tensor-based baselines on the RTE dataset and demonstrating markedly improved scalability.