Ruan vdM

Papers

Peer-reviewed and workshop publications. Full list on Google Scholar.

Preprint · 2026

Scaling Few-Shot Spoken Word Classification With Generative Meta-Continual Learning

Louise Beyers, Batsirayi Mupamhi Ziki, Ruan van der Merwe
arXiv preprint arXiv:2605.13075

Asks whether a spoken word classifier can sequentially learn to distinguish 1000 classes from only five examples per class. We train a model with the Generative Meta-Continual Learning (GeMCL) algorithm and compare it against strong baselines for this large-scale, few-shot continual setting.

GeMCL delivers exceptionally stable performance and adapts roughly 2000× faster than a frozen HuBERT encoder with repeatedly retrained classifiers, though it does not always surpass fully fine-tuned baselines. The result points toward practical, rapidly-adaptable keyword systems that keep accumulating new classes without retraining from scratch.

arXiv
Preprint · 2026

Does Language Matter for Spoken Word Classification? A Multilingual Generative Meta-Learning Approach

Batsirayi Mupamhi Ziki, Louise Beyers, Ruan van der Merwe
arXiv preprint arXiv:2605.13084

Meta-learning outperforms supervised learning for few-shot monolingual spoken word classification, yet remains under-explored in the multilingual setting. We apply the Generative Meta-Continual Learning (GeMCL) algorithm to spoken word classification — its generative nature makes it viable in application, while the meta-learning component promotes the generalisation that multilingual use demands.

We train monolingual models on English, German, French, and Catalan, a bilingual model on English and German, and a multilingual model on all four languages. Although the multilingual model performs best, the differences between models are unexpectedly small. We also find that the hours of unique data seen during training is a stronger performance indicator than the number of languages in the training data.

arXiv
Conference · 2023

Mitigating Catastrophic Forgetting for Few-Shot Spoken Word Classification Through Meta-Learning

Ruan van der Merwe, Herman Kamper
Interspeech 2023, pp. 441–445

Proposes MAMLCon, an extension of model-agnostic meta-learning (MAML) for continual few-shot spoken word classification. Each inner learning loop concludes with a consolidation gradient step that uses stored templates (one per class) drawn from all previously seen classes, letting the model rehearse without a large replay buffer.

MAMLCon consistently outperforms OML (Online-aware Meta-Learning) across varying numbers of shots and final class counts on Google Speech Commands and the FACC dataset. It targets the practical setting of user-defined keyword systems where new words are added incrementally, showing that meta-learning can mitigate catastrophic forgetting in genuinely on-device regimes.

arXiv
Workshop · 2022

Manifold Characteristics That Predict Downstream Task Performance

Ruan H. van der Merwe, Gregory Newman, Etienne Barnard
First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward at ICML 2022

Introduces the Representation Manifold Quality Metric (RMQM), a principled geometric measure of learned representation quality that correlates positively with downstream task performance.

We characterise representation manifolds by tracking how data points move under sequentially larger perturbations — white-noise injection and PGD adversarial attacks — revealing that self-supervised methods learn smoother manifolds with larger but more consistent step sizes. RMQM gives a framework for comparing pretraining methods beyond the usual linear-probe accuracy, enabling more detailed structural analysis of embedding spaces.

arXiv