In my recent open source contribution I enabled callbacks in the scalable (multi-core) implementation of Latent Dirichlet Alocation in the gensim
library 1. This will, in turn, allow users for faster and more accurate turning of the popular topic extraction model.
An obvious use case is monitoring and early stopping of training, with popular coherence metrics such as \(U_{mass}\) and \(C_V\) 2. On the News20Group dataset, the training performance looks as follows:

The achieved scores are decent, actually better than reported in the literature3 – but this may be due to preprocessing not early stopping. A full example is shared in this Kaggle notebook.
- 1.R Rehr Uv Rek, P Sojka. Software Framework for Topic Modelling with Large Corpora. Unpublished. Published online 2010. doi:10.13140/2.1.2393.1847
- 2.Röder M, Both A, Hinneburg A. Exploring the Space of Topic Coherence Measures. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. Published online February 2, 2015. doi:10.1145/2684822.2685324
- 3.Zhang Z, Fang M, Chen L, Namazi Rad MR. Is Neural Topic Modelling Better than Clustering? An Empirical Study on Clustering with Contextual Embeddings for Topics. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Published online 2022. doi:10.18653/v1/2022.naacl-main.285