Unsupervised Learning Using Variational Inference on Finite Inverted Dirichlet Mixture Models with Component Splitting

Maanicshah, Kamal; Amayri, Manar; Bouguila, Nizar; Fan, Wentao

doi:10.1007/s11277-021-08308-3

Unsupervised Learning Using Variational Inference on Finite Inverted Dirichlet Mixture Models with Component Splitting

Published: 11 March 2021

Volume 119, pages 1817–1844, (2021)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

Kamal Maanicshah ORCID: orcid.org/0000-0001-6914-9610¹,
Manar Amayri³,
Nizar Bouguila¹ &
…
Wentao Fan²

186 Accesses
1 Citation
Explore all metrics

Abstract

Unsupervised learning has been one of the essentials of pattern recognition and data mining. The role of Dirichlet family of mixture models in this field is inevitable. In this article, we propose a finite Inverted Dirichlet mixture model for unsupervised learning using variational inference. In particular, we develop an incremental algorithm with a component splitting approach for local model selection, which makes the clustering algorithm more efficient. We illustrate our model and learning algorithm with synthetic data and some real applications for occupancy estimation in smart homes and topic learning in images and videos. Extensive comparisons with comparable recent approaches have shown the merits of our proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variational Learning of Finite Inverted Dirichlet Mixture Models and Applications

Unsupervised Variational Learning of Finite Generalized Inverted Dirichlet Mixture Models with Feature Selection and Component Splitting

Online Data Clustering Using Variational Learning of a Hierarchical Dirichlet Process Mixture of Dirichlet Distributions

Notes

References

Li, X., Han, Q., & Qiu, B. (2018). A clustering algorithm with affine space-based boundary detection. Applied Intelligence, 48(2), 432–444.
Article Google Scholar
Abassi, L., & Boukhris, I. (2019). A worker clustering-based approach of label aggregation under the belief function theory. Applied Intelligence, 49(1), 53–62.
Article Google Scholar
Kumar, Y., & Singh, P. K. (2019). A chaotic teaching learning based optimization algorithm for clustering problems. Applied Intelligence, 49(3), 1036–1062.
Article Google Scholar
Chen, J., Lin, X., Xuan, Q., & Xiang, Y. (2019). FGCH: A fast and grid based clustering algorithm for hybrid data stream. Applied Intelligence, 49(4), 1228–1244.
Article Google Scholar
Lai, Y., He, W., Ping, Y., Qu, J., & Zhang, X. (2018). Variational Bayesian inference for infinite Dirichlet mixture towards accurate data categorization. Wireless Personal Communications, 102(3), 2307–2329.
Article Google Scholar
Sandhan, T., Sethi, A., Srivastava, T., & Choi, J.Y. (2013). Unsupervised learning approach for abnormal event detection in surveillance video by revealing infrequent patterns. In Proceedings of 28th international conference on image and vision computing New Zealand (IVCNZ 2013) (pp. 494–499).
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., & Freeman, W.T. (2004). Discovering object categories in image collections.
Liu, D., & Chen, T. (2007). Unsupervised image categorization and object localization using topic models and correspondences between images. In Proceedings of IEEE 11th international conference on computer vision (pp. 1–7).
Bouguila, N., Ziou, D., & Vaillancourt, J. (2004). Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application. IEEE Transactions on Image Processing, 13(11), 1533–1543.
Article Google Scholar
Figueiredo, M. A. T., & Jain, A. K. (2002). Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 381–396.
Article Google Scholar
Zivkovic, Z. (2004). Improved adaptive gaussian mixture model for background subtraction. In Proceedings of 17th international conference pattern recognition ICPR 20042 pp. 28–31.
Lima, K.A.B., Aires, K.R.T., & Reis, F.W.P.D. (2015). Adaptive method for segmentation of vehicles through local threshold in the gaussian mixture model. In Proceedings of Brazilian conference on intelligent systems (BRACIS) (pp. 204–209).
Li, Y., Xiong, C., Yin, Y., & Liu, Y. (2009). Moving object detection based on edged mixture gaussian models. In Proceedings of international workshop intelligent systems and applications (pp. 1–5).
Reynolds, D. (2015). Gaussian mixture models. Encyclopedia of Biometrics.
Bouguila, N., & Ziou, D. (2004). Dirichlet-based probability model applied to human skin detection [image skin detection]. In Proceedings and signal processing 2004 IEEE international conference acoustics, speech5 p. V–521.
Bouguila, N., & Ziou, D. (2004). A powerful finite mixture model based on the generalized dirichlet distribution: unsupervised learning and applications. In Proceedings of 17th international conference pattern recognition ICPR 20041, pp. 280–283.
Bdiri, T., & Bouguila, N. (2012). Positive vectors clustering using inverted Dirichlet finite mixture models. Expert Systems with Applications, 39(2), 1869–1882.
Article Google Scholar
Gori, M., & Tesi, A. (1992). On the problem of local minima in backpropagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(1), 76–86.
Article Google Scholar
Bouguila, N., & Ziou, D. (2007). High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(10), 1716–1731.
Article Google Scholar
Zamzami, N., & Bouguila, N. (2019). Model selection and application to high-dimensional count data clustering - via finite EDCM mixture models. Applied Intelligence, 49(4), 1467–1488.
Article Google Scholar
Bourouis, S., Al Mashrgy, M., & Bouguila, N. (2014). Bayesian learning of finite generalized inverted Dirichlet mixtures: Application to object classification and forgery detection. Expert Systems with Applications, 41(5), 2329–2336.
Article Google Scholar
Bouguila, N., Wang, J. H., & Hamza, A. B. (2010). Software modules categorization through likelihood and Bayesian analysis of finite Dirichlet mixtures. Journal of Applied Statistics, 37(2), 235–252.
Article MathSciNet MATH Google Scholar
Attias, H. (1999). Inferring parameters and structure of latent variable models by variational bayes. In Proceedings of the fifteenth conference on uncertainty in artificial intelligence, pp. 21–30. Morgan Kaufmann Publishers Inc.
Teh, Y.W., Newman, D., & Welling, M. (2007). A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In Advances in neural information processing systems (pp. 1353–1360).
Constantinopoulos, C., & Likas, A. (2007). Unsupervised learning of gaussian mixtures based on variational component splitting. IEEE Transactions on Neural Networks, 18(3), 745–755.
Article Google Scholar
Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37(2), 183–233.
Article MATH Google Scholar
Fan, W., Bouguila, N., & Ziou, D. (2011). A variational statistical framework for object detection. In B.-L. Lu, L. Zhang, & J. Kwok (Eds.), Neural Information Processing (pp. 276–283). Berlin, Heidelberg: Springer.
Chapter Google Scholar
Fan, W., Bouguila, N., & Ziou, D. (2014). Variational learning of finite Dirichlet mixture models using component splitting. Neurocomputing, 129, 3–16.
Article Google Scholar
Tiao, G. G., & Cuttman, I. (1965). The inverted Dirichlet distribution with applications. Journal of the American Statistical Association, 60(311), 793–805.
Article MathSciNet MATH Google Scholar
Corduneanu, C.M.B.A. (2001). Variational Bayesian model selection for mixture distributions.
xian Wang, H., Luo, B., bing Zhang, Q., & Wei, S. (2004). Estimation for the number of components in a mixture model using stepwise split-and-merge EM algorithm. Pattern Recognition Letters, 25(16), 1799–1809.
Article Google Scholar
Tirdad, P., Bouguila, N., & Ziou, D. (2015). Variational learning of finite inverted Dirichlet mixture models and applications. Artificial Intelligence Applications in Information and Communication Technologies.
Fan, W., Bouguila, N., & Ziou, D. (2012). Variational learning for finite Dirichlet mixture models and applications. IEEE Transactions on Neural Networks and Learning Systems, 23(5), 762–774.
Article Google Scholar
Malazi, H. T., & Davari, M. (2018). Combining emerging patterns with random forest for complex activity recognition in smart homes. Applied Intelligence, 48(2), 315–330.
Article Google Scholar
Liouane, Z., Lemlouma, T., Roose, P., Weis, F., & Messaoud, H. (2018). An improved extreme learning machine model for the prediction of human scenarios in smart homes. Applied Intelligence, 48(8), 2017–2030.
Article Google Scholar
Amayri, M., & Ploix, S. (2018). Decision tree and parametrized classifier for estimating occupancy in energy management. In 5th International conference on control, decision and information technologies, CoDIT 2018, Thessaloniki, Greece, April 10–13, 2018 (pp. 397–402).
Fränti, P., & Sieranoja, S. (2018). K-means properties on six clustering benchmark datasets. Applied Intelligence, 48(12), 4743–4759.
Article MATH Google Scholar
Chen, Y., Wang, J. Z., & Krovetz, R. (2003). An unsupervised learning approach to content-based image retrieval. Proceedings Seventh International Symposium on Signal Processing and its Applications., 1, 197–200.
Article Google Scholar
Chen, Y., Wang, J. Z., & Krovetz, R. (2005). Clue: Cluster-based retrieval of images by unsupervised learning. IEEE Transactions on Image Processing, 14(8), 1187–1201.
Article Google Scholar
Zakariya, S.M., Ali, R., & Ahmad, N. (2010). Combining visual features of an image at different precision value of unsupervised content based image retrieval. In Proceedings of IEEE international conference on computational intelligence and computing research (pp. 1–4).
Gultepe, E., & Makrehchi, M. (2018). Improving clustering performance using independent component analysis and unsupervised feature learning. Human-centric Computing and Information Sciences, 8(1), 1.
Article Google Scholar
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91.
Article Google Scholar
Bay, H., Ess, A., Tuytelaars, T., & Gool, L. V. (2008). Speeded-up robust features (surf). Computer Vision and Image Understanding, 110(3), 346–359. Similarity Matching in Computer Vision and Multimedia.
Article Google Scholar
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of IEEE computer society conference computer vision and pattern recognition (CVPR’05)1 pp. 886–893.
Ravinder, M., & Venugopal, T. (2016). Content-based cricket video shot classification using bag-of-visual-features. Artificial Intelligence and Evolutionary Computations in Engineering Systems.
Zhu, Q., Zhong, Y., Zhao, B., Xia, G., & Zhang, L. (2016). Bag-of-visual-words scene classifier with local and global features for high spatial resolution remote sensing imagery. IEEE Geoscience and Remote Sensing Letters, 13(6), 747–751.
Article Google Scholar
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints.
Shao, H., Svoboda, T., & Van Gool, L. (2003). Zubud Zurich buildings database for image based recognition. 01.
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In IEEE conference on computer vision & pattern recognition (CPRV ’06) (pp. 2169 – 2178). New York, United States. IEEE Computer Society.
Soleymani, M., Larson, M., Pun, T., & Hanjalic, A. (2014). Corpus development for affective video indexing. IEEE Transactions on Multimedia, 16(4), 1075–1089.
Article Google Scholar
Schuldt, C., Laptev, I., & Caputo, B. (2004). Recognizing human actions: a local SVM approach. In Proceedings of 17th international conference pattern recognition ICPR 20043 pp. 32–36.
Patel, D.M., & Upadhyay, S. (2013). Optical flow measurement using Lucas Kanade method.
Becker, F., Petra, S., & Schnörr, C. (2015). Optical flow. Handbook of mathematical methods in imaging.
Bellamine, I., & Tairi, H. (2015). Optical flow estimation based on the structure-texture image decomposition. Signal, Image and Video Processing, 9(1), 193.
Article Google Scholar
Miao, H., & Wang, Y. (2018). Optical flow based obstacle avoidance and path planning for quadrotor flight. Proceedings of 2017 Chinese intelligent automation conference.
Araújo, T., Aresta, G., Rouco, J., Ferreira, C., Azevedo, E., & Campilho, A. (2015). Optical flow based approach for automatic cardiac cycle estimation in ultrasound images of the carotid. Image Analysis and Recognition.
Scovanner, P., Ali, S., & Shah, M. (2007). A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th international conference on multimedia, MULTIMEDIA ’07 (pp. 357–360). New York, NY, USA. ACM.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.
Article MathSciNet MATH Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.
Article MathSciNet MATH Google Scholar
Zhang, X., & Qian, R. (2017). Research on technical state evaluation of vehicle equipment based on bic cluster analysis. In 2017 IEEE 2nd international conference on big data analysis (ICBDA) (pp. 303–306.
Mate, M. E., Sven, S., Peter, B., Zoltan, K., Zoltan, S., Miklos, S., et al. (2019). In situ cell cycle analysis in giant cell tumor of bone reveals patients with elevated risk of reduced progression-free survival. Bone.
Yang, K., Zhou, N., Røste, T., Yu, J., Li, F., Chen, W., Eide, E., Ekman, T., Li, C., & Chang, F. (2019). High-speed vehicle-to-vehicle radio channel characteristics for suburban and municipal lake region at 5.9 GHz. In 2019 13th European conference on antennas and propagation (EuCAP) (pp. 1–5).
Li, Y., Zheng, X., & Yau, C. Y. (2019). Generalized threshold latent variable model. Electronic Journal of Statistics, 13(1), 2043–2092.
Article MathSciNet MATH Google Scholar
Fan, Q., Yin, C., & Liu, H. (2019). Accurate recovery of sparse objects with perfect mask based on joint sparse reconstruction. IEEE Access, 7, 73504–73515.
Article Google Scholar
Ma, Z., & Leijon, A. (2011). Bayesian estimation of beta mixture models with variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2160–2173.
Article Google Scholar
Woolrich, M. W., & Behrens, T. E. (2006). Variational bayes inference of spatial mixture models for segmentation. IEEE Transactions on Medical Imaging, 25(10), 1380–1391.
Article Google Scholar

Download references

Acknowledgements

The completion of this research was made possible thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC) and Concordia University Research Chair Tier 2.

Author information

Authors and Affiliations

Concordia Institute for Information Systems Engineering, Concordia University, Montreal, Quebec, H3G 1M8, Canada
Kamal Maanicshah & Nizar Bouguila
College of Computer Science and Technology, Huaqiao University, Xiamen, 361021, Fujian, China
Wentao Fan
G-SCOP laboratory, Grenoble Institute of Technology, Avenue Felix Viallet, 38031, Grenoble, France
Manar Amayri

Authors

Kamal Maanicshah
View author publications
You can also search for this author in PubMed Google Scholar
Manar Amayri
View author publications
You can also search for this author in PubMed Google Scholar
Nizar Bouguila
View author publications
You can also search for this author in PubMed Google Scholar
Wentao Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kamal Maanicshah.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Proof of Equations (18), (19), (20)

The solution for variational inference $Q_s\big (\varTheta _s\big )$ is given by (17) as,

$$\begin{aligned} \ln Q_s\big (\varTheta _s\big ) = \big <\ln p\big ({\mathcal {X}},\varTheta \big )\big >_{t\ne s} + {\text {const}} \end{aligned}$$

(37)

where the constant term is the culmination of all the terms that are independent of $Q_s\big (\varTheta _s\big )$. The solutions can be easily derived from the logarithm of the joint distribution $p\big ({\mathcal {X}},\varTheta \big )$ given by,

$$\begin{aligned} \ln p\big ({\mathcal (X)},\varTheta \big )&= \sum _{i=1}^N\sum _{j=1}^MZ_{ij} \Bigg [\ln \pi _j + \ln \frac{\varGamma \big (\sum _{l=1}^{D+1}\alpha _{jl}\big )}{\prod _{l=1}^{D+1}\varGamma \big (\alpha _{jl}\big )} + \sum _{l=1}^D\big (\alpha _{jl} - 1\big )X_{il}\nonumber \\&\quad - \bigg (\sum _{l=1}^{D+1}\alpha _{jl}\bigg ) \ ln\bigg (1 + \sum _{l=1}^D X_{il}\bigg )\Bigg ] \nonumber \\&\quad + \sum _{i=1}^N\Bigg [\sum _{j=1}^sZ_{ij}\ln \pi _j + \sum _{j=s+1}^MZ_{ij}\ln \pi _j^*\Bigg ]\nonumber \\&\quad -\big (M-s\big )\ln \Bigg [1-\sum _{k=1}^s\pi _k\Bigg ]+\frac{\varGamma \big (\sum _{j=s+1}^{M}c_{j}\big )}{\prod _{j=s+1}^{M}\varGamma \big (c_{j}\big )}\nonumber \\&\quad + \sum _{j=s+1}^M\big (c_j-1\big )\bigg [\pi _j^*-\Big (1-\sum _{k=1}^s\pi _k\Big )\bigg ]\nonumber \\&\quad + \sum _{j=1}^M\sum _{l=1}^D u_{jl}\ln \nu _{jl} - \ln \varGamma \big (u_{jl}\big ) + \big (u_{jl} - 1\big )\ln \alpha _{jl} - \nu _{jl}\alpha _{jl} \end{aligned}$$

(38)

1.1 Proof of Equation (18): Variational Solution for $Q\big ({\mathcal {Z}}\big )$

The logarithm of $p\big ({\mathcal {X}},\varTheta \big )$ with respect to Z is given by,

$$\begin{aligned} \ln Q\big (Z_i\big )&= \big<\ln p\big ({\mathcal {X}},\varTheta \big )\big>_{\theta \ne Z_i}\nonumber \\&=\sum _{j=1}^MZ_{ij} \Bigg [\ln \pi _j + R_j + \sum _{l=1}^D\big ({\overline{\alpha }}_{jl} - 1\big )X_{il} - \bigg (\sum _{l=1}^{D+1}\alpha _{jl}\bigg ) \ln \bigg (1 + \sum _{l=1}^D X_{il}\bigg )\Bigg ]\nonumber \\&\quad + \sum _{j=1}^sZ_{ij}\ln \pi _j + \sum _{j=s+1}^MZ_{ij}\big<\ln \pi _j^*\big> + {\text {const}}\nonumber \\&=\sum _{j=1}^sZ_{ij} \Bigg [\ln \pi _j + R_j + \sum _{l=1}^D\big ({\overline{\alpha }}_{jl} - 1\big )X_{il} - \bigg (\sum _{l=1}^{D+1}\alpha _{jl}\bigg ) \ln \bigg (1 + \sum _{l=1}^D X_{il}\bigg )\Bigg ]\nonumber \\&\quad \sum _{j=s+1}^MZ_{ij} \Bigg [\big <\ln \pi _j^*\big > + R_j + \sum _{l=1}^D\big ({\overline{\alpha }}_{jl} - 1\big )X_{il}\nonumber \\&\quad - \bigg (\sum _{l=1}^{D+1}\alpha _{jl}\bigg ) \ln \bigg (1 + \sum _{l=1}^D X_{il}\bigg )\Bigg ] + {\text {const}} \end{aligned}$$

(39)

where,

$$\begin{aligned} R_j = \Bigg<\ln \frac{\varGamma \big (\sum _{l=1}^{D+1}\alpha _{jl}\big )}{\prod _{l=1}^{D+1}\varGamma \big (\alpha _{jl}\big )}\Bigg>_{\alpha _{j1}...\alpha _{jD+1}},{\overline{\alpha }}_{jl} = \big <\alpha _{jl}\big > = \frac{u_{jl}}{\nu _{jl}} \end{aligned}$$

(40)

Here, $R_j$ is intractable as it has no closed form. In order to make the equation tractable we employ the second-order Taylor expansion of the equation similar to the method followed in [65, 66]. This leads us to the Eq. (24) which is actually approximation of $R_j$ and $\big (\varvec{\alpha }_{j1},...,\varvec{\alpha }_{jD+1}\big )$ representing the expected values of $\varvec{\alpha }_j$. Thus, we can calculate ${\tilde{R}}_j$ using Eq. (24). This equation is also found to be the strict lower bound of $R_j$ as proved in [32]. Equation (39) can be now rewritten as,

$$\begin{aligned} \ln Q\big ({\mathcal {Z}}\big ) =\sum _{i=1}^N\Bigg [ \sum _{j=1}^sZ_{ij}\ln {\tilde{r}}_{ij} + \sum _{j=s+1}^MZ_{ij}\ln {\tilde{r}}_{ij}^*\Bigg ] + {\text {const}} \end{aligned}$$

(41)

where

$$\begin{aligned} \ln {\tilde{r}}_{ij} = \ln \pi _j + {\tilde{R}}_j + \sum _{l=1}^D\big ({\overline{\alpha }}_{jl}-1\big )\ln X_{il} - \bigg (\sum _{l=1}^{D+1}\alpha _{jl}\bigg ) \ln \bigg (1 + \sum _{l=1}^D X_{il}\bigg ) \end{aligned}$$

(42)

and

$$\begin{aligned} \ln {\tilde{r}}_{ij}^* = \big <\ln \pi _j^*\big > + {\tilde{R}}_j + \sum _{l=1}^D\big ({\overline{\alpha }}_{jl}-1\big )\ln X_{il} - \bigg (\sum _{l=1}^{D+1}\alpha _{jl}\bigg ) \ln \bigg (1 + \sum _{l=1}^D X_{il}\bigg ) \end{aligned}$$

(43)

It can be seen that Eq. (41) is the logarithmic form of Eq. (7) ignoring the constant. Exponentiating both the sides of Eq. (7), we get,

$$\begin{aligned} Q\big ({\mathcal {Z}}\big ) \propto \prod _{i=1}^{N}\Bigg [\prod _{j=1}^{s}{\tilde{r}}_{ij}^{Z_{ij}}\prod _{j=s+1}^{M}{\tilde{r}}_{ij}^{*Z_{ij}}\Bigg ] \end{aligned}$$

(44)

Normalizing this equation we can write the variational solution of $Q\big ({\mathcal {Z}}\big )$ as,

$$\begin{aligned} Q\big ({\mathcal {Z}}\big ) \propto \prod _{i=1}^{N}\Bigg [\prod _{j=1}^{s}r_{ij}^{Z_{ij}}\prod _{j=s+1}^{M}r_{ij}^{*Z_{ij}}\Bigg ] \end{aligned}$$

(45)

where $r_{ij}$ and $r_{ij}^*$ can be obtained from Eqs. (22) and (23). Also, we can say that $\big <Z_{ij}\big > = r_{ij}$ for $j = 1,...,s$ and $\big <Z_{ij}^*\big > = r_{ij}$ for $j = s+1,...,M$

1.2 Proof of Equation (19): Variational Solution of $Q(\varvec{\pi }^*)$

Similarly, the logarithm of the variational solution $Q\big (\varvec{\pi }^*\big )$ is given as,

$$\begin{aligned} \ln Q\big (\pi _j^*\big )&= \big<\ln p\big ({\mathcal {X}},\varTheta \big )\big>_{\varTheta \ne \pi _j^*}\nonumber \\&=\sum _{i=1}^N\big<Z_{ij}\big>\ln \pi _j^* + \big (c_j - 1\big ) \ln \pi _j^* + {\text {const}}\nonumber \\&=\ln \pi _j^*\Bigg [\sum _{i=1}^N\big <Z_{ij}\big > + c_j - 1 \Bigg ] + {\text {const}} \end{aligned}$$

(46)

This equation shows that it has the same logarithmic form as that of Eq. (9). So we can write the variational solution of $Q\big (\varvec{\pi }^*\big )$ as,

$$\begin{aligned} Q\big (\varvec{\pi }^*\big ) = \Bigg (1 - \sum \limits _{k=1}^s\pi _k\Bigg )^{-M+s} \frac{\varGamma \big (\sum _{j=s+1}^Mc_j^*\big )}{\prod _{j=s+1}^M\varGamma \big (c_j^*\big )}\prod \limits _{j=s+1}^M\Bigg (\frac{\pi _j^*}{1-\sum _{k=1}^s\pi _k}\Bigg )^{c_j^*-1} \end{aligned}$$

(47)

where

$$\begin{aligned} c_j^* = \sum _{i=1}^N \big <Z_{ij}\big > + c_j \end{aligned}$$

(48)

$\big <Z_{ij}\big > = r_{ij}^*$ in the above equation.

1.3 Proof of Equation (20): Variational Solution of $Q\big (\varvec{\alpha }\big )$

As in the other two cases the logarithm of the variational solution $Q\big (\alpha _{jl}\big )$ is given by,

$$\begin{aligned} \ln Q\big (\alpha _{jl}\big )&= \big<\ln p\big ({\mathcal {X}},\varTheta \big )\big>_{\varTheta \ne \alpha _{jl}}\nonumber \\&=\sum _{i=1}^N\big<Z_{ij}\big>{\mathcal {J}}\big (\alpha _{jl}\big )+\alpha _{jl}\sum _{i=1}^N\big <Z_{ij}\big > \ln X_{il} - \alpha _{jl} \ln \Bigg (1+\sum _{l=1}^{D+1}X_{il}\Bigg )\nonumber \\&\quad +\big (u_{jl}-1 \big ) \ln \alpha _{jl} - \nu _{jl}\alpha _{jl} + {\text {const}} \end{aligned}$$

(49)

where,

$$\begin{aligned} {\mathcal {J}}\big (\alpha _{jl}\big ) = \Bigg <\ln \frac{\varGamma \big (\alpha _{jl}+\sum _{s \ne l}^{D+1}\alpha _{js}\big )}{\varGamma \big (\alpha _{jl}\big )\prod _{s \ne l}^{D+1}\varGamma \big (\alpha _{js}\big )}\Bigg >_{\varTheta \ne \alpha _{jl}} \end{aligned}$$

(50)

Similar to what we encountered in the case of $R_j$ the equation for ${\mathcal {J}}\big (\alpha _{jl}\big )$ is also intractable. We solve this problem finding the lower bound for the equation by calculating the first-order Taylor expansion with respect to ${\overline{\alpha }}_{jl}$. The calculated lower bound is given by,

$$\begin{aligned} {\mathcal {L}}\big (\alpha _{jl}\big ) \ge \,\,&{\overline{\alpha }}_{jl} \ln \alpha _{jl}\Bigg [\psi \Bigg (\sum _{l=1}^{D+1}{\overline{\alpha }}_{jl}\Bigg )-\psi \big ({\overline{\alpha }}_{jl}\big )+ \sum _{s \ne l}^{D+1}{\overline{\alpha }}_{js}\nonumber \\&\quad \times \psi '\Bigg (\sum _{l=1}^{D+1}{\overline{\alpha }}_{jl}\Bigg )\big (\big <\ln \alpha _{js}\big >-\ln {\overline{\alpha }}_{js}\big )\Bigg ] + {\text {const}} \end{aligned}$$

(51)

This approximation is also found to be a strict lower bound of ${\mathcal {L}}\big (\alpha _{jl}\big )$ and is also proved in [32]. Substituting this equation for lower bound in Eq. (49)

$$\begin{aligned} \ln Q\big (\alpha _{jl}\big )&= \sum _{i=1}^N\big<Z_{ij}\big>{\overline{\alpha }}_{jl} \ln \alpha _{jl}\Bigg [\psi \Bigg (\sum _{l=1}^{D+1}{\overline{\alpha }}_{jl}\Bigg )-\psi \big ({\overline{\alpha }}_{jl}\big )\nonumber \\&\quad + \sum _{s \ne l}^{D+1}{\overline{\alpha }}_{js} \psi '\Bigg (\sum _{l=1}^{D+1}{\overline{\alpha }}_{jl}\Bigg )\big (\big<\ln \alpha _{js}\big>-\ln {\overline{\alpha }}_{js}\big )\Bigg ]\nonumber \\&\quad +\alpha _{jl}\sum _{i=1}^N\big <Z_{ij}\big > \ln X_{il} - \alpha _{jl} \ln \Bigg (1+\sum _{l=1}^{D+1}X_{il}\Bigg )\nonumber \\&\quad +\big (u_{jl}-1 \big ) \ln \alpha _{jl} - \nu _{jl}\alpha _{jl} + {\text {const}} \end{aligned}$$

(52)

This equation can be rewritten as,

$$\begin{aligned} \ln Q\big (\alpha _{jl}\big ) = \ln \alpha _{jl}\big (u_{jl}+\varphi _{jl} - 1\big ) - \alpha _{jl}\big (\nu _{jl}-\vartheta _{jl}\big ) + {\text {const}} \end{aligned}$$

(53)

where,

$$\begin{aligned} \varphi _{jl}&=\sum _{i=1}^N\big<Z_{ij}\big>{\overline{\alpha }}_{jl} \Bigg [\psi \Bigg (\sum _{l=1}^{D+1}{\overline{\alpha }}_{jl}\Bigg )-\psi \big ({\overline{\alpha }}_{jl}\big )\nonumber \\&\quad + \sum _{s \ne l}^{D+1}{\overline{\alpha }}_{js} \psi '\Bigg (\sum _{l=1}^{D+1}{\overline{\alpha }}_{jl}\Bigg )\big (\big <\ln \alpha _{js}\big >-\ln {\overline{\alpha }}_{js}\big )\Bigg ] \end{aligned}$$

(54)

$$\begin{aligned} \vartheta _{jl}&= \sum _{i=1}^N\big <Z_{ij}\big >\Bigg [\ln X_{il}- \ln \Bigg (1+\sum _{l=1}^D X_{il}\Bigg )\Bigg ] \end{aligned}$$

(55)

Equation (53) is the logarithmic form of a Gamma distribution. If we exponentiate both the sides, we get,

$$\begin{aligned} Q\big (\alpha _{jl}\big ) \propto \alpha _{jl}^{u_{jl}+\varphi _{jl} - 1}e^{-\big (\nu _{jl}-\vartheta _{jl}\big )\alpha _{jl}} \end{aligned}$$

(56)

This leaves us with the optimal solution for the hyper-parameters $u_{jl}$ and $\nu _{jl}$ given by,

$$\begin{aligned} u_{jl}^* = u_{jl} + \varphi _{jl},\,\,\,\, \nu _{jl}^* = \nu _{jl}-\vartheta _{jl} \end{aligned}$$

(57)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maanicshah, K., Amayri, M., Bouguila, N. et al. Unsupervised Learning Using Variational Inference on Finite Inverted Dirichlet Mixture Models with Component Splitting. Wireless Pers Commun 119, 1817–1844 (2021). https://doi.org/10.1007/s11277-021-08308-3

Download citation

Accepted: 18 February 2021
Published: 11 March 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11277-021-08308-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised Learning Using Variational Inference on Finite Inverted Dirichlet Mixture Models with Component Splitting

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Variational Learning of Finite Inverted Dirichlet Mixture Models and Applications

Unsupervised Variational Learning of Finite Generalized Inverted Dirichlet Mixture Models with Feature Selection and Component Splitting

Online Data Clustering Using Variational Learning of a Hierarchical Dirichlet Process Mixture of Dirichlet Distributions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendices

Appendix

Proof of Equations (18), (19), (20)

1.1 Proof of Equation (18): Variational Solution for \(Q\big ({\mathcal {Z}}\big )\)

1.2 Proof of Equation (19): Variational Solution of \(Q(\varvec{\pi }^*)\)

1.3 Proof of Equation (20): Variational Solution of \(Q\big (\varvec{\alpha }\big )\)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Unsupervised Learning Using Variational Inference on Finite Inverted Dirichlet Mixture Models with Component Splitting

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Variational Learning of Finite Inverted Dirichlet Mixture Models and Applications

Unsupervised Variational Learning of Finite Generalized Inverted Dirichlet Mixture Models with Feature Selection and Component Splitting

Online Data Clustering Using Variational Learning of a Hierarchical Dirichlet Process Mixture of Dirichlet Distributions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendices

Appendix

Proof of Equations (18), (19), (20)

1.1 Proof of Equation (18): Variational Solution for \(Q\big ({\mathcal {Z}}\big )\)

1.2 Proof of Equation (19): Variational Solution of \(Q(\varvec{\pi }^*)\)

1.3 Proof of Equation (20): Variational Solution of \(Q\big (\varvec{\alpha }\big )\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation