This model driver implements the regularization method as
introduced by Fraley and Raftery (2007) for univariate normal
mixtures. Default parameters for the regularization according to
that paper may be obtained using FLXMCregnorm_defaults()
. We
extend this to the multivariate case assuming independence between
variables within components, i.e., we only implement the special
case where the covariance matrix is diagonal. For more general
applications of normal mixtures see package mclust.
FLXMCregnorm(formula = . ~ ., params)
A formula which is interpreted relative to the formula
specified in the call to flexmix::flexmix()
using
stats::update.formula()
. Only the left-hand side (response)
of the formula is used. Default is to
use the original model formula specified in flexmix::flexmix()
.
Prior parameters for normal mixtures. You may obtain
default values according to Fraley and Raftery (2007) using
FLXMCregnorm_defaults()
.
As the prior depends on the number of components it is probably not
advisable to run stepFlexmix
with more than one value of k
at
a time.
An object of class "FLXC"
.
For the regularization the conjugate prior distributions for the normal distribution are used, which are:
Normal prior with parameter mu_p
and sigma^2/kappa_p
for the mean.
Inverse Gamma prior with parameters nu_p/2
and zeta_p^2/2
for the
variance.
Ernst, D, Ortega Menjivar, L, Scharl, T, Grün, B (2025). Ordinal Clustering with the flex-Scheme. Austrian Journal of Statistics. Submitted manuscript.
Fraley, C, Raftery, AE (2007) Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering. Journal of Classification, 24(2), 155-181
FLXMCregnorm_defaults
library("flexmix")
library("flexord")
library("flexclust")
# example data
data("iris", package = "datasets")
my_iris <- subset(iris, select=setdiff(colnames(iris), "Species")) |>
as.matrix()
# cluster one model with a scale parameter similar to the default for 3 components
params <- FLXMCregnorm_defaults(my_iris, zeta_p = c(0.23, 0.06, 1.04, 0.19))
m1 <- stepFlexmix(my_iris ~ 1, k = 3,
model=FLXMCregnorm(params = params))
#> 3 : * * *
summary(m1)
#>
#> Call:
#> stepFlexmix(my_iris ~ 1, model = FLXMCregnorm(params = params),
#> k = 3)
#>
#> prior size post>0 ratio
#> Comp.1 0.383 71 150 0.473
#> Comp.2 0.275 29 150 0.193
#> Comp.3 0.343 50 150 0.333
#>
#> 'log Lik.' -683.2037 (df=26)
#> AIC: 1418.407 BIC: 1496.684
#>
# rand index of clusters vs species
randIndex(clusters(m1), iris$Species)
#> ARI
#> 0.6311837
# cluster one model with default scale parameter
params <- FLXMCregnorm_defaults(my_iris, k = 3)
m2 <- stepFlexmix(my_iris ~ 1, k = 3,
model = FLXMCregnorm(params = params))
#> 3 : * * *
summary(m2)
#>
#> Call:
#> stepFlexmix(my_iris ~ 1, model = FLXMCregnorm(params = params),
#> k = 3)
#>
#> prior size post>0 ratio
#> Comp.1 0.552 94 150 0.6267
#> Comp.2 0.107 6 146 0.0411
#> Comp.3 0.341 50 150 0.3333
#>
#> 'log Lik.' -684.242 (df=26)
#> AIC: 1420.484 BIC: 1498.76
#>
# rand index of clusters vs species
randIndex(clusters(m2), iris$Species)
#> ARI
#> 0.5596496
# rand index between both models (should be >= 0.8)
randIndex(clusters(m1), clusters(m2))
#> ARI
#> 0.6833932