This model driver implements the regularization method as introduced by Fraley and Raftery (2007) for univariate normal mixtures. Default parameters for the regularization according to that paper may be obtained using FLXMCregnorm_defaults(). We extend this to the multivariate case assuming independence between variables within components, i.e., we only implement the special case where the covariance matrix is diagonal. For more general applications of normal mixtures see package mclust.

FLXMCregnorm(formula = . ~ ., params)

Arguments

formula

A formula which is interpreted relative to the formula specified in the call to flexmix::flexmix() using stats::update.formula(). Only the left-hand side (response) of the formula is used. Default is to use the original model formula specified in flexmix::flexmix().

params

Prior parameters for normal mixtures. You may obtain default values according to Fraley and Raftery (2007) using FLXMCregnorm_defaults(). As the prior depends on the number of components it is probably not advisable to run stepFlexmix with more than one value of k at a time.

Value

An object of class "FLXC".

Details

For the regularization the conjugate prior distributions for the normal distribution are used, which are:

  • Normal prior with parameter mu_p and sigma^2/kappa_p for the mean.

  • Inverse Gamma prior with parameters nu_p/2 and zeta_p^2/2 for the variance.

References

  • Ernst, D, Ortega Menjivar, L, Scharl, T, Grün, B (2025). Ordinal Clustering with the flex-Scheme. Austrian Journal of Statistics. Submitted manuscript.

  • Fraley, C, Raftery, AE (2007) Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering. Journal of Classification, 24(2), 155-181

See also

FLXMCregnorm_defaults

Examples

library("flexmix")
library("flexord")
library("flexclust")

# example data
data("iris", package = "datasets")
my_iris <- subset(iris, select=setdiff(colnames(iris), "Species")) |>
    as.matrix()

# cluster one model with a scale parameter similar to the default for 3 components
params <- FLXMCregnorm_defaults(my_iris, zeta_p = c(0.23, 0.06, 1.04, 0.19))
m1 <- stepFlexmix(my_iris ~ 1, k = 3, 
    model=FLXMCregnorm(params = params))
#> 3 : * * *
summary(m1)
#> 
#> Call:
#> stepFlexmix(my_iris ~ 1, model = FLXMCregnorm(params = params), 
#>     k = 3)
#> 
#>        prior size post>0 ratio
#> Comp.1 0.383   71    150 0.473
#> Comp.2 0.275   29    150 0.193
#> Comp.3 0.343   50    150 0.333
#> 
#> 'log Lik.' -683.2037 (df=26)
#> AIC: 1418.407   BIC: 1496.684 
#> 

# rand index of clusters vs species
randIndex(clusters(m1), iris$Species)
#>       ARI 
#> 0.6311837 

# cluster one model with default scale parameter
params <- FLXMCregnorm_defaults(my_iris, k = 3)
m2 <- stepFlexmix(my_iris ~ 1, k = 3,
    model = FLXMCregnorm(params = params))
#> 3 : * * *
summary(m2)
#> 
#> Call:
#> stepFlexmix(my_iris ~ 1, model = FLXMCregnorm(params = params), 
#>     k = 3)
#> 
#>        prior size post>0  ratio
#> Comp.1 0.552   94    150 0.6267
#> Comp.2 0.107    6    146 0.0411
#> Comp.3 0.341   50    150 0.3333
#> 
#> 'log Lik.' -684.242 (df=26)
#> AIC: 1420.484   BIC: 1498.76 
#> 

# rand index of clusters vs species
randIndex(clusters(m2), iris$Species)
#>       ARI 
#> 0.5596496 

# rand index between both models (should be >= 0.8)
randIndex(clusters(m1), clusters(m2))
#>       ARI 
#> 0.6833932