Title:  Processing of Model Parameters 

Description:  Utilities for processing the parameters of various statistical models. Beyond computing p values, CIs, and other indices for a wide variety of models (see list of supported models using the function 'insight::supported_models()'), this package implements features like bootstrapping or simulating of parameters and models, feature reduction (feature extraction and variable selection) as well as functions to describe data and variable characteristics (e.g. skewness, kurtosis, smoothness or distribution). 
Authors:  Daniel Lüdecke [aut, cre] , Dominique Makowski [aut] , Mattan S. BenShachar [aut] , Indrajeet Patil [aut] , Søren Højsgaard [aut], Brenton M. Wiernik [aut] , Zen J. Lau [ctb], Vincent ArelBundock [ctb] , Jeffrey Girard [ctb] , Christina Maimone [rev], Niels Ohlsen [rev], Douglas Ezra Morrison [ctb] , Joseph Luchman [ctb] 
Maintainer:  Daniel Lüdecke <[email protected]> 
License:  GPL3 
Version:  0.23.0.8 
Built:  20241112 09:27:03 UTC 
Source:  https://github.com/easystats/parameters 
Bootstrap a statistical model n times to return a data frame of estimates.
bootstrap_model(model, iterations = 1000, ...) ## Default S3 method: bootstrap_model( model, iterations = 1000, type = "ordinary", parallel = c("no", "multicore", "snow"), n_cpus = 1, verbose = FALSE, ... ) ## S3 method for class 'merMod' bootstrap_model( model, iterations = 1000, type = "parametric", parallel = c("no", "multicore", "snow"), n_cpus = 1, cluster = NULL, verbose = FALSE, ... )
bootstrap_model(model, iterations = 1000, ...) ## Default S3 method: bootstrap_model( model, iterations = 1000, type = "ordinary", parallel = c("no", "multicore", "snow"), n_cpus = 1, verbose = FALSE, ... ) ## S3 method for class 'merMod' bootstrap_model( model, iterations = 1000, type = "parametric", parallel = c("no", "multicore", "snow"), n_cpus = 1, cluster = NULL, verbose = FALSE, ... )
model 
Statistical model. 
iterations 
The number of draws to simulate/bootstrap. 
... 
Arguments passed to or from other methods. 
type 
Character string specifying the type of bootstrap. For mixed models
of class 
parallel 
The type of parallel operation to be used (if any). 
n_cpus 
Number of processes to be used in parallel operation. 
verbose 
Toggle warnings and messages. 
cluster 
Optional cluster when 
By default, boot::boot()
is used to generate bootstraps from
the model data, which are then used to update()
the model, i.e. refit
the model with the bootstrapped samples. For merMod
objects (lme4)
or models from glmmTMB, the lme4::bootMer()
function is used to
obtain bootstrapped samples. bootstrap_parameters()
summarizes the
bootstrapped model estimates.
A data frame of bootstrapped estimates.
The output can be passed directly to the various functions from the
emmeans package, to obtain bootstrapped estimates, contrasts, simple
slopes, etc. and their confidence intervals. These can then be passed to
model_parameter()
to obtain standard errors, pvalues, etc. (see
example).
Note that that pvalues returned here are estimated under the assumption of translation equivariance: that shape of the sampling distribution is unaffected by the null being true or not. If this assumption does not hold, pvalues can be biased, and it is suggested to use proper permutation tests to obtain nonparametric pvalues.
bootstrap_parameters()
, simulate_model()
, simulate_parameters()
model < lm(mpg ~ wt + factor(cyl), data = mtcars) b < bootstrap_model(model) print(head(b)) est < emmeans::emmeans(b, consec ~ cyl) print(model_parameters(est))
model < lm(mpg ~ wt + factor(cyl), data = mtcars) b < bootstrap_model(model) print(head(b)) est < emmeans::emmeans(b, consec ~ cyl) print(model_parameters(est))
Compute bootstrapped parameters and their related indices such as Confidence Intervals (CI) and pvalues.
bootstrap_parameters(model, ...) ## Default S3 method: bootstrap_parameters( model, iterations = 1000, centrality = "median", ci = 0.95, ci_method = "quantile", test = "pvalue", ... )
bootstrap_parameters(model, ...) ## Default S3 method: bootstrap_parameters( model, iterations = 1000, centrality = "median", ci = 0.95, ci_method = "quantile", test = "pvalue", ... )
model 
Statistical model. 
... 
Arguments passed to other methods, like 
iterations 
The number of draws to simulate/bootstrap. 
centrality 
The pointestimates (centrality indices) to compute. Character
(vector) or list with one or more of these options: 
ci 
Value or vector of probability of the CI (between 0 and 1)
to be estimated. Default to 
ci_method 
The type of index used for Credible Interval. Can be 
test 
The indices to compute. Character (vector) with one or more of
these options: 
This function first calls bootstrap_model()
to generate
bootstrapped coefficients. The resulting replicated for each coefficient
are treated as "distribution", and is passed to bayestestR::describe_posterior()
to calculate the related indices defined in the "test"
argument.
Note that that pvalues returned here are estimated under the assumption of translation equivariance: that shape of the sampling distribution is unaffected by the null being true or not. If this assumption does not hold, pvalues can be biased, and it is suggested to use proper permutation tests to obtain nonparametric pvalues.
A data frame summarizing the bootstrapped parameters.
The output can be passed directly to the various functions from the
emmeans package, to obtain bootstrapped estimates, contrasts, simple
slopes, etc. and their confidence intervals. These can then be passed to
model_parameter()
to obtain standard errors, pvalues, etc. (see
example).
Note that that pvalues returned here are estimated under the assumption of translation equivariance: that shape of the sampling distribution is unaffected by the null being true or not. If this assumption does not hold, pvalues can be biased, and it is suggested to use proper permutation tests to obtain nonparametric pvalues.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application (Vol. 1). Cambridge university press.
bootstrap_model()
, simulate_parameters()
, simulate_model()
set.seed(2) model < lm(Sepal.Length ~ Species * Petal.Width, data = iris) b < bootstrap_parameters(model) print(b) # different type of bootstrapping set.seed(2) b < bootstrap_parameters(model, type = "balanced") print(b) est < emmeans::emmeans(b, trt.vs.ctrl ~ Species) print(model_parameters(est))
set.seed(2) model < lm(Sepal.Length ~ Species * Petal.Width, data = iris) b < bootstrap_parameters(model) print(b) # different type of bootstrapping set.seed(2) b < bootstrap_parameters(model, type = "balanced") print(b) est < emmeans::emmeans(b, trt.vs.ctrl ~ Species) print(model_parameters(est))
Approximation of degrees of freedom based on a "betweenwithin" heuristic.
ci_betwithin(model, ci = 0.95, ...) dof_betwithin(model) p_value_betwithin(model, dof = NULL, ...)
ci_betwithin(model, ci = 0.95, ...) dof_betwithin(model) p_value_betwithin(model, dof = NULL, ...)
model 
A mixed model. 
ci 
Confidence Interval (CI) level. Default to 
... 
Additional arguments passed down to the underlying functions.
E.g., arguments like 
dof 
Degrees of Freedom. 
Inferential statistics (like pvalues, confidence intervals and
standard errors) may be biased in mixed models when the number of clusters
is small (even if the sample size of level1 units is high). In such cases
it is recommended to approximate a more accurate number of degrees of freedom
for such inferential statistics (see Li and Redden 2015). The
Betweenwithin denominator degrees of freedom approximation is
recommended in particular for (generalized) linear mixed models with repeated
measurements (longitudinal design). dof_betwithin()
implements a heuristic
based on the betweenwithin approach. Note that this implementation
does not return exactly the same results as shown in Li and Redden 2015,
but similar.
In particular for repeated measure designs (longitudinal data analysis),
the betweenwithin heuristic is likely to be more accurate than simply
using the residual or infinite degrees of freedom, because dof_betwithin()
returns different degrees of freedom for withincluster and betweencluster
effects.
A data frame.
Elff, M.; Heisig, J.P.; Schaeffer, M.; Shikano, S. (2019). Multilevel Analysis with Few Clusters: Improving Likelihoodbased Methods to Provide Unbiased Estimates and Accurate Inference, British Journal of Political Science.
Li, P., Redden, D. T. (2015). Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample clusterrandomized trials. BMC Medical Research Methodology, 15(1), 38. doi:10.1186/s128740150026x
dof_betwithin()
is a small helperfunction to calculate approximated
degrees of freedom of model parameters, based on the "betweenwithin" heuristic.
if (require("lme4")) { data(sleepstudy) model < lmer(Reaction ~ Days + (1 + Days  Subject), data = sleepstudy) dof_betwithin(model) p_value_betwithin(model) }
if (require("lme4")) { data(sleepstudy) model < lmer(Reaction ~ Days + (1 + Days  Subject), data = sleepstudy) dof_betwithin(model) p_value_betwithin(model) }
An approximate Ftest based on the KenwardRoger (1997) approach.
ci_kenward(model, ci = 0.95) dof_kenward(model) p_value_kenward(model, dof = NULL) se_kenward(model)
ci_kenward(model, ci = 0.95) dof_kenward(model) p_value_kenward(model, dof = NULL) se_kenward(model)
model 
A statistical model. 
ci 
Confidence Interval (CI) level. Default to 
dof 
Degrees of Freedom. 
Inferential statistics (like pvalues, confidence intervals and
standard errors) may be biased in mixed models when the number of clusters
is small (even if the sample size of level1 units is high). In such cases
it is recommended to approximate a more accurate number of degrees of freedom
for such inferential statistics. Unlike simpler approximation heuristics
like the "ml1" rule (dof_ml1
), the KenwardRoger approximation is
also applicable in more complex multilevel designs, e.g. with crossclassified
clusters. However, the "ml1" heuristic also applies to generalized
mixed models, while approaches like KenwardRoger or Satterthwaite are limited
to linear mixed models only.
A data frame.
Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 983997.
dof_kenward()
and se_kenward()
are small helperfunctions
to calculate approximated degrees of freedom and standard errors for model
parameters, based on the KenwardRoger (1997) approach.
dof_satterthwaite()
and dof_ml1()
approximate degrees of freedom
based on Satterthwaite's method or the "ml1" rule.
if (require("lme4", quietly = TRUE)) { model < lmer(Petal.Length ~ Sepal.Length + (1  Species), data = iris) p_value_kenward(model) }
if (require("lme4", quietly = TRUE)) { model < lmer(Petal.Length ~ Sepal.Length + (1  Species), data = iris) p_value_kenward(model) }
Approximation of degrees of freedom based on a "ml1" heuristic as suggested by Elff et al. (2019).
ci_ml1(model, ci = 0.95, ...) dof_ml1(model) p_value_ml1(model, dof = NULL, ...)
ci_ml1(model, ci = 0.95, ...) dof_ml1(model) p_value_ml1(model, dof = NULL, ...)
model 
A mixed model. 
ci 
Confidence Interval (CI) level. Default to 
... 
Additional arguments passed down to the underlying functions.
E.g., arguments like 
dof 
Degrees of Freedom. 
Inferential statistics (like pvalues, confidence intervals and
standard errors) may be biased in mixed models when the number of clusters
is small (even if the sample size of level1 units is high). In such cases
it is recommended to approximate a more accurate number of degrees of freedom
for such inferential statistics (see Li and Redden 2015). The
ml1 heuristic is such an approach that uses a tdistribution with
fewer degrees of freedom (dof_ml1()
) to calculate pvalues
(p_value_ml1()
) and confidence intervals (ci(method = "ml1")
).
In particular for repeated measure designs (longitudinal data analysis),
the ml1 heuristic is likely to be more accurate than simply using the
residual or infinite degrees of freedom, because dof_ml1()
returns
different degrees of freedom for withincluster and betweencluster effects.
Note that the "ml1" heuristic is not applicable (or at least less accurate)
for complex multilevel designs, e.g. with crossclassified clusters. In such cases,
more accurate approaches like the KenwardRoger approximation (dof_kenward()
)
is recommended. However, the "ml1" heuristic also applies to generalized
mixed models, while approaches like KenwardRoger or Satterthwaite are limited
to linear mixed models only.
A data frame.
Elff, M.; Heisig, J.P.; Schaeffer, M.; Shikano, S. (2019). Multilevel Analysis with Few Clusters: Improving Likelihoodbased Methods to Provide Unbiased Estimates and Accurate Inference, British Journal of Political Science.
Li, P., Redden, D. T. (2015). Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample clusterrandomized trials. BMC Medical Research Methodology, 15(1), 38. doi:10.1186/s128740150026x
dof_ml1()
is a small helperfunction to calculate approximated
degrees of freedom of model parameters, based on the "ml1" heuristic.
if (require("lme4")) { model < lmer(Petal.Length ~ Sepal.Length + (1  Species), data = iris) p_value_ml1(model) }
if (require("lme4")) { model < lmer(Petal.Length ~ Sepal.Length + (1  Species), data = iris) p_value_ml1(model) }
An approximate Ftest based on the Satterthwaite (1946) approach.
ci_satterthwaite(model, ci = 0.95, ...) dof_satterthwaite(model) p_value_satterthwaite(model, dof = NULL, ...) se_satterthwaite(model)
ci_satterthwaite(model, ci = 0.95, ...) dof_satterthwaite(model) p_value_satterthwaite(model, dof = NULL, ...) se_satterthwaite(model)
model 
A statistical model. 
ci 
Confidence Interval (CI) level. Default to 
... 
Additional arguments passed down to the underlying functions.
E.g., arguments like 
dof 
Degrees of Freedom. 
Inferential statistics (like pvalues, confidence intervals and
standard errors) may be biased in mixed models when the number of clusters
is small (even if the sample size of level1 units is high). In such cases
it is recommended to approximate a more accurate number of degrees of freedom
for such inferential statistics. Unlike simpler approximation heuristics
like the "ml1" rule (dof_ml1
), the Satterthwaite approximation is
also applicable in more complex multilevel designs. However, the "ml1"
heuristic also applies to generalized mixed models, while approaches like
KenwardRoger or Satterthwaite are limited to linear mixed models only.
A data frame.
Satterthwaite FE (1946) An approximate distribution of estimates of variance components. Biometrics Bulletin 2 (6):110–4.
dof_satterthwaite()
and se_satterthwaite()
are small helperfunctions
to calculate approximated degrees of freedom and standard errors for model
parameters, based on the Satterthwaite (1946) approach.
dof_kenward()
and dof_ml1()
approximate degrees of freedom based on
KenwardRoger's method or the "ml1" rule.
if (require("lme4", quietly = TRUE)) { model < lmer(Petal.Length ~ Sepal.Length + (1  Species), data = iris) p_value_satterthwaite(model) }
if (require("lme4", quietly = TRUE)) { model < lmer(Petal.Length ~ Sepal.Length + (1  Species), data = iris) p_value_satterthwaite(model) }
ci()
attempts to return confidence intervals of model parameters.
## Default S3 method: ci(x, ci = 0.95, dof = NULL, method = NULL, ...) ## S3 method for class 'glmmTMB' ci( x, ci = 0.95, dof = NULL, method = "wald", component = "all", verbose = TRUE, ... ) ## S3 method for class 'merMod' ci(x, ci = 0.95, dof = NULL, method = "wald", iterations = 500, ...)
## Default S3 method: ci(x, ci = 0.95, dof = NULL, method = NULL, ...) ## S3 method for class 'glmmTMB' ci( x, ci = 0.95, dof = NULL, method = "wald", component = "all", verbose = TRUE, ... ) ## S3 method for class 'merMod' ci(x, ci = 0.95, dof = NULL, method = "wald", iterations = 500, ...)
x 
A statistical model. 
ci 
Confidence Interval (CI) level. Default to 
dof 
Number of degrees of freedom to be used when calculating
confidence intervals. If 
method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
... 
Additional arguments passed down to the underlying functions.
E.g., arguments like 
component 
Model component for which parameters should be shown. See
the documentation for your object's class in 
verbose 
Toggle warnings and messages. 
iterations 
The number of bootstrap replicates. Only applies to models
of class 
A data frame containing the CI bounds.
There are different ways of approximating the degrees of freedom depending
on different assumptions about the nature of the model and its sampling
distribution. The ci_method
argument modulates the method for computing degrees
of freedom (df) that are used to calculate confidence intervals (CI) and the
related pvalues. Following options are allowed, depending on the model
class:
Classical methods:
Classical inference is generally based on the Wald method. The Wald approach to inference computes a test statistic by dividing the parameter estimate by its standard error (Coefficient / SE), then comparing this statistic against a t or normal distribution. This approach can be used to compute CIs and pvalues.
"wald"
:
Applies to nonBayesian models. For linear models, CIs computed using the Wald method (SE and a tdistribution with residual df); pvalues computed using the Wald method with a tdistribution with residual df. For other models, CIs computed using the Wald method (SE and a normal distribution); pvalues computed using the Wald method with a normal distribution.
"normal"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a normal distribution.
"residual"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a tdistribution with residual df when possible. If the residual df for a model cannot be determined, a normal distribution is used instead.
Methods for mixed models:
Compared to fixed effects (or singlelevel) models, determining appropriate df for Waldbased inference in mixed models is more difficult. See the R GLMM FAQ for a discussion.
Several approximate methods for computing df are available, but you should
also consider instead using profile likelihood ("profile"
) or bootstrap ("boot"
)
CIs and pvalues instead.
"satterthwaite"
Applies to linear mixed models. CIs computed using the Wald method (SE and a tdistribution with Satterthwaite df); pvalues computed using the Wald method with a tdistribution with Satterthwaite df.
"kenward"
Applies to linear mixed models. CIs computed using the Wald method (KenwardRoger SE and a tdistribution with KenwardRoger df); pvalues computed using the Wald method with KenwardRoger SE and tdistribution with KenwardRoger df.
"ml1"
Applies to linear mixed models. CIs computed using the Wald
method (SE and a tdistribution with ml1 approximated df); pvalues
computed using the Wald method with a tdistribution with ml1 approximated df.
See ci_ml1()
.
"betwithin"
Applies to linear mixed models and generalized linear mixed models.
CIs computed using the Wald method (SE and a tdistribution with betweenwithin df);
pvalues computed using the Wald method with a tdistribution with betweenwithin df.
See ci_betwithin()
.
Likelihoodbased methods:
Likelihoodbased inference is based on comparing the likelihood for the
maximumlikelihood estimate to the the likelihood for models with one or more
parameter values changed (e.g., set to zero or a range of alternative values).
Likelihood ratios for the maximumlikelihood and alternative models are compared
to a $\chi$
squared distribution to compute CIs and pvalues.
"profile"
Applies to nonBayesian models of class glm
, polr
, merMod
or glmmTMB
.
CIs computed by profiling the likelihood curve for a parameter, using
linear interpolation to find where likelihood ratio equals a critical value;
pvalues computed using the Wald method with a normaldistribution (note:
this might change in a future update!)
"uniroot"
Applies to nonBayesian models of class glmmTMB
. CIs
computed by profiling the likelihood curve for a parameter, using root
finding to find where likelihood ratio equals a critical value; pvalues
computed using the Wald method with a normaldistribution (note: this
might change in a future update!)
Methods for bootstrapped or Bayesian models:
Bootstrapbased inference is based on resampling and refitting the model to the resampled datasets. The distribution of parameter estimates across resampled datasets is used to approximate the parameter's sampling distribution. Depending on the type of model, several different methods for bootstrapping and constructing CIs and pvalues from the bootstrap distribution are available.
For Bayesian models, inference is based on drawing samples from the model posterior distribution.
"quantile"
(or "eti"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as equal tailed intervals using the quantiles of the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::eti()
.
"hdi"
Applies to all models (including Bayesian models). For nonBayesian
models, only applies if bootstrap = TRUE
. CIs computed as highest density intervals
for the bootstrap or posterior samples; pvalues are based on the probability of direction.
See bayestestR::hdi()
.
"bci"
(or "bcai"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as bias corrected and accelerated intervals for the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::bci()
.
"si"
Applies to Bayesian models with proper priors. CIs computed as
support intervals comparing the posterior samples against the prior samples;
pvalues are based on the probability of direction. See bayestestR::si()
.
"boot"
Applies to nonBayesian models of class merMod
. CIs computed
using parametric bootstrapping (simulating data from the fitted model);
pvalues computed using the Wald method with a normaldistribution)
(note: this might change in a future update!).
For all iterationbased methods other than "boot"
("hdi"
, "quantile"
, "ci"
, "eti"
, "si"
, "bci"
, "bcai"
),
pvalues are based on the probability of direction (bayestestR::p_direction()
),
which is converted into a pvalue using bayestestR::pd_to_p()
.
data(qol_cancer) model < lm(QoL ~ time + age + education, data = qol_cancer) # regular confidence intervals ci(model) # using heteroscedasticityrobust standard errors ci(model, vcov = "HC3") library(parameters) data(Salamanders, package = "glmmTMB") model < glmmTMB::glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) ci(model) ci(model, component = "zi")
data(qol_cancer) model < lm(QoL ~ time + age + education, data = qol_cancer) # regular confidence intervals ci(model) # using heteroscedasticityrobust standard errors ci(model, vcov = "HC3") library(parameters) data(Salamanders, package = "glmmTMB") model < glmmTMB::glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) ci(model) ci(model, component = "zi")
Compute hierarchical or kmeans cluster analysis and return the group assignment for each observation as vector.
cluster_analysis( x, n = NULL, method = "kmeans", include_factors = FALSE, standardize = TRUE, verbose = TRUE, distance_method = "euclidean", hclust_method = "complete", kmeans_method = "HartiganWong", dbscan_eps = 15, iterations = 100, ... )
cluster_analysis( x, n = NULL, method = "kmeans", include_factors = FALSE, standardize = TRUE, verbose = TRUE, distance_method = "euclidean", hclust_method = "complete", kmeans_method = "HartiganWong", dbscan_eps = 15, iterations = 100, ... )
x 
A data frame (with at least two variables), or a matrix (with at least two columns). 
n 
Number of clusters used for supervised cluster methods. If 
method 
Method for computing the cluster analysis. Can be 
include_factors 
Logical, if 
standardize 
Standardize the dataframe before clustering (default). 
verbose 
Toggle warnings and messages. 
distance_method 
Distance measure to be used for methods based on
distances (e.g., when 
hclust_method 
Agglomeration method to be used when 
kmeans_method 
Algorithm used for calculating kmeans cluster. Only applies,
if 
dbscan_eps 
The 
iterations 
The number of replications. 
... 
Arguments passed to or from other methods. 
The print()
and plot()
methods show the (standardized) mean value for
each variable within each cluster. Thus, a higher absolute value indicates
that a certain variable characteristic is more pronounced within that
specific cluster (as compared to other cluster groups with lower absolute
mean values).
Clusters classification can be obtained via print(x, newdata = NULL, ...)
.
The group classification for each observation as vector. The
returned vector includes missing values, so it has the same length
as nrow(x)
.
There is also a plot()
method
implemented in the seepackage.
Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2014) cluster: Cluster Analysis Basics and Extensions. R package.
n_clusters()
to determine the number of clusters to extract.
cluster_discrimination()
to determine the accuracy of cluster group
classification via linear discriminant analysis (LDA).
performance::check_clusterstructure()
to check suitability of data
for clustering.
https://www.datanovia.com/en/lessons/
set.seed(33) # KMeans ==================================================== rez < cluster_analysis(iris[1:4], n = 3, method = "kmeans") rez # Show results predict(rez) # Get clusters summary(rez) # Extract the centers values (can use 'plot()' on that) if (requireNamespace("MASS", quietly = TRUE)) { cluster_discrimination(rez) # Perform LDA } # Hierarchical kmeans (more robust kmeans) if (require("factoextra", quietly = TRUE)) { rez < cluster_analysis(iris[1:4], n = 3, method = "hkmeans") rez # Show results predict(rez) # Get clusters } # Hierarchical Clustering (hclust) =========================== rez < cluster_analysis(iris[1:4], n = 3, method = "hclust") rez # Show results predict(rez) # Get clusters # KMedoids (pam) ============================================ if (require("cluster", quietly = TRUE)) { rez < cluster_analysis(iris[1:4], n = 3, method = "pam") rez # Show results predict(rez) # Get clusters } # PAM with automated number of clusters if (require("fpc", quietly = TRUE)) { rez < cluster_analysis(iris[1:4], method = "pamk") rez # Show results predict(rez) # Get clusters } # DBSCAN ==================================================== if (require("dbscan", quietly = TRUE)) { # Note that you can assimilate more outliers (cluster 0) to neighbouring # clusters by setting borderPoints = TRUE. rez < cluster_analysis(iris[1:4], method = "dbscan", dbscan_eps = 1.45) rez # Show results predict(rez) # Get clusters } # Mixture ==================================================== if (require("mclust", quietly = TRUE)) { library(mclust) # Needs the package to be loaded rez < cluster_analysis(iris[1:4], method = "mixture") rez # Show results predict(rez) # Get clusters }
set.seed(33) # KMeans ==================================================== rez < cluster_analysis(iris[1:4], n = 3, method = "kmeans") rez # Show results predict(rez) # Get clusters summary(rez) # Extract the centers values (can use 'plot()' on that) if (requireNamespace("MASS", quietly = TRUE)) { cluster_discrimination(rez) # Perform LDA } # Hierarchical kmeans (more robust kmeans) if (require("factoextra", quietly = TRUE)) { rez < cluster_analysis(iris[1:4], n = 3, method = "hkmeans") rez # Show results predict(rez) # Get clusters } # Hierarchical Clustering (hclust) =========================== rez < cluster_analysis(iris[1:4], n = 3, method = "hclust") rez # Show results predict(rez) # Get clusters # KMedoids (pam) ============================================ if (require("cluster", quietly = TRUE)) { rez < cluster_analysis(iris[1:4], n = 3, method = "pam") rez # Show results predict(rez) # Get clusters } # PAM with automated number of clusters if (require("fpc", quietly = TRUE)) { rez < cluster_analysis(iris[1:4], method = "pamk") rez # Show results predict(rez) # Get clusters } # DBSCAN ==================================================== if (require("dbscan", quietly = TRUE)) { # Note that you can assimilate more outliers (cluster 0) to neighbouring # clusters by setting borderPoints = TRUE. rez < cluster_analysis(iris[1:4], method = "dbscan", dbscan_eps = 1.45) rez # Show results predict(rez) # Get clusters } # Mixture ==================================================== if (require("mclust", quietly = TRUE)) { library(mclust) # Needs the package to be loaded rez < cluster_analysis(iris[1:4], method = "mixture") rez # Show results predict(rez) # Get clusters }
For each cluster, computes the mean (or other indices) of the variables. Can be used to retrieve the centers of clusters. Also returns the within Sum of Squares.
cluster_centers(data, clusters, fun = mean, ...)
cluster_centers(data, clusters, fun = mean, ...)
data 
A data.frame. 
clusters 
A vector with clusters assignments (must be same length as rows in data). 
fun 
What function to use, 
... 
Other arguments to be passed to or from other functions. 
A dataframe containing the cluster centers. Attributes include performance statistics and distance between each observation and its respective cluster centre.
k < kmeans(iris[1:4], 3) cluster_centers(iris[1:4], clusters = k$cluster) cluster_centers(iris[1:4], clusters = k$cluster, fun = median)
k < kmeans(iris[1:4], 3) cluster_centers(iris[1:4], clusters = k$cluster) cluster_centers(iris[1:4], clusters = k$cluster, fun = median)
Computes linear discriminant analysis (LDA) on classified cluster groups, and
determines the goodness of classification for each cluster group. See MASS::lda()
for details.
cluster_discrimination(x, cluster_groups = NULL, ...)
cluster_discrimination(x, cluster_groups = NULL, ...)
x 
A data frame 
cluster_groups 
Group classification of the cluster analysis, which can
be retrieved from the 
... 
Other arguments to be passed to or from. 
n_clusters()
to determine the number of clusters to extract,
cluster_analysis()
to compute a cluster analysis and
performance::check_clusterstructure()
to check suitability of data for
clustering.
# Retrieve group classification from hierarchical cluster analysis clustering < cluster_analysis(iris[, 1:4], n = 3) # Goodness of group classification cluster_discrimination(clustering)
# Retrieve group classification from hierarchical cluster analysis clustering < cluster_analysis(iris[, 1:4], n = 3) # Goodness of group classification cluster_discrimination(clustering)
One of the core "issue" of statistical clustering is that, in many cases, different methods will give different results. The metaclustering approach proposed by easystats (that finds echoes in consensus clustering; see Monti et al., 2003) consists of treating the unique clustering solutions as a ensemble, from which we can derive a probability matrix. This matrix contains, for each pair of observations, the probability of being in the same cluster. For instance, if the 6th and the 9th row of a dataframe has been assigned to a similar cluster by 5 our of 10 clustering methods, then its probability of being grouped together is 0.5.
cluster_meta(list_of_clusters, rownames = NULL, ...)
cluster_meta(list_of_clusters, rownames = NULL, ...)
list_of_clusters 
A list of vectors with the clustering assignments from various methods. 
rownames 
An optional vector of row.names for the matrix. 
... 
Currently not used. 
Metaclustering is based on the hypothesis that, as each clustering algorithm embodies a different prism by which it sees the data, running an infinite amount of algorithms would result in the emergence of the "true" clusters. As the number of algorithms and parameters is finite, the probabilistic perspective is a useful proxy. This method is interesting where there is no obvious reasons to prefer one over another clustering method, as well as to investigate how robust some clusters are under different algorithms.
This metaclustering probability matrix can be transformed into a dissimilarity
matrix (such as the one produced by dist()
) and submitted for instance to
hierarchical clustering (hclust()
). See the example below.
A matrix containing all the pairwise (between each observation) probabilities of being clustered together by the methods.
data < iris[1:4] rez1 < cluster_analysis(data, n = 2, method = "kmeans") rez2 < cluster_analysis(data, n = 3, method = "kmeans") rez3 < cluster_analysis(data, n = 6, method = "kmeans") list_of_clusters < list(rez1, rez2, rez3) m < cluster_meta(list_of_clusters) # Visualize matrix without reordering heatmap(m, Rowv = NA, Colv = NA, scale = "none") # Without reordering # Reordered heatmap heatmap(m, scale = "none") # Extract 3 clusters predict(m, n = 3) # Convert to dissimilarity d < as.dist(abs(m  1)) model < hclust(d) plot(model, hang = 1)
data < iris[1:4] rez1 < cluster_analysis(data, n = 2, method = "kmeans") rez2 < cluster_analysis(data, n = 3, method = "kmeans") rez3 < cluster_analysis(data, n = 6, method = "kmeans") list_of_clusters < list(rez1, rez2, rez3) m < cluster_meta(list_of_clusters) # Visualize matrix without reordering heatmap(m, Rowv = NA, Colv = NA, scale = "none") # Without reordering # Reordered heatmap heatmap(m, scale = "none") # Extract 3 clusters predict(m, n = 3) # Convert to dissimilarity d < as.dist(abs(m  1)) model < hclust(d) plot(model, hang = 1)
Compute performance indices for clustering solutions.
cluster_performance(model, ...) ## S3 method for class 'kmeans' cluster_performance(model, ...) ## S3 method for class 'hclust' cluster_performance(model, data, clusters, ...) ## S3 method for class 'dbscan' cluster_performance(model, data, ...) ## S3 method for class 'parameters_clusters' cluster_performance(model, ...)
cluster_performance(model, ...) ## S3 method for class 'kmeans' cluster_performance(model, ...) ## S3 method for class 'hclust' cluster_performance(model, data, clusters, ...) ## S3 method for class 'dbscan' cluster_performance(model, data, ...) ## S3 method for class 'parameters_clusters' cluster_performance(model, ...)
model 
Cluster model. 
... 
Arguments passed to or from other methods. 
data 
A data.frame. 
clusters 
A vector with clusters assignments (must be same length as rows in data). 
# kmeans model < kmeans(iris[1:4], 3) cluster_performance(model) # hclust data < iris[1:4] model < hclust(dist(data)) clusters < cutree(model, 3) rez < cluster_performance(model, data, clusters) rez # DBSCAN model < dbscan::dbscan(iris[1:4], eps = 1.45, minPts = 10) rez < cluster_performance(model, iris[1:4]) rez # Retrieve performance from parameters params < model_parameters(kmeans(iris[1:4], 3)) cluster_performance(params)
# kmeans model < kmeans(iris[1:4], 3) cluster_performance(model) # hclust data < iris[1:4] model < hclust(dist(data)) clusters < cutree(model, 3) rez < cluster_performance(model, data, clusters) rez # DBSCAN model < dbscan::dbscan(iris[1:4], eps = 1.45, minPts = 10) rez < cluster_performance(model, iris[1:4]) rez # Retrieve performance from parameters params < model_parameters(kmeans(iris[1:4], 3)) cluster_performance(params)
Compute and extract model parameters of multiple regression
models. See model_parameters()
for further details.
compare_parameters( ..., ci = 0.95, effects = "fixed", component = "conditional", standardize = NULL, exponentiate = FALSE, ci_method = "wald", p_adjust = NULL, select = NULL, column_names = NULL, pretty_names = TRUE, coefficient_names = NULL, keep = NULL, drop = NULL, include_reference = FALSE, groups = NULL, verbose = TRUE ) compare_models( ..., ci = 0.95, effects = "fixed", component = "conditional", standardize = NULL, exponentiate = FALSE, ci_method = "wald", p_adjust = NULL, select = NULL, column_names = NULL, pretty_names = TRUE, coefficient_names = NULL, keep = NULL, drop = NULL, include_reference = FALSE, groups = NULL, verbose = TRUE )
compare_parameters( ..., ci = 0.95, effects = "fixed", component = "conditional", standardize = NULL, exponentiate = FALSE, ci_method = "wald", p_adjust = NULL, select = NULL, column_names = NULL, pretty_names = TRUE, coefficient_names = NULL, keep = NULL, drop = NULL, include_reference = FALSE, groups = NULL, verbose = TRUE ) compare_models( ..., ci = 0.95, effects = "fixed", component = "conditional", standardize = NULL, exponentiate = FALSE, ci_method = "wald", p_adjust = NULL, select = NULL, column_names = NULL, pretty_names = TRUE, coefficient_names = NULL, keep = NULL, drop = NULL, include_reference = FALSE, groups = NULL, verbose = TRUE )
... 
One or more regression model objects, or objects returned by

ci 
Confidence Interval (CI) level. Default to 
effects 
Should parameters for fixed effects ( 
component 
Model component for which parameters should be shown. See
documentation for related model class in 
standardize 
The method used for standardizing the parameters. Can be

exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
ci_method 
Method for computing degrees of freedom for pvalues
and confidence intervals (CI). See documentation for related model class
in 
p_adjust 
Character vector, if not 
select 
Determines which columns and and which layout columns are printed. There are three options for this argument:
*. A string indicating a predefined layout
For 
column_names 
Character vector with strings that should be used as
column headers. Must be of same length as number of models in 
pretty_names 
Can be 
coefficient_names 
Character vector with strings that should be used
as column headers for the coefficient column. Must be of same length as
number of models in 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
include_reference 
Logical, if 
groups 
Named list, can be used to group parameters in the printed output.
List elements may either be character vectors that match the name of those
parameters that belong to one group, or list elements can be row numbers
of those parameter rows that should belong to one group. The names of the
list elements will be used as group names, which will be inserted as "header
row". A possible use case might be to emphasize focal predictors and control
variables, see 'Examples'. Parameters will be reordered according to the
order used in 
verbose 
Toggle warnings and messages. 
This function is in an early stage and does not yet cope with more complex models, and probably does not yet properly render all model components. It should also be noted that when including models with interaction terms, not only do the values of the parameters change, but so does their meaning (from main effects, to simple slopes), thereby making such comparisons hard. Therefore, you should not use this function to compare models with interaction terms with models without interaction terms.
A data frame of indices related to the model's parameters.
data(iris) lm1 < lm(Sepal.Length ~ Species, data = iris) lm2 < lm(Sepal.Length ~ Species + Petal.Length, data = iris) compare_parameters(lm1, lm2) # custom style compare_parameters(lm1, lm2, select = "{estimate}{stars} ({se})") # custom style, in HTML result < compare_parameters(lm1, lm2, select = "{estimate}<br>({se}){p}") print_html(result) data(mtcars) m1 < lm(mpg ~ wt, data = mtcars) m2 < glm(vs ~ wt + cyl, data = mtcars, family = "binomial") compare_parameters(m1, m2) # exponentiate coefficients, but not for lm compare_parameters(m1, m2, exponentiate = "nongaussian") # change column names compare_parameters("linear model" = m1, "logistic reg." = m2) compare_parameters(m1, m2, column_names = c("linear model", "logistic reg.")) # or as list compare_parameters(list(m1, m2)) compare_parameters(list("linear model" = m1, "logistic reg." = m2))
data(iris) lm1 < lm(Sepal.Length ~ Species, data = iris) lm2 < lm(Sepal.Length ~ Species + Petal.Length, data = iris) compare_parameters(lm1, lm2) # custom style compare_parameters(lm1, lm2, select = "{estimate}{stars} ({se})") # custom style, in HTML result < compare_parameters(lm1, lm2, select = "{estimate}<br>({se}){p}") print_html(result) data(mtcars) m1 < lm(mpg ~ wt, data = mtcars) m2 < glm(vs ~ wt + cyl, data = mtcars, family = "binomial") compare_parameters(m1, m2) # exponentiate coefficients, but not for lm compare_parameters(m1, m2, exponentiate = "nongaussian") # change column names compare_parameters("linear model" = m1, "logistic reg." = m2) compare_parameters(m1, m2, column_names = c("linear model", "logistic reg.")) # or as list compare_parameters(list(m1, m2)) compare_parameters(list("linear model" = m1, "logistic reg." = m2))
Enables a conversion between Exploratory Factor Analysis (EFA) and
Confirmatory Factor Analysis (CFA) lavaan
ready structure.
convert_efa_to_cfa(model, ...) ## S3 method for class 'fa' convert_efa_to_cfa( model, threshold = "max", names = NULL, max_per_dimension = NULL, ... ) efa_to_cfa(model, ...)
convert_efa_to_cfa(model, ...) ## S3 method for class 'fa' convert_efa_to_cfa( model, threshold = "max", names = NULL, max_per_dimension = NULL, ... ) efa_to_cfa(model, ...)
model 
An EFA model (e.g., a 
... 
Arguments passed to or from other methods. 
threshold 
A value between 0 and 1 indicates which (absolute) values
from the loadings should be removed. An integer higher than 1 indicates the
n strongest loadings to retain. Can also be 
names 
Vector containing dimension names. 
max_per_dimension 
Maximum number of variables to keep per dimension. 
Converted index.
library(parameters) data(attitude) efa < psych::fa(attitude, nfactors = 3) model1 < efa_to_cfa(efa) model2 < efa_to_cfa(efa, threshold = 0.3) model3 < efa_to_cfa(efa, max_per_dimension = 2) suppressWarnings(anova( lavaan::cfa(model1, data = attitude), lavaan::cfa(model2, data = attitude), lavaan::cfa(model3, data = attitude) ))
library(parameters) data(attitude) efa < psych::fa(attitude, nfactors = 3) model1 < efa_to_cfa(efa) model2 < efa_to_cfa(efa, threshold = 0.3) model3 < efa_to_cfa(efa, max_per_dimension = 2) suppressWarnings(anova( lavaan::cfa(model1, data = attitude), lavaan::cfa(model2, data = attitude), lavaan::cfa(model3, data = attitude) ))
Estimate or extract degrees of freedom of models parameters.
degrees_of_freedom(model, method = "analytical", ...) dof(model, method = "analytical", ...)
degrees_of_freedom(model, method = "analytical", ...) dof(model, method = "analytical", ...)
model 
A statistical model. 
method 
Type of approximation for the degrees of freedom. Can be one of the following:
Usually, when degrees of freedom are required to calculate pvalues or
confidence intervals, 
... 
Currently not used. 
In many cases, degrees_of_freedom()
returns the same as df.residuals()
,
or nk
(number of observations minus number of parameters). However,
degrees_of_freedom()
refers to the model's parameters degrees of freedom
of the distribution for the related test statistic. Thus, for models with
zstatistic, results from degrees_of_freedom()
and df.residuals()
differ.
Furthermore, for other approximation methods like "kenward"
or
"satterthwaite"
, each model parameter can have a different degree of
freedom.
model < lm(Sepal.Length ~ Petal.Length * Species, data = iris) dof(model) model < glm(vs ~ mpg * cyl, data = mtcars, family = "binomial") dof(model) model < lmer(Sepal.Length ~ Petal.Length + (1  Species), data = iris) dof(model) if (require("rstanarm", quietly = TRUE)) { model < stan_glm( Sepal.Length ~ Petal.Length * Species, data = iris, chains = 2, refresh = 0 ) dof(model) }
model < lm(Sepal.Length ~ Petal.Length * Species, data = iris) dof(model) model < glm(vs ~ mpg * cyl, data = mtcars, family = "binomial") dof(model) model < lmer(Sepal.Length ~ Petal.Length + (1  Species), data = iris) dof(model) if (require("rstanarm", quietly = TRUE)) { model < stan_glm( Sepal.Length ~ Petal.Length * Species, data = iris, chains = 2, refresh = 0 ) dof(model) }
Prints tables (i.e. data frame) in different output formats.
print_md()
is an alias for display(format = "markdown")
, print_html()
is an alias for display(format = "html")
. print_table()
is for specific
use cases only, and currently only works for compare_parameters()
objects.
## S3 method for class 'parameters_model' display( object, format = "markdown", pretty_names = TRUE, split_components = TRUE, select = NULL, caption = NULL, subtitle = NULL, footer = NULL, align = NULL, digits = 2, ci_digits = digits, p_digits = 3, footer_digits = 3, ci_brackets = c("(", ")"), show_sigma = FALSE, show_formula = FALSE, zap_small = FALSE, font_size = "100%", line_padding = 4, column_labels = NULL, include_reference = FALSE, verbose = TRUE, ... ) ## S3 method for class 'parameters_sem' display( object, format = "markdown", digits = 2, ci_digits = digits, p_digits = 3, ci_brackets = c("(", ")"), ... ) ## S3 method for class 'parameters_efa_summary' display(object, format = "markdown", digits = 3, ...) ## S3 method for class 'parameters_efa' display( object, format = "markdown", digits = 2, sort = FALSE, threshold = NULL, labels = NULL, ... ) ## S3 method for class 'equivalence_test_lm' display(object, format = "markdown", digits = 2, ...) print_table(x, digits = 2, p_digits = 3, theme = "default", ...)
## S3 method for class 'parameters_model' display( object, format = "markdown", pretty_names = TRUE, split_components = TRUE, select = NULL, caption = NULL, subtitle = NULL, footer = NULL, align = NULL, digits = 2, ci_digits = digits, p_digits = 3, footer_digits = 3, ci_brackets = c("(", ")"), show_sigma = FALSE, show_formula = FALSE, zap_small = FALSE, font_size = "100%", line_padding = 4, column_labels = NULL, include_reference = FALSE, verbose = TRUE, ... ) ## S3 method for class 'parameters_sem' display( object, format = "markdown", digits = 2, ci_digits = digits, p_digits = 3, ci_brackets = c("(", ")"), ... ) ## S3 method for class 'parameters_efa_summary' display(object, format = "markdown", digits = 3, ...) ## S3 method for class 'parameters_efa' display( object, format = "markdown", digits = 2, sort = FALSE, threshold = NULL, labels = NULL, ... ) ## S3 method for class 'equivalence_test_lm' display(object, format = "markdown", digits = 2, ...) print_table(x, digits = 2, p_digits = 3, theme = "default", ...)
object 
An object returned by 
format 
String, indicating the output format. Can be 
pretty_names 
Can be 
split_components 
Logical, if 
select 
Determines which columns and and which layout columns are printed. There are three options for this argument:
*. A string indicating a predefined layout
For 
caption 
Table caption as string. If 
subtitle 
Table title (same as caption) and subtitle, as strings. If 
footer 
Can either be 
align 
Only applies to HTML tables. May be one of 
digits , ci_digits , p_digits

Number of digits for rounding or
significant figures. May also be 
footer_digits 
Number of decimal places for values in the footer summary. 
ci_brackets 
Logical, if 
show_sigma 
Logical, if 
show_formula 
Logical, if 
zap_small 
Logical, if 
font_size 
For HTML tables, the font size. 
line_padding 
For HTML tables, the distance (in pixel) between lines. 
column_labels 
Labels of columns for HTML tables. If 
include_reference 
Logical, if 
verbose 
Toggle messages and warnings. 
... 
Arguments passed down to 
sort 
Sort the loadings. 
threshold 
A value between 0 and 1 indicates which (absolute) values
from the loadings should be removed. An integer higher than 1 indicates the
n strongest loadings to retain. Can also be 
labels 
A character vector containing labels to be added to the loadings data. Usually, the question related to the item. 
x 
An object returned by 
theme 
String, indicating the table theme. Can be one of 
display()
is useful when the tableoutput from functions,
which is usually printed as formatted texttable to console, should
be formatted for pretty tablerendering in markdown documents, or if
knitted from rmarkdown to PDF or Word files. See
vignette
for examples.
print_table()
is a special function for compare_parameters()
objects,
which prints the output as a formatted HTML table. It is still somewhat
experimental, thus, only a fixed layoutstyle is available at the moment
(columns for estimates, confidence intervals and pvalues). However, it
is possible to include other model components, like zeroinflation, or random
effects in the table. See 'Examples'. An alternative is to set engine = "tt"
in print_html()
to use the tinytable package for creating HTML tables.
If format = "markdown"
, the return value will be a character
vector in markdowntable format. If format = "html"
, an object of
class gt_tbl
. For print_table()
, an object of class tinytable
is
returned.
print.parameters_model()
and print.compare_parameters()
model < lm(mpg ~ wt + cyl, data = mtcars) mp < model_parameters(model) display(mp) data(iris) lm1 < lm(Sepal.Length ~ Species, data = iris) lm2 < lm(Sepal.Length ~ Species + Petal.Length, data = iris) lm3 < lm(Sepal.Length ~ Species * Petal.Length, data = iris) out < compare_parameters(lm1, lm2, lm3) print_html( out, select = "{coef}{stars}({ci})", column_labels = c("Estimate", "95% CI") ) # line break, unicode minussign print_html( out, select = "{estimate}{stars}<br>({ci_low} \u2212 {ci_high})", column_labels = c("Est. (95% CI)") ) data(iris) data(Salamanders, package = "glmmTMB") m1 < lm(Sepal.Length ~ Species * Petal.Length, data = iris) m2 < lme4::lmer( Sepal.Length ~ Petal.Length + Petal.Width + (1  Species), data = iris ) m3 < glmmTMB::glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) out < compare_parameters(m1, m2, m3, effects = "all", component = "all") print_table(out)
model < lm(mpg ~ wt + cyl, data = mtcars) mp < model_parameters(model) display(mp) data(iris) lm1 < lm(Sepal.Length ~ Species, data = iris) lm2 < lm(Sepal.Length ~ Species + Petal.Length, data = iris) lm3 < lm(Sepal.Length ~ Species * Petal.Length, data = iris) out < compare_parameters(lm1, lm2, lm3) print_html( out, select = "{coef}{stars}({ci})", column_labels = c("Estimate", "95% CI") ) # line break, unicode minussign print_html( out, select = "{estimate}{stars}<br>({ci_low} \u2212 {ci_high})", column_labels = c("Est. (95% CI)") ) data(iris) data(Salamanders, package = "glmmTMB") m1 < lm(Sepal.Length ~ Species * Petal.Length, data = iris) m2 < lme4::lmer( Sepal.Length ~ Petal.Length + Petal.Width + (1  Species), data = iris ) m3 < glmmTMB::glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) out < compare_parameters(m1, m2, m3, effects = "all", component = "all") print_table(out)
Computes Dominance Analysis Statistics and Designations
dominance_analysis( model, sets = NULL, all = NULL, conditional = TRUE, complete = TRUE, quote_args = NULL, contrasts = model$contrasts, ... )
dominance_analysis( model, sets = NULL, all = NULL, conditional = TRUE, complete = TRUE, quote_args = NULL, contrasts = model$contrasts, ... )
model 
A model object supported by 
sets 
A (named) list of formula objects with no left hand side/response. If the list has names, the name provided each element will be used as the label for the set. Unnamed list elements will be provided a set number name based on its position among the sets as entered. Predictors in each formula are bound together as a set in the dominance
analysis and dominance statistics and designations are computed for
the predictors together. Predictors in 
all 
A formula with no left hand side/response. Predictors in the formula are included in each subset in the dominance
analysis and the R2 value associated with them is subtracted from the
overall value. Predictors in 
conditional 
Logical. If If conditional dominance is not desired as an importance criterion, avoiding computing the conditional dominance matrix can save computation time. 
complete 
Logical. If If complete dominance is not desired as an importance criterion, avoiding computing complete dominance designations can save computation time. 
quote_args 
A character vector of arguments in the model submitted to

contrasts 
A named list of 
... 
Not used at current. 
Computes two decompositions of the model's R2 and returns a matrix of designations from which predictor relative importance determinations can be obtained.
Note in the output that the "constant" subset is associated with a component of the model that does not directly contribute to the R2 such as an intercept. The "all" subset is apportioned a component of the fit statistic but is not considered a part of the dominance analysis and therefore does not receive a rank, conditional dominance statistics, or complete dominance designations.
The input model is parsed using insight::find_predictors()
, does not
yet support interactions, transformations, or offsets applied in the R
formula, and will fail with an error if any such terms are detected.
The model submitted must accept an formula object as a formula
argument. In addition, the model object must accept the data on which
the model is estimated as a data
argument. Formulas submitted
using object references (i.e., lm(mtcars$mpg ~ mtcars$vs)
) and
functions that accept data as a nondata
argument
(e.g., survey::svyglm()
uses design
) will fail with an error.
Models that return TRUE
for the insight::model_info()
function's values "is_bayesian", "is_mixed", "is_gam",
is_multivariate", "is_zero_inflated",
or "is_hurdle" are not supported at current.
When performance::r2()
returns multiple values, only the first is used
by default.
Object of class "parameters_da"
.
An object of class "parameters_da"
is a list of data.frame
s composed
of the following elements:
General
A data.frame
which associates dominance statistics with
model parameters. The variables in this data.frame
include:
Parameter
Parameter names.
General_Dominance
Vector of general dominance statistics.
The R2 ascribed to variables in the all
argument are also reported
here though they are not general dominance statistics.
Percent
Vector of general dominance statistics normalized to sum to 1.
Ranks
Vector of ranks applied to the general dominance statistics.
Subset
Names of the subset to which the parameter belongs in
the dominance analysis. Each other data.frame
returned will refer
to these subset names.
Conditional
A data.frame
of conditional dominance
statistics. Each observation represents a subset and each variable
represents an the average increment to R2 with a specific number of
subsets in the model. NULL
if conditional
argument is FALSE
.
Complete
A data.frame
of complete dominance
designations. The subsets in the observations are compared to the
subsets referenced in each variable. Whether the subset
in each variable dominates the subset in each observation is
represented in the logical value. NULL
if complete
argument is FALSE
.
Joseph Luchman
Azen, R., & Budescu, D. V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8(2), 129148. doi:10.1037/1082989X.8.2.129
Budescu, D. V. (1993). Dominance analysis: A new approach to the problem of relative importance of predictors in multiple regression. Psychological Bulletin, 114(3), 542551. doi:10.1037/00332909.114.3.542
Groemping, U. (2007). Estimators of relative importance in linear regression based on variance decomposition. The American Statistician, 61(2), 139147. doi:10.1198/000313007X188252
data(mtcars) # Dominance Analysis with Logit Regression model < glm(vs ~ cyl + carb + mpg, data = mtcars, family = binomial()) performance::r2(model) dominance_analysis(model) # Dominance Analysis with Weighted Logit Regression model_wt < glm(vs ~ cyl + carb + mpg, data = mtcars, weights = wt, family = quasibinomial() ) dominance_analysis(model_wt, quote_args = "weights")
data(mtcars) # Dominance Analysis with Logit Regression model < glm(vs ~ cyl + carb + mpg, data = mtcars, family = binomial()) performance::r2(model) dominance_analysis(model) # Dominance Analysis with Weighted Logit Regression model_wt < glm(vs ~ cyl + carb + mpg, data = mtcars, weights = wt, family = quasibinomial() ) dominance_analysis(model_wt, quote_args = "weights")
Compute the (conditional) equivalence test for frequentist models.
## S3 method for class 'lm' equivalence_test( x, range = "default", ci = 0.95, rule = "classic", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'merMod' equivalence_test( x, range = "default", ci = 0.95, rule = "classic", effects = c("fixed", "random"), vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'ggeffects' equivalence_test( x, range = "default", rule = "classic", test = "pairwise", verbose = TRUE, ... )
## S3 method for class 'lm' equivalence_test( x, range = "default", ci = 0.95, rule = "classic", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'merMod' equivalence_test( x, range = "default", ci = 0.95, rule = "classic", effects = c("fixed", "random"), vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'ggeffects' equivalence_test( x, range = "default", rule = "classic", test = "pairwise", verbose = TRUE, ... )
x 
A statistical model. 
range 
The range of practical equivalence of an effect. May be

ci 
Confidence Interval (CI) level. Default to 
rule 
Character, indicating the rules when testing for practical
equivalence. Can be 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. 
effects 
Should parameters for fixed effects ( 
test 
Hypothesis test for computing contrasts or pairwise comparisons.
See 
In classical null hypothesis significance testing (NHST) within a
frequentist framework, it is not possible to accept the null hypothesis, H0 
unlike in Bayesian statistics, where such probability statements are
possible. "... one can only reject the null hypothesis if the test
statistics falls into the critical region(s), or fail to reject this
hypothesis. In the latter case, all we can say is that no significant effect
was observed, but one cannot conclude that the null hypothesis is true."
(Pernet 2017). One way to address this issues without Bayesian methods is
Equivalence Testing, as implemented in equivalence_test()
. While you
either can reject the null hypothesis or claim an inconclusive result in
NHST, the equivalence test  according to Pernet  adds a third category,
"accept". Roughly speaking, the idea behind equivalence testing in a
frequentist framework is to check whether an estimate and its uncertainty
(i.e. confidence interval) falls within a region of "practical equivalence".
Depending on the rule for this test (see below), statistical significance
does not necessarily indicate whether the null hypothesis can be rejected or
not, i.e. the classical interpretation of the pvalue may differ from the
results returned from the equivalence test.
"bayes"  Bayesian rule (Kruschke 2018)
This rule follows the "HDI+ROPE decision rule" (Kruschke, 2014, 2018) used
for the Bayesian counterpart()
. This
means, if the confidence intervals are completely outside the ROPE, the
"null hypothesis" for this parameter is "rejected". If the ROPE
completely covers the CI, the null hypothesis is accepted. Else, it's
undecided whether to accept or reject the null hypothesis. Desirable
results are low proportions inside the ROPE (the closer to zero the
better).
"classic"  The TOST rule (Lakens 2017)
This rule follows the "TOST rule", i.e. a two onesided test procedure (Lakens 2017). Following this rule...
practical equivalence is assumed (i.e. H0 "accepted") when the narrow confidence intervals are completely inside the ROPE, no matter if the effect is statistically significant or not;
practical equivalence (i.e. H0) is rejected, when the coefficient is
statistically significant, both when the narrow confidence intervals
(i.e. 12*alpha
) include or exclude the the ROPE boundaries, but the
narrow confidence intervals are not fully covered by the ROPE;
else the decision whether to accept or reject practical equivalence is undecided (i.e. when effects are not statistically significant and the narrow confidence intervals overlaps the ROPE).
"cet"  Conditional Equivalence Testing (Campbell/Gustafson 2018)
The Conditional Equivalence Testing as described by Campbell and Gustafson 2018. According to this rule, practical equivalence is rejected when the coefficient is statistically significant. When the effect is not significant and the narrow confidence intervals are completely inside the ROPE, we accept (i.e. assume) practical equivalence, else it is undecided.
For rule = "classic"
, "narrow" confidence intervals are used for
equivalence testing. "Narrow" means, the the intervals is not 1  alpha,
but 1  2 * alpha. Thus, if ci = .95
, alpha is assumed to be 0.05
and internally a cilevel of 0.90 is used. rule = "cet"
uses
both regular and narrow confidence intervals, while rule = "bayes"
only uses the regular intervals.
The equivalence pvalue is the area of the (cumulative) confidence distribution that is outside of the region of equivalence. It can be interpreted as pvalue for rejecting the alternative hypothesis and accepting the "null hypothesis" (i.e. assuming practical equivalence). That is, a high pvalue means we reject the assumption of practical equivalence and accept the alternative hypothesis.
Second generation pvalues (SGPV) were proposed as a statistic that represents the proportion of datasupported hypotheses that are also null hypotheses (Blume et al. 2018, Lakens and Delacre 2020). It represents the proportion of the full confidence interval range (assuming a normally or tdistributed, equaltailed interval, based on the model) that is inside the ROPE. The SGPV ranges from zero to one. Higher values indicate that the effect is more likely to be practically equivalent ("not of interest").
Note that the assumed interval, which is used to calculate the SGPV, is an estimation of the full interval based on the chosen confidence level. For example, if the 95% confidence interval of a coefficient ranges from 1 to 1, the underlying full (normally or tdistributed) interval approximately ranges from 1.9 to 1.9, see also following code:
# simulate full normal distribution out < bayestestR::distribution_normal(10000, 0, 0.5) # range of "full" distribution range(out) # range of 95% CI round(quantile(out, probs = c(0.025, 0.975)), 2)
This ensures that the SGPV always refers to the general compatible parameter space of coefficients, independent from the confidence interval chosen for testing practical equivalence. Therefore, the SGPV of the full interval is similar to the ROPE coverage of Bayesian equivalence tests, see following code:
library(bayestestR) library(brms) m < lm(mpg ~ gear + wt + cyl + hp, data = mtcars) m2 < brm(mpg ~ gear + wt + cyl + hp, data = mtcars) # SGPV for frequentist models equivalence_test(m) # similar to ROPE coverage of Bayesian models equivalence_test(m2) # similar to ROPE coverage of simulated draws / bootstrap samples equivalence_test(simulate_model(m))
Some attention is required for finding suitable values for the ROPE limits
(argument range
). See 'Details' in bayestestR::rope_range()
for further information.
A data frame.
There is no standardized approach to drawing conclusions based on the available data and statistical models. A frequently chosen but also much criticized approach is to evaluate results based on their statistical significance (Amrhein et al. 2017).
A more sophisticated way would be to test whether estimated effects exceed the "smallest effect size of interest", to avoid even the smallest effects being considered relevant simply because they are statistically significant, but clinically or practically irrelevant (Lakens et al. 2018, Lakens 2024).
A rather unconventional approach, which is nevertheless advocated by various authors, is to interpret results from classical regression models either in terms of probabilities, similar to the usual approach in Bayesian statistics (Schweder 2018; Schweder and Hjort 2003; Vos 2022) or in terms of relative measure of "evidence" or "compatibility" with the data (Greenland et al. 2022; Rafi and Greenland 2020), which nevertheless comes close to a probabilistic interpretation.
A more detailed discussion of this topic is found in the documentation of
p_function()
.
The parameters package provides several options or functions to aid statistical inference. These are, for example:
equivalence_test()
, to compute the (conditional)
equivalence test for frequentist models
p_significance()
, to compute the probability of
practical significance, which can be conceptualized as a unidirectional
equivalence test
p_function()
, or consonance function, to compute pvalues and
compatibility (confidence) intervals for statistical models
the pd
argument (setting pd = TRUE
) in model_parameters()
includes
a column with the probability of direction, i.e. the probability that a
parameter is strictly positive or negative. See bayestestR::p_direction()
for details. If plotting is desired, the p_direction()
function can be used, together with plot()
.
the s_value
argument (setting s_value = TRUE
) in model_parameters()
replaces the pvalues with their related Svalues (Rafi and Greenland 2020)
finally, it is possible to generate distributions of model coefficients by
generating bootstrapsamples (setting bootstrap = TRUE
) or simulating
draws from model coefficients using simulate_model()
. These samples
can then be treated as "posterior samples" and used in many functions from
the bayestestR package.
Most of the above shown options or functions derive from methods originally
implemented for Bayesian models (Makowski et al. 2019). However, assuming
that model assumptions are met (which means, the model fits well to the data,
the correct model is chosen that reflects the data generating process
(distributional model family) etc.), it seems appropriate to interpret
results from classical frequentist models in a "Bayesian way" (more details:
documentation in p_function()
).
There is also a plot()
method
implemented in the seepackage.
Amrhein, V., KornerNievergelt, F., and Roth, T. (2017). The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research. PeerJ, 5, e3544. doi:10.7717/peerj.3544
Blume, J. D., D'Agostino McGowan, L., Dupont, W. D., & Greevy, R. A. (2018). Secondgeneration pvalues: Improved rigor, reproducibility, & transparency in statistical analyses. PLOS ONE, 13(3), e0188299. https://doi.org/10.1371/journal.pone.0188299
Campbell, H., & Gustafson, P. (2018). Conditional equivalence testing: An alternative remedy for publication bias. PLOS ONE, 13(4), e0195145. doi: 10.1371/journal.pone.0195145
Greenland S, Rafi Z, Matthews R, Higgs M. To Aid Scientific Inference, Emphasize Unconditional Compatibility Descriptions of Statistics. (2022) https://arxiv.org/abs/1909.08583v7 (Accessed November 10, 2022)
Kruschke, J. K. (2014). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press
Kruschke, J. K. (2018). Rejecting or accepting parameter values in Bayesian estimation. Advances in Methods and Practices in Psychological Science, 1(2), 270280. doi: 10.1177/2515245918771304
Lakens, D. (2017). Equivalence Tests: A Practical Primer for t Tests, Correlations, and MetaAnalyses. Social Psychological and Personality Science, 8(4), 355–362. doi: 10.1177/1948550617697177
Lakens, D. (2024). Improving Your Statistical Inferences (Version v1.5.1). Retrieved from https://lakens.github.io/statistical_inferences/. doi:10.5281/ZENODO.6409077
Lakens, D., and Delacre, M. (2020). Equivalence Testing and the Second Generation PValue. MetaPsychology, 4. https://doi.org/10.15626/MP.2018.933
Lakens, D., Scheel, A. M., and Isager, P. M. (2018). Equivalence Testing for Psychological Research: A Tutorial. Advances in Methods and Practices in Psychological Science, 1(2), 259–269. doi:10.1177/2515245918770963
Makowski, D., BenShachar, M. S., Chen, S. H. A., and Lüdecke, D. (2019). Indices of Effect Existence and Significance in the Bayesian Framework. Frontiers in Psychology, 10, 2767. doi:10.3389/fpsyg.2019.02767
Pernet, C. (2017). Null hypothesis significance testing: A guide to commonly misunderstood concepts and recommendations for good practice. F1000Research, 4, 621. doi: 10.12688/f1000research.6963.5
Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise. BMC Medical Research Methodology (2020) 20:244.
Schweder T. Confidence is epistemic probability for empirical science. Journal of Statistical Planning and Inference (2018) 195:116–125. doi:10.1016/j.jspi.2017.09.016
Schweder T, Hjort NL. Frequentist analogues of priors and posteriors. In Stigum, B. (ed.), Econometrics and the Philosophy of Economics: Theory Data Confrontation in Economics, pp. 285217. Princeton University Press, Princeton, NJ, 2003
Vos P, Holbert D. Frequentist statistical inference without repeated sampling. Synthese 200, 89 (2022). doi:10.1007/s1122902203560x
For more details, see bayestestR::equivalence_test()
. Further
readings can be found in the references. See also p_significance()
for
a unidirectional equivalence test.
data(qol_cancer) model < lm(QoL ~ time + age + education, data = qol_cancer) # default rule equivalence_test(model) # using heteroscedasticityrobust standard errors equivalence_test(model, vcov = "HC3") # conditional equivalence test equivalence_test(model, rule = "cet") # plot method if (require("see", quietly = TRUE)) { result < equivalence_test(model) plot(result) }
data(qol_cancer) model < lm(QoL ~ time + age + education, data = qol_cancer) # default rule equivalence_test(model) # using heteroscedasticityrobust standard errors equivalence_test(model, vcov = "HC3") # conditional equivalence test equivalence_test(model, rule = "cet") # plot method if (require("see", quietly = TRUE)) { result < equivalence_test(model) plot(result) }
The functions principal_components()
and factor_analysis()
can
be used to perform a principal component analysis (PCA) or a factor analysis
(FA). They return the loadings as a data frame, and various methods and
functions are available to access / display other information (see the
Details section).
factor_analysis( x, n = "auto", rotation = "none", sort = FALSE, threshold = NULL, standardize = TRUE, cor = NULL, ... ) principal_components( x, n = "auto", rotation = "none", sparse = FALSE, sort = FALSE, threshold = NULL, standardize = TRUE, ... ) rotated_data(pca_results, verbose = TRUE) ## S3 method for class 'parameters_efa' predict( object, newdata = NULL, names = NULL, keep_na = TRUE, verbose = TRUE, ... ) ## S3 method for class 'parameters_efa' print(x, digits = 2, sort = FALSE, threshold = NULL, labels = NULL, ...) ## S3 method for class 'parameters_efa' sort(x, ...) closest_component(pca_results)
factor_analysis( x, n = "auto", rotation = "none", sort = FALSE, threshold = NULL, standardize = TRUE, cor = NULL, ... ) principal_components( x, n = "auto", rotation = "none", sparse = FALSE, sort = FALSE, threshold = NULL, standardize = TRUE, ... ) rotated_data(pca_results, verbose = TRUE) ## S3 method for class 'parameters_efa' predict( object, newdata = NULL, names = NULL, keep_na = TRUE, verbose = TRUE, ... ) ## S3 method for class 'parameters_efa' print(x, digits = 2, sort = FALSE, threshold = NULL, labels = NULL, ...) ## S3 method for class 'parameters_efa' sort(x, ...) closest_component(pca_results)
x 
A data frame or a statistical model. 
n 
Number of components to extract. If 
rotation 
If not 
sort 
Sort the loadings. 
threshold 
A value between 0 and 1 indicates which (absolute) values
from the loadings should be removed. An integer higher than 1 indicates the
n strongest loadings to retain. Can also be 
standardize 
A logical value indicating whether the variables should be standardized (centered and scaled) to have unit variance before the analysis (in general, such scaling is advisable). 
cor 
An optional correlation matrix that can be used (note that the
data must still be passed as the first argument). If 
... 
Arguments passed to or from other methods. 
sparse 
Whether to compute sparse PCA (SPCA, using 
pca_results 
The output of the 
verbose 
Toggle warnings. 
object 
An object of class 
newdata 
An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. 
names 
Optional character vector to name columns of the returned data frame. 
keep_na 
Logical, if 
digits 
Argument for 
labels 
Argument for 
n_components()
and n_factors()
automatically estimates the optimal
number of dimensions to retain.
performance::check_factorstructure()
checks the suitability of the
data for factor analysis using the sphericity (see
performance::check_sphericity_bartlett()
) and the KMO (see
performance::check_kmo()
) measure.
performance::check_itemscale()
computes various measures of internal
consistencies applied to the (sub)scales (i.e., components) extracted from
the PCA.
Running summary()
returns information related to each component/factor,
such as the explained variance and the Eivenvalues.
Running get_scores()
computes scores for each subscale.
Running closest_component()
will return a numeric vector with the
assigned component index for each column from the original data frame.
Running rotated_data()
will return the rotated data, including missing
values, so it matches the original data frame.
Running
plot()
visually displays the loadings (that requires the
seepackage to work).
Complexity represents the number of latent components needed to account for the observed variables. Whereas a perfect simple structure solution has a complexity of 1 in that each item would only load on one factor, a solution with evenly distributed items has a complexity greater than 1 (Hofman, 1978; Pettersson and Turkheimer, 2010).
Uniqueness represents the variance that is 'unique' to the variable and
not shared with other variables. It is equal to 1 – communality
(variance that is shared with other variables). A uniqueness of 0.20
suggests that 20%
or that variable's variance is not shared with other
variables in the overall factor model. The greater 'uniqueness' the lower
the relevance of the variable in the factor model.
MSA represents the KaiserMeyerOlkin Measure of Sampling Adequacy (Kaiser and Rice, 1974) for each item. It indicates whether there is enough data for each factor give reliable results for the PCA. The value should be > 0.6, and desirable values are > 0.8 (Tabachnick and Fidell, 2013).
There is a simplified rule of thumb that may help do decide whether to run a factor analysis or a principal component analysis:
Run factor analysis if you assume or wish to test a theoretical model of latent factors causing observed variables.
Run principal component analysis If you want to simply reduce your correlated observed variables to a smaller set of important independent composite variables.
(Source: CrossValidated)
Use get_scores()
to compute scores for the "subscales" represented by the
extracted principal components. get_scores()
takes the results from
principal_components()
and extracts the variables for each component found
by the PCA. Then, for each of these "subscales", raw means are calculated
(which equals adding up the single items and dividing by the number of items).
This results in a sum score for each component from the PCA, which is on the
same scale as the original, single items that were used to compute the PCA.
One can also use predict()
to backpredict scores for each component,
to which one can provide newdata
or a vector of names
for the components.
Use summary()
to get the Eigenvalues and the explained variance for each
extracted component. The eigenvectors and eigenvalues represent the "core"
of a PCA: The eigenvectors (the principal components) determine the
directions of the new feature space, and the eigenvalues determine their
magnitude. In other words, the eigenvalues explain the variance of the
data along the new feature axes.
A data frame of loadings.
Kaiser, H.F. and Rice. J. (1974). Little jiffy, mark iv. Educational and Psychological Measurement, 34(1):111–117
Hofmann, R. (1978). Complexity and simplicity as objective indices descriptive of factor solutions. Multivariate Behavioral Research, 13:2, 247250, doi:10.1207/s15327906mbr1302_9
Pettersson, E., & Turkheimer, E. (2010). Item selection, evaluation, and simple structure in personality data. Journal of research in personality, 44(4), 407420, doi:10.1016/j.jrp.2010.03.002
Tabachnick, B. G., and Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Boston: Pearson Education.
library(parameters) # Principal Component Analysis (PCA)  principal_components(mtcars[, 1:7], n = "all", threshold = 0.2) # Automated number of components principal_components(mtcars[, 1:4], n = "auto") # labels can be useful if variable names are not selfexplanatory print( principal_components(mtcars[, 1:4], n = "auto"), labels = c( "Miles/(US) gallon", "Number of cylinders", "Displacement (cu.in.)", "Gross horsepower" ) ) # Sparse PCA principal_components(mtcars[, 1:7], n = 4, sparse = TRUE) principal_components(mtcars[, 1:7], n = 4, sparse = "robust") # Rotated PCA principal_components(mtcars[, 1:7], n = 2, rotation = "oblimin", threshold = "max", sort = TRUE ) principal_components(mtcars[, 1:7], n = 2, threshold = 2, sort = TRUE) pca < principal_components(mtcars[, 1:5], n = 2, rotation = "varimax") pca # Print loadings summary(pca) # Print information about the factors predict(pca, names = c("Component1", "Component2")) # Backpredict scores # which variables from the original data belong to which extracted component? closest_component(pca) # Factor Analysis (FA)  factor_analysis(mtcars[, 1:7], n = "all", threshold = 0.2) factor_analysis(mtcars[, 1:7], n = 2, rotation = "oblimin", threshold = "max", sort = TRUE) factor_analysis(mtcars[, 1:7], n = 2, threshold = 2, sort = TRUE) efa < factor_analysis(mtcars[, 1:5], n = 2) summary(efa) predict(efa, verbose = FALSE) # Automated number of components factor_analysis(mtcars[, 1:4], n = "auto")
library(parameters) # Principal Component Analysis (PCA)  principal_components(mtcars[, 1:7], n = "all", threshold = 0.2) # Automated number of components principal_components(mtcars[, 1:4], n = "auto") # labels can be useful if variable names are not selfexplanatory print( principal_components(mtcars[, 1:4], n = "auto"), labels = c( "Miles/(US) gallon", "Number of cylinders", "Displacement (cu.in.)", "Gross horsepower" ) ) # Sparse PCA principal_components(mtcars[, 1:7], n = 4, sparse = TRUE) principal_components(mtcars[, 1:7], n = 4, sparse = "robust") # Rotated PCA principal_components(mtcars[, 1:7], n = 2, rotation = "oblimin", threshold = "max", sort = TRUE ) principal_components(mtcars[, 1:7], n = 2, threshold = 2, sort = TRUE) pca < principal_components(mtcars[, 1:5], n = 2, rotation = "varimax") pca # Print loadings summary(pca) # Print information about the factors predict(pca, names = c("Component1", "Component2")) # Backpredict scores # which variables from the original data belong to which extracted component? closest_component(pca) # Factor Analysis (FA)  factor_analysis(mtcars[, 1:7], n = "all", threshold = 0.2) factor_analysis(mtcars[, 1:7], n = 2, rotation = "oblimin", threshold = "max", sort = TRUE) factor_analysis(mtcars[, 1:7], n = 2, threshold = 2, sort = TRUE) efa < factor_analysis(mtcars[, 1:5], n = 2) summary(efa) predict(efa, verbose = FALSE) # Automated number of components factor_analysis(mtcars[, 1:4], n = "auto")
Format the name of the degreesoffreedom adjustment methods.
format_df_adjust( method, approx_string = "approximated", dof_string = " degrees of freedom" )
format_df_adjust( method, approx_string = "approximated", dof_string = " degrees of freedom" )
method 
Name of the method. 
approx_string , dof_string

Suffix added to the name of the method in the returned string. 
A formatted string.
library(parameters) format_df_adjust("kenward") format_df_adjust("kenward", approx_string = "", dof_string = " DoF")
library(parameters) format_df_adjust("kenward") format_df_adjust("kenward", approx_string = "", dof_string = " DoF")
Format order.
format_order(order, textual = TRUE, ...)
format_order(order, textual = TRUE, ...)
order 
value or vector of orders. 
textual 
Return number as words. If 
... 
Arguments to be passed to 
A formatted string.
format_order(2) format_order(8) format_order(25, textual = FALSE)
format_order(2) format_order(8) format_order(25, textual = FALSE)
Format the name of the pvalue adjustment methods.
format_p_adjust(method)
format_p_adjust(method)
method 
Name of the method. 
A string with the full surname(s) of the author(s), including year of publication, for the adjustmentmethod.
library(parameters) format_p_adjust("holm") format_p_adjust("bonferroni")
library(parameters) format_p_adjust("holm") format_p_adjust("bonferroni")
This functions formats the names of model parameters (coefficients) to make them more humanreadable.
format_parameters(model, ...) ## Default S3 method: format_parameters(model, brackets = c("[", "]"), ...)
format_parameters(model, ...) ## Default S3 method: format_parameters(model, brackets = c("[", "]"), ...)
model 
A statistical model. 
... 
Currently not used. 
brackets 
A character vector of length two, indicating the opening and closing brackets. 
A (names) character vector with formatted parameter names. The value names refer to the original names of the coefficients.
Note that the interpretation of interaction terms depends on many
characteristics of the model. The number of parameters, and overall
performance of the model, can differ or not between a * b
,
a : b
, and a / b
, suggesting that sometimes interaction terms
give different parameterizations of the same model, but other times it gives
completely different models (depending on a
or b
being factors
of covariates, included as main effects or not, etc.). Their interpretation
depends of the full context of the model, which should not be inferred
from the parameters table alone  rather, we recommend to use packages
that calculate estimated marginal means or marginal effects, such as
modelbased, emmeans, ggeffects, or
marginaleffects. To raise awareness for this issue, you may use
print(...,show_formula=TRUE)
to add the modelspecification to the output
of the print()
method for model_parameters()
.
model < lm(Sepal.Length ~ Species * Sepal.Width, data = iris) format_parameters(model) model < lm(Sepal.Length ~ Petal.Length + (Species / Sepal.Width), data = iris) format_parameters(model) model < lm(Sepal.Length ~ Species + poly(Sepal.Width, 2), data = iris) format_parameters(model) model < lm(Sepal.Length ~ Species + poly(Sepal.Width, 2, raw = TRUE), data = iris) format_parameters(model)
model < lm(Sepal.Length ~ Species * Sepal.Width, data = iris) format_parameters(model) model < lm(Sepal.Length ~ Petal.Length + (Species / Sepal.Width), data = iris) format_parameters(model) model < lm(Sepal.Length ~ Species + poly(Sepal.Width, 2), data = iris) format_parameters(model) model < lm(Sepal.Length ~ Species + poly(Sepal.Width, 2, raw = TRUE), data = iris) format_parameters(model)
A print()
method for objects from compare_parameters()
.
## S3 method for class 'compare_parameters' format( x, split_components = TRUE, select = NULL, digits = 2, ci_digits = digits, p_digits = 3, ci_width = NULL, ci_brackets = NULL, zap_small = FALSE, format = NULL, groups = NULL, engine = NULL, ... ) ## S3 method for class 'compare_parameters' print( x, split_components = TRUE, caption = NULL, subtitle = NULL, footer = NULL, digits = 2, ci_digits = digits, p_digits = 3, zap_small = FALSE, groups = NULL, column_width = NULL, ci_brackets = c("[", "]"), select = NULL, ... ) ## S3 method for class 'compare_parameters' print_html( x, caption = NULL, subtitle = NULL, footer = NULL, digits = 2, ci_digits = digits, p_digits = 3, zap_small = FALSE, groups = NULL, select = NULL, ci_brackets = c("(", ")"), font_size = "100%", line_padding = 4, column_labels = NULL, engine = "gt", ... ) ## S3 method for class 'compare_parameters' print_md( x, digits = 2, ci_digits = digits, p_digits = 3, caption = NULL, subtitle = NULL, footer = NULL, select = NULL, split_components = TRUE, ci_brackets = c("(", ")"), zap_small = FALSE, groups = NULL, engine = "tt", ... )
## S3 method for class 'compare_parameters' format( x, split_components = TRUE, select = NULL, digits = 2, ci_digits = digits, p_digits = 3, ci_width = NULL, ci_brackets = NULL, zap_small = FALSE, format = NULL, groups = NULL, engine = NULL, ... ) ## S3 method for class 'compare_parameters' print( x, split_components = TRUE, caption = NULL, subtitle = NULL, footer = NULL, digits = 2, ci_digits = digits, p_digits = 3, zap_small = FALSE, groups = NULL, column_width = NULL, ci_brackets = c("[", "]"), select = NULL, ... ) ## S3 method for class 'compare_parameters' print_html( x, caption = NULL, subtitle = NULL, footer = NULL, digits = 2, ci_digits = digits, p_digits = 3, zap_small = FALSE, groups = NULL, select = NULL, ci_brackets = c("(", ")"), font_size = "100%", line_padding = 4, column_labels = NULL, engine = "gt", ... ) ## S3 method for class 'compare_parameters' print_md( x, digits = 2, ci_digits = digits, p_digits = 3, caption = NULL, subtitle = NULL, footer = NULL, select = NULL, split_components = TRUE, ci_brackets = c("(", ")"), zap_small = FALSE, groups = NULL, engine = "tt", ... )
x 
An object returned by 
split_components 
Logical, if 
select 
Determines which columns and and which layout columns are printed. There are three options for this argument:
*. A string indicating a predefined layout
For 
digits , ci_digits , p_digits

Number of digits for rounding or
significant figures. May also be 
ci_width 
Minimum width of the returned string for confidence
intervals. If not 
ci_brackets 
Logical, if 
zap_small 
Logical, if 
format 
String, indicating the output format. Can be 
groups 
Named list, can be used to group parameters in the printed output.
List elements may either be character vectors that match the name of those
parameters that belong to one group, or list elements can be row numbers
of those parameter rows that should belong to one group. The names of the
list elements will be used as group names, which will be inserted as "header
row". A possible use case might be to emphasize focal predictors and control
variables, see 'Examples'. Parameters will be reordered according to the
order used in 
engine 
Character string, naming the package or engine to be used for
printing into HTML or markdown format. Currently supported 
... 
Arguments passed down to 
caption 
Table caption as string. If 
subtitle 
Table title (same as caption) and subtitle, as strings. If 
footer 
Can either be 
column_width 
Width of table columns. Can be either 
font_size 
For HTML tables, the font size. 
line_padding 
For HTML tables, the distance (in pixel) between lines. 
column_labels 
Labels of columns for HTML tables. If 
Invisibly returns the original input object.
The verbose
argument can be used to display or silence messages and
warnings for the different functions in the parameters package. However,
some messages providing additional information can be displayed or suppressed
using options()
:
parameters_info
: options(parameters_info = TRUE)
will override the
include_info
argument in model_parameters()
and always show the model
summary for nonmixed models.
parameters_mixed_info
: options(parameters_mixed_info = TRUE)
will
override the include_info
argument in model_parameters()
for mixed
models, and will then always show the model summary.
parameters_cimethod
: options(parameters_cimethod = TRUE)
will show the
additional information about the approximation method used to calculate
confidence intervals and pvalues. Set to FALSE
to hide this message when
printing model_parameters()
objects.
parameters_exponentiate
: options(parameters_exponentiate = TRUE)
will
show the additional information on how to interpret coefficients of models
with logtransformed response variables or with log/logitlinks when the
exponentiate
argument in model_parameters()
is not TRUE
. Set this option
to FALSE
to hide this message when printing model_parameters()
objects.
There are further options that can be used to modify the default behaviour for printed outputs:
parameters_labels
: options(parameters_labels = TRUE)
will use variable
and value labels for pretty names, if data is labelled. If no labels
available, default pretty names are used.
parameters_interaction
: options(parameters_interaction = <character>)
will replace the interaction mark (by default, *
) with the related character.
parameters_select
: options(parameters_select = <value>)
will set the
default for the select
argument. See argument's documentation for available
options.
easystats_table_width
: options(easystats_table_width = <value>)
will
set the default width for tables in textformat, i.e. for most of the outputs
printed to console. If not specified, tables will be adjusted to the current
available width, e.g. of the of the console (or any other source for textual
output, like markdown files). The argument table_width
can also be used in
most print()
methods to specify the table width as desired.
easystats_html_engine
: options(easystats_html_engine = "gt")
will set
the default HTML engine for tables to gt
, i.e. the gt package is used to
create HTML tables. If set to tt
, the tinytable package is used.
insight_use_symbols
: options(insight_use_symbols = TRUE)
will try to
print unicodechars for symbols as column names, wherever possible (e.g.,
ω instead of Omega
).
data(iris) lm1 < lm(Sepal.Length ~ Species, data = iris) lm2 < lm(Sepal.Length ~ Species + Petal.Length, data = iris) # custom style result < compare_parameters(lm1, lm2, select = "{estimate}{stars} ({se})") print(result) # custom style, in HTML result < compare_parameters(lm1, lm2, select = "{estimate}<br>({se}){p}") print_html(result)
data(iris) lm1 < lm(Sepal.Length ~ Species, data = iris) lm2 < lm(Sepal.Length ~ Species + Petal.Length, data = iris) # custom style result < compare_parameters(lm1, lm2, select = "{estimate}{stars} ({se})") print(result) # custom style, in HTML result < compare_parameters(lm1, lm2, select = "{estimate}<br>({se}){p}") print_html(result)
A print()
method for objects from model_parameters()
.
## S3 method for class 'parameters_model' format( x, pretty_names = TRUE, split_components = TRUE, select = NULL, digits = 2, ci_digits = digits, p_digits = 3, ci_width = NULL, ci_brackets = NULL, zap_small = FALSE, format = NULL, groups = NULL, include_reference = FALSE, ... ) ## S3 method for class 'parameters_model' print( x, pretty_names = TRUE, split_components = TRUE, select = NULL, caption = NULL, footer = NULL, digits = 2, ci_digits = digits, p_digits = 3, footer_digits = 3, show_sigma = FALSE, show_formula = FALSE, zap_small = FALSE, groups = NULL, column_width = NULL, ci_brackets = c("[", "]"), include_reference = FALSE, ... ) ## S3 method for class 'parameters_model' summary(object, ...) ## S3 method for class 'parameters_model' print_html( x, pretty_names = TRUE, split_components = TRUE, select = NULL, caption = NULL, subtitle = NULL, footer = NULL, align = NULL, digits = 2, ci_digits = digits, p_digits = 3, footer_digits = 3, ci_brackets = c("(", ")"), show_sigma = FALSE, show_formula = FALSE, zap_small = FALSE, groups = NULL, font_size = "100%", line_padding = 4, column_labels = NULL, include_reference = FALSE, verbose = TRUE, ... ) ## S3 method for class 'parameters_model' print_md( x, pretty_names = TRUE, split_components = TRUE, select = NULL, caption = NULL, subtitle = NULL, footer = NULL, align = NULL, digits = 2, ci_digits = digits, p_digits = 3, footer_digits = 3, ci_brackets = c("(", ")"), show_sigma = FALSE, show_formula = FALSE, zap_small = FALSE, groups = NULL, include_reference = FALSE, verbose = TRUE, ... )
## S3 method for class 'parameters_model' format( x, pretty_names = TRUE, split_components = TRUE, select = NULL, digits = 2, ci_digits = digits, p_digits = 3, ci_width = NULL, ci_brackets = NULL, zap_small = FALSE, format = NULL, groups = NULL, include_reference = FALSE, ... ) ## S3 method for class 'parameters_model' print( x, pretty_names = TRUE, split_components = TRUE, select = NULL, caption = NULL, footer = NULL, digits = 2, ci_digits = digits, p_digits = 3, footer_digits = 3, show_sigma = FALSE, show_formula = FALSE, zap_small = FALSE, groups = NULL, column_width = NULL, ci_brackets = c("[", "]"), include_reference = FALSE, ... ) ## S3 method for class 'parameters_model' summary(object, ...) ## S3 method for class 'parameters_model' print_html( x, pretty_names = TRUE, split_components = TRUE, select = NULL, caption = NULL, subtitle = NULL, footer = NULL, align = NULL, digits = 2, ci_digits = digits, p_digits = 3, footer_digits = 3, ci_brackets = c("(", ")"), show_sigma = FALSE, show_formula = FALSE, zap_small = FALSE, groups = NULL, font_size = "100%", line_padding = 4, column_labels = NULL, include_reference = FALSE, verbose = TRUE, ... ) ## S3 method for class 'parameters_model' print_md( x, pretty_names = TRUE, split_components = TRUE, select = NULL, caption = NULL, subtitle = NULL, footer = NULL, align = NULL, digits = 2, ci_digits = digits, p_digits = 3, footer_digits = 3, ci_brackets = c("(", ")"), show_sigma = FALSE, show_formula = FALSE, zap_small = FALSE, groups = NULL, include_reference = FALSE, verbose = TRUE, ... )
x , object

An object returned by 
pretty_names 
Can be 
split_components 
Logical, if 
select 
Determines which columns and and which layout columns are printed. There are three options for this argument:
*. A string indicating a predefined layout
For 
digits , ci_digits , p_digits

Number of digits for rounding or
significant figures. May also be 
ci_width 
Minimum width of the returned string for confidence
intervals. If not 
ci_brackets 
Logical, if 
zap_small 
Logical, if 
format 
String, indicating the output format. Can be 
groups 
Named list, can be used to group parameters in the printed output.
List elements may either be character vectors that match the name of those
parameters that belong to one group, or list elements can be row numbers
of those parameter rows that should belong to one group. The names of the
list elements will be used as group names, which will be inserted as "header
row". A possible use case might be to emphasize focal predictors and control
variables, see 'Examples'. Parameters will be reordered according to the
order used in 
include_reference 
Logical, if 
... 
Arguments passed down to 
caption 
Table caption as string. If 
footer 
Can either be 
footer_digits 
Number of decimal places for values in the footer summary. 
show_sigma 
Logical, if 
show_formula 
Logical, if 
column_width 
Width of table columns. Can be either 
subtitle 
Table title (same as caption) and subtitle, as strings. If 
align 
Only applies to HTML tables. May be one of 
font_size 
For HTML tables, the font size. 
line_padding 
For HTML tables, the distance (in pixel) between lines. 
column_labels 
Labels of columns for HTML tables. If 
verbose 
Toggle messages and warnings. 
summary()
is a convenient shortcut for
print(object, select = "minimal", show_sigma = TRUE, show_formula = TRUE)
.
Invisibly returns the original input object.
The verbose
argument can be used to display or silence messages and
warnings for the different functions in the parameters package. However,
some messages providing additional information can be displayed or suppressed
using options()
:
parameters_info
: options(parameters_info = TRUE)
will override the
include_info
argument in model_parameters()
and always show the model
summary for nonmixed models.
parameters_mixed_info
: options(parameters_mixed_info = TRUE)
will
override the include_info
argument in model_parameters()
for mixed
models, and will then always show the model summary.
parameters_cimethod
: options(parameters_cimethod = TRUE)
will show the
additional information about the approximation method used to calculate
confidence intervals and pvalues. Set to FALSE
to hide this message when
printing model_parameters()
objects.
parameters_exponentiate
: options(parameters_exponentiate = TRUE)
will
show the additional information on how to interpret coefficients of models
with logtransformed response variables or with log/logitlinks when the
exponentiate
argument in model_parameters()
is not TRUE
. Set this option
to FALSE
to hide this message when printing model_parameters()
objects.
There are further options that can be used to modify the default behaviour for printed outputs:
parameters_labels
: options(parameters_labels = TRUE)
will use variable
and value labels for pretty names, if data is labelled. If no labels
available, default pretty names are used.
parameters_interaction
: options(parameters_interaction = <character>)
will replace the interaction mark (by default, *
) with the related character.
parameters_select
: options(parameters_select = <value>)
will set the
default for the select
argument. See argument's documentation for available
options.
easystats_table_width
: options(easystats_table_width = <value>)
will
set the default width for tables in textformat, i.e. for most of the outputs
printed to console. If not specified, tables will be adjusted to the current
available width, e.g. of the of the console (or any other source for textual
output, like markdown files). The argument table_width
can also be used in
most print()
methods to specify the table width as desired.
easystats_html_engine
: options(easystats_html_engine = "gt")
will set
the default HTML engine for tables to gt
, i.e. the gt package is used to
create HTML tables. If set to tt
, the tinytable package is used.
insight_use_symbols
: options(insight_use_symbols = TRUE)
will try to
print unicodechars for symbols as column names, wherever possible (e.g.,
ω instead of Omega
).
Note that the interpretation of interaction terms depends on many
characteristics of the model. The number of parameters, and overall
performance of the model, can differ or not between a * b
,
a : b
, and a / b
, suggesting that sometimes interaction terms
give different parameterizations of the same model, but other times it gives
completely different models (depending on a
or b
being factors
of covariates, included as main effects or not, etc.). Their interpretation
depends of the full context of the model, which should not be inferred
from the parameters table alone  rather, we recommend to use packages
that calculate estimated marginal means or marginal effects, such as
modelbased, emmeans, ggeffects, or
marginaleffects. To raise awareness for this issue, you may use
print(...,show_formula=TRUE)
to add the modelspecification to the output
of the print()
method for model_parameters()
.
Throughout the parameters package, we decided to label the residual
degrees of freedom df_error. The reason for this is that these degrees
of freedom not always refer to the residuals. For certain models, they refer
to the estimate error  in a linear model these are the same, but in  for
instance  any mixed effects model, this isn't strictly true. Hence, we
think that df_error
is the most generic label for these degrees of
freedom.
See also display()
.
library(parameters) model < glmmTMB::glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) mp < model_parameters(model) print(mp, pretty_names = FALSE) print(mp, split_components = FALSE) print(mp, select = c("Parameter", "Coefficient", "SE")) print(mp, select = "minimal") # group parameters  data(iris) model < lm( Sepal.Width ~ Sepal.Length + Species + Petal.Length, data = iris ) # don't select "Intercept" parameter mp < model_parameters(model, parameters = "^(?!\\(Intercept)") groups < list( "Focal Predictors" = c("Speciesversicolor", "Speciesvirginica"), "Controls" = c("Sepal.Length", "Petal.Length") ) print(mp, groups = groups) # or use row indices print(mp, groups = list( "Focal Predictors" = c(1, 4), "Controls" = c(2, 3) )) # only show coefficients, CI and p, # put nonmatched parameters to the end data(mtcars) mtcars$cyl < as.factor(mtcars$cyl) mtcars$gear < as.factor(mtcars$gear) model < lm(mpg ~ hp + gear * vs + cyl + drat, data = mtcars) # don't select "Intercept" parameter mp < model_parameters(model, parameters = "^(?!\\(Intercept)") print(mp, groups = list( "Engine" = c("cyl6", "cyl8", "vs", "hp"), "Interactions" = c("gear4:vs", "gear5:vs") )) # custom column layouts  data(iris) lm1 < lm(Sepal.Length ~ Species, data = iris) lm2 < lm(Sepal.Length ~ Species + Petal.Length, data = iris) # custom style result < compare_parameters(lm1, lm2, select = "{estimate}{stars} ({se})") print(result) # custom style, in HTML result < compare_parameters(lm1, lm2, select = "{estimate}<br>({se}){p}") print_html(result)
library(parameters) model < glmmTMB::glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) mp < model_parameters(model) print(mp, pretty_names = FALSE) print(mp, split_components = FALSE) print(mp, select = c("Parameter", "Coefficient", "SE")) print(mp, select = "minimal") # group parameters  data(iris) model < lm( Sepal.Width ~ Sepal.Length + Species + Petal.Length, data = iris ) # don't select "Intercept" parameter mp < model_parameters(model, parameters = "^(?!\\(Intercept)") groups < list( "Focal Predictors" = c("Speciesversicolor", "Speciesvirginica"), "Controls" = c("Sepal.Length", "Petal.Length") ) print(mp, groups = groups) # or use row indices print(mp, groups = list( "Focal Predictors" = c(1, 4), "Controls" = c(2, 3) )) # only show coefficients, CI and p, # put nonmatched parameters to the end data(mtcars) mtcars$cyl < as.factor(mtcars$cyl) mtcars$gear < as.factor(mtcars$gear) model < lm(mpg ~ hp + gear * vs + cyl + drat, data = mtcars) # don't select "Intercept" parameter mp < model_parameters(model, parameters = "^(?!\\(Intercept)") print(mp, groups = list( "Engine" = c("cyl6", "cyl8", "vs", "hp"), "Interactions" = c("gear4:vs", "gear5:vs") )) # custom column layouts  data(iris) lm1 < lm(Sepal.Length ~ Species, data = iris) lm2 < lm(Sepal.Length ~ Species + Petal.Length, data = iris) # custom style result < compare_parameters(lm1, lm2, select = "{estimate}{stars} ({se})") print(result) # custom style, in HTML result < compare_parameters(lm1, lm2, select = "{estimate}<br>({se}){p}") print_html(result)
get_scores()
takes n_items
amount of items that load the most
(either by loading cutoff or number) on a component, and then computes their
average.
get_scores(x, n_items = NULL)
get_scores(x, n_items = NULL)
x 
An object returned by 
n_items 
Number of required (i.e. nonmissing) items to build the sum
score. If 
get_scores()
takes the results from principal_components()
and
extracts the variables for each component found by the PCA. Then, for each
of these "subscales", row means are calculated (which equals adding up the
single items and dividing by the number of items). This results in a sum
score for each component from the PCA, which is on the same scale as the
original, single items that were used to compute the PCA.
A data frame with subscales, which are average sum scores for all items from each component.
if (require("psych")) { pca < principal_components(mtcars[, 1:7], n = 2, rotation = "varimax") # PCA extracted two components pca # assignment of items to each component closest_component(pca) # now we want to have sum scores for each component get_scores(pca) # compare to manually computed sum score for 2nd component, which # consists of items "hp" and "qsec" (mtcars$hp + mtcars$qsec) / 2 }
if (require("psych")) { pca < principal_components(mtcars[, 1:7], n = 2, rotation = "varimax") # PCA extracted two components pca # assignment of items to each component closest_component(pca) # now we want to have sum scores for each component get_scores(pca) # compare to manually computed sum score for 2nd component, which # consists of items "hp" and "qsec" (mtcars$hp + mtcars$qsec) / 2 }
Compute and extract model parameters. The available options and arguments depend
on the modeling package and model class
. Follow one of these links to read
the modelspecific documentation:
Default method: lm
, glm
, stats, censReg,
MASS, survey, ...
Additive models: bamlss, gamlss, mgcv,
scam, VGAM, Gam
, gamm
, ...
ANOVA: afex, aov
, anova
, ...
Bayesian: BayesFactor, blavaan, brms,
MCMCglmm, posterior, rstanarm, bayesQR
, bcplm
, BGGM
, blmrm
,
blrm
, mcmc.list
, MCMCglmm
, ...
Clustering: hclust, kmeans, mclust, pam, ...
Correlations, ttests, etc.: lmtest, htest
,
pairwise.htest
, ...
MetaAnalysis: metaBMA, metafor, metaplus, ...
Mixed models: cplm, glmmTMB, lme4,
lmerTest, nlme, ordinal, robustlmm, spaMM, mixed
, MixMod
, ...
Multinomial, ordinal and cumulative link: brglm2,
DirichletReg, nnet, ordinal, mlm
, ...
Multiple imputation: mice
PCA, FA, CFA, SEM: FactoMineR, lavaan,
psych, sem
, ...
Zeroinflated and hurdle: cplm, mhurdle, pscl, ...
Other models: aod, bbmle, betareg,
emmeans, epiR, ggeffects, glmx, ivfixed, ivprobit,
JRM, lmodel2, logitsf, marginaleffects, margins, maxLik,
mediation, mfx, multcomp, mvord, plm, PMCMRplus,
quantreg, selection, systemfit, tidymodels, varEST,
WRS2, bfsl
, deltaMethod
, fitdistr
, mjoint
, mle
, model.avg
, ...
model_parameters(model, ...) parameters(model, ...)
model_parameters(model, ...) parameters(model, ...)
model 
Statistical Model. 
... 
Arguments passed to or from other methods. Nondocumented arguments are

A data frame of indices related to the model's parameters.
Standardization is based on standardize_parameters()
. In case
of standardize = "refit"
, the data used to fit the model will be
standardized and the model is completely refitted. In such cases, standard
errors and confidence intervals refer to the standardized coefficient. The
default, standardize = "refit"
, never standardizes categorical predictors
(i.e. factors), which may be a different behaviour compared to other R
packages or other software packages (like SPSS). To mimic behaviour of SPSS
or packages such as lm.beta, use standardize = "basic"
.
refit: This method is based on a complete model refit with a
standardized version of the data. Hence, this method is equal to
standardizing the variables before fitting the model. It is the "purest" and
the most accurate (Neter et al., 1989), but it is also the most
computationally costly and long (especially for heavy models such as Bayesian
models). This method is particularly recommended for complex models that
include interactions or transformations (e.g., polynomial or spline terms).
The robust
(default to FALSE
) argument enables a robust standardization
of data, i.e., based on the median
and MAD
instead of the mean
and
SD
. See datawizard::standardize()
for more details.
Note that standardize_parameters(method = "refit")
may not return
the same results as fitting a model on data that has been standardized with
standardize()
; standardize_parameters()
used the data used by the model
fitting function, which might not be same data if there are missing values.
see the remove_na
argument in standardize()
.
posthoc: Posthoc standardization of the parameters, aiming at
emulating the results obtained by "refit" without refitting the model. The
coefficients are divided by the standard deviation (or MAD if robust
) of
the outcome (which becomes their expression 'unit'). Then, the coefficients
related to numeric variables are additionally multiplied by the standard
deviation (or MAD if robust
) of the related terms, so that they correspond
to changes of 1 SD of the predictor (e.g., "A change in 1 SD of x
is
related to a change of 0.24 of the SD of y
). This does not apply to binary
variables or factors, so the coefficients are still related to changes in
levels. This method is not accurate and tend to give aberrant results when
interactions are specified.
basic: This method is similar to method = "posthoc"
, but treats all
variables as continuous: it also scales the coefficient by the standard
deviation of model's matrix' parameter of factors levels (transformed to
integers) or binary predictors. Although being inappropriate for these cases,
this method is the one implemented by default in other software packages,
such as lm.beta::lm.beta()
.
smart (Standardization of Model's parameters with Adjustment,
Reconnaissance and Transformation  experimental): Similar to method = "posthoc"
in that it does not involve model refitting. The difference is
that the SD (or MAD if robust
) of the response is computed on the relevant
section of the data. For instance, if a factor with 3 levels A (the
intercept), B and C is entered as a predictor, the effect corresponding to B
vs. A will be scaled by the variance of the response at the intercept only.
As a results, the coefficients for effects of factors are similar to a Glass'
delta.
pseudo (for 2level (G)LMMs only): In this (posthoc) method, the
response and the predictor are standardized based on the level of prediction
(levels are detected with performance::check_heterogeneity_bias()
): Predictors
are standardized based on their SD at level of prediction (see also
datawizard::demean()
); The outcome (in linear LMMs) is standardized based
on a fitted randominterceptmodel, where sqrt(randominterceptvariance)
is used for level 2 predictors, and sqrt(residualvariance)
is used for
level 1 predictors (Hoffman 2015, page 342). A warning is given when a
withingroup variable is found to have access betweengroup variance.
See also package vignette.
Throughout the parameters package, we decided to label the residual
degrees of freedom df_error. The reason for this is that these degrees
of freedom not always refer to the residuals. For certain models, they refer
to the estimate error  in a linear model these are the same, but in  for
instance  any mixed effects model, this isn't strictly true. Hence, we
think that df_error
is the most generic label for these degrees of
freedom.
There are different ways of approximating the degrees of freedom depending
on different assumptions about the nature of the model and its sampling
distribution. The ci_method
argument modulates the method for computing degrees
of freedom (df) that are used to calculate confidence intervals (CI) and the
related pvalues. Following options are allowed, depending on the model
class:
Classical methods:
Classical inference is generally based on the Wald method. The Wald approach to inference computes a test statistic by dividing the parameter estimate by its standard error (Coefficient / SE), then comparing this statistic against a t or normal distribution. This approach can be used to compute CIs and pvalues.
"wald"
:
Applies to nonBayesian models. For linear models, CIs computed using the Wald method (SE and a tdistribution with residual df); pvalues computed using the Wald method with a tdistribution with residual df. For other models, CIs computed using the Wald method (SE and a normal distribution); pvalues computed using the Wald method with a normal distribution.
"normal"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a normal distribution.
"residual"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a tdistribution with residual df when possible. If the residual df for a model cannot be determined, a normal distribution is used instead.
Methods for mixed models:
Compared to fixed effects (or singlelevel) models, determining appropriate df for Waldbased inference in mixed models is more difficult. See the R GLMM FAQ for a discussion.
Several approximate methods for computing df are available, but you should
also consider instead using profile likelihood ("profile"
) or bootstrap ("boot"
)
CIs and pvalues instead.
"satterthwaite"
Applies to linear mixed models. CIs computed using the Wald method (SE and a tdistribution with Satterthwaite df); pvalues computed using the Wald method with a tdistribution with Satterthwaite df.
"kenward"
Applies to linear mixed models. CIs computed using the Wald method (KenwardRoger SE and a tdistribution with KenwardRoger df); pvalues computed using the Wald method with KenwardRoger SE and tdistribution with KenwardRoger df.
"ml1"
Applies to linear mixed models. CIs computed using the Wald
method (SE and a tdistribution with ml1 approximated df); pvalues
computed using the Wald method with a tdistribution with ml1 approximated df.
See ci_ml1()
.
"betwithin"
Applies to linear mixed models and generalized linear mixed models.
CIs computed using the Wald method (SE and a tdistribution with betweenwithin df);
pvalues computed using the Wald method with a tdistribution with betweenwithin df.
See ci_betwithin()
.
Likelihoodbased methods:
Likelihoodbased inference is based on comparing the likelihood for the
maximumlikelihood estimate to the the likelihood for models with one or more
parameter values changed (e.g., set to zero or a range of alternative values).
Likelihood ratios for the maximumlikelihood and alternative models are compared
to a $\chi$
squared distribution to compute CIs and pvalues.
"profile"
Applies to nonBayesian models of class glm
, polr
, merMod
or glmmTMB
.
CIs computed by profiling the likelihood curve for a parameter, using
linear interpolation to find where likelihood ratio equals a critical value;
pvalues computed using the Wald method with a normaldistribution (note:
this might change in a future update!)
"uniroot"
Applies to nonBayesian models of class glmmTMB
. CIs
computed by profiling the likelihood curve for a parameter, using root
finding to find where likelihood ratio equals a critical value; pvalues
computed using the Wald method with a normaldistribution (note: this
might change in a future update!)
Methods for bootstrapped or Bayesian models:
Bootstrapbased inference is based on resampling and refitting the model to the resampled datasets. The distribution of parameter estimates across resampled datasets is used to approximate the parameter's sampling distribution. Depending on the type of model, several different methods for bootstrapping and constructing CIs and pvalues from the bootstrap distribution are available.
For Bayesian models, inference is based on drawing samples from the model posterior distribution.
"quantile"
(or "eti"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as equal tailed intervals using the quantiles of the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::eti()
.
"hdi"
Applies to all models (including Bayesian models). For nonBayesian
models, only applies if bootstrap = TRUE
. CIs computed as highest density intervals
for the bootstrap or posterior samples; pvalues are based on the probability of direction.
See bayestestR::hdi()
.
"bci"
(or "bcai"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as bias corrected and accelerated intervals for the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::bci()
.
"si"
Applies to Bayesian models with proper priors. CIs computed as
support intervals comparing the posterior samples against the prior samples;
pvalues are based on the probability of direction. See bayestestR::si()
.
"boot"
Applies to nonBayesian models of class merMod
. CIs computed
using parametric bootstrapping (simulating data from the fitted model);
pvalues computed using the Wald method with a normaldistribution)
(note: this might change in a future update!).
For all iterationbased methods other than "boot"
("hdi"
, "quantile"
, "ci"
, "eti"
, "si"
, "bci"
, "bcai"
),
pvalues are based on the probability of direction (bayestestR::p_direction()
),
which is converted into a pvalue using bayestestR::pd_to_p()
.
There is no standardized approach to drawing conclusions based on the available data and statistical models. A frequently chosen but also much criticized approach is to evaluate results based on their statistical significance (Amrhein et al. 2017).
A more sophisticated way would be to test whether estimated effects exceed the "smallest effect size of interest", to avoid even the smallest effects being considered relevant simply because they are statistically significant, but clinically or practically irrelevant (Lakens et al. 2018, Lakens 2024).
A rather unconventional approach, which is nevertheless advocated by various authors, is to interpret results from classical regression models either in terms of probabilities, similar to the usual approach in Bayesian statistics (Schweder 2018; Schweder and Hjort 2003; Vos 2022) or in terms of relative measure of "evidence" or "compatibility" with the data (Greenland et al. 2022; Rafi and Greenland 2020), which nevertheless comes close to a probabilistic interpretation.
A more detailed discussion of this topic is found in the documentation of
p_function()
.
The parameters package provides several options or functions to aid statistical inference. These are, for example:
equivalence_test()
, to compute the (conditional)
equivalence test for frequentist models
p_significance()
, to compute the probability of
practical significance, which can be conceptualized as a unidirectional
equivalence test
p_function()
, or consonance function, to compute pvalues and
compatibility (confidence) intervals for statistical models
the pd
argument (setting pd = TRUE
) in model_parameters()
includes
a column with the probability of direction, i.e. the probability that a
parameter is strictly positive or negative. See bayestestR::p_direction()
for details. If plotting is desired, the p_direction()
function can be used, together with plot()
.
the s_value
argument (setting s_value = TRUE
) in model_parameters()
replaces the pvalues with their related Svalues (Rafi and Greenland 2020)
finally, it is possible to generate distributions of model coefficients by
generating bootstrapsamples (setting bootstrap = TRUE
) or simulating
draws from model coefficients using simulate_model()
. These samples
can then be treated as "posterior samples" and used in many functions from
the bayestestR package.
Most of the above shown options or functions derive from methods originally
implemented for Bayesian models (Makowski et al. 2019). However, assuming
that model assumptions are met (which means, the model fits well to the data,
the correct model is chosen that reflects the data generating process
(distributional model family) etc.), it seems appropriate to interpret
results from classical frequentist models in a "Bayesian way" (more details:
documentation in p_function()
).
Note that the interpretation of interaction terms depends on many
characteristics of the model. The number of parameters, and overall
performance of the model, can differ or not between a * b
,
a : b
, and a / b
, suggesting that sometimes interaction terms
give different parameterizations of the same model, but other times it gives
completely different models (depending on a
or b
being factors
of covariates, included as main effects or not, etc.). Their interpretation
depends of the full context of the model, which should not be inferred
from the parameters table alone  rather, we recommend to use packages
that calculate estimated marginal means or marginal effects, such as
modelbased, emmeans, ggeffects, or
marginaleffects. To raise awareness for this issue, you may use
print(...,show_formula=TRUE)
to add the modelspecification to the output
of the print()
method for model_parameters()
.
The verbose
argument can be used to display or silence messages and
warnings for the different functions in the parameters package. However,
some messages providing additional information can be displayed or suppressed
using options()
:
parameters_info
: options(parameters_info = TRUE)
will override the
include_info
argument in model_parameters()
and always show the model
summary for nonmixed models.
parameters_mixed_info
: options(parameters_mixed_info = TRUE)
will
override the include_info
argument in model_parameters()
for mixed
models, and will then always show the model summary.
parameters_cimethod
: options(parameters_cimethod = TRUE)
will show the
additional information about the approximation method used to calculate
confidence intervals and pvalues. Set to FALSE
to hide this message when
printing model_parameters()
objects.
parameters_exponentiate
: options(parameters_exponentiate = TRUE)
will
show the additional information on how to interpret coefficients of models
with logtransformed response variables or with log/logitlinks when the
exponentiate
argument in model_parameters()
is not TRUE
. Set this option
to FALSE
to hide this message when printing model_parameters()
objects.
There are further options that can be used to modify the default behaviour for printed outputs:
parameters_labels
: options(parameters_labels = TRUE)
will use variable
and value labels for pretty names, if data is labelled. If no labels
available, default pretty names are used.
parameters_interaction
: options(parameters_interaction = <character>)
will replace the interaction mark (by default, *
) with the related character.
parameters_select
: options(parameters_select = <value>)
will set the
default for the select
argument. See argument's documentation for available
options.
easystats_table_width
: options(easystats_table_width = <value>)
will
set the default width for tables in textformat, i.e. for most of the outputs
printed to console. If not specified, tables will be adjusted to the current
available width, e.g. of the of the console (or any other source for textual
output, like markdown files). The argument table_width
can also be used in
most print()
methods to specify the table width as desired.
easystats_html_engine
: options(easystats_html_engine = "gt")
will set
the default HTML engine for tables to gt
, i.e. the gt package is used to
create HTML tables. If set to tt
, the tinytable package is used.
insight_use_symbols
: options(insight_use_symbols = TRUE)
will try to
print unicodechars for symbols as column names, wherever possible (e.g.,
ω instead of Omega
).
The print()
method has several
arguments to tweak the output. There is also a
plot()
method
implemented in the
seepackage, and a dedicated
method for use inside rmarkdown files,
print_md()
.
For developers, if
speed performance is an issue, you can use the (undocumented) pretty_names
argument, e.g. model_parameters(..., pretty_names = FALSE)
. This will
skip the formatting of the coefficient names and makes model_parameters()
faster.
Amrhein, V., KornerNievergelt, F., and Roth, T. (2017). The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research. PeerJ, 5, e3544. doi:10.7717/peerj.3544
Greenland S, Rafi Z, Matthews R, Higgs M. To Aid Scientific Inference, Emphasize Unconditional Compatibility Descriptions of Statistics. (2022) https://arxiv.org/abs/1909.08583v7 (Accessed November 10, 2022)
Hoffman, L. (2015). Longitudinal analysis: Modeling withinperson fluctuation and change. Routledge.
Lakens, D. (2024). Improving Your Statistical Inferences (Version v1.5.1). Retrieved from https://lakens.github.io/statistical_inferences/. doi:10.5281/ZENODO.6409077
Lakens, D., Scheel, A. M., and Isager, P. M. (2018). Equivalence Testing for Psychological Research: A Tutorial. Advances in Methods and Practices in Psychological Science, 1(2), 259–269. doi:10.1177/2515245918770963
Makowski, D., BenShachar, M. S., Chen, S. H. A., and Lüdecke, D. (2019). Indices of Effect Existence and Significance in the Bayesian Framework. Frontiers in Psychology, 10, 2767. doi:10.3389/fpsyg.2019.02767
Neter, J., Wasserman, W., and Kutner, M. H. (1989). Applied linear regression models.
Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise. BMC Medical Research Methodology (2020) 20:244.
Schweder T. Confidence is epistemic probability for empirical science. Journal of Statistical Planning and Inference (2018) 195:116–125. doi:10.1016/j.jspi.2017.09.016
Schweder T, Hjort NL. Frequentist analogues of priors and posteriors. In Stigum, B. (ed.), Econometrics and the Philosophy of Economics: Theory Data Confrontation in Economics, pp. 285217. Princeton University Press, Princeton, NJ, 2003
Vos P, Holbert D. Frequentist statistical inference without repeated sampling. Synthese 200, 89 (2022). doi:10.1007/s1122902203560x
insight::standardize_names()
to rename columns into a consistent,
standardized naming scheme.
Parameters from ANOVAs
## S3 method for class 'aov' model_parameters( model, type = NULL, df_error = NULL, ci = NULL, alternative = NULL, test = NULL, power = FALSE, es_type = NULL, keep = NULL, drop = NULL, table_wide = FALSE, verbose = TRUE, ... ) ## S3 method for class 'afex_aov' model_parameters( model, es_type = NULL, df_error = NULL, type = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'aov' model_parameters( model, type = NULL, df_error = NULL, ci = NULL, alternative = NULL, test = NULL, power = FALSE, es_type = NULL, keep = NULL, drop = NULL, table_wide = FALSE, verbose = TRUE, ... ) ## S3 method for class 'afex_aov' model_parameters( model, es_type = NULL, df_error = NULL, type = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
Object of class 
type 
Numeric, type of sums of squares. May be 1, 2 or 3. If 2 or 3,
ANOVAtables using 
df_error 
Denominator degrees of freedom (or degrees of freedom of the
error estimate, i.e., the residuals). This is used to compute effect sizes
for ANOVAtables from mixed models. See 'Examples'. (Ignored for

ci 
Confidence Interval (CI) level for effect sizes specified in

alternative 
A character string specifying the alternative hypothesis;
Controls the type of CI returned: 
test 
String, indicating the type of test for 
power 
Logical, if 
es_type 
The effect size of interest. Not that possibly not all effect sizes are applicable to the model object. See 'Details'. For Anova models, can also be a character vector with multiple effect size names. 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
table_wide 
Logical that decides whether the ANOVA table should be in
wide format, i.e. should the numerator and denominator degrees of freedom
be in the same row. Default: 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to 
For an object of class htest
, data is extracted via insight::get_data()
, and passed to the relevant function according to:
A ttest depending on type
: "cohens_d"
(default), "hedges_g"
, or one of "p_superiority"
, "u1"
, "u2"
, "u3"
, "overlap"
.
For a Paired ttest: depending on type
: "rm_rm"
, "rm_av"
, "rm_b"
, "rm_d"
, "rm_z"
.
A Chisquared tests of independence or Fisher's Exact Test, depending on type
: "cramers_v"
(default), "tschuprows_t"
, "phi"
, "cohens_w"
, "pearsons_c"
, "cohens_h"
, "oddsratio"
, "riskratio"
, "arr"
, or "nnt"
.
A Chisquared tests of goodnessoffit, depending on type
: "fei"
(default) "cohens_w"
, "pearsons_c"
A Oneway ANOVA test, depending on type
: "eta"
(default), "omega"
or "epsilon"
squared, "f"
, or "f2"
.
A McNemar test returns Cohen's g.
A Wilcoxon test depending on type
: returns "rank_biserial
" correlation (default) or one of "p_superiority"
, "vda"
, "u2"
, "u3"
, "overlap"
.
A KruskalWallis test depending on type
: "epsilon"
(default) or "eta"
.
A Friedman test returns Kendall's W.
(Where applicable, ci
and alternative
are taken from the htest
if not otherwise provided.)
For an object of class BFBayesFactor
, using bayestestR::describe_posterior()
,
A ttest depending on type
: "cohens_d"
(default) or one of "p_superiority"
, "u1"
, "u2"
, "u3"
, "overlap"
.
A correlation test returns r.
A contingency table test, depending on type
: "cramers_v"
(default), "phi"
, "tschuprows_t"
, "cohens_w"
, "pearsons_c"
, "cohens_h"
, "oddsratio"
, or "riskratio"
, "arr"
, or "nnt"
.
A proportion test returns p.
Objects of class anova
, aov
, aovlist
or afex_aov
, depending on type
: "eta"
(default), "omega"
or "epsilon"
squared, "f"
, or "f2"
.
Other objects are passed to parameters::standardize_parameters()
.
For statistical models it is recommended to directly use the listed functions, for the full range of options they provide.
A data frame of indices related to the model's parameters.
For ANOVAtables from mixed models (i.e. anova(lmer())
), only
partial or adjusted effect sizes can be computed. Note that type 3 ANOVAs
with interactions involved only give sensible and informative results when
covariates are meancentred and factors are coded with orthogonal contrasts
(such as those produced by contr.sum
, contr.poly
, or
contr.helmert
, but not by the default contr.treatment
).
df < iris df$Sepal.Big < ifelse(df$Sepal.Width >= 3, "Yes", "No") model < aov(Sepal.Length ~ Sepal.Big, data = df) model_parameters(model) model_parameters(model, es_type = c("omega", "eta"), ci = 0.9) model < anova(lm(Sepal.Length ~ Sepal.Big, data = df)) model_parameters(model) model_parameters( model, es_type = c("omega", "eta", "epsilon"), alternative = "greater" ) model < aov(Sepal.Length ~ Sepal.Big + Error(Species), data = df) model_parameters(model) df < iris df$Sepal.Big < ifelse(df$Sepal.Width >= 3, "Yes", "No") mm < lme4::lmer(Sepal.Length ~ Sepal.Big + Petal.Width + (1  Species), data = df) model < anova(mm) # simple parameters table model_parameters(model) # parameters table including effect sizes model_parameters( model, es_type = "eta", ci = 0.9, df_error = dof_satterthwaite(mm)[2:3] )
df < iris df$Sepal.Big < ifelse(df$Sepal.Width >= 3, "Yes", "No") model < aov(Sepal.Length ~ Sepal.Big, data = df) model_parameters(model) model_parameters(model, es_type = c("omega", "eta"), ci = 0.9) model < anova(lm(Sepal.Length ~ Sepal.Big, data = df)) model_parameters(model) model_parameters( model, es_type = c("omega", "eta", "epsilon"), alternative = "greater" ) model < aov(Sepal.Length ~ Sepal.Big + Error(Species), data = df) model_parameters(model) df < iris df$Sepal.Big < ifelse(df$Sepal.Width >= 3, "Yes", "No") mm < lme4::lmer(Sepal.Length ~ Sepal.Big + Petal.Width + (1  Species), data = df) model < anova(mm) # simple parameters table model_parameters(model) # parameters table including effect sizes model_parameters( model, es_type = "eta", ci = 0.9, df_error = dof_satterthwaite(mm)[2:3] )
Format Bayesian Exploratory Factor Analysis objects from the BayesFM package.
## S3 method for class 'befa' model_parameters( model, sort = FALSE, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = NULL, verbose = TRUE, ... )
## S3 method for class 'befa' model_parameters( model, sort = FALSE, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = NULL, verbose = TRUE, ... )
model 
Bayesian EFA created by the 
sort 
Sort the loadings. 
centrality 
The pointestimates (centrality indices) to compute. Character
(vector) or list with one or more of these options: 
dispersion 
Logical, if 
ci 
Value or vector of probability of the CI (between 0 and 1)
to be estimated. Default to 
ci_method 
The type of index used for Credible Interval. Can be 
test 
The indices of effect existence to compute. Character (vector) or
list with one or more of these options: 
verbose 
Toggle warnings. 
... 
Arguments passed to or from other methods. 
A data frame of loadings.
library(parameters) if (require("BayesFM")) { efa < BayesFM::befa(mtcars, iter = 1000) results < model_parameters(efa, sort = TRUE, verbose = FALSE) results efa_to_cfa(results, verbose = FALSE) }
library(parameters) if (require("BayesFM")) { efa < BayesFM::befa(mtcars, iter = 1000) results < model_parameters(efa, sort = TRUE, verbose = FALSE) results efa_to_cfa(results, verbose = FALSE) }
Parameters from BFBayesFactor
objects from {BayesFactor}
package.
## S3 method for class 'BFBayesFactor' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, priors = TRUE, es_type = NULL, include_proportions = FALSE, verbose = TRUE, ... )
## S3 method for class 'BFBayesFactor' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, priors = TRUE, es_type = NULL, include_proportions = FALSE, verbose = TRUE, ... )
model 
Object of class 
centrality 
The pointestimates (centrality indices) to compute. Character
(vector) or list with one or more of these options: 
dispersion 
Logical, if 
ci 
Value or vector of probability of the CI (between 0 and 1)
to be estimated. Default to 
ci_method 
The type of index used for Credible Interval. Can be 
test 
The indices of effect existence to compute. Character (vector) or
list with one or more of these options: 
rope_range 
ROPE's lower and higher bounds. Should be a vector of two
values (e.g., 
rope_ci 
The Credible Interval (CI) probability, corresponding to the proportion of HDI, to use for the percentage in ROPE. 
priors 
Add the prior used for each parameter. 
es_type 
The effect size of interest. Not that possibly not all effect sizes are applicable to the model object. See 'Details'. For Anova models, can also be a character vector with multiple effect size names. 
include_proportions 
Logical that decides whether to include posterior
cell proportions/counts for Bayesian contingency table analysis (from

verbose 
Toggle off warnings. 
... 
Additional arguments to be passed to or from methods. 
The meaning of the extracted parameters:
For BayesFactor::ttestBF()
: Difference
is the raw difference between
the means.
For BayesFactor::correlationBF()
: rho
is the linear correlation
estimate (equivalent to Pearson's r).
For BayesFactor::lmBF()
/ BayesFactor::generalTestBF()
/ BayesFactor::regressionBF()
/ BayesFactor::anovaBF()
: in addition to
parameters of the fixed and random effects, there are: mu
is the
(meancentered) intercept; sig2
is the model's sigma; g
/ g_*
are
the g parameters; See the Bayes Factors for ANOVAs paper
(doi:10.1016/j.jmp.2012.08.001).
A data frame of indices related to the model's parameters.
# Bayesian ttest model < BayesFactor::ttestBF(x = rnorm(100, 1, 1)) model_parameters(model) model_parameters(model, es_type = "cohens_d", ci = 0.9) # Bayesian contingency table analysis data(raceDolls) bf < BayesFactor::contingencyTableBF( raceDolls, sampleType = "indepMulti", fixedMargin = "cols" ) model_parameters(bf, centrality = "mean", dispersion = TRUE, verbose = FALSE, es_type = "cramers_v" )
# Bayesian ttest model < BayesFactor::ttestBF(x = rnorm(100, 1, 1)) model_parameters(model) model_parameters(model, es_type = "cohens_d", ci = 0.9) # Bayesian contingency table analysis data(raceDolls) bf < BayesFactor::contingencyTableBF( raceDolls, sampleType = "indepMulti", fixedMargin = "cols" ) model_parameters(bf, centrality = "mean", dispersion = TRUE, verbose = FALSE, es_type = "cramers_v" )
Extract and compute indices and measures to describe parameters of generalized additive models (GAM(M)s).
## S3 method for class 'cgam' model_parameters( model, ci = 0.95, ci_method = "residual", bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'gamm' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, verbose = TRUE, ... ) ## S3 method for class 'Gam' model_parameters( model, es_type = NULL, df_error = NULL, type = NULL, table_wide = FALSE, verbose = TRUE, ... ) ## S3 method for class 'scam' model_parameters( model, ci = 0.95, ci_method = "residual", bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'cgam' model_parameters( model, ci = 0.95, ci_method = "residual", bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'gamm' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, verbose = TRUE, ... ) ## S3 method for class 'Gam' model_parameters( model, es_type = NULL, df_error = NULL, type = NULL, table_wide = FALSE, verbose = TRUE, ... ) ## S3 method for class 'scam' model_parameters( model, ci = 0.95, ci_method = "residual", bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
A gam/gamm model. 
ci 
Confidence Interval (CI) level. Default to 
ci_method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
bootstrap 
Should estimates be based on bootstrapped model? If

iterations 
The number of bootstrap replicates. This only apply in the case of bootstrapped frequentist models. 
standardize 
The method used for standardizing the parameters. Can be

exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
p_adjust 
Character vector, if not 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

es_type 
The effect size of interest. Not that possibly not all effect sizes are applicable to the model object. See 'Details'. For Anova models, can also be a character vector with multiple effect size names. 
df_error 
Denominator degrees of freedom (or degrees of freedom of the
error estimate, i.e., the residuals). This is used to compute effect sizes
for ANOVAtables from mixed models. See 'Examples'. (Ignored for

type 
Numeric, type of sums of squares. May be 1, 2 or 3. If 2 or 3,
ANOVAtables using 
table_wide 
Logical that decides whether the ANOVA table should be in
wide format, i.e. should the numerator and denominator degrees of freedom
be in the same row. Default: 
The reporting of degrees of freedom for the spline terms
slightly differs from the output of summary(model)
, for example in the
case of mgcv::gam()
. The estimated degrees of freedom, column
edf
in the summaryoutput, is named df
in the returned data
frame, while the column df_error
in the returned data frame refers to
the residual degrees of freedom that are returned by df.residual()
.
Hence, the values in the the column df_error
differ from the column
Ref.df
from the summary, which is intentional, as these reference
degrees of freedom “is not very interpretable”
(web).
A data frame of indices related to the model's parameters.
insight::standardize_names()
to rename
columns into a consistent, standardized naming scheme.
library(parameters) if (require("mgcv")) { dat < gamSim(1, n = 400, dist = "normal", scale = 2) model < gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) model_parameters(model) }
library(parameters) if (require("mgcv")) { dat < gamSim(1, n = 400, dist = "normal", scale = 2) model < gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) model_parameters(model) }
Parameters from (linear) mixed models.
## S3 method for class 'cpglmm' model_parameters( model, ci = 0.95, ci_method = NULL, ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'glmmTMB' model_parameters( model, ci = 0.95, ci_method = "wald", ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", component = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'merMod' model_parameters( model, ci = 0.95, ci_method = NULL, ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mixed' model_parameters( model, ci = 0.95, ci_method = "wald", ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", component = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'MixMod' model_parameters( model, ci = 0.95, ci_method = "wald", ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", component = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'lme' model_parameters( model, ci = 0.95, ci_method = NULL, ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'clmm2' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "scale"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'clmm' model_parameters( model, ci = 0.95, ci_method = NULL, ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'cpglmm' model_parameters( model, ci = 0.95, ci_method = NULL, ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'glmmTMB' model_parameters( model, ci = 0.95, ci_method = "wald", ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", component = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'merMod' model_parameters( model, ci = 0.95, ci_method = NULL, ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mixed' model_parameters( model, ci = 0.95, ci_method = "wald", ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", component = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'MixMod' model_parameters( model, ci = 0.95, ci_method = "wald", ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", component = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'lme' model_parameters( model, ci = 0.95, ci_method = NULL, ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, wb_component = TRUE, summary = getOption("parameters_mixed_summary", FALSE), include_info = getOption("parameters_mixed_info", FALSE), include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'clmm2' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "scale"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'clmm' model_parameters( model, ci = 0.95, ci_method = NULL, ci_random = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, effects = "all", group_level = FALSE, exponentiate = FALSE, p_adjust = NULL, include_sigma = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
A mixed model. 
ci 
Confidence Interval (CI) level. Default to 
ci_method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
ci_random 
Logical, if 
bootstrap 
Should estimates be based on bootstrapped model? If

iterations 
The number of draws to simulate/bootstrap. 
standardize 
The method used for standardizing the parameters. Can be

effects 
Should parameters for fixed effects ( 
group_level 
Logical, for multilevel models (i.e. models with random
effects) and when 
exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
p_adjust 
Character vector, if not 
include_sigma 
Logical, if 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

component 
Should all parameters, parameters for the conditional model,
for the zeroinflation part of the model, or the dispersion model be returned?
Applies to models with zeroinflation and/or dispersion component. 
wb_component 
Logical, if 
summary 
Deprecated, please use 
include_info 
Logical, if 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
A data frame of indices related to the model's parameters.
For models of class merMod
and glmmTMB
, confidence intervals for random
effect variances can be calculated.
For models of from package lme4, when ci_method
is either "profile"
or "boot"
, and effects
is either "random"
or "all"
, profiled resp.
bootstrapped confidence intervals are computed for the random effects.
For all other options of ci_method
, and only when the merDeriv
package is installed, confidence intervals for random effects are based on
normaldistribution approximation, using the deltamethod to transform
standard errors for constructing the intervals around the logtransformed
SD parameters. These are than backtransformed, so that random effect
variances, standard errors and confidence intervals are shown on the original
scale. Due to the transformation, the intervals are asymmetrical, however,
they are within the correct bounds (i.e. no negative interval for the SD,
and the interval for the correlations is within the range from 1 to +1).
For models of class glmmTMB
, confidence intervals for random effect
variances always use a Wald tdistribution approximation.
If a model is "singular", this means that some dimensions of the variancecovariance matrix have been estimated as exactly zero. This often occurs for mixed models with complex random effects structures.
There is no goldstandard about how to deal with singularity and which
randomeffects specification to choose. One way is to fully go Bayesian
(with informative priors). Other proposals are listed in the documentation
of performance::check_singularity()
. However, since version 1.1.9, the
glmmTMB package allows to use priors in a frequentist framework, too. One
recommendation is to use a Gamma prior (Chung et al. 2013). The mean may
vary from 1 to very large values (like 1e8
), and the shape parameter should
be set to a value of 2.5. You can then update()
your model with the specified
prior. In glmmTMB, the code would look like this:
# "model" is an object of class gmmmTMB prior < data.frame( prior = "gamma(1, 2.5)", # mean can be 1, but even 1e8 class = "ranef" # for random effects ) model_with_priors < update(model, priors = prior)
Large values for the mean parameter of the Gamma prior have no large impact
on the random effects variances in terms of a "bias". Thus, if 1
doesn't
fix the singular fit, you can safely try larger values.
For some models from package glmmTMB, both the dispersion parameter and the residual variance from the random effects parameters are shown. Usually, these are the same but presented on different scales, e.g.
model < glmmTMB(Sepal.Width ~ Petal.Length + (1Species), data = iris) exp(fixef(model)$disp) # 0.09902987 sigma(model)^2 # 0.09902987
For models where the dispersion parameter and the residual variance are the same, only the residual variance is shown in the output.
There are different ways of approximating the degrees of freedom depending
on different assumptions about the nature of the model and its sampling
distribution. The ci_method
argument modulates the method for computing degrees
of freedom (df) that are used to calculate confidence intervals (CI) and the
related pvalues. Following options are allowed, depending on the model
class:
Classical methods:
Classical inference is generally based on the Wald method. The Wald approach to inference computes a test statistic by dividing the parameter estimate by its standard error (Coefficient / SE), then comparing this statistic against a t or normal distribution. This approach can be used to compute CIs and pvalues.
"wald"
:
Applies to nonBayesian models. For linear models, CIs computed using the Wald method (SE and a tdistribution with residual df); pvalues computed using the Wald method with a tdistribution with residual df. For other models, CIs computed using the Wald method (SE and a normal distribution); pvalues computed using the Wald method with a normal distribution.
"normal"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a normal distribution.
"residual"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a tdistribution with residual df when possible. If the residual df for a model cannot be determined, a normal distribution is used instead.
Methods for mixed models:
Compared to fixed effects (or singlelevel) models, determining appropriate df for Waldbased inference in mixed models is more difficult. See the R GLMM FAQ for a discussion.
Several approximate methods for computing df are available, but you should
also consider instead using profile likelihood ("profile"
) or bootstrap ("boot"
)
CIs and pvalues instead.
"satterthwaite"
Applies to linear mixed models. CIs computed using the Wald method (SE and a tdistribution with Satterthwaite df); pvalues computed using the Wald method with a tdistribution with Satterthwaite df.
"kenward"
Applies to linear mixed models. CIs computed using the Wald method (KenwardRoger SE and a tdistribution with KenwardRoger df); pvalues computed using the Wald method with KenwardRoger SE and tdistribution with KenwardRoger df.
"ml1"
Applies to linear mixed models. CIs computed using the Wald
method (SE and a tdistribution with ml1 approximated df); pvalues
computed using the Wald method with a tdistribution with ml1 approximated df.
See ci_ml1()
.
"betwithin"
Applies to linear mixed models and generalized linear mixed models.
CIs computed using the Wald method (SE and a tdistribution with betweenwithin df);
pvalues computed using the Wald method with a tdistribution with betweenwithin df.
See ci_betwithin()
.
Likelihoodbased methods:
Likelihoodbased inference is based on comparing the likelihood for the
maximumlikelihood estimate to the the likelihood for models with one or more
parameter values changed (e.g., set to zero or a range of alternative values).
Likelihood ratios for the maximumlikelihood and alternative models are compared
to a $\chi$
squared distribution to compute CIs and pvalues.
"profile"
Applies to nonBayesian models of class glm
, polr
, merMod
or glmmTMB
.
CIs computed by profiling the likelihood curve for a parameter, using
linear interpolation to find where likelihood ratio equals a critical value;
pvalues computed using the Wald method with a normaldistribution (note:
this might change in a future update!)
"uniroot"
Applies to nonBayesian models of class glmmTMB
. CIs
computed by profiling the likelihood curve for a parameter, using root
finding to find where likelihood ratio equals a critical value; pvalues
computed using the Wald method with a normaldistribution (note: this
might change in a future update!)
Methods for bootstrapped or Bayesian models:
Bootstrapbased inference is based on resampling and refitting the model to the resampled datasets. The distribution of parameter estimates across resampled datasets is used to approximate the parameter's sampling distribution. Depending on the type of model, several different methods for bootstrapping and constructing CIs and pvalues from the bootstrap distribution are available.
For Bayesian models, inference is based on drawing samples from the model posterior distribution.
"quantile"
(or "eti"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as equal tailed intervals using the quantiles of the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::eti()
.
"hdi"
Applies to all models (including Bayesian models). For nonBayesian
models, only applies if bootstrap = TRUE
. CIs computed as highest density intervals
for the bootstrap or posterior samples; pvalues are based on the probability of direction.
See bayestestR::hdi()
.
"bci"
(or "bcai"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as bias corrected and accelerated intervals for the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::bci()
.
"si"
Applies to Bayesian models with proper priors. CIs computed as
support intervals comparing the posterior samples against the prior samples;
pvalues are based on the probability of direction. See bayestestR::si()
.
"boot"
Applies to nonBayesian models of class merMod
. CIs computed
using parametric bootstrapping (simulating data from the fitted model);
pvalues computed using the Wald method with a normaldistribution)
(note: this might change in a future update!).
For all iterationbased methods other than "boot"
("hdi"
, "quantile"
, "ci"
, "eti"
, "si"
, "bci"
, "bcai"
),
pvalues are based on the probability of direction (bayestestR::p_direction()
),
which is converted into a pvalue using bayestestR::pd_to_p()
.
If the calculation of random effects parameters takes too long, you may
use effects = "fixed"
. There is also a plot()
method
implemented in the seepackage.
Chung Y, RabeHesketh S, Dorie V, Gelman A, and Liu J. 2013. "A Nondegenerate Penalized Likelihood Estimator for Variance Parameters in Multilevel Models." Psychometrika 78 (4): 685–709. doi:10.1007/s1133601393282
insight::standardize_names()
to
rename columns into a consistent, standardized naming scheme.
library(parameters) data(mtcars) model < lme4::lmer(mpg ~ wt + (1  gear), data = mtcars) model_parameters(model) data(Salamanders, package = "glmmTMB") model < glmmTMB::glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) model_parameters(model, effects = "all") model < lme4::lmer(mpg ~ wt + (1  gear), data = mtcars) model_parameters(model, bootstrap = TRUE, iterations = 50, verbose = FALSE)
library(parameters) data(mtcars) model < lme4::lmer(mpg ~ wt + (1  gear), data = mtcars) model_parameters(model) data(Salamanders, package = "glmmTMB") model < glmmTMB::glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) model_parameters(model, effects = "all") model < lme4::lmer(mpg ~ wt + (1  gear), data = mtcars) model_parameters(model, bootstrap = TRUE, iterations = 50, verbose = FALSE)
Format cluster models obtained for example by kmeans()
.
## S3 method for class 'dbscan' model_parameters(model, data = NULL, clusters = NULL, ...) ## S3 method for class 'hclust' model_parameters(model, data = NULL, clusters = NULL, ...) ## S3 method for class 'pvclust' model_parameters(model, data = NULL, clusters = NULL, ci = 0.95, ...) ## S3 method for class 'kmeans' model_parameters(model, ...) ## S3 method for class 'hkmeans' model_parameters(model, ...) ## S3 method for class 'Mclust' model_parameters(model, data = NULL, clusters = NULL, ...) ## S3 method for class 'pam' model_parameters(model, data = NULL, clusters = NULL, ...)
## S3 method for class 'dbscan' model_parameters(model, data = NULL, clusters = NULL, ...) ## S3 method for class 'hclust' model_parameters(model, data = NULL, clusters = NULL, ...) ## S3 method for class 'pvclust' model_parameters(model, data = NULL, clusters = NULL, ci = 0.95, ...) ## S3 method for class 'kmeans' model_parameters(model, ...) ## S3 method for class 'hkmeans' model_parameters(model, ...) ## S3 method for class 'Mclust' model_parameters(model, data = NULL, clusters = NULL, ...) ## S3 method for class 'pam' model_parameters(model, data = NULL, clusters = NULL, ...)
model 
Cluster model. 
data 
A data.frame. 
clusters 
A vector with clusters assignments (must be same length as rows in data). 
... 
Arguments passed to or from other methods. 
ci 
Confidence Interval (CI) level. Default to 
# DBSCAN  if (require("dbscan", quietly = TRUE)) { model < dbscan::dbscan(iris[1:4], eps = 1.45, minPts = 10) rez < model_parameters(model, iris[1:4]) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Sum_Squares_Total attributes(rez)$Sum_Squares_Between # HDBSCAN model < dbscan::hdbscan(iris[1:4], minPts = 10) model_parameters(model, iris[1:4]) } # # Hierarchical clustering (hclust)  data < iris[1:4] model < hclust(dist(data)) clusters < cutree(model, 3) rez < model_parameters(model, data, clusters) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Total_Sum_Squares attributes(rez)$Between_Sum_Squares # # pvclust (finds "significant" clusters)  if (require("pvclust", quietly = TRUE)) { data < iris[1:4] # NOTE: pvclust works on transposed data model < pvclust::pvclust(datawizard::data_transpose(data, verbose = FALSE), method.dist = "euclidean", nboot = 50, quiet = TRUE ) rez < model_parameters(model, data, ci = 0.90) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Sum_Squares_Total attributes(rez)$Sum_Squares_Between } # # Kmeans  model < kmeans(iris[1:4], centers = 3) rez < model_parameters(model) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Sum_Squares_Total attributes(rez)$Sum_Squares_Between # # Hierarchical Kmeans (factoextra::hkclust)  if (require("factoextra", quietly = TRUE)) { data < iris[1:4] model < factoextra::hkmeans(data, k = 3) rez < model_parameters(model) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Sum_Squares_Total attributes(rez)$Sum_Squares_Between } if (require("mclust", quietly = TRUE)) { model < mclust::Mclust(iris[1:4], verbose = FALSE) model_parameters(model) } # # KMedoids (PAM and HPAM) ============== if (require("cluster", quietly = TRUE)) { model < cluster::pam(iris[1:4], k = 3) model_parameters(model) } if (require("fpc", quietly = TRUE)) { model < fpc::pamk(iris[1:4], criterion = "ch") model_parameters(model) }
# DBSCAN  if (require("dbscan", quietly = TRUE)) { model < dbscan::dbscan(iris[1:4], eps = 1.45, minPts = 10) rez < model_parameters(model, iris[1:4]) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Sum_Squares_Total attributes(rez)$Sum_Squares_Between # HDBSCAN model < dbscan::hdbscan(iris[1:4], minPts = 10) model_parameters(model, iris[1:4]) } # # Hierarchical clustering (hclust)  data < iris[1:4] model < hclust(dist(data)) clusters < cutree(model, 3) rez < model_parameters(model, data, clusters) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Total_Sum_Squares attributes(rez)$Between_Sum_Squares # # pvclust (finds "significant" clusters)  if (require("pvclust", quietly = TRUE)) { data < iris[1:4] # NOTE: pvclust works on transposed data model < pvclust::pvclust(datawizard::data_transpose(data, verbose = FALSE), method.dist = "euclidean", nboot = 50, quiet = TRUE ) rez < model_parameters(model, data, ci = 0.90) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Sum_Squares_Total attributes(rez)$Sum_Squares_Between } # # Kmeans  model < kmeans(iris[1:4], centers = 3) rez < model_parameters(model) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Sum_Squares_Total attributes(rez)$Sum_Squares_Between # # Hierarchical Kmeans (factoextra::hkclust)  if (require("factoextra", quietly = TRUE)) { data < iris[1:4] model < factoextra::hkmeans(data, k = 3) rez < model_parameters(model) rez # Get clusters predict(rez) # Clusters centers in long form attributes(rez)$means # Between and Total Sum of Squares attributes(rez)$Sum_Squares_Total attributes(rez)$Sum_Squares_Between } if (require("mclust", quietly = TRUE)) { model < mclust::Mclust(iris[1:4], verbose = FALSE) model_parameters(model) } # # KMedoids (PAM and HPAM) ============== if (require("cluster", quietly = TRUE)) { model < cluster::pam(iris[1:4], k = 3) model_parameters(model) } if (require("fpc", quietly = TRUE)) { model < fpc::pamk(iris[1:4], criterion = "ch") model_parameters(model) }
Extract and compute indices and measures to describe parameters of (generalized) linear models (GLMs).
## Default S3 method: model_parameters( model, ci = 0.95, ci_method = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'glm' model_parameters( model, ci = 0.95, ci_method = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'censReg' model_parameters( model, ci = 0.95, ci_method = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'ridgelm' model_parameters(model, verbose = TRUE, ...)
## Default S3 method: model_parameters( model, ci = 0.95, ci_method = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'glm' model_parameters( model, ci = 0.95, ci_method = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'censReg' model_parameters( model, ci = 0.95, ci_method = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, vcov = NULL, vcov_args = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'ridgelm' model_parameters(model, verbose = TRUE, ...)
model 
Model object. 
ci 
Confidence Interval (CI) level. Default to 
ci_method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
bootstrap 
Should estimates be based on bootstrapped model? If

iterations 
The number of bootstrap replicates. This only apply in the case of bootstrapped frequentist models. 
standardize 
The method used for standardizing the parameters. Can be

exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
p_adjust 
Character vector, if not 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
summary 
Deprecated, please use 
include_info 
Logical, if 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

A data frame of indices related to the model's parameters.
There are different ways of approximating the degrees of freedom depending
on different assumptions about the nature of the model and its sampling
distribution. The ci_method
argument modulates the method for computing degrees
of freedom (df) that are used to calculate confidence intervals (CI) and the
related pvalues. Following options are allowed, depending on the model
class:
Classical methods:
Classical inference is generally based on the Wald method. The Wald approach to inference computes a test statistic by dividing the parameter estimate by its standard error (Coefficient / SE), then comparing this statistic against a t or normal distribution. This approach can be used to compute CIs and pvalues.
"wald"
:
Applies to nonBayesian models. For linear models, CIs computed using the Wald method (SE and a tdistribution with residual df); pvalues computed using the Wald method with a tdistribution with residual df. For other models, CIs computed using the Wald method (SE and a normal distribution); pvalues computed using the Wald method with a normal distribution.
"normal"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a normal distribution.
"residual"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a tdistribution with residual df when possible. If the residual df for a model cannot be determined, a normal distribution is used instead.
Methods for mixed models:
Compared to fixed effects (or singlelevel) models, determining appropriate df for Waldbased inference in mixed models is more difficult. See the R GLMM FAQ for a discussion.
Several approximate methods for computing df are available, but you should
also consider instead using profile likelihood ("profile"
) or bootstrap ("boot"
)
CIs and pvalues instead.
"satterthwaite"
Applies to linear mixed models. CIs computed using the Wald method (SE and a tdistribution with Satterthwaite df); pvalues computed using the Wald method with a tdistribution with Satterthwaite df.
"kenward"
Applies to linear mixed models. CIs computed using the Wald method (KenwardRoger SE and a tdistribution with KenwardRoger df); pvalues computed using the Wald method with KenwardRoger SE and tdistribution with KenwardRoger df.
"ml1"
Applies to linear mixed models. CIs computed using the Wald
method (SE and a tdistribution with ml1 approximated df); pvalues
computed using the Wald method with a tdistribution with ml1 approximated df.
See ci_ml1()
.
"betwithin"
Applies to linear mixed models and generalized linear mixed models.
CIs computed using the Wald method (SE and a tdistribution with betweenwithin df);
pvalues computed using the Wald method with a tdistribution with betweenwithin df.
See ci_betwithin()
.
Likelihoodbased methods:
Likelihoodbased inference is based on comparing the likelihood for the
maximumlikelihood estimate to the the likelihood for models with one or more
parameter values changed (e.g., set to zero or a range of alternative values).
Likelihood ratios for the maximumlikelihood and alternative models are compared
to a $\chi$
squared distribution to compute CIs and pvalues.
"profile"
Applies to nonBayesian models of class glm
, polr
, merMod
or glmmTMB
.
CIs computed by profiling the likelihood curve for a parameter, using
linear interpolation to find where likelihood ratio equals a critical value;
pvalues computed using the Wald method with a normaldistribution (note:
this might change in a future update!)
"uniroot"
Applies to nonBayesian models of class glmmTMB
. CIs
computed by profiling the likelihood curve for a parameter, using root
finding to find where likelihood ratio equals a critical value; pvalues
computed using the Wald method with a normaldistribution (note: this
might change in a future update!)
Methods for bootstrapped or Bayesian models:
Bootstrapbased inference is based on resampling and refitting the model to the resampled datasets. The distribution of parameter estimates across resampled datasets is used to approximate the parameter's sampling distribution. Depending on the type of model, several different methods for bootstrapping and constructing CIs and pvalues from the bootstrap distribution are available.
For Bayesian models, inference is based on drawing samples from the model posterior distribution.
"quantile"
(or "eti"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as equal tailed intervals using the quantiles of the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::eti()
.
"hdi"
Applies to all models (including Bayesian models). For nonBayesian
models, only applies if bootstrap = TRUE
. CIs computed as highest density intervals
for the bootstrap or posterior samples; pvalues are based on the probability of direction.
See bayestestR::hdi()
.
"bci"
(or "bcai"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as bias corrected and accelerated intervals for the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::bci()
.
"si"
Applies to Bayesian models with proper priors. CIs computed as
support intervals comparing the posterior samples against the prior samples;
pvalues are based on the probability of direction. See bayestestR::si()
.
"boot"
Applies to nonBayesian models of class merMod
. CIs computed
using parametric bootstrapping (simulating data from the fitted model);
pvalues computed using the Wald method with a normaldistribution)
(note: this might change in a future update!).
For all iterationbased methods other than "boot"
("hdi"
, "quantile"
, "ci"
, "eti"
, "si"
, "bci"
, "bcai"
),
pvalues are based on the probability of direction (bayestestR::p_direction()
),
which is converted into a pvalue using bayestestR::pd_to_p()
.
insight::standardize_names()
to rename columns into a
consistent, standardized naming scheme.
library(parameters) model < lm(mpg ~ wt + cyl, data = mtcars) model_parameters(model) # bootstrapped parameters model_parameters(model, bootstrap = TRUE) # standardized parameters model_parameters(model, standardize = "refit") # robust, heteroskedasticityconsistent standard errors model_parameters(model, vcov = "HC3") model_parameters(model, vcov = "vcovCL", vcov_args = list(cluster = mtcars$cyl) ) # different pvalue style in output model_parameters(model, p_digits = 5) model_parameters(model, digits = 3, ci_digits = 4, p_digits = "scientific") # report Svalue or probability of direction for parameters model_parameters(model, s_value = TRUE) model_parameters(model, pd = TRUE) # logistic regression model model < glm(vs ~ wt + cyl, data = mtcars, family = "binomial") model_parameters(model) # show odds ratio / exponentiated coefficients model_parameters(model, exponentiate = TRUE) # biascorrected logistic regression with penalized maximum likelihood model < glm( vs ~ wt + cyl, data = mtcars, family = "binomial", method = "brglmFit" ) model_parameters(model)
library(parameters) model < lm(mpg ~ wt + cyl, data = mtcars) model_parameters(model) # bootstrapped parameters model_parameters(model, bootstrap = TRUE) # standardized parameters model_parameters(model, standardize = "refit") # robust, heteroskedasticityconsistent standard errors model_parameters(model, vcov = "HC3") model_parameters(model, vcov = "vcovCL", vcov_args = list(cluster = mtcars$cyl) ) # different pvalue style in output model_parameters(model, p_digits = 5) model_parameters(model, digits = 3, ci_digits = 4, p_digits = "scientific") # report Svalue or probability of direction for parameters model_parameters(model, s_value = TRUE) model_parameters(model, pd = TRUE) # logistic regression model model < glm(vs ~ wt + cyl, data = mtcars, family = "binomial") model_parameters(model) # show odds ratio / exponentiated coefficients model_parameters(model, exponentiate = TRUE) # biascorrected logistic regression with penalized maximum likelihood model < glm( vs ~ wt + cyl, data = mtcars, family = "binomial", method = "brglmFit" ) model_parameters(model)
Parameters from multinomial or cumulative link models
## S3 method for class 'DirichletRegModel' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "precision"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'bifeAPEs' model_parameters(model, ...) ## S3 method for class 'bracl' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mlm' model_parameters( model, ci = 0.95, vcov = NULL, vcov_args = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'clm2' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "scale"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'DirichletRegModel' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "precision"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'bifeAPEs' model_parameters(model, ...) ## S3 method for class 'bracl' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mlm' model_parameters( model, ci = 0.95, vcov = NULL, vcov_args = NULL, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'clm2' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "scale"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
A model with multinomial or categorical response value. 
ci 
Confidence Interval (CI) level. Default to 
bootstrap 
Should estimates be based on bootstrapped model? If

iterations 
The number of bootstrap replicates. This only apply in the case of bootstrapped frequentist models. 
component 
Should all parameters, parameters for the conditional model,
for the zeroinflation part of the model, or the dispersion model be returned?
Applies to models with zeroinflation and/or dispersion component. 
standardize 
The method used for standardizing the parameters. Can be

exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
p_adjust 
Character vector, if not 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

summary 
Deprecated, please use 
include_info 
Logical, if 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
Multinomial or cumulative link models, i.e. models where the
response value (dependent variable) is categorical and has more than two
levels, usually return coefficients for each response level. Hence, the
output from model_parameters()
will split the coefficient tables
by the different levels of the model's response.
A data frame of indices related to the model's parameters.
insight::standardize_names()
to rename
columns into a consistent, standardized naming scheme.
data("stemcell", package = "brglm2") model < brglm2::bracl( research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML" ) model_parameters(model)
data("stemcell", package = "brglm2") model < brglm2::bracl( research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML" ) model_parameters(model)
Parameters from Hypothesis Testing.
## S3 method for class 'glht' model_parameters( model, ci = 0.95, exponentiate = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'glht' model_parameters( model, ci = 0.95, exponentiate = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
Object of class 
ci 
Confidence Interval (CI) level. Default to 
exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

A data frame of indices related to the model's parameters.
if (require("multcomp", quietly = TRUE)) { # multiple linear model, swiss data lmod < lm(Fertility ~ ., data = swiss) mod < glht( model = lmod, linfct = c( "Agriculture = 0", "Examination = 0", "Education = 0", "Catholic = 0", "Infant.Mortality = 0" ) ) model_parameters(mod) } if (require("PMCMRplus", quietly = TRUE)) { model < suppressWarnings( kwAllPairsConoverTest(count ~ spray, data = InsectSprays) ) model_parameters(model) }
if (require("multcomp", quietly = TRUE)) { # multiple linear model, swiss data lmod < lm(Fertility ~ ., data = swiss) mod < glht( model = lmod, linfct = c( "Agriculture = 0", "Examination = 0", "Education = 0", "Catholic = 0", "Infant.Mortality = 0" ) ) model_parameters(mod) } if (require("PMCMRplus", quietly = TRUE)) { model < suppressWarnings( kwAllPairsConoverTest(count ~ spray, data = InsectSprays) ) model_parameters(model) }
Parameters from special regression models not listed under one of the previous categories yet.
## S3 method for class 'glimML' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("conditional", "random", "dispersion", "all"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'averaging' model_parameters( model, ci = 0.95, component = c("conditional", "full"), exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'betareg' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("conditional", "precision", "all"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'emm_list' model_parameters( model, ci = 0.95, exponentiate = FALSE, p_adjust = NULL, verbose = TRUE, ... ) ## S3 method for class 'glmx' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "extra"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'marginaleffects' model_parameters(model, ci = 0.95, exponentiate = FALSE, ...) ## S3 method for class 'metaplus' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, include_studies = TRUE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'meta_random' model_parameters( model, ci = 0.95, ci_method = "eti", exponentiate = FALSE, include_studies = TRUE, verbose = TRUE, ... ) ## S3 method for class 'meta_bma' model_parameters( model, ci = 0.95, ci_method = "eti", exponentiate = FALSE, include_studies = TRUE, verbose = TRUE, ... ) ## S3 method for class 'betaor' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("conditional", "precision", "all"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, verbose = TRUE, ... ) ## S3 method for class 'betamfx' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "precision", "marginal"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mjoint' model_parameters( model, ci = 0.95, effects = "fixed", component = c("all", "conditional", "survival"), exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mvord' model_parameters( model, ci = 0.95, component = c("all", "conditional", "thresholds", "correlation"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'selection' model_parameters( model, ci = 0.95, component = c("all", "selection", "outcome", "auxiliary"), bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'glimML' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("conditional", "random", "dispersion", "all"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'averaging' model_parameters( model, ci = 0.95, component = c("conditional", "full"), exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'betareg' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("conditional", "precision", "all"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'emm_list' model_parameters( model, ci = 0.95, exponentiate = FALSE, p_adjust = NULL, verbose = TRUE, ... ) ## S3 method for class 'glmx' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "extra"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'marginaleffects' model_parameters(model, ci = 0.95, exponentiate = FALSE, ...) ## S3 method for class 'metaplus' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, include_studies = TRUE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'meta_random' model_parameters( model, ci = 0.95, ci_method = "eti", exponentiate = FALSE, include_studies = TRUE, verbose = TRUE, ... ) ## S3 method for class 'meta_bma' model_parameters( model, ci = 0.95, ci_method = "eti", exponentiate = FALSE, include_studies = TRUE, verbose = TRUE, ... ) ## S3 method for class 'betaor' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("conditional", "precision", "all"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, verbose = TRUE, ... ) ## S3 method for class 'betamfx' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "precision", "marginal"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mjoint' model_parameters( model, ci = 0.95, effects = "fixed", component = c("all", "conditional", "survival"), exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mvord' model_parameters( model, ci = 0.95, component = c("all", "conditional", "thresholds", "correlation"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'selection' model_parameters( model, ci = 0.95, component = c("all", "selection", "outcome", "auxiliary"), bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
Model object. 
ci 
Confidence Interval (CI) level. Default to 
bootstrap 
Should estimates be based on bootstrapped model? If

iterations 
The number of bootstrap replicates. This only apply in the case of bootstrapped frequentist models. 
component 
Model component for which parameters should be shown. May be
one of 
standardize 
The method used for standardizing the parameters. Can be

exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
p_adjust 
Character vector, if not 
summary 
Deprecated, please use 
include_info 
Logical, if 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

include_studies 
Logical, if 
ci_method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
effects 
Should results for fixed effects, random effects or both be returned? Only applies to mixed models. May be abbreviated. 
A data frame of indices related to the model's parameters.
insight::standardize_names()
to rename
columns into a consistent, standardized naming scheme.
library(parameters) if (require("brglm2", quietly = TRUE)) { data("stemcell") model < bracl( research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML" ) model_parameters(model) }
library(parameters) if (require("brglm2", quietly = TRUE)) { data("stemcell") model < bracl( research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML" ) model_parameters(model) }
Parameters of htests (correlations, ttests, chisquared, ...).
## S3 method for class 'htest' model_parameters( model, ci = 0.95, alternative = NULL, bootstrap = FALSE, es_type = NULL, verbose = TRUE, ... ) ## S3 method for class 'coeftest' model_parameters( model, ci = 0.95, ci_method = "wald", keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'htest' model_parameters( model, ci = 0.95, alternative = NULL, bootstrap = FALSE, es_type = NULL, verbose = TRUE, ... ) ## S3 method for class 'coeftest' model_parameters( model, ci = 0.95, ci_method = "wald", keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
Object of class 
ci 
Level of confidence intervals for effect size statistic. Currently
only applies to objects from 
alternative 
A character string specifying the alternative hypothesis;
Controls the type of CI returned: 
bootstrap 
Should estimates be bootstrapped? 
es_type 
The effect size of interest. Not that possibly not all effect sizes are applicable to the model object. See 'Details'. For Anova models, can also be a character vector with multiple effect size names. 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

ci_method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
For an object of class htest
, data is extracted via insight::get_data()
, and passed to the relevant function according to:
A ttest depending on type
: "cohens_d"
(default), "hedges_g"
, or one of "p_superiority"
, "u1"
, "u2"
, "u3"
, "overlap"
.
For a Paired ttest: depending on type
: "rm_rm"
, "rm_av"
, "rm_b"
, "rm_d"
, "rm_z"
.
A Chisquared tests of independence or Fisher's Exact Test, depending on type
: "cramers_v"
(default), "tschuprows_t"
, "phi"
, "cohens_w"
, "pearsons_c"
, "cohens_h"
, "oddsratio"
, "riskratio"
, "arr"
, or "nnt"
.
A Chisquared tests of goodnessoffit, depending on type
: "fei"
(default) "cohens_w"
, "pearsons_c"
A Oneway ANOVA test, depending on type
: "eta"
(default), "omega"
or "epsilon"
squared, "f"
, or "f2"
.
A McNemar test returns Cohen's g.
A Wilcoxon test depending on type
: returns "rank_biserial
" correlation (default) or one of "p_superiority"
, "vda"
, "u2"
, "u3"
, "overlap"
.
A KruskalWallis test depending on type
: "epsilon"
(default) or "eta"
.
A Friedman test returns Kendall's W.
(Where applicable, ci
and alternative
are taken from the htest
if not otherwise provided.)
For an object of class BFBayesFactor
, using bayestestR::describe_posterior()
,
A ttest depending on type
: "cohens_d"
(default) or one of "p_superiority"
, "u1"
, "u2"
, "u3"
, "overlap"
.
A correlation test returns r.
A contingency table test, depending on type
: "cramers_v"
(default), "phi"
, "tschuprows_t"
, "cohens_w"
, "pearsons_c"
, "cohens_h"
, "oddsratio"
, or "riskratio"
, "arr"
, or "nnt"
.
A proportion test returns p.
Objects of class anova
, aov
, aovlist
or afex_aov
, depending on type
: "eta"
(default), "omega"
or "epsilon"
squared, "f"
, or "f2"
.
Other objects are passed to parameters::standardize_parameters()
.
For statistical models it is recommended to directly use the listed functions, for the full range of options they provide.
A data frame of indices related to the model's parameters.
model < cor.test(mtcars$mpg, mtcars$cyl, method = "pearson") model_parameters(model) model < t.test(iris$Sepal.Width, iris$Sepal.Length) model_parameters(model, es_type = "hedges_g") model < t.test(mtcars$mpg ~ mtcars$vs) model_parameters(model, es_type = "hedges_g") model < t.test(iris$Sepal.Width, mu = 1) model_parameters(model, es_type = "cohens_d") data(airquality) airquality$Month < factor(airquality$Month, labels = month.abb[5:9]) model < pairwise.t.test(airquality$Ozone, airquality$Month) model_parameters(model) smokers < c(83, 90, 129, 70) patients < c(86, 93, 136, 82) model < suppressWarnings(pairwise.prop.test(smokers, patients)) model_parameters(model) model < suppressWarnings(chisq.test(table(mtcars$am, mtcars$cyl))) model_parameters(model, es_type = "cramers_v")
model < cor.test(mtcars$mpg, mtcars$cyl, method = "pearson") model_parameters(model) model < t.test(iris$Sepal.Width, iris$Sepal.Length) model_parameters(model, es_type = "hedges_g") model < t.test(mtcars$mpg ~ mtcars$vs) model_parameters(model, es_type = "hedges_g") model < t.test(iris$Sepal.Width, mu = 1) model_parameters(model, es_type = "cohens_d") data(airquality) airquality$Month < factor(airquality$Month, labels = month.abb[5:9]) model < pairwise.t.test(airquality$Ozone, airquality$Month) model_parameters(model) smokers < c(83, 90, 129, 70) patients < c(86, 93, 136, 82) model < suppressWarnings(pairwise.prop.test(smokers, patients)) model_parameters(model) model < suppressWarnings(chisq.test(table(mtcars$am, mtcars$cyl))) model_parameters(model, es_type = "cramers_v")
Parameters from Bayesian models.
## S3 method for class 'MCMCglmm' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, bf_prior = NULL, diagnostic = c("ESS", "Rhat"), priors = TRUE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'data.frame' model_parameters( model, as_draws = FALSE, exponentiate = FALSE, verbose = TRUE, ... ) ## S3 method for class 'brmsfit' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, bf_prior = NULL, diagnostic = c("ESS", "Rhat"), priors = FALSE, effects = "fixed", component = "all", exponentiate = FALSE, standardize = NULL, group_level = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'draws' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, exponentiate = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'stanreg' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, bf_prior = NULL, diagnostic = c("ESS", "Rhat"), priors = TRUE, effects = "fixed", exponentiate = FALSE, standardize = NULL, group_level = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'MCMCglmm' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, bf_prior = NULL, diagnostic = c("ESS", "Rhat"), priors = TRUE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'data.frame' model_parameters( model, as_draws = FALSE, exponentiate = FALSE, verbose = TRUE, ... ) ## S3 method for class 'brmsfit' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, bf_prior = NULL, diagnostic = c("ESS", "Rhat"), priors = FALSE, effects = "fixed", component = "all", exponentiate = FALSE, standardize = NULL, group_level = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'draws' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, exponentiate = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'stanreg' model_parameters( model, centrality = "median", dispersion = FALSE, ci = 0.95, ci_method = "eti", test = "pd", rope_range = "default", rope_ci = 0.95, bf_prior = NULL, diagnostic = c("ESS", "Rhat"), priors = TRUE, effects = "fixed", exponentiate = FALSE, standardize = NULL, group_level = FALSE, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
Bayesian model (including SEM from blavaan. May also be
a data frame with posterior samples, however, 
centrality 
The pointestimates (centrality indices) to compute. Character
(vector) or list with one or more of these options: 
dispersion 
Logical, if 
ci 
Credible Interval (CI) level. Default to 
ci_method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
test 
The indices of effect existence to compute. Character (vector) or
list with one or more of these options: 
rope_range 
ROPE's lower and higher bounds. Should be a vector of two
values (e.g., 
rope_ci 
The Credible Interval (CI) probability, corresponding to the proportion of HDI, to use for the percentage in ROPE. 
bf_prior 
Distribution representing a prior for the computation of Bayes factors / SI. Used if the input is a posterior, otherwise (in the case of models) ignored. 
diagnostic 
Diagnostic metrics to compute. Character (vector) or list
with one or more of these options: 
priors 
Add the prior used for each parameter. 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle messages and warnings. 
... 
Currently not used. 
as_draws 
Logical, if 
exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
effects 
Should results for fixed effects, random effects or both be returned? Only applies to mixed models. May be abbreviated. 
component 
Which type of parameters to return, such as parameters for the
conditional model, the zeroinflation part of the model, the dispersion
term, or other auxiliary parameters be returned? Applies to models with
zeroinflation and/or dispersion formula, or if parameters such as 
standardize 
The method used for standardizing the parameters. Can be

group_level 
Logical, for multilevel models (i.e. models with random
effects) and when 
A data frame of indices related to the model's parameters.
There are different ways of approximating the degrees of freedom depending
on different assumptions about the nature of the model and its sampling
distribution. The ci_method
argument modulates the method for computing degrees
of freedom (df) that are used to calculate confidence intervals (CI) and the
related pvalues. Following options are allowed, depending on the model
class:
Classical methods:
Classical inference is generally based on the Wald method. The Wald approach to inference computes a test statistic by dividing the parameter estimate by its standard error (Coefficient / SE), then comparing this statistic against a t or normal distribution. This approach can be used to compute CIs and pvalues.
"wald"
:
Applies to nonBayesian models. For linear models, CIs computed using the Wald method (SE and a tdistribution with residual df); pvalues computed using the Wald method with a tdistribution with residual df. For other models, CIs computed using the Wald method (SE and a normal distribution); pvalues computed using the Wald method with a normal distribution.
"normal"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a normal distribution.
"residual"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a tdistribution with residual df when possible. If the residual df for a model cannot be determined, a normal distribution is used instead.
Methods for mixed models:
Compared to fixed effects (or singlelevel) models, determining appropriate df for Waldbased inference in mixed models is more difficult. See the R GLMM FAQ for a discussion.
Several approximate methods for computing df are available, but you should
also consider instead using profile likelihood ("profile"
) or bootstrap ("boot"
)
CIs and pvalues instead.
"satterthwaite"
Applies to linear mixed models. CIs computed using the Wald method (SE and a tdistribution with Satterthwaite df); pvalues computed using the Wald method with a tdistribution with Satterthwaite df.
"kenward"
Applies to linear mixed models. CIs computed using the Wald method (KenwardRoger SE and a tdistribution with KenwardRoger df); pvalues computed using the Wald method with KenwardRoger SE and tdistribution with KenwardRoger df.
"ml1"
Applies to linear mixed models. CIs computed using the Wald
method (SE and a tdistribution with ml1 approximated df); pvalues
computed using the Wald method with a tdistribution with ml1 approximated df.
See ci_ml1()
.
"betwithin"
Applies to linear mixed models and generalized linear mixed models.
CIs computed using the Wald method (SE and a tdistribution with betweenwithin df);
pvalues computed using the Wald method with a tdistribution with betweenwithin df.
See ci_betwithin()
.
Likelihoodbased methods:
Likelihoodbased inference is based on comparing the likelihood for the
maximumlikelihood estimate to the the likelihood for models with one or more
parameter values changed (e.g., set to zero or a range of alternative values).
Likelihood ratios for the maximumlikelihood and alternative models are compared
to a $\chi$
squared distribution to compute CIs and pvalues.
"profile"
Applies to nonBayesian models of class glm
, polr
, merMod
or glmmTMB
.
CIs computed by profiling the likelihood curve for a parameter, using
linear interpolation to find where likelihood ratio equals a critical value;
pvalues computed using the Wald method with a normaldistribution (note:
this might change in a future update!)
"uniroot"
Applies to nonBayesian models of class glmmTMB
. CIs
computed by profiling the likelihood curve for a parameter, using root
finding to find where likelihood ratio equals a critical value; pvalues
computed using the Wald method with a normaldistribution (note: this
might change in a future update!)
Methods for bootstrapped or Bayesian models:
Bootstrapbased inference is based on resampling and refitting the model to the resampled datasets. The distribution of parameter estimates across resampled datasets is used to approximate the parameter's sampling distribution. Depending on the type of model, several different methods for bootstrapping and constructing CIs and pvalues from the bootstrap distribution are available.
For Bayesian models, inference is based on drawing samples from the model posterior distribution.
"quantile"
(or "eti"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as equal tailed intervals using the quantiles of the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::eti()
.
"hdi"
Applies to all models (including Bayesian models). For nonBayesian
models, only applies if bootstrap = TRUE
. CIs computed as highest density intervals
for the bootstrap or posterior samples; pvalues are based on the probability of direction.
See bayestestR::hdi()
.
"bci"
(or "bcai"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as bias corrected and accelerated intervals for the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::bci()
.
"si"
Applies to Bayesian models with proper priors. CIs computed as
support intervals comparing the posterior samples against the prior samples;
pvalues are based on the probability of direction. See bayestestR::si()
.
"boot"
Applies to nonBayesian models of class merMod
. CIs computed
using parametric bootstrapping (simulating data from the fitted model);
pvalues computed using the Wald method with a normaldistribution)
(note: this might change in a future update!).
For all iterationbased methods other than "boot"
("hdi"
, "quantile"
, "ci"
, "eti"
, "si"
, "bci"
, "bcai"
),
pvalues are based on the probability of direction (bayestestR::p_direction()
),
which is converted into a pvalue using bayestestR::pd_to_p()
.
When standardize = "refit"
, columns diagnostic
,
bf_prior
and priors
refer to the original
model
. If model
is a data frame, arguments diagnostic
,
bf_prior
and priors
are ignored.
There is also a
plot()
method
implemented in the
seepackage.
insight::standardize_names()
to
rename columns into a consistent, standardized naming scheme.
library(parameters) if (require("rstanarm")) { model < suppressWarnings(stan_glm( Sepal.Length ~ Petal.Length * Species, data = iris, iter = 500, refresh = 0 )) model_parameters(model) }
library(parameters) if (require("rstanarm")) { model < suppressWarnings(stan_glm( Sepal.Length ~ Petal.Length * Species, data = iris, iter = 500, refresh = 0 )) model_parameters(model) }
Format models of class mira
, obtained from mice::width.mids()
, or of
class mipo
.
## S3 method for class 'mipo' model_parameters( model, ci = 0.95, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mira' model_parameters( model, ci = 0.95, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'mipo' model_parameters( model, ci = 0.95, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mira' model_parameters( model, ci = 0.95, exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
An object of class 
ci 
Confidence Interval (CI) level. Default to 
exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
p_adjust 
Character vector, if not 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. 
model_parameters()
for objects of class mira
works
similar to summary(mice::pool())
, i.e. it generates the pooled summary
of multiple imputed repeated regression analyses.
library(parameters) data(nhanes2, package = "mice") imp < mice::mice(nhanes2) fit < with(data = imp, exp = lm(bmi ~ age + hyp + chl)) model_parameters(fit) # model_parameters() also works for models that have no "tidy"method in mice data(warpbreaks) set.seed(1234) warpbreaks$tension[sample(1:nrow(warpbreaks), size = 10)] < NA imp < mice::mice(warpbreaks) fit < with(data = imp, expr = gee::gee(breaks ~ tension, id = wool)) # does not work: # summary(mice::pool(fit)) model_parameters(fit) # and it works with pooled results data("nhanes2", package = "mice") imp < mice::mice(nhanes2) fit < with(data = imp, exp = lm(bmi ~ age + hyp + chl)) pooled < mice::pool(fit) model_parameters(pooled)
library(parameters) data(nhanes2, package = "mice") imp < mice::mice(nhanes2) fit < with(data = imp, exp = lm(bmi ~ age + hyp + chl)) model_parameters(fit) # model_parameters() also works for models that have no "tidy"method in mice data(warpbreaks) set.seed(1234) warpbreaks$tension[sample(1:nrow(warpbreaks), size = 10)] < NA imp < mice::mice(warpbreaks) fit < with(data = imp, expr = gee::gee(breaks ~ tension, id = wool)) # does not work: # summary(mice::pool(fit)) model_parameters(fit) # and it works with pooled results data("nhanes2", package = "mice") imp < mice::mice(nhanes2) fit < with(data = imp, exp = lm(bmi ~ age + hyp + chl)) pooled < mice::pool(fit) model_parameters(pooled)
Format structural models from the psych or FactoMineR packages.
## S3 method for class 'PCA' model_parameters( model, sort = FALSE, threshold = NULL, labels = NULL, verbose = TRUE, ... ) ## S3 method for class 'lavaan' model_parameters( model, ci = 0.95, standardize = FALSE, component = c("regression", "correlation", "loading", "defined"), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'principal' model_parameters( model, sort = FALSE, threshold = NULL, labels = NULL, verbose = TRUE, ... )
## S3 method for class 'PCA' model_parameters( model, sort = FALSE, threshold = NULL, labels = NULL, verbose = TRUE, ... ) ## S3 method for class 'lavaan' model_parameters( model, ci = 0.95, standardize = FALSE, component = c("regression", "correlation", "loading", "defined"), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'principal' model_parameters( model, sort = FALSE, threshold = NULL, labels = NULL, verbose = TRUE, ... )
model 
Model object. 
sort 
Sort the loadings. 
threshold 
A value between 0 and 1 indicates which (absolute) values
from the loadings should be removed. An integer higher than 1 indicates the
n strongest loadings to retain. Can also be 
labels 
A character vector containing labels to be added to the loadings data. Usually, the question related to the item. 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. 
ci 
Confidence Interval (CI) level. Default to 
standardize 
Return standardized parameters (standardized coefficients).
Can be 
component 
What type of links to return. Can be 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
For the structural models obtained with psych, the following indices are present:
Complexity (Hoffman's, 1978; Pettersson and Turkheimer, 2010) represents the number of latent components needed to account for the observed variables. Whereas a perfect simple structure solution has a complexity of 1 in that each item would only load on one factor, a solution with evenly distributed items has a complexity greater than 1.
Uniqueness represents the variance that is 'unique' to the
variable and not shared with other variables. It is equal to 1 – communality
(variance that is shared with other variables). A uniqueness
of 0.20
suggests that 20%
or that variable's variance is not shared
with other variables in the overall factor model. The greater 'uniqueness'
the lower the relevance of the variable in the factor model.
MSA represents the KaiserMeyerOlkin Measure of Sampling Adequacy (Kaiser and Rice, 1974) for each item. It indicates whether there is enough data for each factor give reliable results for the PCA. The value should be > 0.6, and desirable values are > 0.8 (Tabachnick and Fidell, 2013).
A data frame of indices or loadings.
There is also a
plot()
method
for lavaan
models implemented in the
seepackage.
Kaiser, H.F. and Rice. J. (1974). Little jiffy, mark iv. Educational and Psychological Measurement, 34(1):111–117
Pettersson, E., and Turkheimer, E. (2010). Item selection, evaluation, and simple structure in personality data. Journal of research in personality, 44(4), 407420.
Revelle, W. (2016). How To: Use the psych package for Factor Analysis and data reduction.
Tabachnick, B. G., and Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Boston: Pearson Education.
Rosseel Y (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 136.
Merkle EC , Rosseel Y (2018). blavaan: Bayesian Structural Equation Models via Parameter Expansion. Journal of Statistical Software, 85(4), 130. http://www.jstatsoft.org/v85/i04/
library(parameters) if (require("psych", quietly = TRUE)) { # Principal Component Analysis (PCA)  pca < psych::principal(attitude) model_parameters(pca) pca < psych::principal(attitude, nfactors = 3, rotate = "none") model_parameters(pca, sort = TRUE, threshold = 0.2) principal_components(attitude, n = 3, sort = TRUE, threshold = 0.2) # Exploratory Factor Analysis (EFA)  efa < psych::fa(attitude, nfactors = 3) model_parameters(efa, threshold = "max", sort = TRUE, labels = as.character(1:ncol(attitude)) ) # Omega  omega < psych::omega(mtcars, nfactors = 3) params < model_parameters(omega) params summary(params) } # lavaan library(parameters) # lavaan  if (require("lavaan", quietly = TRUE)) { # Confirmatory Factor Analysis (CFA)  structure < " visual =~ x1 + x2 + x3 textual =~ x4 + x5 + x6 speed =~ x7 + x8 + x9 " model < lavaan::cfa(structure, data = HolzingerSwineford1939) model_parameters(model) model_parameters(model, standardize = TRUE) # filter parameters model_parameters( model, parameters = list( To = "^(?!visual)", From = "^(?!(x7x8))" ) ) # Structural Equation Model (SEM)  structure < " # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 " model < lavaan::sem(structure, data = PoliticalDemocracy) model_parameters(model) model_parameters(model, standardize = TRUE) }
library(parameters) if (require("psych", quietly = TRUE)) { # Principal Component Analysis (PCA)  pca < psych::principal(attitude) model_parameters(pca) pca < psych::principal(attitude, nfactors = 3, rotate = "none") model_parameters(pca, sort = TRUE, threshold = 0.2) principal_components(attitude, n = 3, sort = TRUE, threshold = 0.2) # Exploratory Factor Analysis (EFA)  efa < psych::fa(attitude, nfactors = 3) model_parameters(efa, threshold = "max", sort = TRUE, labels = as.character(1:ncol(attitude)) ) # Omega  omega < psych::omega(mtcars, nfactors = 3) params < model_parameters(omega) params summary(params) } # lavaan library(parameters) # lavaan  if (require("lavaan", quietly = TRUE)) { # Confirmatory Factor Analysis (CFA)  structure < " visual =~ x1 + x2 + x3 textual =~ x4 + x5 + x6 speed =~ x7 + x8 + x9 " model < lavaan::cfa(structure, data = HolzingerSwineford1939) model_parameters(model) model_parameters(model, standardize = TRUE) # filter parameters model_parameters( model, parameters = list( To = "^(?!visual)", From = "^(?!(x7x8))" ) ) # Structural Equation Model (SEM)  structure < " # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 " model < lavaan::sem(structure, data = PoliticalDemocracy) model_parameters(model) model_parameters(model, standardize = TRUE) }
Extract and compute indices and measures to describe parameters of metaanalysis models.
## S3 method for class 'rma' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, include_studies = TRUE, keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'rma' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, standardize = NULL, exponentiate = FALSE, include_studies = TRUE, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
Model object. 
ci 
Confidence Interval (CI) level. Default to 
bootstrap 
Should estimates be based on bootstrapped model? If

iterations 
The number of bootstrap replicates. This only apply in the case of bootstrapped frequentist models. 
standardize 
The method used for standardizing the parameters. Can be

exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
include_studies 
Logical, if 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

A data frame of indices related to the model's parameters.
library(parameters) mydat << data.frame( effectsize = c(0.393, 0.675, 0.282, 1.398), stderr = c(0.317, 0.317, 0.13, 0.36) ) if (require("metafor", quietly = TRUE)) { model < rma(yi = effectsize, sei = stderr, method = "REML", data = mydat) model_parameters(model) } # with subgroups if (require("metafor", quietly = TRUE)) { data(dat.bcg) dat < escalc( measure = "RR", ai = tpos, bi = tneg, ci = cpos, di = cneg, data = dat.bcg ) dat$alloc < ifelse(dat$alloc == "random", "random", "other") d << dat model < rma(yi, vi, mods = ~alloc, data = d, digits = 3, slab = author) model_parameters(model) } if (require("metaBMA", quietly = TRUE)) { data(towels) m < suppressWarnings(meta_random(logOR, SE, study, data = towels)) model_parameters(m) }
library(parameters) mydat << data.frame( effectsize = c(0.393, 0.675, 0.282, 1.398), stderr = c(0.317, 0.317, 0.13, 0.36) ) if (require("metafor", quietly = TRUE)) { model < rma(yi = effectsize, sei = stderr, method = "REML", data = mydat) model_parameters(model) } # with subgroups if (require("metafor", quietly = TRUE)) { data(dat.bcg) dat < escalc( measure = "RR", ai = tpos, bi = tneg, ci = cpos, di = cneg, data = dat.bcg ) dat$alloc < ifelse(dat$alloc == "random", "random", "other") d << dat model < rma(yi, vi, mods = ~alloc, data = d, digits = 3, slab = author) model_parameters(model) } if (require("metaBMA", quietly = TRUE)) { data(towels) m < suppressWarnings(meta_random(logOR, SE, study, data = towels)) model_parameters(m) }
WRS2
Parameters from robust statistical objects in WRS2
## S3 method for class 't1way' model_parameters(model, keep = NULL, verbose = TRUE, ...)
## S3 method for class 't1way' model_parameters(model, keep = NULL, verbose = TRUE, ...)
model 
Object from 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. 
A data frame of indices related to the model's parameters.
if (require("WRS2") && packageVersion("WRS2") >= "1.1.3") { model < t1way(libido ~ dose, data = viagra) model_parameters(model) }
if (require("WRS2") && packageVersion("WRS2") >= "1.1.3") { model < t1way(libido ~ dose, data = viagra) model_parameters(model) }
Parameters from zeroinflated models (from packages like pscl, cplm or countreg).
## S3 method for class 'zcpglm' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "zi", "zero_inflated"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mhurdle' model_parameters( model, ci = 0.95, component = c("all", "conditional", "zi", "zero_inflated", "infrequent_purchase", "ip", "auxiliary"), exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
## S3 method for class 'zcpglm' model_parameters( model, ci = 0.95, bootstrap = FALSE, iterations = 1000, component = c("all", "conditional", "zi", "zero_inflated"), standardize = NULL, exponentiate = FALSE, p_adjust = NULL, summary = getOption("parameters_summary", FALSE), include_info = getOption("parameters_info", FALSE), keep = NULL, drop = NULL, verbose = TRUE, ... ) ## S3 method for class 'mhurdle' model_parameters( model, ci = 0.95, component = c("all", "conditional", "zi", "zero_inflated", "infrequent_purchase", "ip", "auxiliary"), exponentiate = FALSE, p_adjust = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
A model with zeroinflation component. 
ci 
Confidence Interval (CI) level. Default to 
bootstrap 
Should estimates be based on bootstrapped model? If

iterations 
The number of bootstrap replicates. This only apply in the case of bootstrapped frequentist models. 
component 
Should all parameters, parameters for the conditional model,
for the zeroinflation part of the model, or the dispersion model be returned?
Applies to models with zeroinflation and/or dispersion component. 
standardize 
The method used for standardizing the parameters. Can be

exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
p_adjust 
Character vector, if not 
summary 
Deprecated, please use 
include_info 
Logical, if 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

A data frame of indices related to the model's parameters.
insight::standardize_names()
to rename
columns into a consistent, standardized naming scheme.
library(parameters) if (require("pscl")) { data("bioChemists") model < zeroinfl(art ~ fem + mar + kid5 + ment  kid5 + phd, data = bioChemists) model_parameters(model) }
library(parameters) if (require("pscl")) { data("bioChemists") model < zeroinfl(art ~ fem + mar + kid5 + ment  kid5 + phd, data = bioChemists) model_parameters(model) }
Similarly to n_factors()
for factor / principal component analysis,
n_clusters()
is the main function to find out the optimal numbers of clusters
present in the data based on the maximum consensus of a large number of
methods.
Essentially, there exist many methods to determine the optimal number of
clusters, each with pros and cons, benefits and limitations. The main
n_clusters
function proposes to run all of them, and find out the number of
clusters that is suggested by the majority of methods (in case of ties, it
will select the most parsimonious solution with fewer clusters).
Note that we also implement some specific, commonly used methods, like the Elbow or the Gap method, with their own visualization functionalities. See the examples below for more details.
n_clusters( x, standardize = TRUE, include_factors = FALSE, package = c("easystats", "NbClust", "mclust"), fast = TRUE, nbclust_method = "kmeans", n_max = 10, ... ) n_clusters_elbow( x, standardize = TRUE, include_factors = FALSE, clustering_function = stats::kmeans, n_max = 10, ... ) n_clusters_gap( x, standardize = TRUE, include_factors = FALSE, clustering_function = stats::kmeans, n_max = 10, gap_method = "firstSEmax", ... ) n_clusters_silhouette( x, standardize = TRUE, include_factors = FALSE, clustering_function = stats::kmeans, n_max = 10, ... ) n_clusters_dbscan( x, standardize = TRUE, include_factors = FALSE, method = c("kNN", "SS"), min_size = 0.1, eps_n = 50, eps_range = c(0.1, 3), ... ) n_clusters_hclust( x, standardize = TRUE, include_factors = FALSE, distance_method = "correlation", hclust_method = "average", ci = 0.95, iterations = 100, ... )
n_clusters( x, standardize = TRUE, include_factors = FALSE, package = c("easystats", "NbClust", "mclust"), fast = TRUE, nbclust_method = "kmeans", n_max = 10, ... ) n_clusters_elbow( x, standardize = TRUE, include_factors = FALSE, clustering_function = stats::kmeans, n_max = 10, ... ) n_clusters_gap( x, standardize = TRUE, include_factors = FALSE, clustering_function = stats::kmeans, n_max = 10, gap_method = "firstSEmax", ... ) n_clusters_silhouette( x, standardize = TRUE, include_factors = FALSE, clustering_function = stats::kmeans, n_max = 10, ... ) n_clusters_dbscan( x, standardize = TRUE, include_factors = FALSE, method = c("kNN", "SS"), min_size = 0.1, eps_n = 50, eps_range = c(0.1, 3), ... ) n_clusters_hclust( x, standardize = TRUE, include_factors = FALSE, distance_method = "correlation", hclust_method = "average", ci = 0.95, iterations = 100, ... )
x 
A data frame. 
standardize 
Standardize the dataframe before clustering (default). 
include_factors 
Logical, if 
package 
Package from which methods are to be called to determine the
number of clusters. Can be 
fast 
If 
nbclust_method 
The clustering method (passed to 
n_max 
Maximal number of clusters to test. 
... 
Arguments passed to or from other methods. For instance, when
Further nondocumented arguments are:

clustering_function , gap_method

Other arguments passed to other
functions. 
method , min_size , eps_n , eps_range

Arguments for DBSCAN algorithm. 
distance_method 
The distance method (passed to 
hclust_method 
The hierarchical clustering method (passed to 
ci 
Confidence Interval (CI) level. Default to 
iterations 
The number of bootstrap replicates. This only apply in the case of bootstrapped frequentist models. 
There is also a plot()
method implemented in the seepackage.
library(parameters) # The main 'n_clusters' function =============================== if (require("mclust", quietly = TRUE) && require("NbClust", quietly = TRUE) && require("cluster", quietly = TRUE) && require("see", quietly = TRUE)) { n < n_clusters(iris[, 1:4], package = c("NbClust", "mclust")) # package can be "all" n summary(n) as.data.frame(n) # Duration is the time elapsed for each method in seconds plot(n) # The following runs all the method but it significantly slower # n_clusters(iris[1:4], standardize = FALSE, package = "all", fast = FALSE) } x < n_clusters_elbow(iris[1:4]) x as.data.frame(x) plot(x) # # Gap method  if (require("see", quietly = TRUE) && require("cluster", quietly = TRUE) && require("factoextra", quietly = TRUE)) { x < n_clusters_gap(iris[1:4]) x as.data.frame(x) plot(x) } # # Silhouette method  if (require("factoextra", quietly = TRUE)) { x < n_clusters_silhouette(iris[1:4]) x as.data.frame(x) plot(x) } # if (require("dbscan", quietly = TRUE)) { # DBSCAN method  # NOTE: This actually primarily estimates the 'eps' parameter, the number of # clusters is a side effect (it's the number of clusters corresponding to # this 'optimal' EPS parameter). x < n_clusters_dbscan(iris[1:4], method = "kNN", min_size = 0.05) # 5 percent x head(as.data.frame(x)) plot(x) x < n_clusters_dbscan(iris[1:4], method = "SS", eps_n = 100, eps_range = c(0.1, 2)) x head(as.data.frame(x)) plot(x) } # # hclust method  if (require("pvclust", quietly = TRUE)) { # iterations should be higher for real analyses x < n_clusters_hclust(iris[1:4], iterations = 50, ci = 0.90) x head(as.data.frame(x), n = 10) # Print 10 first rows plot(x) }
library(parameters) # The main 'n_clusters' function =============================== if (require("mclust", quietly = TRUE) && require("NbClust", quietly = TRUE) && require("cluster", quietly = TRUE) && require("see", quietly = TRUE)) { n < n_clusters(iris[, 1:4], package = c("NbClust", "mclust")) # package can be "all" n summary(n) as.data.frame(n) # Duration is the time elapsed for each method in seconds plot(n) # The following runs all the method but it significantly slower # n_clusters(iris[1:4], standardize = FALSE, package = "all", fast = FALSE) } x < n_clusters_elbow(iris[1:4]) x as.data.frame(x) plot(x) # # Gap method  if (require("see", quietly = TRUE) && require("cluster", quietly = TRUE) && require("factoextra", quietly = TRUE)) { x < n_clusters_gap(iris[1:4]) x as.data.frame(x) plot(x) } # # Silhouette method  if (require("factoextra", quietly = TRUE)) { x < n_clusters_silhouette(iris[1:4]) x as.data.frame(x) plot(x) } # if (require("dbscan", quietly = TRUE)) { # DBSCAN method  # NOTE: This actually primarily estimates the 'eps' parameter, the number of # clusters is a side effect (it's the number of clusters corresponding to # this 'optimal' EPS parameter). x < n_clusters_dbscan(iris[1:4], method = "kNN", min_size = 0.05) # 5 percent x head(as.data.frame(x)) plot(x) x < n_clusters_dbscan(iris[1:4], method = "SS", eps_n = 100, eps_range = c(0.1, 2)) x head(as.data.frame(x)) plot(x) } # # hclust method  if (require("pvclust", quietly = TRUE)) { # iterations should be higher for real analyses x < n_clusters_hclust(iris[1:4], iterations = 50, ci = 0.90) x head(as.data.frame(x), n = 10) # Print 10 first rows plot(x) }
This function runs many existing procedures for determining how many factors to retain/extract from factor analysis (FA) or dimension reduction (PCA). It returns the number of factors based on the maximum consensus between methods. In case of ties, it will keep the simplest model and select the solution with the fewer factors.
n_factors( x, type = "FA", rotation = "varimax", algorithm = "default", package = c("nFactors", "psych"), cor = NULL, safe = TRUE, n_max = NULL, ... ) n_components( x, type = "PCA", rotation = "varimax", algorithm = "default", package = c("nFactors", "psych"), cor = NULL, safe = TRUE, ... )
n_factors( x, type = "FA", rotation = "varimax", algorithm = "default", package = c("nFactors", "psych"), cor = NULL, safe = TRUE, n_max = NULL, ... ) n_components( x, type = "PCA", rotation = "varimax", algorithm = "default", package = c("nFactors", "psych"), cor = NULL, safe = TRUE, ... )
x 
A data frame. 
type 
Can be 
rotation 
Only used for VSS (Very Simple Structure criterion, see

algorithm 
Factoring method used by VSS. Can be 
package 
Package from which respective methods are used. Can be

cor 
An optional correlation matrix that can be used (note that the
data must still be passed as the first argument). If 
safe 
If 
n_max 
If set to a value (e.g., 
... 
Arguments passed to or from other methods. 
n_components()
is actually an alias for n_factors()
, with
different defaults for the function arguments.
A data frame.
There is also a
plot()
method
implemented in the seepackage.
n_components()
is a convenient shortcut for n_factors(type = "PCA")
.
Bartlett, M. S. (1950). Tests of significance in factor analysis. British Journal of statistical psychology, 3(2), 7785.
Bentler, P. M., & Yuan, K. H. (1996). Test of linear trend in eigenvalues of a covariance matrix with application to data analysis. British Journal of Mathematical and Statistical Psychology, 49(2), 299312.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate behavioral research, 1(2), 245276.
Finch, W. H. (2019). Using Fit Statistic Differences to Determine the Optimal Number of Factors to Retain in an Exploratory Factor Analysis. Educational and Psychological Measurement.
Zoski, K. W., & Jurs, S. (1996). An objective counterpart to the visual scree test for factor analysis: The standard error scree. Educational and Psychological Measurement, 56(3), 443451.
Zoski, K., & Jurs, S. (1993). Using multiple regression to determine the number of factors to retain in factor analysis. Multiple Linear Regression Viewpoints, 20(1), 59.
Nasser, F., Benson, J., & Wisenbaker, J. (2002). The performance of regressionbased variations of the visual scree for determining the number of common factors. Educational and psychological measurement, 62(3), 397419.
Golino, H., Shi, D., Garrido, L. E., Christensen, A. P., Nieto, M. D., Sadana, R., & Thiyagarajan, J. A. (2018). Investigating the performance of Exploratory Graph Analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial.
Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PloS one, 12(6), e0174035.
Revelle, W., & Rocklin, T. (1979). Very simple structure: An alternative procedure for estimating the optimal number of interpretable factors. Multivariate Behavioral Research, 14(4), 403414.
Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations. Psychometrika, 41(3), 321327.
library(parameters) n_factors(mtcars, type = "PCA") result < n_factors(mtcars[1:5], type = "FA") as.data.frame(result) summary(result) # Setting package = 'all' will increase the number of methods (but is slow) n_factors(mtcars, type = "PCA", package = "all") n_factors(mtcars, type = "FA", algorithm = "mle", package = "all")
library(parameters) n_factors(mtcars, type = "PCA") result < n_factors(mtcars[1:5], type = "FA") as.data.frame(result) summary(result) # Setting package = 'all' will increase the number of methods (but is slow) n_factors(mtcars, type = "PCA", package = "all") n_factors(mtcars, type = "FA", algorithm = "mle", package = "all")
Compute calibrated pvalues that can be interpreted probabilistically, i.e. as posterior probability of H0 (given that H0 and H1 have equal prior probabilities).
p_calibrate(x, ...) ## Default S3 method: p_calibrate(x, type = "frequentist", verbose = TRUE, ...)
p_calibrate(x, ...) ## Default S3 method: p_calibrate(x, type = "frequentist", verbose = TRUE, ...)
x 
A numeric vector of pvalues, or a regression model object. 
... 
Currently not used. 
type 
Type of calibration. Can be 
verbose 
Toggle warnings. 
The Bayesian calibration, i.e. when type = "bayesian"
, can be interpreted
as the lower bound of the Bayes factor for H0 to H1, based on the data.
The full Bayes factor would then require multiplying by the prior odds of
H0 to H1. The frequentist calibration also has a Bayesian interpretation; it
is the posterior probability of H0, assuming that H0 and H1 have equal
prior probabilities of 0.5 each (Sellke et al. 2001).
The calibration only works for pvalues lower than or equal to 1/e
.
A data frame with pvalues and calibrated pvalues.
Thomas Sellke, M. J Bayarri and James O Berger (2001) Calibration of p Values for Testing Precise Null Hypotheses, The American Statistician, 55:1, 6271, doi:10.1198/000313001300339950
model < lm(mpg ~ wt + as.factor(gear) + am, data = mtcars) p_calibrate(model, verbose = FALSE)
model < lm(mpg ~ wt + as.factor(gear) + am, data = mtcars) p_calibrate(model, verbose = FALSE)
Compute the Probability of Direction (pd, also known as the Maximum Probability of Effect  MPE). This can be interpreted as the probability that a parameter (described by its full confidence, or "compatibility" interval) is strictly positive or negative (whichever is the most probable). Although differently expressed, this index is fairly similar (i.e., is strongly correlated) to the frequentist pvalue (see 'Details').
## S3 method for class 'lm' p_direction( x, ci = 0.95, method = "direct", null = 0, vcov = NULL, vcov_args = NULL, ... )
## S3 method for class 'lm' p_direction( x, ci = 0.95, method = "direct", null = 0, vcov = NULL, vcov_args = NULL, ... )
x 
A statistical model. 
ci 
Confidence Interval (CI) level. Default to 
method 
Can be 
null 
The value considered as a "null" effect. Traditionally 0, but could also be 1 in the case of ratios of change (OR, IRR, ...). 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
... 
Arguments passed to other methods, e.g. 
A data frame.
The Probability of Direction (pd) is an index of effect existence, representing the certainty with which an effect goes in a particular direction (i.e., is positive or negative / has a sign), typically ranging from 0.5 to 1 (but see next section for cases where it can range between 0 and 1). Beyond its simplicity of interpretation, understanding and computation, this index also presents other interesting properties:
Like other posteriorbased indices, pd is solely based on the posterior distributions and does not require any additional information from the data or the model (e.g., such as priors, as in the case of Bayes factors).
It is robust to the scale of both the response variable and the predictors.
It is strongly correlated with the frequentist pvalue, and can thus be used to draw parallels and give some reference to readers nonfamiliar with Bayesian statistics (Makowski et al., 2019).
In most cases, it seems that the pd has a direct correspondence with the
frequentist onesided pvalue through the formula (for twosided p):
p = 2 * (1  p_{d})
Thus, a twosided pvalue of respectively .1
, .05
, .01
and .001
would
correspond approximately to a pd of 95%
, 97.5%
, 99.5%
and 99.95%
.
See pd_to_p()
for details.
The largest value pd can take is 1  the posterior is strictly directional. However, the smallest value pd can take depends on the parameter space represented by the posterior.
For a continuous parameter space, exact values of 0 (or any point null
value) are not possible, and so 100% of the posterior has some sign, some
positive, some negative. Therefore, the smallest the pd can be is 0.5 
with an equal posterior mass of positive and negative values. Values close to
0.5 cannot be used to support the null hypothesis (that the parameter does
not have a direction) is a similar why to how large pvalues cannot be used
to support the null hypothesis (see pd_to_p()
; Makowski et al., 2019).
For a discrete parameter space or a parameter space that is a mixture between discrete and continuous spaces, exact values of 0 (or any point null value) are possible! Therefore, the smallest the pd can be is 0  with 100% of the posterior mass on 0. Thus values close to 0 can be used to support the null hypothesis (see van den Bergh et al., 2021).
Examples of posteriors representing discrete parameter space:
When a parameter can only take discrete values.
When a mixture prior/posterior is used (such as the spikeandslab prior; see van den Bergh et al., 2021).
When conducting Bayesian model averaging (e.g., weighted_posteriors()
or
brms::posterior_average
).
There is no standardized approach to drawing conclusions based on the available data and statistical models. A frequently chosen but also much criticized approach is to evaluate results based on their statistical significance (Amrhein et al. 2017).
A more sophisticated way would be to test whether estimated effects exceed the "smallest effect size of interest", to avoid even the smallest effects being considered relevant simply because they are statistically significant, but clinically or practically irrelevant (Lakens et al. 2018, Lakens 2024).
A rather unconventional approach, which is nevertheless advocated by various authors, is to interpret results from classical regression models either in terms of probabilities, similar to the usual approach in Bayesian statistics (Schweder 2018; Schweder and Hjort 2003; Vos 2022) or in terms of relative measure of "evidence" or "compatibility" with the data (Greenland et al. 2022; Rafi and Greenland 2020), which nevertheless comes close to a probabilistic interpretation.
A more detailed discussion of this topic is found in the documentation of
p_function()
.
The parameters package provides several options or functions to aid statistical inference. These are, for example:
equivalence_test()
, to compute the (conditional)
equivalence test for frequentist models
p_significance()
, to compute the probability of
practical significance, which can be conceptualized as a unidirectional
equivalence test
p_function()
, or consonance function, to compute pvalues and
compatibility (confidence) intervals for statistical models
the pd
argument (setting pd = TRUE
) in model_parameters()
includes
a column with the probability of direction, i.e. the probability that a
parameter is strictly positive or negative. See bayestestR::p_direction()
for details. If plotting is desired, the p_direction()
function can be used, together with plot()
.
the s_value
argument (setting s_value = TRUE
) in model_parameters()
replaces the pvalues with their related Svalues (Rafi and Greenland 2020)
finally, it is possible to generate distributions of model coefficients by
generating bootstrapsamples (setting bootstrap = TRUE
) or simulating
draws from model coefficients using simulate_model()
. These samples
can then be treated as "posterior samples" and used in many functions from
the bayestestR package.
Most of the above shown options or functions derive from methods originally
implemented for Bayesian models (Makowski et al. 2019). However, assuming
that model assumptions are met (which means, the model fits well to the data,
the correct model is chosen that reflects the data generating process
(distributional model family) etc.), it seems appropriate to interpret
results from classical frequentist models in a "Bayesian way" (more details:
documentation in p_function()
).
Amrhein, V., KornerNievergelt, F., and Roth, T. (2017). The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research. PeerJ, 5, e3544. doi:10.7717/peerj.3544
Greenland S, Rafi Z, Matthews R, Higgs M. To Aid Scientific Inference, Emphasize Unconditional Compatibility Descriptions of Statistics. (2022) https://arxiv.org/abs/1909.08583v7 (Accessed November 10, 2022)
Lakens, D. (2024). Improving Your Statistical Inferences (Version v1.5.1). Retrieved from https://lakens.github.io/statistical_inferences/. doi:10.5281/ZENODO.6409077
Lakens, D., Scheel, A. M., and Isager, P. M. (2018). Equivalence Testing for Psychological Research: A Tutorial. Advances in Methods and Practices in Psychological Science, 1(2), 259–269. doi:10.1177/2515245918770963
Makowski, D., BenShachar, M. S., Chen, S. H. A., and Lüdecke, D. (2019). Indices of Effect Existence and Significance in the Bayesian Framework. Frontiers in Psychology, 10, 2767. doi:10.3389/fpsyg.2019.02767
Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise. BMC Medical Research Methodology (2020) 20:244.
Schweder T. Confidence is epistemic probability for empirical science. Journal of Statistical Planning and Inference (2018) 195:116–125. doi:10.1016/j.jspi.2017.09.016
Schweder T, Hjort NL. Frequentist analogues of priors and posteriors. In Stigum, B. (ed.), Econometrics and the Philosophy of Economics: Theory Data Confrontation in Economics, pp. 285217. Princeton University Press, Princeton, NJ, 2003
Vos P, Holbert D. Frequentist statistical inference without repeated sampling. Synthese 200, 89 (2022). doi:10.1007/s1122902203560x
See also equivalence_test()
, p_function()
and
p_significance()
for functions related to checking effect existence and
significance.
data(qol_cancer) model < lm(QoL ~ time + age + education, data = qol_cancer) p_direction(model) # based on heteroscedasticityrobust standard errors p_direction(model, vcov = "HC3") result < p_direction(model) plot(result)
data(qol_cancer) model < lm(QoL ~ time + age + education, data = qol_cancer) p_direction(model) # based on heteroscedasticityrobust standard errors p_direction(model, vcov = "HC3") result < p_direction(model) plot(result)
Compute pvalues and compatibility (confidence) intervals for
statistical models, at different levels. This function is also called
consonance function. It allows to see which estimates are compatible with
the model at various compatibility levels. Use plot()
to generate plots
of the p resp. consonance function and compatibility intervals at
different levels.
p_function( model, ci_levels = c(0.25, 0.5, 0.75, emph = 0.95), exponentiate = FALSE, effects = "fixed", component = "all", vcov = NULL, vcov_args = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) consonance_function( model, ci_levels = c(0.25, 0.5, 0.75, emph = 0.95), exponentiate = FALSE, effects = "fixed", component = "all", vcov = NULL, vcov_args = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) confidence_curve( model, ci_levels = c(0.25, 0.5, 0.75, emph = 0.95), exponentiate = FALSE, effects = "fixed", component = "all", vcov = NULL, vcov_args = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
p_function( model, ci_levels = c(0.25, 0.5, 0.75, emph = 0.95), exponentiate = FALSE, effects = "fixed", component = "all", vcov = NULL, vcov_args = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) consonance_function( model, ci_levels = c(0.25, 0.5, 0.75, emph = 0.95), exponentiate = FALSE, effects = "fixed", component = "all", vcov = NULL, vcov_args = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... ) confidence_curve( model, ci_levels = c(0.25, 0.5, 0.75, emph = 0.95), exponentiate = FALSE, effects = "fixed", component = "all", vcov = NULL, vcov_args = NULL, keep = NULL, drop = NULL, verbose = TRUE, ... )
model 
Statistical Model. 
ci_levels 
Vector of scalars, indicating the different levels at which
compatibility intervals should be printed or plotted. In plots, these levels
are highlighted by vertical lines. It is possible to increase thickness for
one or more of these lines by providing a names vector, where the to be
highlighted values should be named 
exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
effects 
Should parameters for fixed effects ( 
component 
Should all parameters, parameters for the conditional model,
for the zeroinflation part of the model, or the dispersion model be returned?
Applies to models with zeroinflation and/or dispersion component. 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
keep 
Character containing a regular expression pattern that
describes the parameters that should be included (for 
drop 
See 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to or from other methods. Nondocumented arguments are

p_function()
only returns the compatibility interval estimates, not the
related pvalues. The reason for this is because the pvalue for a
given estimate value is just 1  CI_level
. The values indicating the lower
and upper limits of the intervals are the related estimates associated with
the pvalue. E.g., if a parameter x
has a 75% compatibility interval
of (0.81, 1.05)
, then the pvalue for the estimate value of 0.81
would be 1  0.75
, which is 0.25
. This relationship is more intuitive and
better to understand when looking at the plots (using plot()
).
p_function()
, and in particular its plot()
method, aims at reinterpreting
pvalues and confidence intervals (better named: compatibility intervals)
in unconditional terms. Instead of referring to the longterm property and
repeated trials when interpreting interval estimates (socalled "aleatory
probability", Schweder 2018), and assuming that all underlying assumptions
are correct and met, p_function()
interprets pvalues in a Fisherian way
as "continuous measure of evidence against the very test hypothesis and
entire model (all assumptions) used to compute it"
(PValues Are Tough and SValues Can Help, lesslikely.com/statistics/svalues;
see also Amrhein and Greenland 2022).
This interpretation as a continuous measure of evidence against the test hypothesis and the entire model used to compute it can be seen in the figure below (taken from PValues Are Tough and SValues Can Help, lesslikely.com/statistics/svalues). The "conditional" interpretation of pvalues and interval estimates (A) implicitly assumes certain assumptions to be true, thus the interpretation is "conditioned" on these assumptions (i.e. assumptions are taken as given). The unconditional interpretation (B), however, questions all these assumptions.
"Emphasizing unconditional interpretations helps avoid overconfident and misleading inferences in light of uncertainties about the assumptions used to arrive at the statistical results." (Greenland et al. 2022).
Note: The term "conditional" as used by Rafi and Greenland probably has a slightly different meaning than normally. "Conditional" in this notion means that all model assumptions are taken as given  it should not be confused with terms like "conditional probability". See also Greenland et al. 2022 for a detailed elaboration on this issue.
In other words, the term compatibility interval emphasizes "the dependence of the pvalue on the assumptions as well as on the data, recognizing that p<0.05 can arise from assumption violations even if the effect under study is null" (Gelman/Greenland 2019).
Schweder (2018) resp. Schweder and Hjort (2016) (and others) argue that
confidence curves (as produced by p_function()
) have a valid probabilistic
interpretation. They distinguish between aleatory probability, which
describes the aleatory stochastic element of a distribution ex ante, i.e.
before the data are obtained. This is the classical interpretation of
confidence intervals following the NeymanPearson school of statistics.
However, there is also an ex post probability, called epistemic probability,
for confidence curves. The shift in terminology from confidence intervals
to compatibility intervals may help emphasizing this interpretation.
In this sense, the probabilistic interpretation of pvalues and compatibility intervals is "conditional"  on the data and model assumptions (which is in line with the "unconditional" interpretation in the sense of Rafi and Greenland).
Ascribing a probabilistic interpretation to one realized confidence interval is possible without repeated sampling of the specific experiment. Important is the assumption that a sampling distribution is a good description of the variability of the parameter (Vos and Holbert 2022). At the core, the interpretation of a confidence interval is "I assume that this sampling distribution is a good description of the uncertainty of the parameter. If that's a good assumption, then the values in this interval are the most plausible or compatible with the data". The source of confidence in probability statements is the assumption that the selected sampling distribution is appropriate.
"The realized confidence distribution is clearly an epistemic probability distribution" (Schweder 2018). In Bayesian words, compatibility intervals (or confidence distributons, or consonance curves) are "posteriors without priors" (Schweder, Hjort, 2003).
The pvalue indicates the degree of compatibility of the endpoints of the
interval at a given confidence level with (1) the observed data and (2) model
assumptions. The observed point estimate (pvalue = 1) is the value
estimated to be most compatible with the data and model assumptions,
whereas values values far from the observed point estimate (where p
approaches 0) are least compatible with the data and model assumptions
(Schweder and Hjort 2016, pp. 6061; Amrhein and Greenland 2022). In this
regards, pvalues are statements about confidence or compatibility:
The pvalue is not an absolute measure of evidence for a model (such as the
null/alternative model), it is a continuous measure of the compatibility of
the observed data with the model used to compute it (Greenland et al. 2016,
Greenland 2023). Going one step further, and following Schweder, pvalues
can be considered as epistemic probability  "not necessarily of the
hypothesis being true, but of it possibly being true" (Schweder 2018).
Hence, the interpretation of pvalues might be guided using
bayestestR::p_to_pd()
.
We here presented the discussion of pvalues and confidence intervals from the perspective of two paradigms, one saying that probability statements can be made, one saying that interpretation is guided in terms of "compatibility". Cox and Hinkley say, "interval estimates cannot be taken as probability statements" (Cox and Hinkley 1979: 208), which conflicts with the Schweder and Hjort confidence distribution school. However, if you view interval estimates as being intervals of values being consistent with the data, this comes close to the idea of (epistemic) probability. We do not believe that these two paradigms contradict or exclude each other. Rather, the aim is to emphasize one point of view or the other, i.e. to place the linguistic nuances either on 'compatibility' or 'probability'.
The main takeaway is not to interpret pvalues as dichotomous decisions that distinguish between "we found an effect" (statistically significant)" vs. "we found no effect" (statistically not significant) (Altman and Bland 1995).
The fact that the term "conditional" is used in different meanings in statistics, is confusing and unfortunate. Thus, we would summarize the (probabilistic) interpretation of compatibility intervals as follows: The intervals are built from the data and our modeling assumptions. The accuracy of the intervals depends on our model assumptions. If a value is outside the interval, that might be because (1) that parameter value isn't supported by the data, or (2) the modeling assumptions are a poor fit for the situation. When we make bad assumptions, the compatibility interval might be too wide or (more commonly and seriously) too narrow, making us think we know more about the parameter than is warranted.
When we say "there is a 95% chance the true value is in the interval", that is a statement of epistemic probability (i.e. description of uncertainty related to our knowledge or belief). When we talk about repeated samples or sampling distributions, that is referring to aleatoric (physical properties) probability. Frequentist inference is built on defining estimators with known aleatoric probability properties, from which we can draw epistemic probabilistic statements of uncertainty (Schweder and Hjort 2016).
The parameters package provides several options or functions to aid
statistical inference. Beyond p_function()
, there are, for example:
equivalence_test()
, to compute the (conditional)
equivalence test for frequentist models
p_significance()
, to compute the probability of
practical significance, which can be conceptualized as a unidirectional
equivalence test
the pd
argument (setting pd = TRUE
) in model_parameters()
includes
a column with the probability of direction, i.e. the probability that a
parameter is strictly positive or negative. See bayestestR::p_direction()
for details. If plotting is desired, the p_direction()
function can be used, together with plot()
.
the s_value
argument (setting s_value = TRUE
) in model_parameters()
replaces the pvalues with their related Svalues (Rafi and Greenland 2020)
finally, it is possible to generate distributions of model coefficients by
generating bootstrapsamples (setting bootstrap = TRUE
) or simulating
draws from model coefficients using simulate_model()
. These samples
can then be treated as "posterior samples" and used in many functions from
the bayestestR package.
A data frame with pvalues and compatibility intervals.
Curently, p_function()
computes intervals based on Wald t or zstatistic.
For certain models (like mixed models), profiled intervals may be more
accurate, however, this is currently not supported.
Altman DG, Bland JM. Absence of evidence is not evidence of absence. BMJ. 1995;311(7003):485. doi:10.1136/bmj.311.7003.485
Amrhein V, Greenland S. Discuss practical importance of results based on interval estimates and pvalue functions, not only on point estimates and null pvalues. Journal of Information Technology 2022;37:316–20. doi:10.1177/02683962221105904
Cox DR, Hinkley DV. 1979. Theoretical Statistics. 6th edition. Chapman and Hall/CRC
Fraser DAS. The Pvalue function and statistical inference. The American Statistician. 2019;73(sup1):135147. doi:10.1080/00031305.2018.1556735
Gelman A, Greenland S. Are confidence intervals better termed "uncertainty intervals"? BMJ (2019)l5381. doi:10.1136/bmj.l5381
Greenland S, Rafi Z, Matthews R, Higgs M. To Aid Scientific Inference, Emphasize Unconditional Compatibility Descriptions of Statistics. (2022) https://arxiv.org/abs/1909.08583v7 (Accessed November 10, 2022)
Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al. (2016). Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology. 31:337350. doi:10.1007/s1065401601493
Greenland S (2023). Divergence versus decision Pvalues: A distinction worth making in theory and keeping in practice: Or, how divergence Pvalues measure evidence even when decision Pvalues do not. Scand J Statist, 50(1), 5488.
Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical science: Replace confidence and significance by compatibility and surprise. BMC Medical Research Methodology. 2020;20(1):244. doi:10.1186/s12874020011059
Schweder T. Confidence is epistemic probability for empirical science. Journal of Statistical Planning and Inference (2018) 195:116–125. doi:10.1016/j.jspi.2017.09.016
Schweder T, Hjort NL. Confidence and Likelihood. Scandinavian Journal of Statistics. 2002;29(2):309332. doi:10.1111/14679469.00285
Schweder T, Hjort NL. Frequentist analogues of priors and posteriors. In Stigum, B. (ed.), Econometrics and the Philosophy of Economics: Theory Data Confrontation in Economics, pp. 285217. Princeton University Press, Princeton, NJ, 2003
Schweder T, Hjort NL. Confidence, Likelihood, Probability: Statistical inference with confidence distributions. Cambridge University Press, 2016.
Vos P, Holbert D. Frequentist statistical inference without repeated sampling. Synthese 200, 89 (2022). doi:10.1007/s1122902203560x
See also equivalence_test()
and p_significance()
for
functions related to checking effect existence and significance.
model < lm(Sepal.Length ~ Species, data = iris) p_function(model) model < lm(mpg ~ wt + as.factor(gear) + am, data = mtcars) result < p_function(model) # single panels plot(result, n_columns = 2) # integrated plot, the default plot(result)
model < lm(Sepal.Length ~ Species, data = iris) p_function(model) model < lm(mpg ~ wt + as.factor(gear) + am, data = mtcars) result < p_function(model) # single panels plot(result, n_columns = 2) # integrated plot, the default plot(result)
Compute the probability of Practical Significance (ps), which can be conceptualized as a unidirectional equivalence test. It returns the probability that an effect is above a given threshold corresponding to a negligible effect in the median's direction, considering a parameter's full confidence interval. In other words, it returns the probability of a clear direction of an effect, which is larger than the smallest effect size of interest (e.g., a minimal important difference). Its theoretical range is from zero to one, but the ps is typically larger than 0.5 (to indicate practical significance).
In comparison the the equivalence_test()
function, where the SGPV
(second generation pvalue) describes the proportion of the full confidence
interval that is inside the ROPE, the value returned by p_significance()
describes the larger proportion of the full confidence interval that is
outside the ROPE. This makes p_significance()
comparable to
bayestestR::p_direction()
, however, while p_direction()
compares to a
pointnull by default, p_significance()
compares to a rangenull.
## S3 method for class 'lm' p_significance( x, threshold = "default", ci = 0.95, vcov = NULL, vcov_args = NULL, verbose = TRUE, ... )
## S3 method for class 'lm' p_significance( x, threshold = "default", ci = 0.95, vcov = NULL, vcov_args = NULL, verbose = TRUE, ... )
x 
A statistical model. 
threshold 
The threshold value that separates significant from negligible effect, which can have following possible values:

ci 
Confidence Interval (CI) level. Default to 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
verbose 
Toggle warnings and messages. 
... 
Arguments passed to other methods. 
p_significance()
returns the proportion of the full confidence
interval range (assuming a normally or tdistributed, equaltailed interval,
based on the model) that is outside a certain range (the negligible effect,
or ROPE, see argument threshold
). If there are values of the distribution
both below and above the ROPE, p_significance()
returns the higher
probability of a value being outside the ROPE. Typically, this value should
be larger than 0.5 to indicate practical significance. However, if the range
of the negligible effect is rather large compared to the range of the
confidence interval, p_significance()
will be less than 0.5, which
indicates no clear practical significance.
Note that the assumed interval, which is used to calculate the practical significance, is an estimation of the full interval based on the chosen confidence level. For example, if the 95% confidence interval of a coefficient ranges from 1 to 1, the underlying full (normally or tdistributed) interval approximately ranges from 1.9 to 1.9, see also following code:
# simulate full normal distribution out < bayestestR::distribution_normal(10000, 0, 0.5) # range of "full" distribution range(out) # range of 95% CI round(quantile(out, probs = c(0.025, 0.975)), 2)
This ensures that the practical significance always refers to the general compatible parameter space of coefficients. Therefore, the full interval is similar to a Bayesian posterior distribution of an equivalent Bayesian model, see following code:
library(bayestestR) library(brms) m < lm(mpg ~ gear + wt + cyl + hp, data = mtcars) m2 < brm(mpg ~ gear + wt + cyl + hp, data = mtcars) # probability of significance (ps) for frequentist model p_significance(m) # similar to ps of Bayesian models p_significance(m2) # similar to ps of simulated draws / bootstrap samples p_significance(simulate_model(m))
A data frame with columns for the parameter names, the confidence intervals and the values for practical significance. Higher values indicate more practical significance (upper bound is one).
There is no standardized approach to drawing conclusions based on the available data and statistical models. A frequently chosen but also much criticized approach is to evaluate results based on their statistical significance (Amrhein et al. 2017).
A more sophisticated way would be to test whether estimated effects exceed the "smallest effect size of interest", to avoid even the smallest effects being considered relevant simply because they are statistically significant, but clinically or practically irrelevant (Lakens et al. 2018, Lakens 2024).
A rather unconventional approach, which is nevertheless advocated by various authors, is to interpret results from classical regression models either in terms of probabilities, similar to the usual approach in Bayesian statistics (Schweder 2018; Schweder and Hjort 2003; Vos 2022) or in terms of relative measure of "evidence" or "compatibility" with the data (Greenland et al. 2022; Rafi and Greenland 2020), which nevertheless comes close to a probabilistic interpretation.
A more detailed discussion of this topic is found in the documentation of
p_function()
.
The parameters package provides several options or functions to aid statistical inference. These are, for example:
equivalence_test()
, to compute the (conditional)
equivalence test for frequentist models
p_significance()
, to compute the probability of
practical significance, which can be conceptualized as a unidirectional
equivalence test
p_function()
, or consonance function, to compute pvalues and
compatibility (confidence) intervals for statistical models
the pd
argument (setting pd = TRUE
) in model_parameters()
includes
a column with the probability of direction, i.e. the probability that a
parameter is strictly positive or negative. See bayestestR::p_direction()
for details. If plotting is desired, the p_direction()
function can be used, together with plot()
.
the s_value
argument (setting s_value = TRUE
) in model_parameters()
replaces the pvalues with their related Svalues (Rafi and Greenland 2020)
finally, it is possible to generate distributions of model coefficients by
generating bootstrapsamples (setting bootstrap = TRUE
) or simulating
draws from model coefficients using simulate_model()
. These samples
can then be treated as "posterior samples" and used in many functions from
the bayestestR package.
Most of the above shown options or functions derive from methods originally
implemented for Bayesian models (Makowski et al. 2019). However, assuming
that model assumptions are met (which means, the model fits well to the data,
the correct model is chosen that reflects the data generating process
(distributional model family) etc.), it seems appropriate to interpret
results from classical frequentist models in a "Bayesian way" (more details:
documentation in p_function()
).
There is also a plot()
method
implemented in the seepackage.
Amrhein, V., KornerNievergelt, F., and Roth, T. (2017). The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research. PeerJ, 5, e3544. doi:10.7717/peerj.3544
Greenland S, Rafi Z, Matthews R, Higgs M. To Aid Scientific Inference, Emphasize Unconditional Compatibility Descriptions of Statistics. (2022) https://arxiv.org/abs/1909.08583v7 (Accessed November 10, 2022)
Lakens, D. (2024). Improving Your Statistical Inferences (Version v1.5.1). Retrieved from https://lakens.github.io/statistical_inferences/. doi:10.5281/ZENODO.6409077
Lakens, D., Scheel, A. M., and Isager, P. M. (2018). Equivalence Testing for Psychological Research: A Tutorial. Advances in Methods and Practices in Psychological Science, 1(2), 259–269. doi:10.1177/2515245918770963
Makowski, D., BenShachar, M. S., Chen, S. H. A., and Lüdecke, D. (2019). Indices of Effect Existence and Significance in the Bayesian Framework. Frontiers in Psychology, 10, 2767. doi:10.3389/fpsyg.2019.02767
Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise. BMC Medical Research Methodology (2020) 20:244.
Schweder T. Confidence is epistemic probability for empirical science. Journal of Statistical Planning and Inference (2018) 195:116–125. doi:10.1016/j.jspi.2017.09.016
Schweder T, Hjort NL. Frequentist analogues of priors and posteriors. In Stigum, B. (ed.), Econometrics and the Philosophy of Economics: Theory Data Confrontation in Economics, pp. 285217. Princeton University Press, Princeton, NJ, 2003
Vos P, Holbert D. Frequentist statistical inference without repeated sampling. Synthese 200, 89 (2022). doi:10.1007/s1122902203560x
For more details, see bayestestR::p_significance()
. See also
equivalence_test()
, p_function()
and bayestestR::p_direction()
for functions related to checking effect existence and significance.
data(qol_cancer) model < lm(QoL ~ time + age + education, data = qol_cancer) p_significance(model) p_significance(model, threshold = c(0.5, 1.5)) # based on heteroscedasticityrobust standard errors p_significance(model, vcov = "HC3") if (require("see", quietly = TRUE)) { result < p_significance(model) plot(result) }
data(qol_cancer) model < lm(QoL ~ time + age + education, data = qol_cancer) p_significance(model) p_significance(model, threshold = c(0.5, 1.5)) # based on heteroscedasticityrobust standard errors p_significance(model, vcov = "HC3") if (require("see", quietly = TRUE)) { result < p_significance(model) plot(result) }
This function attempts to return, or compute, pvalues of a model's parameters. See the documentation for your object's class:
Bayesian models (rstanarm, brms, MCMCglmm, ...)
Zeroinflated models (hurdle
, zeroinfl
, zerocount
, ...)
Marginal effects models (mfx)
Models with special components (DirichletRegModel
, clm2
, cgam
, ...)
p_value(model, ...) ## Default S3 method: p_value( model, dof = NULL, method = NULL, component = "all", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'emmGrid' p_value(model, ci = 0.95, adjust = "none", ...)
p_value(model, ...) ## Default S3 method: p_value( model, dof = NULL, method = NULL, component = "all", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'emmGrid' p_value(model, ci = 0.95, adjust = "none", ...)
model 
A statistical model. 
... 
Additional arguments 
dof 
Number of degrees of freedom to be used when calculating
confidence intervals. If 
method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
component 
Model component for which parameters should be shown. See
the documentation for your object's class in 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
verbose 
Toggle warnings and messages. 
ci 
Confidence Interval (CI) level. Default to 
adjust 
Character value naming the method used to adjust pvalues or
confidence intervals. See 
A data frame with at least two columns: the parameter names and the pvalues. Depending on the model, may also include columns for model components etc.
There are different ways of approximating the degrees of freedom depending
on different assumptions about the nature of the model and its sampling
distribution. The ci_method
argument modulates the method for computing degrees
of freedom (df) that are used to calculate confidence intervals (CI) and the
related pvalues. Following options are allowed, depending on the model
class:
Classical methods:
Classical inference is generally based on the Wald method. The Wald approach to inference computes a test statistic by dividing the parameter estimate by its standard error (Coefficient / SE), then comparing this statistic against a t or normal distribution. This approach can be used to compute CIs and pvalues.
"wald"
:
Applies to nonBayesian models. For linear models, CIs computed using the Wald method (SE and a tdistribution with residual df); pvalues computed using the Wald method with a tdistribution with residual df. For other models, CIs computed using the Wald method (SE and a normal distribution); pvalues computed using the Wald method with a normal distribution.
"normal"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a normal distribution.
"residual"
Applies to nonBayesian models. Compute Wald CIs and pvalues, but always use a tdistribution with residual df when possible. If the residual df for a model cannot be determined, a normal distribution is used instead.
Methods for mixed models:
Compared to fixed effects (or singlelevel) models, determining appropriate df for Waldbased inference in mixed models is more difficult. See the R GLMM FAQ for a discussion.
Several approximate methods for computing df are available, but you should
also consider instead using profile likelihood ("profile"
) or bootstrap ("boot"
)
CIs and pvalues instead.
"satterthwaite"
Applies to linear mixed models. CIs computed using the Wald method (SE and a tdistribution with Satterthwaite df); pvalues computed using the Wald method with a tdistribution with Satterthwaite df.
"kenward"
Applies to linear mixed models. CIs computed using the Wald method (KenwardRoger SE and a tdistribution with KenwardRoger df); pvalues computed using the Wald method with KenwardRoger SE and tdistribution with KenwardRoger df.
"ml1"
Applies to linear mixed models. CIs computed using the Wald
method (SE and a tdistribution with ml1 approximated df); pvalues
computed using the Wald method with a tdistribution with ml1 approximated df.
See ci_ml1()
.
"betwithin"
Applies to linear mixed models and generalized linear mixed models.
CIs computed using the Wald method (SE and a tdistribution with betweenwithin df);
pvalues computed using the Wald method with a tdistribution with betweenwithin df.
See ci_betwithin()
.
Likelihoodbased methods:
Likelihoodbased inference is based on comparing the likelihood for the
maximumlikelihood estimate to the the likelihood for models with one or more
parameter values changed (e.g., set to zero or a range of alternative values).
Likelihood ratios for the maximumlikelihood and alternative models are compared
to a $\chi$
squared distribution to compute CIs and pvalues.
"profile"
Applies to nonBayesian models of class glm
, polr
, merMod
or glmmTMB
.
CIs computed by profiling the likelihood curve for a parameter, using
linear interpolation to find where likelihood ratio equals a critical value;
pvalues computed using the Wald method with a normaldistribution (note:
this might change in a future update!)
"uniroot"
Applies to nonBayesian models of class glmmTMB
. CIs
computed by profiling the likelihood curve for a parameter, using root
finding to find where likelihood ratio equals a critical value; pvalues
computed using the Wald method with a normaldistribution (note: this
might change in a future update!)
Methods for bootstrapped or Bayesian models:
Bootstrapbased inference is based on resampling and refitting the model to the resampled datasets. The distribution of parameter estimates across resampled datasets is used to approximate the parameter's sampling distribution. Depending on the type of model, several different methods for bootstrapping and constructing CIs and pvalues from the bootstrap distribution are available.
For Bayesian models, inference is based on drawing samples from the model posterior distribution.
"quantile"
(or "eti"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as equal tailed intervals using the quantiles of the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::eti()
.
"hdi"
Applies to all models (including Bayesian models). For nonBayesian
models, only applies if bootstrap = TRUE
. CIs computed as highest density intervals
for the bootstrap or posterior samples; pvalues are based on the probability of direction.
See bayestestR::hdi()
.
"bci"
(or "bcai"
)
Applies to all models (including Bayesian models).
For nonBayesian models, only applies if bootstrap = TRUE
. CIs computed
as bias corrected and accelerated intervals for the bootstrap or
posterior samples; pvalues are based on the probability of direction.
See bayestestR::bci()
.
"si"
Applies to Bayesian models with proper priors. CIs computed as
support intervals comparing the posterior samples against the prior samples;
pvalues are based on the probability of direction. See bayestestR::si()
.
"boot"
Applies to nonBayesian models of class merMod
. CIs computed
using parametric bootstrapping (simulating data from the fitted model);
pvalues computed using the Wald method with a normaldistribution)
(note: this might change in a future update!).
For all iterationbased methods other than "boot"
("hdi"
, "quantile"
, "ci"
, "eti"
, "si"
, "bci"
, "bcai"
),
pvalues are based on the probability of direction (bayestestR::p_direction()
),
which is converted into a pvalue using bayestestR::pd_to_p()
.
data(iris) model < lm(Petal.Length ~ Sepal.Length + Species, data = iris) p_value(model)
data(iris) model < lm(Petal.Length ~ Sepal.Length + Species, data = iris) p_value(model)
This function attempts to return, or compute, pvalues of Bayesian models.
## S3 method for class 'BFBayesFactor' p_value(model, ...)
## S3 method for class 'BFBayesFactor' p_value(model, ...)
model 
A statistical model. 
... 
Additional arguments 
For Bayesian models, the pvalues corresponds to the probability of
direction (bayestestR::p_direction()
), which is converted to a pvalue
using bayestestR::convert_pd_to_p()
.
The pvalues.
data(iris) model < lm(Petal.Length ~ Sepal.Length + Species, data = iris) p_value(model)
data(iris) model < lm(Petal.Length ~ Sepal.Length + Species, data = iris) p_value(model)
This function attempts to return, or compute, pvalues of models with special model components.
## S3 method for class 'DirichletRegModel' p_value(model, component = c("all", "conditional", "precision"), ...) ## S3 method for class 'averaging' p_value(model, component = c("conditional", "full"), ...) ## S3 method for class 'betareg' p_value( model, component = c("all", "conditional", "precision"), verbose = TRUE, ... ) ## S3 method for class 'cgam' p_value(model, component = c("all", "conditional", "smooth_terms"), ...) ## S3 method for class 'clm2' p_value(model, component = c("all", "conditional", "scale"), ...)
## S3 method for class 'DirichletRegModel' p_value(model, component = c("all", "conditional", "precision"), ...) ## S3 method for class 'averaging' p_value(model, component = c("conditional", "full"), ...) ## S3 method for class 'betareg' p_value( model, component = c("all", "conditional", "precision"), verbose = TRUE, ... ) ## S3 method for class 'cgam' p_value(model, component = c("all", "conditional", "smooth_terms"), ...) ## S3 method for class 'clm2' p_value(model, component = c("all", "conditional", "scale"), ...)
model 
A statistical model. 
component 
Should all parameters, parameters for the conditional model,
precision or scalecomponent or smooth_terms be returned? 
... 
Additional arguments 
verbose 
Toggle warnings and messages. 
The pvalues.
This function attempts to return, or compute, pvalues of marginal effects models from package mfx.
## S3 method for class 'poissonmfx' p_value(model, component = c("all", "conditional", "marginal"), ...) ## S3 method for class 'betaor' p_value(model, component = c("all", "conditional", "precision"), ...) ## S3 method for class 'betamfx' p_value( model, component = c("all", "conditional", "precision", "marginal"), ... )
## S3 method for class 'poissonmfx' p_value(model, component = c("all", "conditional", "marginal"), ...) ## S3 method for class 'betaor' p_value(model, component = c("all", "conditional", "precision"), ...) ## S3 method for class 'betamfx' p_value( model, component = c("all", "conditional", "precision", "marginal"), ... )
model 
A statistical model. 
component 
Should all parameters, parameters for the conditional model,
precisioncomponent or marginal effects be returned? 
... 
Currently not used. 
A data frame with at least two columns: the parameter names and the pvalues. Depending on the model, may also include columns for model components etc.
if (require("mfx", quietly = TRUE)) { set.seed(12345) n < 1000 x < rnorm(n) y < rnegbin(n, mu = exp(1 + 0.5 * x), theta = 0.5) d < data.frame(y, x) model < poissonmfx(y ~ x, data = d) p_value(model) p_value(model, component = "marginal") }
if (require("mfx", quietly = TRUE)) { set.seed(12345) n < 1000 x < rnorm(n) y < rnegbin(n, mu = exp(1 + 0.5 * x), theta = 0.5) d < data.frame(y, x) model < poissonmfx(y ~ x, data = d) p_value(model) p_value(model, component = "marginal") }
This function attempts to return, or compute, pvalues of hurdle and zeroinflated models.
## S3 method for class 'zcpglm' p_value(model, component = c("all", "conditional", "zi", "zero_inflated"), ...) ## S3 method for class 'zeroinfl' p_value( model, component = c("all", "conditional", "zi", "zero_inflated"), method = NULL, verbose = TRUE, ... )
## S3 method for class 'zcpglm' p_value(model, component = c("all", "conditional", "zi", "zero_inflated"), ...) ## S3 method for class 'zeroinfl' p_value( model, component = c("all", "conditional", "zi", "zero_inflated"), method = NULL, verbose = TRUE, ... )
model 
A statistical model. 
component 
Model component for which parameters should be shown. See
the documentation for your object's class in 
... 
Additional arguments 
method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
verbose 
Toggle warnings and messages. 
A data frame with at least two columns: the parameter names and the pvalues. Depending on the model, may also include columns for model components etc.
if (require("pscl", quietly = TRUE)) { data("bioChemists") model < zeroinfl(art ~ fem + mar + kid5  kid5 + phd, data = bioChemists) p_value(model) p_value(model, component = "zi") }
if (require("pscl", quietly = TRUE)) { data("bioChemists") model < zeroinfl(art ~ fem + mar + kid5  kid5 + phd, data = bioChemists) p_value(model) p_value(model, component = "zi") }
In a regression model, the parameters do not all have the meaning. For
instance, the intercept has to be interpreted as theoretical outcome value
under some conditions (when predictors are set to 0), whereas other
coefficients are to be interpreted as amounts of change. Others, such as
interactions, represent changes in another of the parameter. The
parameters_type
function attempts to retrieve information and meaning
of parameters. It outputs a dataframe of information for each parameters,
such as the Type
(whether the parameter corresponds to a factor or a
numeric predictor, or whether it is a (regular) interaction or a nested
one), the Link
(whether the parameter can be interpreted as a mean
value, the slope of an association or a difference between two levels) and,
in the case of interactions, which other parameters is impacted by which
parameter.
parameters_type(model, ...)
parameters_type(model, ...)
model 
A statistical model. 
... 
Arguments passed to or from other methods. 
A data frame.
library(parameters) model < lm(Sepal.Length ~ Petal.Length + Species, data = iris) parameters_type(model) model < lm(Sepal.Length ~ Species + poly(Sepal.Width, 2), data = iris) parameters_type(model) model < lm(Sepal.Length ~ Species + poly(Sepal.Width, 2, raw = TRUE), data = iris) parameters_type(model) # Interactions model < lm(Sepal.Length ~ Sepal.Width * Species, data = iris) parameters_type(model) model < lm(Sepal.Length ~ Sepal.Width * Species * Petal.Length, data = iris) parameters_type(model) model < lm(Sepal.Length ~ Species * Sepal.Width, data = iris) parameters_type(model) model < lm(Sepal.Length ~ Species / Sepal.Width, data = iris) parameters_type(model) # Complex interactions data < iris data$fac2 < ifelse(data$Sepal.Width > mean(data$Sepal.Width), "A", "B") model < lm(Sepal.Length ~ Species / fac2 / Petal.Length, data = data) parameters_type(model) model < lm(Sepal.Length ~ Species / fac2 * Petal.Length, data = data) parameters_type(model)
library(parameters) model < lm(Sepal.Length ~ Petal.Length + Species, data = iris) parameters_type(model) model < lm(Sepal.Length ~ Species + poly(Sepal.Width, 2), data = iris) parameters_type(model) model < lm(Sepal.Length ~ Species + poly(Sepal.Width, 2, raw = TRUE), data = iris) parameters_type(model) # Interactions model < lm(Sepal.Length ~ Sepal.Width * Species, data = iris) parameters_type(model) model < lm(Sepal.Length ~ Sepal.Width * Species * Petal.Length, data = iris) parameters_type(model) model < lm(Sepal.Length ~ Species * Sepal.Width, data = iris) parameters_type(model) model < lm(Sepal.Length ~ Species / Sepal.Width, data = iris) parameters_type(model) # Complex interactions data < iris data$fac2 < ifelse(data$Sepal.Width > mean(data$Sepal.Width), "A", "B") model < lm(Sepal.Length ~ Species / fac2 / Petal.Length, data = data) parameters_type(model) model < lm(Sepal.Length ~ Species / fac2 * Petal.Length, data = data) parameters_type(model)
This function "pools" (i.e. combines) model parameters in a similar fashion
as mice::pool()
. However, this function pools parameters from
parameters_model
objects, as returned by
model_parameters()
.
pool_parameters( x, exponentiate = FALSE, effects = "fixed", component = "all", verbose = TRUE, ... )
pool_parameters( x, exponentiate = FALSE, effects = "fixed", component = "all", verbose = TRUE, ... )
x 
A list of 
exponentiate 
Logical, indicating whether or not to exponentiate the
coefficients (and related confidence intervals). This is typical for
logistic regression, or more generally speaking, for models with log or
logit links. It is also recommended to use 
effects 
Should parameters for fixed effects ( 
component 
Should all parameters, parameters for the conditional model,
for the zeroinflation part of the model, or the dispersion model be returned?
Applies to models with zeroinflation and/or dispersion component. 
verbose 
Toggle warnings and messages. 
... 
Arguments passed down to 
Averaging of parameters follows Rubin's rules (Rubin, 1987, p. 76). The pooled degrees of freedom is based on the BarnardRubin adjustment for small samples (Barnard and Rubin, 1999).
A data frame of indices related to the model's parameters.
Models with multiple components, (for instance, models with zeroinflation,
where predictors appear in the count and zeroinflation part, or models with
dispersion component) may fail in rare situations. In this case, compute
the pooled parameters for components separately, using the component
argument.
Some model objects do not return standard errors (e.g. objects of class
htest
). For these models, no pooled confidence intervals nor pvalues
are returned.
Barnard, J. and Rubin, D.B. (1999). Small sample degrees of freedom with multiple imputation. Biometrika, 86, 948955. Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons.
# example for multiple imputed datasets data("nhanes2", package = "mice") imp < mice::mice(nhanes2, printFlag = FALSE) models < lapply(1:5, function(i) { lm(bmi ~ age + hyp + chl, data = mice::complete(imp, action = i)) }) pool_parameters(models) # should be identical to: m < with(data = imp, exp = lm(bmi ~ age + hyp + chl)) summary(mice::pool(m)) # For glm, mice used residual df, while `pool_parameters()` uses `Inf` nhanes2$hyp < datawizard::slide(as.numeric(nhanes2$hyp)) imp < mice::mice(nhanes2, printFlag = FALSE) models < lapply(1:5, function(i) { glm(hyp ~ age + chl, family = binomial, data = mice::complete(imp, action = i)) }) m < with(data = imp, exp = glm(hyp ~ age + chl, family = binomial)) # residual df summary(mice::pool(m))$df # df = Inf pool_parameters(models)$df_error # use residual df instead pool_parameters(models, ci_method = "residual")$df_error
# example for multiple imputed datasets data("nhanes2", package = "mice") imp < mice::mice(nhanes2, printFlag = FALSE) models < lapply(1:5, function(i) { lm(bmi ~ age + hyp + chl, data = mice::complete(imp, action = i)) }) pool_parameters(models) # should be identical to: m < with(data = imp, exp = lm(bmi ~ age + hyp + chl)) summary(mice::pool(m)) # For glm, mice used residual df, while `pool_parameters()` uses `Inf` nhanes2$hyp < datawizard::slide(as.numeric(nhanes2$hyp)) imp < mice::mice(nhanes2, printFlag = FALSE) models < lapply(1:5, function(i) { glm(hyp ~ age + chl, family = binomial, data = mice::complete(imp, action = i)) }) m < with(data = imp, exp = glm(hyp ~ age + chl, family = binomial)) # residual df summary(mice::pool(m))$df # df = Inf pool_parameters(models)$df_error # use residual df instead pool_parameters(models, ci_method = "residual")$df_error
Predict method for parameters_clusters objects
## S3 method for class 'parameters_clusters' predict(object, newdata = NULL, names = NULL, ...)
## S3 method for class 'parameters_clusters' predict(object, newdata = NULL, names = NULL, ...)
object 
a model object for which prediction is desired. 
newdata 
data.frame 
names 
character vector or list 
... 
additional arguments affecting the predictions produced. 
A sample data set with longitudinal data, used in the vignette describing the datawizard::demean()
function. Healthrelated quality of life from cancerpatients was measured at three time points (presurgery, 6 and 12 months after surgery).
A data frame with 564 rows and 7 variables:
Patient ID
Quality of Life Score
Timepoint of measurement
Age in years
Patients' Health Questionnaire, 4item version
Hospital ID, where patient was treated
Patients' educational level
This function extracts the different variance components of a mixed model and returns the result as a data frame.
random_parameters(model, component = "conditional")
random_parameters(model, component = "conditional")
model 
A mixed effects model (including 
component 
Should all parameters, parameters for the conditional model,
for the zeroinflation part of the model, or the dispersion model be returned?
Applies to models with zeroinflation and/or dispersion component. 
The variance components are obtained from insight::get_variance()
and
are denoted as following:
The residual variance, σ^{2}_{ε}, is the sum of the distributionspecific variance and the variance due to additive dispersion. It indicates the withingroup variance.
The random intercept variance, or betweengroup variance
for the intercept (τ_{00}),
is obtained from VarCorr()
. It indicates how much groups
or subjects differ from each other.
The random slope variance, or betweengroup variance
for the slopes (τ_{11})
is obtained from VarCorr()
. This measure is only available
for mixed models with random slopes. It indicates how much groups
or subjects differ from each other according to their slopes.
The random slopeintercept correlation
(ρ_{01})
is obtained from VarCorr()
. This measure is only available
for mixed models with random intercepts and slopes.
Note: For the withingroup and betweengroup variance, variance and standard deviations (which are simply the square root of the variance) are shown.
A data frame with random effects statistics for the variance components, including number of levels per random effect group, as well as complete observations in the model.
if (require("lme4")) { data(sleepstudy) model < lmer(Reaction ~ Days + (1 + Days  Subject), data = sleepstudy) random_parameters(model) }
if (require("lme4")) { data(sleepstudy) model < lmer(Reaction ~ Days + (1 + Days  Subject), data = sleepstudy) random_parameters(model) }
This function performs a reduction in the parameter space (the number of
variables). It starts by creating a new set of variables, based on the given
method (the default method is "PCA", but other are available via the
method
argument, such as "cMDS", "DRR" or "ICA"). Then, it names this
new dimensions using the original variables that correlates the most with it.
For instance, a variable named 'V1_0.97/V4_0.88'
means that the V1 and the
V4 variables correlate maximally (with respective coefficients of .97 and
.88) with this dimension. Although this function can be useful in
exploratory data analysis, it's best to perform the dimension reduction step
in a separate and dedicated stage, as this is a very important process in the
data analysis workflow. reduce_data()
is an alias for
reduce_parameters.data.frame()
.
reduce_parameters(x, method = "PCA", n = "max", distance = "euclidean", ...) reduce_data(x, method = "PCA", n = "max", distance = "euclidean", ...)
reduce_parameters(x, method = "PCA", n = "max", distance = "euclidean", ...) reduce_data(x, method = "PCA", n = "max", distance = "euclidean", ...)
x 
A data frame or a statistical model. 
method 
The feature reduction method. Can be one of 
n 
Number of components to extract. If 
distance 
The distance measure to be used. Only applies when

... 
Arguments passed to or from other methods. 
The different methods available are described below:
PCA: See principal_components()
.
cMDS / PCoA: Classical Multidimensional Scaling (cMDS) takes a set of dissimilarities (i.e., a distance matrix) and returns a set of points such that the distances between the points are approximately equal to the dissimilarities.
DRR: Dimensionality Reduction via Regression (DRR) is a very recent technique extending PCA (Laparra et al., 2015). Starting from a rotated PCA, it predicts redundant information from the remaining components using nonlinear regression. Some of the most notable advantages of performing DRR are avoidance of multicollinearity between predictors and overfitting mitigation. DRR tends to perform well when the first principal component is enough to explain most of the variation in the predictors. Requires the DRR package to be installed.
ICA: Performs an Independent Component Analysis using the FastICA algorithm. Contrary to PCA, which attempts to find uncorrelated sources (through least squares minimization), ICA attempts to find independent sources, i.e., the source space that maximizes the "nongaussianity" of all sources. Contrary to PCA, ICA does not rank each source, which makes it a poor tool for dimensionality reduction. Requires the fastICA package to be installed.
See also package vignette.
Nguyen, L. H., and Holmes, S. (2019). Ten quick tips for effective dimensionality reduction. PLOS Computational Biology, 15(6).
Laparra, V., Malo, J., and CampsValls, G. (2015). Dimensionality reduction via regression in hyperspectral imagery. IEEE Journal of Selected Topics in Signal Processing, 9(6), 10261036.
data(iris) model < lm(Sepal.Width ~ Species * Sepal.Length + Petal.Width, data = iris) model reduce_parameters(model) out < reduce_data(iris, method = "PCA", n = "max") head(out)
data(iris) model < lm(Sepal.Width ~ Species * Sepal.Length + Petal.Width, data = iris) model reduce_parameters(model) out < reduce_data(iris, method = "PCA", n = "max") head(out)
Reshape loadings between wide/long formats.
reshape_loadings(x, ...) ## S3 method for class 'parameters_efa' reshape_loadings(x, threshold = NULL, ...) ## S3 method for class 'data.frame' reshape_loadings(x, threshold = NULL, loadings_columns = NULL, ...)
reshape_loadings(x, ...) ## S3 method for class 'parameters_efa' reshape_loadings(x, threshold = NULL, ...) ## S3 method for class 'data.frame' reshape_loadings(x, threshold = NULL, loadings_columns = NULL, ...)
x 
A data frame or a statistical model. 
... 
Arguments passed to or from other methods. 
threshold 
A value between 0 and 1 indicates which (absolute) values
from the loadings should be removed. An integer higher than 1 indicates the
n strongest loadings to retain. Can also be 
loadings_columns 
Vector indicating the columns corresponding to loadings. 
if (require("psych")) { pca < model_parameters(psych::fa(attitude, nfactors = 3)) loadings < reshape_loadings(pca) loadings reshape_loadings(loadings) }
if (require("psych")) { pca < model_parameters(psych::fa(attitude, nfactors = 3)) loadings < reshape_loadings(pca) loadings reshape_loadings(loadings) }
This function performs an automated selection of the 'best' parameters, updating and returning the "best" model.
select_parameters(model, ...) ## S3 method for class 'lm' select_parameters(model, direction = "both", steps = 1000, k = 2, ...) ## S3 method for class 'merMod' select_parameters(model, direction = "backward", steps = 1000, ...)
select_parameters(model, ...) ## S3 method for class 'lm' select_parameters(model, direction = "both", steps = 1000, k = 2, ...) ## S3 method for class 'merMod' select_parameters(model, direction = "backward", steps = 1000, ...)
model 
A statistical model (of class 
... 
Arguments passed to or from other methods. 
direction 
the mode of stepwise search, can be one of 
steps 
the maximum number of steps to be considered. The default is 1000 (essentially as many as required). It is typically used to stop the process early. 
k 
The multiple of the number of degrees of freedom used for the penalty.
Only 
The model refitted with optimal number of parameters.
For frequentist GLMs, select_parameters()
performs an AICbased stepwise
selection.
For mixedeffects models of class merMod
, stepwise selection is based on
cAIC4::stepcAIC()
. This step function only searches the "best" model
based on the randomeffects structure, i.e. select_parameters()
adds or
excludes randomeffects until the cAIC can't be improved further.
model < lm(mpg ~ ., data = mtcars) select_parameters(model) model < lm(mpg ~ cyl * disp * hp * wt, data = mtcars) select_parameters(model) # lme4  model < lme4::lmer( Sepal.Width ~ Sepal.Length * Petal.Width * Petal.Length + (1  Species), data = iris ) select_parameters(model)
model < lm(mpg ~ ., data = mtcars) select_parameters(model) model < lm(mpg ~ cyl * disp * hp * wt, data = mtcars) select_parameters(model) # lme4  model < lme4::lmer( Sepal.Width ~ Sepal.Length * Petal.Width * Petal.Length + (1  Species), data = iris ) select_parameters(model)
Simulate draws from a statistical model to return a data frame of estimates.
simulate_model(model, iterations = 1000, ...) ## S3 method for class 'glmmTMB' simulate_model( model, iterations = 1000, component = "all", verbose = FALSE, ... )
simulate_model(model, iterations = 1000, ...) ## S3 method for class 'glmmTMB' simulate_model( model, iterations = 1000, component = "all", verbose = FALSE, ... )
model 
Statistical model (no Bayesian models). 
iterations 
The number of draws to simulate/bootstrap. 
... 
Arguments passed to 
component 
Should all parameters, parameters for the conditional model,
for the zeroinflation part of the model, or the dispersion model be returned?
Applies to models with zeroinflation and/or dispersion component. 
verbose 
Toggle warnings and messages. 
simulate_model()
is a computationally faster alternative
to bootstrap_model()
. Simulated draws for coefficients are based
on a multivariate normal distribution (MASS::mvrnorm()
) with mean
mu = coef(model)
and variance Sigma = vcov(model)
.
For models from packages glmmTMB, pscl, GLMMadaptive and
countreg, the component
argument can be used to specify
which parameters should be simulated. For all other models, parameters
from the conditional component (fixed effects) are simulated. This may
include smooth terms, but not random effects.
A data frame.
simulate_parameters()
, bootstrap_model()
, bootstrap_parameters()
model < lm(Sepal.Length ~ Species * Petal.Width + Petal.Length, data = iris) head(simulate_model(model)) if (require("glmmTMB", quietly = TRUE)) { model < glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) head(simulate_model(model)) head(simulate_model(model, component = "zero_inflated")) }
model < lm(Sepal.Length ~ Species * Petal.Width + Petal.Length, data = iris) head(simulate_model(model)) if (require("glmmTMB", quietly = TRUE)) { model < glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) head(simulate_model(model)) head(simulate_model(model, component = "zero_inflated")) }
Compute simulated draws of parameters and their related indices such as Confidence Intervals (CI) and pvalues. Simulating parameter draws can be seen as a (computationally faster) alternative to bootstrapping.
## S3 method for class 'glmmTMB' simulate_parameters( model, iterations = 1000, centrality = "median", ci = 0.95, ci_method = "quantile", test = "pvalue", ... ) simulate_parameters(model, ...) ## Default S3 method: simulate_parameters( model, iterations = 1000, centrality = "median", ci = 0.95, ci_method = "quantile", test = "pvalue", ... )
## S3 method for class 'glmmTMB' simulate_parameters( model, iterations = 1000, centrality = "median", ci = 0.95, ci_method = "quantile", test = "pvalue", ... ) simulate_parameters(model, ...) ## Default S3 method: simulate_parameters( model, iterations = 1000, centrality = "median", ci = 0.95, ci_method = "quantile", test = "pvalue", ... )
model 
Statistical model (no Bayesian models). 
iterations 
The number of draws to simulate/bootstrap. 
centrality 
The pointestimates (centrality indices) to compute. Character
(vector) or list with one or more of these options: 
ci 
Value or vector of probability of the CI (between 0 and 1)
to be estimated. Default to 
ci_method 
The type of index used for Credible Interval. Can be 
test 
The indices of effect existence to compute. Character (vector) or
list with one or more of these options: 
... 
Arguments passed to 
simulate_parameters()
is a computationally faster alternative
to bootstrap_parameters()
. Simulated draws for coefficients are based
on a multivariate normal distribution (MASS::mvrnorm()
) with mean
mu = coef(model)
and variance Sigma = vcov(model)
.
For models from packages glmmTMB, pscl, GLMMadaptive and
countreg, the component
argument can be used to specify
which parameters should be simulated. For all other models, parameters
from the conditional component (fixed effects) are simulated. This may
include smooth terms, but not random effects.
A data frame with simulated parameters.
There is also a plot()
method implemented in the seepackage.
Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models. Cambridge; New York: Cambridge University Press 2007: 140143
bootstrap_model()
, bootstrap_parameters()
, simulate_model()
model < lm(Sepal.Length ~ Species * Petal.Width + Petal.Length, data = iris) simulate_parameters(model) if (require("glmmTMB", quietly = TRUE)) { model < glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) simulate_parameters(model, centrality = "mean") simulate_parameters(model, ci = c(.8, .95), component = "zero_inflated") }
model < lm(Sepal.Length ~ Species * Petal.Width + Petal.Length, data = iris) simulate_parameters(model) if (require("glmmTMB", quietly = TRUE)) { model < glmmTMB( count ~ spp + mined + (1  site), ziformula = ~mined, family = poisson(), data = Salamanders ) simulate_parameters(model, centrality = "mean") simulate_parameters(model, ci = c(.8, .95), component = "zero_inflated") }
Sort parameters by coefficient values
sort_parameters(x, ...) ## Default S3 method: sort_parameters(x, sort = "none", column = "Coefficient", ...)
sort_parameters(x, ...) ## Default S3 method: sort_parameters(x, sort = "none", column = "Coefficient", ...)
x 
A data frame or a 
... 
Arguments passed to or from other methods. 
sort 
If 
column 
The column containing model parameter estimates. This will be

A sorted data frame or original object.
# creating object to sort (can also be a regular data frame) mod < model_parameters(stats::lm(wt ~ am * cyl, data = mtcars)) # original output mod # sorted outputs sort_parameters(mod, sort = "ascending") sort_parameters(mod, sort = "descending")
# creating object to sort (can also be a regular data frame) mod < model_parameters(stats::lm(wt ~ am * cyl, data = mtcars)) # original output mod # sorted outputs sort_parameters(mod, sort = "ascending") sort_parameters(mod, sort = "descending")
standard_error()
attempts to return standard errors of model
parameters.
standard_error(model, ...) ## Default S3 method: standard_error( model, component = "all", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'factor' standard_error(model, force = FALSE, verbose = TRUE, ...) ## S3 method for class 'glmmTMB' standard_error( model, effects = "fixed", component = "all", verbose = TRUE, ... ) ## S3 method for class 'merMod' standard_error( model, effects = "fixed", method = NULL, vcov = NULL, vcov_args = NULL, ... )
standard_error(model, ...) ## Default S3 method: standard_error( model, component = "all", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'factor' standard_error(model, force = FALSE, verbose = TRUE, ...) ## S3 method for class 'glmmTMB' standard_error( model, effects = "fixed", component = "all", verbose = TRUE, ... ) ## S3 method for class 'merMod' standard_error( model, effects = "fixed", method = NULL, vcov = NULL, vcov_args = NULL, ... )
model 
A model. 
... 
Arguments passed to or from other methods. 
component 
Model component for which standard errors should be shown.
See the documentation for your object's class in 
vcov 
Variancecovariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

vcov_args 
List of arguments to be passed to the function identified by
the 
verbose 
Toggle warnings and messages. 
force 
Logical, if 
effects 
Should standard errors for fixed effects ( 
method 
Method for computing degrees of freedom for
confidence intervals (CI) and the related pvalues. Allowed are following
options (which vary depending on the model class): 
A data frame with at least two columns: the parameter names and the standard errors. Depending on the model, may also include columns for model components etc.
For Bayesian models (from rstanarm or brms), the standard error is the SD of the posterior samples.
model < lm(Petal.Length ~ Sepal.Length * Species, data = iris) standard_error(model) # robust standard errors standard_error(model, vcov = "HC3") # clusterrobust standard errors standard_error(model, vcov = "vcovCL", vcov_args = list(cluster = iris$Species) )
model < lm(Petal.Length ~ Sepal.Length * Species, data = iris) standard_error(model) # robust standard errors standard_error(model, vcov = "HC3") # clusterrobust standard errors standard_error(model, vcov = "vcovCL", vcov_args = list(cluster = iris$Species) )
This function extracts information, such as the deviations (SD or MAD) from parent variables, that are necessary for posthoc standardization of parameters. This function gives a window on how standardized are obtained, i.e., by what they are divided. The "basic" method of standardization uses.
standardize_info(model, ...) ## Default S3 method: standardize_info( model, robust = FALSE, two_sd = FALSE, include_pseudo = FALSE, verbose = TRUE, ... )
standardize_info(model, ...) ## Default S3 method: standardize_info( model, robust = FALSE, two_sd = FALSE, include_pseudo = FALSE, verbose = TRUE, ... )
model 
A statistical model. 
... 
Arguments passed to or from other methods. 
robust 
Logical, if 
two_sd 
If 
include_pseudo 
(For (G)LMMs) Should Pseudostandardized information be included? 
verbose 
Toggle warnings and messages on or off. 
A data frame with information on each parameter (see
parameters_type()
), and various standardization coefficients
for the posthoc methods (see standardize_parameters()
) for the predictor
and the response.
Other standardize:
standardize_parameters()
model < lm(mpg ~ ., data = mtcars) standardize_info(model) standardize_info(model, robust = TRUE) standardize_info(model, two_sd = TRUE)
model < lm(mpg ~ ., data = mtcars) standardize_info(model) standardize_info(model, robust = TRUE) standardize_info(model, two_sd = TRUE)
Compute standardized model parameters (coefficients).
standardize_parameters( model, method = "refit", ci = 0.95, robust = FALSE, two_sd = FALSE, include_response = TRUE, verbose = TRUE, ... ) standardize_posteriors( model, method = "refit", robust = FALSE, two_sd = FALSE, include_response = TRUE, verbose = TRUE, ... )
standardize_parameters( model, method = "refit", ci = 0.95, robust = FALSE, two_sd = FALSE, include_response = TRUE, verbose = TRUE, ... ) standardize_posteriors( model, method = "refit", robust = FALSE, two_sd = FALSE, include_response = TRUE, verbose = TRUE, ... )
model 
A statistical model. 
method 
The method used for standardizing the parameters. Can be

ci 
Confidence Interval (CI) level 
robust 
Logical, if 
two_sd 
If 
include_response 
If 
verbose 
Toggle warnings and messages on or off. 
... 
For

refit: This method is based on a complete model refit with a
standardized version of the data. Hence, this method is equal to
standardizing the variables before fitting the model. It is the "purest" and
the most accurate (Neter et al., 1989), but it is also the most
computationally costly and long (especially for heavy models such as Bayesian
models). This method is particularly recommended for complex models that
include interactions or transformations (e.g., polynomial or spline terms).
The robust
(default to FALSE
) argument enables a robust standardization
of data, i.e., based on the median
and MAD
instead of the mean
and
SD
. See datawizard::standardize()
for more details.
Note that standardize_parameters(method = "refit")
may not return
the same results as fitting a model on data that has been standardized with
standardize()
; standardize_parameters()
used the data used by the model
fitting function, which might not be same data if there are missing values.
see the remove_na
argument in standardize()
.
posthoc: Posthoc standardization of the parameters, aiming at
emulating the results obtained by "refit" without refitting the model. The
coefficients are divided by the standard deviation (or MAD if robust
) of
the outcome (which becomes their expression 'unit'). Then, the coefficients
related to numeric variables are additionally multiplied by the standard
deviation (or MAD if robust
) of the related terms, so that they correspond
to changes of 1 SD of the predictor (e.g., "A change in 1 SD of x
is
related to a change of 0.24 of the SD of y
). This does not apply to binary
variables or factors, so the coefficients are still related to changes in
levels. This method is not accurate and tend to give aberrant results when
interactions are specified.
basic: This method is similar to method = "posthoc"
, but treats all
variables as continuous: it also scales the coefficient by the standard
deviation of model's matrix' parameter of factors levels (transformed to
integers) or binary predictors. Although being inappropriate for these cases,
this method is the one implemented by default in other software packages,
such as lm.beta::lm.beta()
.
smart (Standardization of Model's parameters with Adjustment,
Reconnaissance and Transformation  experimental): Similar to method = "posthoc"
in that it does not involve model refitting. The difference is
that the SD (or MAD if robust
) of the response is computed on the relevant
section of the data. For instance, if a factor with 3 levels A (the
intercept), B and C is entered as a predictor, the effect corresponding to B
vs. A will be scaled by the variance of the response at the intercept only.
As a results, the coefficients for effects of factors are similar to a Glass'
delta.
pseudo (for 2level (G)LMMs only): In this (posthoc) method, the
response and the predictor are standardized based on the level of prediction
(levels are detected with performance::check_heterogeneity_bias()
): Predictors
are standardized based on their SD at level of prediction (see also
datawizard::demean()
); The outcome (in linear LMMs) is standardized based
on a fitted randominterceptmodel, where sqrt(randominterceptvariance)
is used for level 2 predictors, and sqrt(residualvariance)
is used for
level 1 predictors (Hoffman 2015, page 342). A warning is given when a
withingroup variable is found to have access betweengroup variance.
sdy (for logistic regression models only): This ystandardization is useful when comparing coefficients of logistic regression models across models for the same sample. Unobserved heterogeneity varies across models with different independent variables, and thus, odds ratios from the same predictor of different models cannot be compared directly. The ystandardization makes coefficients "comparable across models by dividing them with the estimated standard deviation of the latent variable for each model" (Mood 2010). Thus, whenever one has multiple logistic regression models that are fit to the same data and share certain predictors (e.g. nested models), it can be useful to use this standardization approach to make logodds or odds ratios comparable.
When the model's formula contains transformations (e.g. y ~ exp(X)
) method = "refit"
will give different results compared to method = "basic"
("posthoc"
and "smart"
do not support such transformations): While
"refit"
standardizes the data prior to the transformation (e.g.
equivalent to exp(scale(X))
), the "basic"
method standardizes the
transformed data (e.g. equivalent to scale(exp(X))
).
See the Transformed Variables section in datawizard::standardize.default()
for more details on how different transformations are dealt with when
method = "refit"
.
The returned confidence intervals are rescaled versions of the unstandardized confidence intervals, and not "true" confidence intervals of the standardized coefficients (cf. Jones & Waller, 2015).
Standardization for generalized linear models (GLM, GLMM, etc) is done only with respect to the predictors (while the outcome remains asis, unstandardized)  maintaining the interpretability of the coefficients (e.g., in a binomial model: the exponent of the standardized parameter is the OR of a change of 1 SD in the predictor, etc.)
standardize(model)
or standardize_parameters(model, method = "refit")
do
not standardize categorical predictors (i.e. factors) / their
dummyvariables, which may be a different behaviour compared to other R
packages (such as lm.beta) or other software packages (like SPSS). To
mimic such behaviours, either use standardize_parameters(model, method = "basic")
to obtain posthoc standardized parameters, or standardize the data
with datawizard::standardize(data, force = TRUE)
before fitting the
model.
A data frame with the standardized parameters (Std_*
, depending on
the model type) and their CIs (CI_low
and CI_high
). Where applicable,
standard errors (SEs) are returned as an attribute (attr(x, "standard_error")
).
Hoffman, L. (2015). Longitudinal analysis: Modeling withinperson fluctuation and change. Routledge.
Jones, J. A., & Waller, N. G. (2015). The normaltheory and asymptotic distributionfree (ADF) covariance matrix of standardized regression coefficients: theoretical extensions and finite sample behavior. Psychometrika, 80(2), 365378.
Neter, J., Wasserman, W., & Kutner, M. H. (1989). Applied linear regression models.
Gelman, A. (2008). Scaling regression inputs by dividing by two standard deviations. Statistics in medicine, 27(15), 28652873.
Mood C. Logistic Regression: Why We Cannot Do What We Think We Can Do, and What We Can Do About It. European Sociological Review (2010) 26:67–82.
See also package vignette.
Other standardize:
standardize_info()
model < lm(len ~ supp * dose, data = ToothGrowth) standardize_parameters(model, method = "refit") standardize_parameters(model, method = "posthoc") standardize_parameters(model, method = "smart") standardize_parameters(model, method = "basic") # Robust and 2 SD standardize_parameters(model, robust = TRUE) standardize_parameters(model, two_sd = TRUE) model < glm(am ~ cyl * mpg, data = mtcars, family = "binomial") standardize_parameters(model, method = "refit") standardize_parameters(model, method = "posthoc") standardize_parameters(model, method = "basic", exponentiate = TRUE) m < lme4::lmer(mpg ~ cyl + am + vs + (1  cyl), mtcars) standardize_parameters(m, method = "pseudo", ci_method = "satterthwaite") model < rstanarm::stan_glm(rating ~ critical + privileges, data = attitude, refresh = 0) standardize_posteriors(model, method = "refit", verbose = FALSE) standardize_posteriors(model, method = "posthoc", verbose = FALSE) standardize_posteriors(model, method = "smart", verbose = FALSE) head(standardize_posteriors(model, method = "basic", verbose = FALSE))
model < lm(len ~ supp * dose, data = ToothGrowth) standardize_parameters(model, method = "refit") standardize_parameters(model, method = "posthoc") standardize_parameters(model, method = "smart") standardize_parameters(model, method = "basic") # Robust and 2 SD standardize_parameters(model, robust = TRUE) standardize_parameters(model, two_sd = TRUE) model < glm(am ~ cyl * mpg, data = mtcars, family = "binomial") standardize_parameters(model, method = "refit") standardize_parameters(model, method = "posthoc") standardize_parameters(model, method = "basic", exponentiate = TRUE) m < lme4::lmer(mpg ~ cyl + am + vs + (1  cyl), mtcars) standardize_parameters(m, method = "pseudo", ci_method = "satterthwaite") model < rstanarm::stan_glm(rating ~ critical + privileges, data = attitude, refresh = 0) standardize_posteriors(model, method = "refit", verbose = FALSE) standardize_posteriors(model, method = "posthoc", verbose = FALSE) standardize_posteriors(model, method = "smart", verbose = FALSE) head(standardize_posteriors(model, method = "basic", verbose = FALSE))