check_outliers()
with method = "optics"
now returns a further refined
cluster selection, by passing the optics_xi
argument to dbscan::extractXi()
.
Deprecated arguments and alias-function-names have been removed.
Argument names in check_model()
that refer to plot-aesthetics (like
dot_size
) are now harmonized across easystats packages, meaning that
these have been renamed. They now follow the pattern aesthetic_type
, e.g.
size_dot
(instead of dot_size
).
check_convergence()
for glmmTMB models.check_outliers()
did not warn that no numeric variables were found when only
the response variable was numeric, but all relevant predictors were not.check_dag()
now also checks for colliders, and suggests removing it in the
printed output.
Minor revisions to the printed output of check_dag()
.
check_dag()
, to check DAGs for correct adjustment sets.check_heterogeneity_bias()
gets a nested
argument. Furthermore, by
can
specify more than one variable, meaning that nested or cross-classified
model designs can also be tested for heterogeneity bias.Patch release, to ensure that performance runs with older version of datawizard on Mac OSX with R (old-release).
icc()
and r2_nakagawa()
get a null_model
argument. This can be useful
when computing R2 or ICC for mixed models, where the internal computation of
the null model fails, or when you already have fit the null model and want
to save time.
icc()
and r2_nakagawa()
get a approximation
argument indicating the
approximation method for the distribution-specific (residual) variance. See
Nakagawa et al. 2017 for details.
icc()
and r2_nakagawa()
get a model_component
argument indicating the
component for zero-inflation or hurdle models.
performance_rmse()
(resp. rmse()
) can now compute analytical and
bootstrapped confidence intervals. The function gains following new arguments:
ci
, ci_method
and iterations
.
New function r2_ferrari()
to compute Ferrari & Cribari-Neto's R2 for
generalized linear models, in particular beta-regression.
Improved documentation of some functions.
Fixed issue in check_model()
when model contained a transformed response
variable that was named like a valid R function name (e.g., lm(log(lapply) ~ x)
,
when data contained a variable named lapply
).
Fixed issue in check_predictions()
for linear models when response was
transformed as ratio (e.g. lm(succes/trials ~ x)
).
Fixed issue in r2_bayes()
for mixed models from rstanarm.
Aliases posterior_predictive_check()
and check_posterior_predictions()
for
check_predictions()
are deprecated.
Arguments named group
or group_by
will be deprecated in a future release.
Please use by
instead. This affects check_heterogeneity_bias()
in
performance.
Improved documentation and new vignettes added.
check_model()
gets a base_size
argument, to set the base font size for plots.
check_predictions()
for stanreg
and brmsfit
models now returns plots in
the usual style as for other models and no longer returns plots from
bayesplot::pp_check()
.
Updated the trained model that is used to prediction distributions in
check_distribution()
.
check_model()
now falls back on normal Q-Q plots when a model is not supported
by the DHARMa package and simulated residuals cannot be calculated.serp
from package serp.simulate_residuals()
and check_residuals()
, to simulate and check residuals
from generalized linear (mixed) models. Simulating residuals is based on the
DHARMa package, and objects returned by simulate_residuals()
inherit from
the DHARMa
class, and thus can be used with any functions from the DHARMa
package. However, there are also implementations in the performance package,
such as check_overdispersion()
, check_zeroinflation()
, check_outliers()
or check_model()
.
Plots for check_model()
have been improved. The Q-Q plots are now based
on simulated residuals from the DHARMa package for non-Gaussian models, thus
providing more accurate and informative plots. The half-normal QQ plot for
generalized linear models can still be obtained by setting the new argument
residual_type = "normal"
.
Following functions now support simulated residuals (from simulate_residuals()
)
resp. objects returned from DHARMa::simulateResiduals()
:
check_overdispersion()
check_zeroinflation()
check_outliers()
check_model()
Improved error messages for check_model()
when QQ-plots cannot be created.
check_distribution()
is more stable for possibly sparse data.
Fixed issue in check_normality()
for t-tests.
Fixed issue in check_itemscale()
for data frame inputs, when factor_index
was not a named vector.
r2()
for models of class glmmTMB
without random effects now returns the
correct r-squared value for non-mixed models.
check_itemscale()
now also accepts data frames as input. In this case,
factor_index
must be specified, which must be a numeric vector of same
length as number of columns in x
, where each element is the index of the
factor to which the respective column in x
.
check_itemscale()
gets a print_html()
method.
Clarification in the documentation of the estimator
argument for
performance_aic()
.
Improved plots for overdispersion-checks for negative-binomial models from
package glmmTMB (affects check_overdispersion()
and check_model()
).
Improved detection rates for singularity in check_singularity()
for models
from package glmmTMB.
For model of class glmmTMB
, deviance residuals are now used in the
check_model()
plot.
Improved (better to understand) error messages for check_model()
,
check_collinearity()
and check_outliers()
for models with non-numeric
response variables.
r2_kullback()
now gives an informative error for non-supported models.
Fixed issue in binned_residuals()
for models with binary outcome, where
in rare occasions empty bins could occur.
performance_score()
should no longer fail for models where scoring rules
can't be calculated. Instead, an informative message is returned.
check_outliers()
now properly accept the percentage_central
argument when
using the "mcd"
method.
Fixed edge cases in check_collinearity()
and check_outliers()
for models
with response variables of classes Date
, POSIXct
, POSIXlt
or difftime
.
Fixed issue with check_model()
for models of package quantreg.
check_predictions()
for models from binomial family,
to get comparable plots for different ways of outcome specification. Now,
if the outcome is a proportion, or defined as matrix of trials and successes,
the produced plots are the same (because the models should be the same, too).Fixed CRAN check errors.
Fixed issue with binned_residuals()
for models with binomial family, where
the outcome was a proportion.
binned_residuals()
gains a few new arguments to control the residuals used
for the test, as well as different options to calculate confidence intervals
(namely, ci_type
, residuals
, ci
and iterations
). The default values
to compute binned residuals have changed. Default residuals are now "deviance"
residuals (and no longer "response" residuals). Default confidence intervals
are now "exact" intervals (and no longer based on Gaussian approximation).
Use ci_type = "gaussian"
and residuals = "response"
to get the old defaults.binned_residuals()
- like check_model()
- gains a show_dots
argument to
show or hide data points that lie inside error bounds. This is particular
useful for models with many observations, where generating the plot would be
very slow.nestedLogit
models.check_outliers()
for method "ics"
now detects number of available cores
for parallel computing via the "mc.cores"
option. This is more robust than
the previous method, which used parallel::detectCores()
. Now you should
set the number of cores via options(mc.cores = 4)
.check_model()
for models that used data sets with
variables of class "haven_labelled"
.More informative message for test_*()
functions that "nesting" only refers
to fixed effects parameters and currently ignores random effects when detecting
nested models.
check_outliers()
for "ICS"
method is now more stable and less likely to
fail.
check_convergence()
now works for parsnip _glm
models.
check_collinearity()
did not work for hurdle- or zero-inflated models of
package pscl when model had no explicitly defined formula for the
zero-inflation model.icc()
and r2_nakagawa()
gain a ci_method
argument, to either calculate
confidence intervals using boot::boot()
(instead of lmer::bootMer()
) when
ci_method = "boot"
or analytical confidence intervals
(ci_method = "analytical"
). Use ci_method = "boot"
when the default method
fails to compute confidence intervals and use ci_method = "analytical"
if
bootstrapped intervals cannot be calculated at all. Note that the default
computation method is preferred.
check_predictions()
accepts a bandwidth
argument (smoothing bandwidth),
which is passed down to the plot()
methods density-estimation.
check_predictions()
gains a type
argument, which is passed down to the
plot()
method to change plot-type (density or discrete dots/intervals).
By default, type
is set to "default"
for models without discrete outcomes,
and else type = "discrete_interval"
.
performance_accuracy()
now includes confidence intervals, and reports those
by default (the standard error is no longer reported, but still included).
check_collinearity()
for fixest models that used i()
to create interactions in formulas.item_discrimination()
, to calculate the discrimination of a scale's items.model_performance()
, check_overdispersion()
, check_outliers()
and r2()
now work with objects of class fixest_multi
(@etiennebacher, #554).
model_performance()
can now return the "Weak instruments" statistic and
p-value for models of class ivreg
with metrics = "weak_instruments"
(@etiennebacher, #560).
Support for mclogit
models.
test_*()
functions now automatically fit a null-model when only one model
objects was provided for testing multiple models.
Warnings in model_performance()
for unsupported objects of class
BFBayesFactor
can now be suppressed with verbose = FALSE
.
check_predictions()
no longer fails with issues when re_formula = NULL
for mixed models, but instead gives a warning and tries to compute posterior
predictive checks with re_formuka = NA
.
check_outliers()
now also works for meta-analysis models from packages
metafor and meta.
plot()
for performance::check_model()
no longer produces a normal QQ plot
for GLMs. Instead, it now shows a half-normal QQ plot of the absolute value
of the standardized deviance residuals.
print()
method for check_collinearity()
, which could mix
up the correct order of parameters.insight::get_data()
to meet forthcoming changes in the
insight package.check_collinearity()
now accepts NULL
for the ci
argument.item_difficulty()
with detecting the maximum values of an
item set. Furthermore, item_difficulty()
gets a maximum_value
argument
in case no item contains the maximum value due to missings.icc()
and r2_nakagawa()
get ci
and iterations
arguments, to compute
confidence intervals for the ICC resp. R2, based on bootstrapped sampling.
r2()
gets ci
, to compute (analytical) confidence intervals for the R2.
The model underlying check_distribution()
was now also trained to detect
cauchy, half-cauchy and inverse-gamma distributions.
model_performance()
now allows to include the ICC for Bayesian models.
verbose
didn't work for r2_bayes()
with BFBayesFactor
objects.
Fixed issues in check_model()
for models with convergence issues that lead
to NA
values in residuals.
Fixed bug in check_outliers
whereby passing multiple elements to the
threshold list generated an error (#496).
test_wald()
now warns the user about inappropriate F test and calls
test_likelihoodratio()
for binomial models.
Fixed edge case for usage of parellel::detectCores()
in check_outliers()
.
The minimum needed R version has been bumped to 3.6
.
The alias performance_lrt()
was removed. Use test_lrt()
resp.
test_likelihoodratio()
.
check_sphericity_bartlett()
, check_kmo()
, check_factorstructure()
and
check_clusterstructure()
.check_normality()
, check_homogeneity()
and check_symmetry()
now works
for htest
objects.
Print method for check_outliers()
changed significantly: now states the
methods, thresholds, and variables used, reports outliers per variable (for
univariate methods) as well as any observation flagged for several
variables/methods. Includes a new optional ID argument to add along the
row number in the output (@rempsyc #443).
check_outliers()
now uses more conventional outlier thresholds. The IQR
and confidence interval methods now gain improved distance scores that
are continuous instead of discrete.
Fixed wrong z-score values when using a vector instead of a data frame in
check_outliers()
(#476).
Fixed cronbachs_alpha()
for objects from parameters::principal_component()
.
print()
methods for model_performance()
and compare_performance()
get a
layout
argument, which can be "horizontal"
(default) or "vertical"
, to
switch the layout of the printed table.
Improved speed performance for check_model()
and some other
performance_*()
functions.
Improved support for models of class geeglm
.
check_model()
gains a show_dots
argument, to show or hide data points.
This is particular useful for models with many observations, where generating
the plot would be very slow.model_performance()
output for kmeans
objects
(#453)icc()
is now named "unadjusted" ICC.performance_cv()
for cross-validated model performance.check_overdispersion()
gets a plot()
method.
check_outliers()
now also works for models of classes gls
and lme
. As a
consequence, check_model()
will no longer fail for these models.
check_collinearity()
now includes the confidence intervals for the VIFs and
tolerance values.
model_performance()
now also includes within-subject R2 measures, where
applicable.
Improved handling of random effects in check_normality()
(i.e. when argument
effects = "random"
).
check_predictions()
did not work for GLMs with matrix-response.
check_predictions()
did not work for logistic regression models (i.e. models
with binary response) from package glmmTMB
item_split_half()
did not work when the input data frame or matrix only
contained two columns.
Fixed wrong computation of BIC
in model_performance()
when models had
transformed response values.
Fixed issues in check_model()
for GLMs with matrix-response.
check_concurvity()
, which returns GAM concurvity measures (comparable to
collinearity checks).check_predictions()
, check_collinearity()
and check_outliers()
now
support (mixed) regression models from BayesFactor
.
check_zeroinflation()
now also works for lme4::glmer.nb()
models.
check_collinearity()
better supports GAM models.
test_performance()
now calls test_lrt()
or test_wald()
instead of
test_vuong()
when package CompQuadForm is missing.
test_performance()
and test_lrt()
now compute the corrected log-likelihood
when models with transformed response variables (such as log- or
sqrt-transformations) are passed to the functions.
performance_aic()
now corrects the AIC value for models with transformed
response variables. This also means that comparing models using
compare_performance()
allows comparisons of AIC values for models with and
without transformed response variables.
Also, model_performance()
now corrects both AIC and BIC values for models
with transformed response variables.
The print()
method for binned_residuals()
now prints a short summary of
the results (and no longer generates a plot). A plot()
method was added to
generate plots.
The plot()
output for check_model()
was revised:
For binomial models, the constant variance plot was omitted, and a binned residuals plot included.
The density-plot that showed normality of residuals was replaced by the posterior predictive check plot.
model_performance()
for models from lme4 did not report AICc when
requested.
r2_nakagawa()
messed up order of group levels when by_group
was TRUE
.
The ci
-level in r2()
for Bayesian models now defaults to 0.95
, to be in
line with the latest changes in the bayestestR package.
S3-method dispatch for pp_check()
was revised, to avoid problems with the
bayesplot package, where the generic is located.
Minor revisions to wording for messages from some of the check-functions.
posterior_predictive_check()
and check_predictions()
were added as aliases
for pp_check()
.
check_multimodal()
and check_heterogeneity_bias()
. These functions will be
removed from the parameters packages in the future.r2()
for linear models can now compute confidence intervals, via the ci
argument.Fixed issues in check_model()
for Bayesian models.
Fixed issue in pp_check()
for models with transformed response variables, so
now predictions and observed response values are on the same (transformed)
scale.
check_outliers()
has new ci
(or hdi
, eti
) method to filter based on
Confidence/Credible intervals.
compare_performance()
now also accepts a list of model objects.
performance_roc()
now also works for binomial models from other classes than
glm.
Several functions, like icc()
or r2_nakagawa()
, now have an
as.data.frame()
method.
check_collinearity()
now correctly handles objects from forthcoming afex
update.
performance_mae()
to calculate the mean absolute error.Fixed issue with "data length differs from size of matrix"
warnings in
examples in forthcoming R 4.2.
Fixed issue in check_normality()
for models with sample size larger than
5.000 observations.
Fixed issue in check_model()
for glmmTMB models.
Fixed issue in check_collinearity()
for glmmTMB models with
zero-inflation, where the zero-inflated model was an intercept-only model.
Add support for model_fit
(tidymodels).
model_performance
supports kmeans models.
Give more informative warning when r2_bayes()
for BFBayesFactor objects
can't be calculated.
Several check_*()
functions now return informative messages for invalid
model types as input.
r2()
supports mhurdle
(mhurdle) models.
Added print()
methods for more classes of r2()
.
The performance_roc()
and performance_accuracy()
functions unfortunately
had spelling mistakes in the output columns: Sensitivity was called
Sensivity and Specificity was called Specifity. We think these are
understandable mistakes :-)
check_model()
check_model()
gains more arguments, to customize plot appearance.
Added option to detrend QQ/PP plots in check_model()
.
model_performance()
The metrics
argument from model_performance()
and compare_performance()
gains a "AICc"
option, to also compute the 2nd order AIC.
"R2_adj"
is now an explicit option in the metrics
argument from
model_performance()
and compare_performance()
.
The default-method for r2()
now tries to compute an r-squared for all models
that have no specific r2()
-method yet, by using following formula:
1-sum((y-y_hat)^2)/sum((y-y_bar)^2))
The column name Parameter
in check_collinearity()
is now more
appropriately named Term
.
test_likelihoodratio()
now correctly sorts models with identical fixed
effects part, but different other model parts (like zero-inflation).
Fixed incorrect computation of models from inverse-Gaussian families, or
Gaussian families fitted with glm()
.
Fixed issue in performance_roc()
for models where outcome was not 0/1
coded.
Fixed issue in performance_accuracy()
for logistic regression models when
method = "boot"
.
cronbachs_alpha()
did not work for matrix
-objects, as stated in the docs.
It now does.
compare_performance()
doesn't return the models' Bayes Factors, now returned
by test_performance()
and test_bf()
.test_vuong()
, to compare models using Vuong's (1989) Test.
test_bf()
, to compare models using Bayes factors.
test_likelihoodratio()
as an alias for performance_lrt()
.
test_wald()
, as a rough approximation for the LRT.
test_performance()
, to run the most relevant and appropriate tests based on
the input.
performance_lrt()
performance_lrt()
get an alias test_likelihoodratio()
.
Does not return AIC/BIC now (as they are not related to LRT per se and can be easily obtained with other functions).
Now contains a column with the difference in degrees of freedom between models.
Fixed column names for consistency.
model_performance()
ivreg
.Revised computation of performance_mse()
, to ensure that it's always based
on response residuals.
performance_aic()
is now more robust.
Fixed issue in icc()
and variance_decomposition()
for multivariate
response models, where not all model parts contained random effects.
Fixed issue in compare_performance()
with duplicated rows.
check_collinearity()
no longer breaks for models with rank deficient model
matrix, but gives a warning instead.
Fixed issue in check_homogeneity()
for method = "auto"
, which wrongly
tested the response variable, not the residuals.
Fixed issue in check_homogeneity()
for edge cases where predictor had
non-syntactic names.
check_collinearity()
gains a verbose
argument, to toggle warnings and
messages.model_performance()
now supports margins
, gamlss
, stanmvreg
and
semLme
.r2_somers()
, to compute Somers' Dxy rank-correlation as R2-measure for
logistic regression models.
display()
, to print output from package-functions into different formats.
print_md()
is an alias for display(format = "markdown")
.
model_performance()
model_performance()
is now more robust and doesn't fail if an index could
not be computed. Instead, it returns all indices that were possible to
calculate.
model_performance()
gains a default-method that catches all model objects
not previously supported. If model object is also not supported by the
default-method, a warning is given.
model_performance()
for metafor-models now includes the degrees of freedom
for Cochran's Q.
performance_mse()
and performance_rmse()
now always try to return the
(R)MSE on the response scale.
performance_accuracy()
now accepts all types of linear or logistic
regression models, even if these are not of class lm
or glm
.
performance_roc()
now accepts all types of logistic regression models, even
if these are not of class glm
.
r2()
for mixed models and r2_nakagawa()
gain a tolerance
-argument, to
set the tolerance level for singularity checks when computing random effect
variances for the conditional r-squared.
Fixed issue in icc()
introduced in the last update that make lme
-models
fail.
Fixed issue in performance_roc()
for models with factors as response.
model_performance()
and compare_performance()
were
changed to be in line with the easystats naming convention: LOGLOSS
is now
Log_loss
, SCORE_LOG
is Score_log
and SCORE_SPHERICAL
is now
Score_spherical
.r2_posterior()
for Bayesian models to obtain posterior distributions of
R-squared.r2_bayes()
works with Bayesian models from BayesFactor
( #143 ).
model_performance()
works with Bayesian models from BayesFactor
( #150 ).
model_performance()
now also includes the residual standard deviation.
Improved formatting for Bayes factors in compare_performance()
.
compare_performance()
with rank = TRUE
doesn't use the BF
values when
BIC
are present, to prevent "double-dipping" of the BIC values (#144).
The method
argument in check_homogeneity()
gains a "levene"
option, to
use Levene's Test for homogeneity.
compare_performance()
when ...
arguments were function calls to
regression objects, instead of direct function calls.r2()
and icc()
support semLME
models (package smicd).
check_heteroscedasticity()
should now also work with zero-inflated mixed
models from glmmTMB and GLMMadpative.
check_outliers()
now returns a logical vector. Original numerical vector is
still accessible via as.numeric()
.
pp_check()
to compute posterior predictive checks for frequentist models.Fixed issue with incorrect labeling of groups from icc()
when by_group = TRUE
.
Fixed issue in check_heteroscedasticity()
for mixed models where sigma could
not be calculated in a straightforward way.
Fixed issues in check_zeroinflation()
for MASS::glm.nb()
.
Fixed CRAN check issues.
icc()
now also computes a "classical" ICC for brmsfit
models. The former
way of calculating an "ICC" for brmsfit
models is now available as new
function called variance_decomposition()
.Fix issue with new version of bigutilsr for check_outliers()
.
Fix issue with model order in performance_lrt()
.
model_performance.rma()
now includes results from heterogeneity test for
meta-analysis objects.
check_normality()
now also works for mixed models (with the limitation that
studentized residuals are used).
check_normality()
gets an effects
-argument for mixed models, to check
random effects for normality.
Fixed issue in performance_accuracy()
for binomial models when response
variable had non-numeric factor levels.
Fixed issues in performance_roc()
, which printed 1 - AUC instead of AUC.
Minor revisions to model_performance()
to meet changes in mlogit package.
Support for bayesx
models.
icc()
gains a by_group
argument, to compute ICCs per different group
factors in mixed models with multiple levels or cross-classified design.
r2_nakagawa()
gains a by_group
argument, to compute explained variance at
different levels (following the variance-reduction approach by Hox 2010).
performance_lrt()
now works on lavaan objects.
Fix issues in some functions for models with logical dependent variable.
Fix bug in check_itemscale()
, which caused multiple computations of skewness
statistics.
Fix issues in r2()
for gam models.
model_performance()
and r2()
now support rma-objects from package
metafor, mlm and bife models.compare_performance()
gets a bayesfactor
argument, to include or exclude
the Bayes factor for model comparisons in the output.
Added r2.aov()
.
Fixed issue in performance_aic()
for models from package survey, which
returned three different AIC values. Now only the AIC value is returned.
Fixed issue in check_collinearity()
for glmmTMB models when zero-inflated
formula only had one predictor.
Fixed issue in check_model()
for lme models.
Fixed issue in check_distribution()
for brmsfit models.
Fixed issue in check_heteroscedasticity()
for aov objects.
Fixed issues for lmrob and glmrob objects.