Title: | Easy Access to Model Information for Various Model Objects |
---|---|
Description: | A tool to provide an easy, intuitive and consistent access to information contained in various R models, like model formulas, model terms, information about random effects, data that was used to fit the model or data from response variables. 'insight' mainly revolves around two types of functions: Functions that find (the names of) information, starting with 'find_', and functions that get the underlying data, starting with 'get_'. The package has a consistent syntax and works with many different model objects, where otherwise functions to access these information are missing. |
Authors: | Daniel Lüdecke [aut, cre] , Dominique Makowski [aut, ctb] , Indrajeet Patil [aut, ctb] , Philip Waggoner [aut, ctb] , Mattan S. Ben-Shachar [aut, ctb] , Brenton M. Wiernik [aut, ctb] , Vincent Arel-Bundock [aut, ctb] , Etienne Bacher [aut, ctb] , Alex Hayes [rev] , Grant McDermott [ctb] , Rémi Thériault [ctb] , Alex Reinhart [ctb] |
Maintainer: | Daniel Lüdecke <[email protected]> |
License: | GPL-3 |
Version: | 0.99.0.16 |
Built: | 2024-11-20 15:28:17 UTC |
Source: | https://github.com/easystats/insight |
Small helper that checks if all objects are supported (regression) model objects and of same class.
all_models_equal(..., verbose = FALSE) all_models_same_class(..., verbose = FALSE)
all_models_equal(..., verbose = FALSE) all_models_same_class(..., verbose = FALSE)
... |
A list of objects. |
verbose |
Toggle off warnings. |
A logical, TRUE
if x
are all supported model objects
of same class.
data(mtcars) data(sleepstudy, package = "lme4") m1 <- lm(mpg ~ wt + cyl + vs, data = mtcars) m2 <- lm(mpg ~ wt + cyl, data = mtcars) m3 <- lme4::lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy) m4 <- glm(formula = vs ~ wt, family = binomial(), data = mtcars) all_models_same_class(m1, m2) all_models_same_class(m1, m2, m3) all_models_same_class(m1, m4, m2, m3, verbose = TRUE) all_models_same_class(m1, m4, mtcars, m2, m3, verbose = TRUE)
data(mtcars) data(sleepstudy, package = "lme4") m1 <- lm(mpg ~ wt + cyl + vs, data = mtcars) m2 <- lm(mpg ~ wt + cyl, data = mtcars) m3 <- lme4::lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy) m4 <- glm(formula = vs ~ wt, family = binomial(), data = mtcars) all_models_same_class(m1, m2) all_models_same_class(m1, m2, m3) all_models_same_class(m1, m4, m2, m3, verbose = TRUE) all_models_same_class(m1, m4, mtcars, m2, m3, verbose = TRUE)
Function to export data frames into tables, which can be printed
to the console, or displayed in markdown or HTML format (and thereby, exported
to other formats like Word or PDF). The table width is automatically adjusted
to fit into the width of the display device (e.g., width of console). Use
the table_width
argument to control this behaviour.
apply_table_theme(out, x, theme = "default", sub_header_positions = NULL) export_table( x, sep = " | ", header = "-", cross = NULL, empty_line = NULL, digits = 2, protect_integers = TRUE, missing = "", width = NULL, format = NULL, title = NULL, caption = title, subtitle = NULL, footer = NULL, align = NULL, by = NULL, zap_small = FALSE, table_width = "auto", remove_duplicates = FALSE, verbose = TRUE, ... )
apply_table_theme(out, x, theme = "default", sub_header_positions = NULL) export_table( x, sep = " | ", header = "-", cross = NULL, empty_line = NULL, digits = 2, protect_integers = TRUE, missing = "", width = NULL, format = NULL, title = NULL, caption = title, subtitle = NULL, footer = NULL, align = NULL, by = NULL, zap_small = FALSE, table_width = "auto", remove_duplicates = FALSE, verbose = TRUE, ... )
out |
A |
x |
A data frame. May also be a list of data frames, to export multiple data frames into multiple tables. |
theme |
The theme to apply to the table. One of |
sub_header_positions |
A vector of row positions to apply a border to. Currently particular for internal use of other easystats packages. |
sep |
Column separator. |
header |
Header separator. Can be |
cross |
Character that is used where separator and header lines cross. |
empty_line |
Separator used for empty lines. If |
digits |
Number of digits for rounding or significant figures. May also
be |
protect_integers |
Should integers be kept as integers (i.e., without decimals)? |
missing |
Value by which |
width |
Refers to the width of columns (with numeric values). Can be
either |
format |
Name of output-format, as string. If |
title , caption , subtitle
|
Table title (same as caption) and subtitle, as strings. If |
footer |
Table footer, as string. For markdown-formatted tables, table
footers, due to the limitation in markdown rendering, are actually just a
new text line under the table. If |
align |
Column alignment. For markdown-formatted tables, the default
|
by |
Name of column in |
zap_small |
Logical, if |
table_width |
Numeric,
|
remove_duplicates |
Logical, if |
verbose |
Toggle messages and warnings. |
... |
Currently not used. |
If format = "text"
(or NULL
), a formatted character string is
returned. format = "markdown"
(or "md"
) returns a character string of
class knitr_kable
, which renders nicely in markdown files. format = "html"
returns an gt
object (created by the gt package), which - by default -
is displayed in the IDE's viewer pane or default browser. This object can
be further modified with the various gt-functions.
The values for caption
, subtitle
and footer
can also be provided
as attributes of x
, e.g. if caption = NULL
and x
has attribute
table_caption
, the value for this attribute will be used as table caption.
table_subtitle
is the attribute for subtitle
, and table_footer
for
footer
.
Vignettes Formatting, printing and exporting tables and Formatting model parameters.
export_table(head(iris)) export_table(head(iris), cross = "+") export_table(head(iris), sep = " ", header = "*", digits = 1) # split longer tables export_table(head(iris), table_width = 30) # colored footers data(iris) x <- as.data.frame(iris[1:5, ]) attr(x, "table_footer") <- c("This is a yellow footer line.", "yellow") export_table(x) attr(x, "table_footer") <- list( c("\nA yellow line", "yellow"), c("\nAnd a red line", "red"), c("\nAnd a blue line", "blue") ) export_table(x) attr(x, "table_footer") <- list( c("Without the ", "yellow"), c("new-line character ", "red"), c("we can have multiple colors per line.", "blue") ) export_table(x) # column-width d <- data.frame( x = c(1, 2, 3), y = c(100, 200, 300), z = c(10000, 20000, 30000) ) export_table(d) export_table(d, width = 8) export_table(d, width = c(x = 5, z = 10)) export_table(d, width = c(x = 5, y = 5, z = 10), align = "lcr")
export_table(head(iris)) export_table(head(iris), cross = "+") export_table(head(iris), sep = " ", header = "*", digits = 1) # split longer tables export_table(head(iris), table_width = 30) # colored footers data(iris) x <- as.data.frame(iris[1:5, ]) attr(x, "table_footer") <- c("This is a yellow footer line.", "yellow") export_table(x) attr(x, "table_footer") <- list( c("\nA yellow line", "yellow"), c("\nAnd a red line", "red"), c("\nAnd a blue line", "blue") ) export_table(x) attr(x, "table_footer") <- list( c("Without the ", "yellow"), c("new-line character ", "red"), c("we can have multiple colors per line.", "blue") ) export_table(x) # column-width d <- data.frame( x = c(1, 2, 3), y = c(100, 200, 300), z = c(10000, 20000, 30000) ) export_table(d) export_table(d, width = 8) export_table(d, width = c(x = 5, z = 10)) export_table(d, width = c(x = 5, y = 5, z = 10), align = "lcr")
Checking if needed package is installed
check_if_installed( package, reason = "for this function to work", stop = TRUE, minimum_version = NULL, quietly = FALSE, prompt = interactive(), ... )
check_if_installed( package, reason = "for this function to work", stop = TRUE, minimum_version = NULL, quietly = FALSE, prompt = interactive(), ... )
package |
A character vector naming the package(s), whose installation needs to be checked in any of the libraries. |
reason |
A phrase describing why the package is needed. The default is a generic description. |
stop |
Logical that decides whether the function should stop if the needed package is not installed. |
minimum_version |
A character vector, representing the minimum package
version that is required for each package. Should be of same length as
|
quietly |
Logical, if |
prompt |
If |
... |
Currently ignored |
If stop = TRUE
, and package
is not yet installed, the
function stops and throws an error. Else, a named logical vector is
returned, indicating which of the packages are installed, and which not.
check_if_installed("insight") try(check_if_installed("datawizard", stop = FALSE)) try(check_if_installed("rstanarm", stop = FALSE)) try(check_if_installed("nonexistent_package", stop = FALSE)) try(check_if_installed("insight", minimum_version = "99.8.7")) try(check_if_installed(c("nonexistent", "also_not_here"), stop = FALSE)) try(check_if_installed(c("datawizard", "rstanarm"), stop = FALSE)) try(check_if_installed(c("datawizard", "rstanarm"), minimum_version = c(NA, "2.21.1"), stop = FALSE ))
check_if_installed("insight") try(check_if_installed("datawizard", stop = FALSE)) try(check_if_installed("rstanarm", stop = FALSE)) try(check_if_installed("nonexistent_package", stop = FALSE)) try(check_if_installed("insight", minimum_version = "99.8.7")) try(check_if_installed(c("nonexistent", "also_not_here"), stop = FALSE)) try(check_if_installed(c("datawizard", "rstanarm"), stop = FALSE)) try(check_if_installed(c("datawizard", "rstanarm"), minimum_version = c(NA, "2.21.1"), stop = FALSE ))
This function "cleans" names of model terms (or a character
vector with such names) by removing patterns like log()
or
as.factor()
etc.
clean_names(x, ...) ## S3 method for class 'character' clean_names(x, include_names = FALSE, ...)
clean_names(x, ...) ## S3 method for class 'character' clean_names(x, include_names = FALSE, ...)
x |
A fitted model, or a character vector. |
... |
Currently not used. |
include_names |
Logical, if |
The "cleaned" variable names as character vector, i.e. pattern
like s()
for splines or log()
are removed from
the model terms.
Typically, this method is intended to work on character vectors,
in order to remove patterns that obscure the variable names. For
convenience reasons it is also possible to call clean_names()
also on a model object. If x
is a regression model, this
function is (almost) equal to calling find_variables()
. The
main difference is that clean_names()
always returns a character
vector, while find_variables()
returns a list of character
vectors, unless flatten = TRUE
. See 'Examples'.
# example from ?stats::glm counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12) outcome <- as.numeric(gl(3, 1, 9)) treatment <- gl(3, 3) m <- glm(counts ~ log(outcome) + as.factor(treatment), family = poisson()) clean_names(m) # difference "clean_names()" and "find_variables()" data(cbpp, package = "lme4") m <- lme4::glmer( cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial ) clean_names(m) find_variables(m) find_variables(m, flatten = TRUE)
# example from ?stats::glm counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12) outcome <- as.numeric(gl(3, 1, 9)) treatment <- gl(3, 3) m <- glm(counts ~ log(outcome) + as.factor(treatment), family = poisson()) clean_names(m) # difference "clean_names()" and "find_variables()" data(cbpp, package = "lme4") m <- lme4::glmer( cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial ) clean_names(m) find_variables(m) find_variables(m, flatten = TRUE)
This function "cleans" names of model parameters by removing
patterns like "r_"
or "b[]"
(mostly applicable to Stan models)
and adding columns with information to which group or component parameters
belong (i.e. fixed or random, count or zero-inflated...)
The main purpose of this function is to easily filter and select model parameters,
in particular of - but not limited to - posterior samples from Stan models,
depending on certain characteristics. This might be useful when only selective
results should be reported or results from all parameters should be filtered
to return only certain results (see print_parameters()
).
clean_parameters(x, ...)
clean_parameters(x, ...)
x |
A fitted model. |
... |
Currently not used. |
The Effects
column indicate if a parameter is a fixed or random effect.
The Component
can either be conditional or zero_inflated. For models
with random effects, the Group
column indicates the grouping factor of the
random effects. For multivariate response models from brms or
rstanarm, an additional Response column is included, to indicate
which parameters belong to which response formula. Furthermore,
Cleaned_Parameter column is returned that contains "human readable"
parameter names (which are mostly identical to Parameter
, except for for
models from brms or rstanarm, or for specific terms like smooth-
or spline-terms).
A data frame with "cleaned" parameter names and information on effects,
component and group where parameters belong to. To be consistent across
different models, the returned data frame always has at least four columns
Parameter
, Effects
, Component
and Cleaned_Parameter
. See 'Details'.
model <- download_model("brms_zi_2") clean_parameters(model)
model <- download_model("brms_zi_2") clean_parameters(model)
Convenient function that formats columns in data frames with color codes, where the color is chosen based on certain conditions. Columns are then printed in color in the console.
color_if( x, columns, predicate = `>`, value = 0, color_if = "green", color_else = "red", digits = 2 ) colour_if( x, columns, predicate = `>`, value = 0, colour_if = "green", colour_else = "red", digits = 2 )
color_if( x, columns, predicate = `>`, value = 0, color_if = "green", color_else = "red", digits = 2 ) colour_if( x, columns, predicate = `>`, value = 0, colour_if = "green", colour_else = "red", digits = 2 )
x |
A data frame |
columns |
Character vector with column names of |
predicate |
A function that takes |
value |
The comparator. May be used in conjunction with |
color_if , colour_if
|
Character vector, indicating the color code used to
format values in |
color_else , colour_else
|
See |
digits |
Digits for rounded values. |
The predicate-function simply works like this:
which(predicate(x[, columns], value))
x
, where columns matched by predicate
are wrapped into color codes.
# all values in Sepal.Length larger than 5 in green, all remaining in red x <- color_if(iris[1:10, ], columns = "Sepal.Length", predicate = `>`, value = 5) x cat(x$Sepal.Length) # all levels "setosa" in Species in green, all remaining in red x <- color_if(iris, columns = "Species", predicate = `==`, value = "setosa") cat(x$Species) # own function, argument "value" not needed here p <- function(x, y) { x >= 4.9 & x <= 5.1 } # all values in Sepal.Length between 4.9 and 5.1 in green, all remaining in red x <- color_if(iris[1:10, ], columns = "Sepal.Length", predicate = p) cat(x$Sepal.Length)
# all values in Sepal.Length larger than 5 in green, all remaining in red x <- color_if(iris[1:10, ], columns = "Sepal.Length", predicate = `>`, value = 5) x cat(x$Sepal.Length) # all levels "setosa" in Species in green, all remaining in red x <- color_if(iris, columns = "Species", predicate = `==`, value = "setosa") cat(x$Species) # own function, argument "value" not needed here p <- function(x, y) { x >= 4.9 & x <= 5.1 } # all values in Sepal.Length between 4.9 and 5.1 in green, all remaining in red x <- color_if(iris[1:10, ], columns = "Sepal.Length", predicate = p) cat(x$Sepal.Length)
Remove empty strings from character
compact_character(x)
compact_character(x)
x |
A single character or a vector of characters. |
A character or a character vector with empty strings removed.
compact_character(c("x", "y", NA)) compact_character(c("x", "NULL", "", "y"))
compact_character(c("x", "y", NA)) compact_character(c("x", "NULL", "", "y"))
Remove empty elements from lists
compact_list(x, remove_na = FALSE)
compact_list(x, remove_na = FALSE)
x |
A list or vector. |
remove_na |
Logical to decide if |
compact_list(list(NULL, 1, c(NA, NA))) compact_list(c(1, NA, NA)) compact_list(c(1, NA, NA), remove_na = TRUE)
compact_list(list(NULL, 1, c(NA, NA))) compact_list(c(1, NA, NA)) compact_list(c(1, NA, NA), remove_na = TRUE)
display()
is a generic function to export data frames
into various table formats (like plain text, markdown, ...). print_md()
usually is a convenient wrapper for display(format = "markdown")
.
Similar, print_html()
is a shortcut for display(format = "html")
.
See the documentation for the specific objects' classes.
display(object, ...) print_md(x, ...) print_html(x, ...) ## S3 method for class 'data.frame' display(object, format = "markdown", ...) ## S3 method for class 'data.frame' print_md(x, ...) ## S3 method for class 'data.frame' print_html(x, ...)
display(object, ...) print_md(x, ...) print_html(x, ...) ## S3 method for class 'data.frame' display(object, format = "markdown", ...) ## S3 method for class 'data.frame' print_md(x, ...) ## S3 method for class 'data.frame' print_html(x, ...)
object , x
|
A data frame. |
... |
Arguments passed to other methods. |
format |
String, indicating the output format. Can be |
Depending on format
, either an object of class gt_tbl
or a character vector of class knitr_kable
.
display(iris[1:5, ], format = "html")
display(iris[1:5, ], format = "html")
Downloads pre-compiled models from the circus-repository. The circus-repository contains a variety of fitted models to help the systematic testing of other packages
download_model( name, url = "https://raw.github.com/easystats/circus/master/data/", extension = ".rda", verbose = TRUE )
download_model( name, url = "https://raw.github.com/easystats/circus/master/data/", extension = ".rda", verbose = TRUE )
name |
Model name. |
url |
String with the URL from where to download the model data.
Optional, and should only be used in case the repository-URL is
changing. By default, models are downloaded from
|
extension |
File extension. Default is |
verbose |
Toggle messages and warnings. |
The code that generated the model is available at the https://easystats.github.io/circus/reference/index.html.
A model from the circus-repository, or NULL
if model could
not be downloaded (e.g., due to server problems).
https://easystats.github.io/circus/
download_model("aov_1") try(download_model("non_existent_model"))
download_model("aov_1") try(download_model("non_existent_model"))
Provides information regarding the models entered in an ellipsis. It detects whether all are models, regressions, nested regressions etc., assigning different classes to the list of objects.
ellipsis_info(objects, ...) ## Default S3 method: ellipsis_info(..., only_models = TRUE, verbose = TRUE)
ellipsis_info(objects, ...) ## Default S3 method: ellipsis_info(..., only_models = TRUE, verbose = TRUE)
objects , ...
|
Arbitrary number of objects. May also be a list of model objects. |
only_models |
Only keep supported models (default to |
verbose |
Toggle warnings. |
The list with objects that were passed to the function, including additional information as attributes (e.g. if models have same response or are nested).
m1 <- lm(Sepal.Length ~ Petal.Width + Species, data = iris) m2 <- lm(Sepal.Length ~ Species, data = iris) m3 <- lm(Sepal.Length ~ Petal.Width, data = iris) m4 <- lm(Sepal.Length ~ 1, data = iris) m5 <- lm(Petal.Width ~ 1, data = iris) objects <- ellipsis_info(m1, m2, m3, m4) class(objects) objects <- ellipsis_info(m1, m2, m4) attributes(objects)$is_nested objects <- ellipsis_info(m1, m2, m5) attributes(objects)$same_response
m1 <- lm(Sepal.Length ~ Petal.Width + Species, data = iris) m2 <- lm(Sepal.Length ~ Species, data = iris) m3 <- lm(Sepal.Length ~ Petal.Width, data = iris) m4 <- lm(Sepal.Length ~ 1, data = iris) m5 <- lm(Petal.Width ~ 1, data = iris) objects <- ellipsis_info(m1, m2, m3, m4) class(objects) objects <- ellipsis_info(m1, m2, m4) attributes(objects)$is_nested objects <- ellipsis_info(m1, m2, m5) attributes(objects)$same_response
Returns information on the sampling or estimation algorithm as well as optimization functions, or for Bayesian model information on chains, iterations and warmup-samples.
find_algorithm(x, ...)
find_algorithm(x, ...)
x |
A fitted model. |
... |
Currently not used. |
A list with elements depending on the model.
For frequentist models:
algorithm
, for instance "OLS"
or "ML"
optimizer
, name of optimizing function, only applies to
specific models (like gam
)
For frequentist mixed models:
algorithm
, for instance "REML"
or "ML"
optimizer
, name of optimizing function
For Bayesian models:
algorithm
, the algorithm
chains
, number of chains
iterations
, number of iterations per chain
warmup
, number of warmups per chain
data(sleepstudy, package = "lme4") m <- lme4::lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy) find_algorithm(m) data(sleepstudy, package = "lme4") m <- suppressWarnings(rstanarm::stan_lmer( Reaction ~ Days + (1 | Subject), data = sleepstudy, refresh = 0 )) find_algorithm(m)
data(sleepstudy, package = "lme4") m <- lme4::lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy) find_algorithm(m) data(sleepstudy, package = "lme4") m <- suppressWarnings(rstanarm::stan_lmer( Reaction ~ Days + (1 | Subject), data = sleepstudy, refresh = 0 )) find_algorithm(m)
Returns the formula(s) for the different parts of a model
(like fixed or random effects, zero-inflated component, ...).
formula_ok()
checks if a model formula has valid syntax
regarding writing TRUE
instead of T
inside poly()
and that no data names are used (i.e. no data$variable
, but rather
variable
).
find_formula(x, ...) formula_ok(x, verbose = TRUE, ...) ## Default S3 method: find_formula(x, verbose = TRUE, ...) ## S3 method for class 'nestedLogit' find_formula(x, dichotomies = FALSE, verbose = TRUE, ...)
find_formula(x, ...) formula_ok(x, verbose = TRUE, ...) ## Default S3 method: find_formula(x, verbose = TRUE, ...) ## S3 method for class 'nestedLogit' find_formula(x, dichotomies = FALSE, verbose = TRUE, ...)
x |
A fitted model. |
... |
Currently not used. |
verbose |
Toggle warnings. |
dichotomies |
Logical, if model is a |
A list of formulas that describe the model. For simple models,
only one list-element, conditional
, is returned. For more complex
models, the returned list may have following elements:
conditional
, the "fixed effects" part from the model (in the
context of fixed-effects or instrumental variable regression, also
called regressors) . One exception are DirichletRegModel
models
from DirichletReg, which has two or three components,
depending on model
.
random
, the "random effects" part from the model (or the
id
for gee-models and similar)
zero_inflated
, the "fixed effects" part from the
zero-inflation component of the model
zero_inflated_random
, the "random effects" part from the
zero-inflation component of the model
dispersion
, the dispersion formula
instruments
, for fixed-effects or instrumental variable
regressions like ivreg::ivreg()
, lfe::felm()
or plm::plm()
,
the instrumental variables
cluster
, for fixed-effects regressions like
lfe::felm()
, the cluster specification
correlation
, for models with correlation-component like
nlme::gls()
, the formula that describes the correlation structure
scale
, for distributional models such as mgcv::gaulss()
family fitted
with mgcv::gam()
, the formula that describes the scale parameter
slopes
, for fixed-effects individual-slope models like
feisr::feis()
, the formula for the slope parameters
precision
, for DirichletRegModel
models from
DirichletReg, when parametrization (i.e. model
) is
"alternative"
.
For models of class lme
or gls
the correlation-component
is only returned, when it is explicitly defined as named argument
(form
), e.g. corAR1(form = ~1 | Mare)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_formula(m) m <- lme4::lmer(Sepal.Length ~ Sepal.Width + (1 | Species), data = iris) f <- find_formula(m) f format(f)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_formula(m) m <- lme4::lmer(Sepal.Length ~ Sepal.Width + (1 | Species), data = iris) f <- find_formula(m) f format(f)
Returns all lowest to highest order interaction terms from a model.
find_interactions( x, component = c("all", "conditional", "zi", "zero_inflated", "dispersion", "instruments"), flatten = FALSE )
find_interactions( x, component = c("all", "conditional", "zi", "zero_inflated", "dispersion", "instruments"), flatten = FALSE )
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
flatten |
Logical, if |
A list of character vectors that represent the interaction terms.
Depending on component
, the returned list has following
elements (or NULL
, if model has no interaction term):
conditional
, interaction terms that belong to the "fixed
effects" terms from the model
zero_inflated
, interaction terms that belong to the "fixed
effects" terms from the zero-inflation component of the model
instruments
, for fixed-effects regressions like ivreg
,
felm
or plm
, interaction terms that belong to the
instrumental variables
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_interactions(m) m <- lm(mpg ~ wt * cyl + vs * hp * gear + carb, data = mtcars) find_interactions(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_interactions(m) m <- lm(mpg ~ wt * cyl + vs * hp * gear + carb, data = mtcars) find_interactions(m)
Returns a character vector with the name(s) of offset terms.
find_offset(x)
find_offset(x)
x |
A fitted model. |
A character vector with the name(s) of offset terms.
# Generate some zero-inflated data set.seed(123) N <- 100 # Samples x <- runif(N, 0, 10) # Predictor off <- rgamma(N, 3, 2) # Offset variable yhat <- -1 + x * 0.5 + log(off) # Prediction on log scale dat <- data.frame(y = NA, x, logOff = log(off)) dat$y <- rpois(N, exp(yhat)) # Poisson process dat$y <- ifelse(rbinom(N, 1, 0.3), 0, dat$y) # Zero-inflation process m1 <- zeroinfl(y ~ offset(logOff) + x | 1, data = dat, dist = "poisson") find_offset(m1) m2 <- zeroinfl(y ~ x | 1, data = dat, offset = logOff, dist = "poisson") find_offset(m2)
# Generate some zero-inflated data set.seed(123) N <- 100 # Samples x <- runif(N, 0, 10) # Predictor off <- rgamma(N, 3, 2) # Offset variable yhat <- -1 + x * 0.5 + log(off) # Prediction on log scale dat <- data.frame(y = NA, x, logOff = log(off)) dat$y <- rpois(N, exp(yhat)) # Poisson process dat$y <- ifelse(rbinom(N, 1, 0.3), 0, dat$y) # Zero-inflation process m1 <- zeroinfl(y ~ offset(logOff) + x | 1, data = dat, dist = "poisson") find_offset(m1) m2 <- zeroinfl(y ~ x | 1, data = dat, offset = logOff, dist = "poisson") find_offset(m2)
Returns the names of model parameters, like they typically
appear in the summary()
output. For Bayesian models, the parameter
names equal the column names of the posterior samples after coercion
from as.data.frame()
. See the documentation for your object's class:
Bayesian models (rstanarm, brms, MCMCglmm, ...)
Generalized additive models (mgcv, VGAM, ...)
Marginal effects models (mfx)
Estimated marginal means (emmeans)
Mixed models (lme4, glmmTMB, GLMMadaptive, ...)
Zero-inflated and hurdle models (pscl, ...)
Models with special components (betareg, MuMIn, ...)
find_parameters(x, ...) ## Default S3 method: find_parameters(x, flatten = FALSE, verbose = TRUE, ...)
find_parameters(x, ...) ## Default S3 method: find_parameters(x, flatten = FALSE, verbose = TRUE, ...)
x |
A fitted model. |
... |
Currently not used. |
flatten |
Logical, if |
verbose |
Toggle messages and warnings. |
A list of parameter names. For simple models, only one list-element,
conditional
, is returned.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
There are four functions that return information about the variables in a
model: find_predictors()
, find_variables()
, find_terms()
and
find_parameters()
. There are some differences between those functions,
which are explained using following model. Note that some, but not all of
those functions return information about the dependent and independent
variables. In this example, we only show the differences for the independent
variables.
model <- lm(mpg ~ factor(gear), data = mtcars)
find_terms(model)
returns the model terms, i.e. how the variables were
used in the model, e.g. applying transformations like factor()
, poly()
etc. find_terms()
may return a variable name multiple times in case of
multiple transformations. The return value would be "factor(gear)"
.
find_parameters(model)
returns the names of the model parameters
(coefficients). The return value would be "(Intercept)"
, "factor(gear)4"
and "factor(gear)5"
.
find_variables()
returns the original variable names. find_variables()
returns each variable name only once. The return value would be "gear"
.
find_predictors()
is comparable to find_variables()
and also returns
the original variable names, but excluded the dependent (response)
variables. The return value would be "gear"
.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_parameters(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_parameters(m)
Returns the names of model parameters, like they typically
appear in the summary()
output.
## S3 method for class 'averaging' find_parameters(x, component = "conditional", flatten = FALSE, ...)
## S3 method for class 'averaging' find_parameters(x, component = "conditional", flatten = FALSE, ...)
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
flatten |
Logical, if |
... |
Currently not used. |
A list of parameter names. The returned list may have following
elements, usually requested via the component
argument:
conditional
, the "fixed effects" part from the model.
full
, parameters from the full model.
precision
for models of class betareg
.
survival
for model of class mjoint
.
extra
for models of class glmx
.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data("GasolineYield", package = "betareg") m <- betareg::betareg(yield ~ batch + temp, data = GasolineYield) find_parameters(m) find_parameters(m, component = "precision")
data("GasolineYield", package = "betareg") m <- betareg::betareg(yield ~ batch + temp, data = GasolineYield) find_parameters(m) find_parameters(m, component = "precision")
Returns the names of model parameters, like they typically
appear in the summary()
output.
## S3 method for class 'betamfx' find_parameters(x, component = "all", flatten = FALSE, ...)
## S3 method for class 'betamfx' find_parameters(x, component = "all", flatten = FALSE, ...)
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
flatten |
Logical, if |
... |
Currently not used. |
A list of parameter names. The returned list may have following elements:
conditional
, the "fixed effects" part from the model.
marginal
, the marginal effects.
precision
, the precision parameter.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_parameters(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_parameters(m)
Returns the names of model parameters, like they typically
appear in the summary()
output. For Bayesian models, the parameter
names equal the column names of the posterior samples after coercion
from as.data.frame()
.
## S3 method for class 'BGGM' find_parameters(x, component = "correlation", flatten = FALSE, ...) ## S3 method for class 'brmsfit' find_parameters( x, effects = "all", component = "all", flatten = FALSE, parameters = NULL, ... )
## S3 method for class 'BGGM' find_parameters(x, component = "correlation", flatten = FALSE, ...) ## S3 method for class 'brmsfit' find_parameters( x, effects = "all", component = "all", flatten = FALSE, parameters = NULL, ... )
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
flatten |
Logical, if |
... |
Currently not used. |
effects |
Should variables for fixed effects ( |
parameters |
Regular expression pattern that describes the parameters that should be returned. |
A list of parameter names. For simple models, only one list-element,
conditional
, is returned. For more complex models, the returned list may
have following elements:
conditional
, the "fixed effects" part from the model
random
, the "random effects" part from the model
zero_inflated
, the "fixed effects" part from the zero-inflation component
of the model
zero_inflated_random
, the "random effects" part from the zero-inflation
component of the model
smooth_terms
, the smooth parameters
Furthermore, some models, especially from brms, can also return auxiliary parameters. These may be one of the following:
sigma
, the residual standard deviation (auxiliary parameter)
dispersion
, the dispersion parameters (auxiliary parameter)
beta
, the beta parameter (auxiliary parameter)
simplex
, simplex parameters of monotonic effects (brms only)
mix
, mixture parameters (brms only)
shiftprop
, shifted proportion parameters (brms only)
Models of class BGGM additionally can return the elements correlation
and intercept
.
Models of class BFBayesFactor additionally can return the element
extra
.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_parameters(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_parameters(m)
Returns the parameter names from a model.
## S3 method for class 'emmGrid' find_parameters(x, flatten = FALSE, merge_parameters = FALSE, ...)
## S3 method for class 'emmGrid' find_parameters(x, flatten = FALSE, merge_parameters = FALSE, ...)
x |
A fitted model. |
flatten |
Logical, if |
merge_parameters |
Logical, if |
... |
Currently not used. |
A list of parameter names. For simple models, only one list-element,
conditional
, is returned.
data(mtcars) model <- lm(mpg ~ wt * factor(cyl), data = mtcars) emm <- emmeans(model, c("wt", "cyl")) find_parameters(emm)
data(mtcars) model <- lm(mpg ~ wt * factor(cyl), data = mtcars) emm <- emmeans(model, c("wt", "cyl")) find_parameters(emm)
Returns the names of model parameters, like they typically
appear in the summary()
output.
## S3 method for class 'gamlss' find_parameters(x, flatten = FALSE, ...) ## S3 method for class 'gam' find_parameters(x, component = "all", flatten = FALSE, ...)
## S3 method for class 'gamlss' find_parameters(x, flatten = FALSE, ...) ## S3 method for class 'gam' find_parameters(x, component = "all", flatten = FALSE, ...)
x |
A fitted model. |
flatten |
Logical, if |
... |
Currently not used. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
A list of parameter names. The returned list may have following elements:
conditional
, the "fixed effects" part from the model.
smooth_terms
, the smooth parameters.
data(mtcars) m <- mgcv::gam(mpg ~ s(hp) + gear, data = mtcars) find_parameters(m)
data(mtcars) m <- mgcv::gam(mpg ~ s(hp) + gear, data = mtcars) find_parameters(m)
Returns the names of model parameters, like they typically
appear in the summary()
output.
## S3 method for class 'glmmTMB' find_parameters(x, effects = "all", component = "all", flatten = FALSE, ...)
## S3 method for class 'glmmTMB' find_parameters(x, effects = "all", component = "all", flatten = FALSE, ...)
x |
A fitted model. |
effects |
Should variables for fixed effects ( |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
flatten |
Logical, if |
... |
Currently not used. |
A list of parameter names. The returned list may have following
elements, usually returned based on the combination of the effects
and
component
arguments:
conditional
, the "fixed effects" part from the model.
random
, the "random effects" part from the model.
zero_inflated
, the "fixed effects" part from the zero-inflation component
of the model.
zero_inflated_random
, the "random effects" part from the zero-inflation
component of the model.
dispersion
, the dispersion parameters (auxiliary parameter)
dispersion_random
, the "random effects" part from the dispersion
parameters (auxiliary parameter)
nonlinear
, the parameters from the nonlinear formula.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(sleepstudy, package = "lme4") m <- lme4::lmer( Reaction ~ Days + (1 + Days | Subject), data = sleepstudy ) find_parameters(m)
data(sleepstudy, package = "lme4") m <- lme4::lmer( Reaction ~ Days + (1 + Days | Subject), data = sleepstudy ) find_parameters(m)
Returns the names of model parameters, like they typically
appear in the summary()
output.
## S3 method for class 'zeroinfl' find_parameters(x, component = "all", flatten = FALSE, ...)
## S3 method for class 'zeroinfl' find_parameters(x, component = "all", flatten = FALSE, ...)
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
flatten |
Logical, if |
... |
Currently not used. |
A list of parameter names. The returned list may have following elements:
conditional
, the "fixed effects" part from the model.
zero_inflated
, the "fixed effects" part from the zero-inflation
component of the model.
Special models are mhurdle
, which also can have the components
infrequent_purchase
, ip
, and auxiliary
.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(bioChemists, package = "pscl") m <- pscl::zeroinfl( art ~ fem + mar + kid5 + ment | kid5 + phd, data = bioChemists ) find_parameters(m)
data(bioChemists, package = "pscl") m <- pscl::zeroinfl( art ~ fem + mar + kid5 + ment | kid5 + phd, data = bioChemists ) find_parameters(m)
Returns the names of the predictor variables for the different
parts of a model (like fixed or random effects, zero-inflated component,
...). Unlike find_parameters()
, the names from find_predictors()
match
the original variable names from the data that was used to fit the model.
find_predictors(x, ...) ## Default S3 method: find_predictors( x, effects = "fixed", component = "all", flatten = FALSE, verbose = TRUE, ... )
find_predictors(x, ...) ## Default S3 method: find_predictors( x, effects = "fixed", component = "all", flatten = FALSE, verbose = TRUE, ... )
x |
A fitted model. |
... |
Currently not used. |
effects |
Should variables for fixed effects ( |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
flatten |
Logical, if |
verbose |
Toggle warnings. |
A list of character vectors that represent the name(s) of the
predictor variables. Depending on the combination of the arguments
effects
and component
, the returned list can have following elements:
conditional
, the "fixed effects" terms from the model
random
, the "random effects" terms from the model
zero_inflated
, the "fixed effects" terms from the zero-inflation
component of the model
zero_inflated_random
, the "random effects" terms from the zero-inflation
component of the model
dispersion
, the dispersion terms
instruments
, for fixed-effects regressions like ivreg
, felm
or plm
,
the instrumental variables
correlation
, for models with correlation-component like gls
, the
variables used to describe the correlation structure
nonlinear
, for non-linear models (like models of class nlmerMod
or
nls
), the staring estimates for the nonlinear parameters
smooth_terms
returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms)
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
There are four functions that return information about the variables in a
model: find_predictors()
, find_variables()
, find_terms()
and
find_parameters()
. There are some differences between those functions,
which are explained using following model. Note that some, but not all of
those functions return information about the dependent and independent
variables. In this example, we only show the differences for the independent
variables.
model <- lm(mpg ~ factor(gear), data = mtcars)
find_terms(model)
returns the model terms, i.e. how the variables were
used in the model, e.g. applying transformations like factor()
, poly()
etc. find_terms()
may return a variable name multiple times in case of
multiple transformations. The return value would be "factor(gear)"
.
find_parameters(model)
returns the names of the model parameters
(coefficients). The return value would be "(Intercept)"
, "factor(gear)4"
and "factor(gear)5"
.
find_variables()
returns the original variable names. find_variables()
returns each variable name only once. The return value would be "gear"
.
find_predictors()
is comparable to find_variables()
and also returns
the original variable names, but excluded the dependent (response)
variables. The return value would be "gear"
.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_predictors(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_predictors(m)
Return the name of the grouping factors from mixed effects models.
find_random(x, split_nested = FALSE, flatten = FALSE)
find_random(x, split_nested = FALSE, flatten = FALSE)
x |
A fitted mixed model. |
split_nested |
Logical, if |
flatten |
Logical, if |
A list of character vectors that represent the name(s) of the random effects (grouping factors). Depending on the model, the returned list has following elements:
random
, the "random effects" terms from the conditional part of model
zero_inflated_random
, the "random effects" terms from the zero-inflation
component of the model
data(sleepstudy, package = "lme4") sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$mysubgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$mygrp == i sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } m <- lme4::lmer( Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject), data = sleepstudy ) find_random(m) find_random(m, split_nested = TRUE)
data(sleepstudy, package = "lme4") sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$mysubgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$mygrp == i sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } m <- lme4::lmer( Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject), data = sleepstudy ) find_random(m) find_random(m, split_nested = TRUE)
Return the name of the random slopes from mixed effects models.
find_random_slopes(x)
find_random_slopes(x)
x |
A fitted mixed model. |
A list of character vectors with the name(s) of the random slopes, or
NULL
if model has no random slopes. Depending on the model, the returned
list has following elements:
random
, the random slopes from the conditional part of model
zero_inflated_random
, the random slopes from the zero-inflation
component of the model
data(sleepstudy, package = "lme4") m <- lme4::lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy) find_random_slopes(m)
data(sleepstudy, package = "lme4") m <- lme4::lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy) find_random_slopes(m)
Returns the name(s) of the response variable(s) from a model object.
find_response(x, combine = TRUE, ...) ## S3 method for class 'mjoint' find_response( x, combine = TRUE, component = c("conditional", "survival", "all"), ... ) ## S3 method for class 'joint' find_response( x, combine = TRUE, component = c("conditional", "survival", "all"), ... )
find_response(x, combine = TRUE, ...) ## S3 method for class 'mjoint' find_response( x, combine = TRUE, component = c("conditional", "survival", "all"), ... ) ## S3 method for class 'joint' find_response( x, combine = TRUE, component = c("conditional", "survival", "all"), ... )
x |
A fitted model. |
combine |
Logical, if |
... |
Currently not used. |
component |
Character, if |
The name(s) of the response variable(s) from x
as character
vector, or NULL
if response variable could not be found.
data(cbpp, package = "lme4") cbpp$trials <- cbpp$size - cbpp$incidence m <- glm(cbind(incidence, trials) ~ period, data = cbpp, family = binomial) find_response(m, combine = TRUE) find_response(m, combine = FALSE)
data(cbpp, package = "lme4") cbpp$trials <- cbpp$size - cbpp$incidence m <- glm(cbind(incidence, trials) ~ period, data = cbpp, family = binomial) find_response(m, combine = TRUE) find_response(m, combine = FALSE)
Return the names of smooth terms from a model object.
find_smooth(x, flatten = FALSE)
find_smooth(x, flatten = FALSE)
x |
A (gam) model. |
flatten |
Logical, if |
A character vector with the name(s) of the smooth terms.
data(iris) model <- mgcv::gam(Petal.Length ~ Petal.Width + s(Sepal.Length), data = iris) find_smooth(model)
data(iris) model <- mgcv::gam(Petal.Length ~ Petal.Width + s(Sepal.Length), data = iris) find_smooth(model)
Returns the statistic for a regression model (t-statistic, z-statistic, etc.).
Small helper that checks if a model is a regression model object and return the statistic used.
find_statistic(x, ...)
find_statistic(x, ...)
x |
An object. |
... |
Currently not used. |
A character describing the type of statistic. If there is no
statistic available with a distribution, NULL
will be returned.
# regression model object data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_statistic(m)
# regression model object data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) find_statistic(m)
Returns a list with the names of all terms, including response
value and random effects, "as is". This means, on-the-fly tranformations
or arithmetic expressions like log()
, I()
, as.factor()
etc. are
preserved.
find_terms(x, ...) ## Default S3 method: find_terms(x, flatten = FALSE, as_term_labels = FALSE, verbose = TRUE, ...)
find_terms(x, ...) ## Default S3 method: find_terms(x, flatten = FALSE, as_term_labels = FALSE, verbose = TRUE, ...)
x |
A fitted model. |
... |
Currently not used. |
flatten |
Logical, if |
as_term_labels |
Logical, if |
verbose |
Toggle warnings. |
A list with (depending on the model) following elements (character vectors):
response
, the name of the response variable
conditional
, the names of the predictor variables from the conditional
model (as opposed to the zero-inflated part of a model)
random
, the names of the random effects (grouping factors)
zero_inflated
, the names of the predictor variables from the zero-inflated part of the model
zero_inflated_random
, the names of the random effects (grouping factors)
dispersion
, the name of the dispersion terms
instruments
, the names of instrumental variables
Returns NULL
if no terms could be found (for instance, due to
problems in accessing the formula).
There are four functions that return information about the variables in a
model: find_predictors()
, find_variables()
, find_terms()
and
find_parameters()
. There are some differences between those functions,
which are explained using following model. Note that some, but not all of
those functions return information about the dependent and independent
variables. In this example, we only show the differences for the independent
variables.
model <- lm(mpg ~ factor(gear), data = mtcars)
find_terms(model)
returns the model terms, i.e. how the variables were
used in the model, e.g. applying transformations like factor()
, poly()
etc. find_terms()
may return a variable name multiple times in case of
multiple transformations. The return value would be "factor(gear)"
.
find_parameters(model)
returns the names of the model parameters
(coefficients). The return value would be "(Intercept)"
, "factor(gear)4"
and "factor(gear)5"
.
find_variables()
returns the original variable names. find_variables()
returns each variable name only once. The return value would be "gear"
.
find_predictors()
is comparable to find_variables()
and also returns
the original variable names, but excluded the dependent (response)
variables. The return value would be "gear"
.
The difference to find_variables()
is that find_terms()
may return a variable multiple times in case of multiple transformations
(see examples below), while find_variables()
returns each variable
name only once.
data(sleepstudy, package = "lme4") m <- suppressWarnings(lme4::lmer( log(Reaction) ~ Days + I(Days^2) + (1 + Days + exp(Days) | Subject), data = sleepstudy )) find_terms(m) # sometimes, it is necessary to retrieve terms from "term.labels" attribute m <- lm(mpg ~ hp * (am + cyl), data = mtcars) find_terms(m, as_term_labels = TRUE)
data(sleepstudy, package = "lme4") m <- suppressWarnings(lme4::lmer( log(Reaction) ~ Days + I(Days^2) + (1 + Days + exp(Days) | Subject), data = sleepstudy )) find_terms(m) # sometimes, it is necessary to retrieve terms from "term.labels" attribute m <- lm(mpg ~ hp * (am + cyl), data = mtcars) find_terms(m, as_term_labels = TRUE)
This functions checks whether any transformation, such as log-
or exp-transforming, was applied to the response variable (dependent
variable) in a regression formula. Optionally, all model terms can also be
checked for any such transformation. Currently, following patterns are
detected: log
, log1p
, log2
, log10
, exp
, expm1
, sqrt
,
log(y+<number>)
, log-log
, power
(e.g. to 2nd power, like I(y^2)
),
inverse
(like 1/y
), scale
(e.g., y/3
), and box-cox
(e-g-,
(y^lambda - 1) / lambda
).
find_transformation(x, ...) ## Default S3 method: find_transformation(x, include_all = FALSE, ...)
find_transformation(x, ...) ## Default S3 method: find_transformation(x, include_all = FALSE, ...)
x |
A regression model or a character string of the formulation of the (response) variable. |
... |
Currently not used. |
include_all |
Logical, if |
A string, with the name of the function of the applied transformation.
Returns "identity"
for no transformation, and e.g. "log(y+3)"
when
a specific values was added to the response variables before
log-transforming. For unknown transformations, returns NULL
.
# identity, no transformation model <- lm(Sepal.Length ~ Species, data = iris) find_transformation(model) # log-transformation model <- lm(log(Sepal.Length) ~ Species, data = iris) find_transformation(model) # log+2 model <- lm(log(Sepal.Length + 2) ~ Species, data = iris) find_transformation(model) # find transformation for all model terms model <- lm(mpg ~ log(wt) + I(gear^2) + exp(am), data = mtcars) find_transformation(model, include_all = TRUE) # inverse, response provided as character string find_transformation("1 / y")
# identity, no transformation model <- lm(Sepal.Length ~ Species, data = iris) find_transformation(model) # log-transformation model <- lm(log(Sepal.Length) ~ Species, data = iris) find_transformation(model) # log+2 model <- lm(log(Sepal.Length + 2) ~ Species, data = iris) find_transformation(model) # find transformation for all model terms model <- lm(mpg ~ log(wt) + I(gear^2) + exp(am), data = mtcars) find_transformation(model, include_all = TRUE) # inverse, response provided as character string find_transformation("1 / y")
Returns a list with the names of all variables, including response value and random effects.
find_variables( x, effects = "all", component = "all", flatten = FALSE, verbose = TRUE )
find_variables( x, effects = "all", component = "all", flatten = FALSE, verbose = TRUE )
x |
A fitted model. |
effects |
Should variables for fixed effects ( |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
flatten |
Logical, if |
verbose |
Toggle warnings. |
A list with (depending on the model) following elements (character vectors):
response
, the name of the response variable
conditional
, the names of the predictor variables from the conditional
model (as opposed to the zero-inflated part of a model)
cluster
, the names of cluster or grouping variables
dispersion
, the name of the dispersion terms
instruments
, the names of instrumental variables
random
, the names of the random effects (grouping factors)
zero_inflated
, the names of the predictor variables from the
zero-inflated part of the model
zero_inflated_random
, the names of the random effects (grouping factors)
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
There are four functions that return information about the variables in a
model: find_predictors()
, find_variables()
, find_terms()
and
find_parameters()
. There are some differences between those functions,
which are explained using following model. Note that some, but not all of
those functions return information about the dependent and independent
variables. In this example, we only show the differences for the independent
variables.
model <- lm(mpg ~ factor(gear), data = mtcars)
find_terms(model)
returns the model terms, i.e. how the variables were
used in the model, e.g. applying transformations like factor()
, poly()
etc. find_terms()
may return a variable name multiple times in case of
multiple transformations. The return value would be "factor(gear)"
.
find_parameters(model)
returns the names of the model parameters
(coefficients). The return value would be "(Intercept)"
, "factor(gear)4"
and "factor(gear)5"
.
find_variables()
returns the original variable names. find_variables()
returns each variable name only once. The return value would be "gear"
.
find_predictors()
is comparable to find_variables()
and also returns
the original variable names, but excluded the dependent (response)
variables. The return value would be "gear"
.
The difference to find_terms()
is that find_variables()
returns
each variable name only once, while find_terms()
may return a variable
multiple times in case of transformations or when arithmetic expressions
were used in the formula.
data(cbpp, package = "lme4") data(sleepstudy, package = "lme4") # some data preparation... cbpp$trials <- cbpp$size - cbpp$incidence sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$mysubgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$mygrp == i sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } m1 <- lme4::glmer( cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial ) find_variables(m1) m2 <- lme4::lmer( Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject), data = sleepstudy ) find_variables(m2) find_variables(m2, flatten = TRUE)
data(cbpp, package = "lme4") data(sleepstudy, package = "lme4") # some data preparation... cbpp$trials <- cbpp$size - cbpp$incidence sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$mysubgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$mygrp == i sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } m1 <- lme4::glmer( cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial ) find_variables(m1) m2 <- lme4::lmer( Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject), data = sleepstudy ) find_variables(m2) find_variables(m2, flatten = TRUE)
Returns the name of the variable that describes the weights of a model.
find_weights(x, ...)
find_weights(x, ...)
x |
A fitted model. |
... |
Currently not used. |
The name of the weighting variable as character vector, or NULL
if no weights were specified.
data(mtcars) mtcars$weight <- rnorm(nrow(mtcars), 1, .3) m <- lm(mpg ~ wt + cyl + vs, data = mtcars, weights = weight) find_weights(m)
data(mtcars) mtcars$weight <- rnorm(nrow(mtcars), 1, .3) m <- lm(mpg ~ wt + cyl + vs, data = mtcars, weights = weight) find_weights(m)
Bayes Factor formatting
format_bf( bf, stars = FALSE, stars_only = FALSE, inferiority_star = "°", name = "BF", protect_ratio = FALSE, na_reference = NA, exact = FALSE )
format_bf( bf, stars = FALSE, stars_only = FALSE, inferiority_star = "°", name = "BF", protect_ratio = FALSE, na_reference = NA, exact = FALSE )
bf |
Bayes Factor. |
stars |
Add significance stars (e.g., p < .001***). For Bayes factors, the thresholds for "significant" results are values larger than 3, 10, and 30. |
stars_only |
Return only significance stars. |
inferiority_star |
String, indicating the symbol that is used to indicate inferiority, i.e. when the Bayes Factor is smaller than one third (the thresholds are smaller than one third, 1/10 and 1/30). |
name |
Name prefixing the text. Can be |
protect_ratio |
Should values smaller than 1 be represented as ratios? |
na_reference |
How to format missing values ( |
exact |
Should very large or very small values be reported with a scientific format (e.g., 4.24e5), or as truncated values (as "> 1000" and "< 1/1000"). |
A formatted string.
bfs <- c(0.000045, 0.033, NA, 1557, 3.54) format_bf(bfs) format_bf(bfs, exact = TRUE, name = NULL) format_bf(bfs, stars = TRUE) format_bf(bfs, protect_ratio = TRUE) format_bf(bfs, protect_ratio = TRUE, exact = TRUE) format_bf(bfs, na_reference = 1)
bfs <- c(0.000045, 0.033, NA, 1557, 3.54) format_bf(bfs) format_bf(bfs, exact = TRUE, name = NULL) format_bf(bfs, stars = TRUE) format_bf(bfs, protect_ratio = TRUE) format_bf(bfs, protect_ratio = TRUE, exact = TRUE) format_bf(bfs, na_reference = 1)
This function converts the first letter in a string into upper case.
format_capitalize(x, verbose = TRUE)
format_capitalize(x, verbose = TRUE)
x |
A character vector or a factor. The latter is coerced to character. All other objects are returned unchanged. |
verbose |
Toggle warnings. |
x
, with first letter capitalized.
format_capitalize("hello") format_capitalize(c("hello", "world")) unique(format_capitalize(iris$Species))
format_capitalize("hello") format_capitalize(c("hello", "world")) unique(format_capitalize(iris$Species))
Confidence/Credible Interval (CI) Formatting
format_ci(CI_low, ...) ## S3 method for class 'numeric' format_ci( CI_low, CI_high, ci = 0.95, digits = 2, brackets = TRUE, width = NULL, width_low = width, width_high = width, missing = "", zap_small = FALSE, ci_string = "CI", ... )
format_ci(CI_low, ...) ## S3 method for class 'numeric' format_ci( CI_low, CI_high, ci = 0.95, digits = 2, brackets = TRUE, width = NULL, width_low = width, width_high = width, missing = "", zap_small = FALSE, ci_string = "CI", ... )
CI_low |
Lower CI bound. Usually a numeric value, but can also be a
CI output returned |
... |
Arguments passed to or from other methods. |
CI_high |
Upper CI bound. |
ci |
CI level in percentage. |
digits |
Number of digits for rounding or significant figures. May also
be |
brackets |
Either a logical, and if |
width |
Minimum width of the returned string. If not |
width_low , width_high
|
Like |
missing |
Value by which |
zap_small |
Logical, if |
ci_string |
String to be used in the output to indicate the type of
interval. Default is |
A formatted string.
format_ci(1.20, 3.57, ci = 0.90) format_ci(1.20, 3.57, ci = NULL) format_ci(1.20, 3.57, ci = NULL, brackets = FALSE) format_ci(1.20, 3.57, ci = NULL, brackets = c("(", ")")) format_ci(c(1.205645, 23.4), c(3.57, -1.35), ci = 0.90) format_ci(c(1.20, NA, NA), c(3.57, -1.35, NA), ci = 0.90) # automatic alignment of width, useful for printing multiple CIs in columns x <- format_ci(c(1.205, 23.4, 100.43), c(3.57, -13.35, 9.4)) cat(x, sep = "\n") x <- format_ci(c(1.205, 23.4, 100.43), c(3.57, -13.35, 9.4), width = "auto") cat(x, sep = "\n")
format_ci(1.20, 3.57, ci = 0.90) format_ci(1.20, 3.57, ci = NULL) format_ci(1.20, 3.57, ci = NULL, brackets = FALSE) format_ci(1.20, 3.57, ci = NULL, brackets = c("(", ")")) format_ci(c(1.205645, 23.4), c(3.57, -1.35), ci = 0.90) format_ci(c(1.20, NA, NA), c(3.57, -1.35, NA), ci = 0.90) # automatic alignment of width, useful for printing multiple CIs in columns x <- format_ci(c(1.205, 23.4, 100.43), c(3.57, -13.35, 9.4)) cat(x, sep = "\n") x <- format_ci(c(1.205, 23.4, 100.43), c(3.57, -13.35, 9.4), width = "auto") cat(x, sep = "\n")
Inserts line breaks into a longer message or warning string. Line length is adjusted to maximum length of the console, if the width can be accessed. By default, new lines are indented by two spaces.
format_alert()
is a wrapper that combines formatting a string with a
call to message()
, warning()
or stop()
. By default, format_alert()
creates a message()
. format_warning()
and format_error()
change the
default type of exception to warning()
and stop()
, respectively.
format_message( string, ..., line_length = 0.9 * getOption("width", 80), indent = " " ) format_alert( string, ..., line_length = 0.9 * getOption("width", 80), indent = " ", type = "message", call = FALSE, immediate = FALSE ) format_warning(..., immediate = FALSE) format_error(...)
format_message( string, ..., line_length = 0.9 * getOption("width", 80), indent = " " ) format_alert( string, ..., line_length = 0.9 * getOption("width", 80), indent = " ", type = "message", call = FALSE, immediate = FALSE ) format_warning(..., immediate = FALSE) format_error(...)
string |
A string. |
... |
Further strings that will be concatenated as indented new lines. |
line_length |
Numeric, the maximum length of a line. The default is 90% of the width of the console window. |
indent |
Character vector. If further lines are specified in |
type |
Type of exception alert to raise.
Can be |
call |
Logical. Indicating if the call should be included in the the
error message. This is usually confusing for users when the function
producing the warning or error is deep within another function, so the
default is |
immediate |
Logical. Indicating if the warning should be printed
immediately. Only applies to |
There is an experimental formatting feature implemented in this function. You can use following tags:
{.b text}
for bold formatting
{.i text}
to use italic font style
{.url www.url.com}
formats the string as URL (i.e., enclosing URL in
<
and >
, blue color and italic font style)
{.pkg packagename}
formats the text in blue color.
This features has some limitations: it's hard to detect the exact length for
each line when the string has multiple lines (after line breaks) and the
string contains formatting tags. Thus, it can happen that lines are wrapped at
an earlier length than expected. Furthermore, if you have multiple words in a
format tag ({.b one two three}
), a line break might occur inside this tag,
and the formatting no longer works (messing up the message-string).
For format_message()
, a formatted string.
For format_alert()
and related functions, the requested exception,
with the exception formatted using format_message()
.
msg <- format_message("Much too long string for just one line, I guess!", line_length = 15 ) message(msg) msg <- format_message("Much too long string for just one line, I guess!", "First new line", "Second new line", "(both indented)", line_length = 30 ) message(msg) msg <- format_message("Much too long string for just one line, I guess!", "First new line", "Second new line", "(not indented)", line_length = 30, indent = "" ) message(msg) # Caution, experimental! See 'Details' msg <- format_message( "This is {.i italic}, visit {.url easystats.github.io/easystats}", line_length = 30 ) message(msg) # message format_alert("This is a message.") format_alert("This is a warning.", type = "message") # error try(format_error("This is an error.")) # warning format_warning("This is a warning.")
msg <- format_message("Much too long string for just one line, I guess!", line_length = 15 ) message(msg) msg <- format_message("Much too long string for just one line, I guess!", "First new line", "Second new line", "(both indented)", line_length = 30 ) message(msg) msg <- format_message("Much too long string for just one line, I guess!", "First new line", "Second new line", "(not indented)", line_length = 30, indent = "" ) message(msg) # Caution, experimental! See 'Details' msg <- format_message( "This is {.i italic}, visit {.url easystats.github.io/easystats}", line_length = 30 ) message(msg) # message format_alert("This is a message.") format_alert("This is a warning.", type = "message") # error try(format_error("This is an error.")) # warning format_warning("This is a warning.")
Convert number to words
format_number(x, textual = TRUE, ...)
format_number(x, textual = TRUE, ...)
x |
Number. |
textual |
Return words. If |
... |
Arguments to be passed to |
A formatted string.
The code has been adapted from here https://github.com/ateucher/useful_code/blob/master/R/numbers2words.r
format_number(2) format_number(45) format_number(324.68765)
format_number(2) format_number(45) format_number(324.68765)
Format p-values.
format_p( p, stars = FALSE, stars_only = FALSE, whitespace = TRUE, name = "p", missing = "", decimal_separator = NULL, digits = 3, ... )
format_p( p, stars = FALSE, stars_only = FALSE, whitespace = TRUE, name = "p", missing = "", decimal_separator = NULL, digits = 3, ... )
p |
value or vector of p-values. |
stars |
Add significance stars (e.g., p < .001***). For Bayes factors, the thresholds for "significant" results are values larger than 3, 10, and 30. |
stars_only |
Return only significance stars. |
whitespace |
Logical, if |
name |
Name prefixing the text. Can be |
missing |
Value by which |
decimal_separator |
Character, if not |
digits |
Number of significant digits. May also be |
... |
Arguments from other methods. |
A formatted string.
format_p(c(.02, .065, 0, .23)) format_p(c(.02, .065, 0, .23), name = NULL) format_p(c(.02, .065, 0, .23), stars_only = TRUE) model <- lm(mpg ~ wt + cyl, data = mtcars) p <- coef(summary(model))[, 4] format_p(p, digits = "apa") format_p(p, digits = "scientific") format_p(p, digits = "scientific2")
format_p(c(.02, .065, 0, .23)) format_p(c(.02, .065, 0, .23), name = NULL) format_p(c(.02, .065, 0, .23), stars_only = TRUE) model <- lm(mpg ~ wt + cyl, data = mtcars) p <- coef(summary(model))[, 4] format_p(p, digits = "apa") format_p(p, digits = "scientific") format_p(p, digits = "scientific2")
Probability of direction (pd) formatting
format_pd(pd, stars = FALSE, stars_only = FALSE, name = "pd")
format_pd(pd, stars = FALSE, stars_only = FALSE, name = "pd")
pd |
Probability of direction (pd). |
stars |
Add significance stars (e.g., p < .001***). For Bayes factors, the thresholds for "significant" results are values larger than 3, 10, and 30. |
stars_only |
Return only significance stars. |
name |
Name prefixing the text. Can be |
A formatted string.
format_pd(0.12) format_pd(c(0.12, 1, 0.9999, 0.98, 0.995, 0.96), name = NULL) format_pd(c(0.12, 1, 0.9999, 0.98, 0.995, 0.96), stars = TRUE)
format_pd(0.12) format_pd(c(0.12, 1, 0.9999, 0.98, 0.995, 0.96), name = NULL) format_pd(c(0.12, 1, 0.9999, 0.98, 0.995, 0.96), stars = TRUE)
Percentage in ROPE formatting
format_rope(rope_percentage, name = "in ROPE", digits = 2)
format_rope(rope_percentage, name = "in ROPE", digits = 2)
rope_percentage |
Value or vector of percentages in ROPE. |
name |
Name prefixing the text. Can be |
digits |
Number of significant digits. May also be |
A formatted string.
format_rope(c(0.02, 0.12, 0.357, 0)) format_rope(c(0.02, 0.12, 0.357, 0), name = NULL)
format_rope(c(0.02, 0.12, 0.357, 0)) format_rope(c(0.02, 0.12, 0.357, 0), name = NULL)
String Values Formatting
format_string(x, ...) ## S3 method for class 'character' format_string(x, length = NULL, abbreviate = "...", ...)
format_string(x, ...) ## S3 method for class 'character' format_string(x, length = NULL, abbreviate = "...", ...)
x |
String value. |
... |
Arguments passed to or from other methods. |
length |
Numeric, maximum length of the returned string. If not
|
abbreviate |
String that will be used as suffix, if |
A formatted string.
s <- "This can be considered as very long string!" # string is shorter than max.length, so returned as is format_string(s, 60) # string is shortened to as many words that result in # a string of maximum 20 chars format_string(s, 20)
s <- "This can be considered as very long string!" # string is shorter than max.length, so returned as is format_string(s, 60) # string is shortened to as many words that result in # a string of maximum 20 chars format_string(s, 20)
This functions takes a data frame (usually with model
parameters) as input and formats certain columns into a more readable
layout (like collapsing separate columns for lower and upper confidence
interval values). Furthermore, column names are formatted as well. Note
that format_table()
converts all columns into character vectors!
format_table( x, pretty_names = TRUE, stars = FALSE, digits = 2, ci_width = "auto", ci_brackets = TRUE, ci_digits = 2, p_digits = 3, rope_digits = 2, ic_digits = 1, zap_small = FALSE, preserve_attributes = FALSE, exact = TRUE, use_symbols = getOption("insight_use_symbols", FALSE), verbose = TRUE, ... )
format_table( x, pretty_names = TRUE, stars = FALSE, digits = 2, ci_width = "auto", ci_brackets = TRUE, ci_digits = 2, p_digits = 3, rope_digits = 2, ic_digits = 1, zap_small = FALSE, preserve_attributes = FALSE, exact = TRUE, use_symbols = getOption("insight_use_symbols", FALSE), verbose = TRUE, ... )
x |
A data frame of model's parameters, as returned by various functions
of the easystats-packages. May also be a result from
|
pretty_names |
Return "pretty" (i.e. more human readable) parameter names. |
stars |
If |
digits , ci_digits , p_digits , rope_digits , ic_digits
|
Number of digits for
rounding or significant figures. May also be |
ci_width |
Minimum width of the returned string for confidence
intervals. If not |
ci_brackets |
Logical, if |
zap_small |
Logical, if |
preserve_attributes |
Logical, if |
exact |
Formatting for Bayes factor columns, in case the provided data
frame contains such a column (i.e. columns named |
use_symbols |
Logical, if |
verbose |
Toggle messages and warnings. |
... |
Arguments passed to or from other methods. |
A data frame. Note that format_table()
converts all columns
into character vectors!
options(insight_use_symbols = TRUE)
overrides the use_symbols
argument
and always displays symbols, if possible.
Vignettes Formatting, printing and exporting tables and Formatting model parameters.
format_table(head(iris), digits = 1) m <- lm(Sepal.Length ~ Species * Sepal.Width, data = iris) x <- parameters::model_parameters(m) as.data.frame(format_table(x)) as.data.frame(format_table(x, p_digits = "scientific")) model <- rstanarm::stan_glm( Sepal.Length ~ Species, data = iris, refresh = 0, seed = 123 ) x <- parameters::model_parameters(model, ci = c(0.69, 0.89, 0.95)) as.data.frame(format_table(x))
format_table(head(iris), digits = 1) m <- lm(Sepal.Length ~ Species * Sepal.Width, data = iris) x <- parameters::model_parameters(m) as.data.frame(format_table(x)) as.data.frame(format_table(x, p_digits = "scientific")) model <- rstanarm::stan_glm( Sepal.Length ~ Species, data = iris, refresh = 0, seed = 123 ) x <- parameters::model_parameters(model, ci = c(0.69, 0.89, 0.95)) as.data.frame(format_table(x))
format_value()
converts numeric values into formatted string values, where
formatting can be something like rounding digits, scientific notation etc.
format_percent()
is a short-cut for format_value(as_percent = TRUE)
.
format_value(x, ...) ## S3 method for class 'data.frame' format_value( x, digits = 2, protect_integers = FALSE, missing = "", width = NULL, as_percent = FALSE, zap_small = FALSE, lead_zero = TRUE, style_positive = "none", style_negative = "hyphen", decimal_point = getOption("OutDec"), ... ) ## S3 method for class 'numeric' format_value( x, digits = 2, protect_integers = FALSE, missing = "", width = NULL, as_percent = FALSE, zap_small = FALSE, lead_zero = TRUE, style_positive = "none", style_negative = "hyphen", decimal_point = getOption("OutDec"), ... ) format_percent(x, ...)
format_value(x, ...) ## S3 method for class 'data.frame' format_value( x, digits = 2, protect_integers = FALSE, missing = "", width = NULL, as_percent = FALSE, zap_small = FALSE, lead_zero = TRUE, style_positive = "none", style_negative = "hyphen", decimal_point = getOption("OutDec"), ... ) ## S3 method for class 'numeric' format_value( x, digits = 2, protect_integers = FALSE, missing = "", width = NULL, as_percent = FALSE, zap_small = FALSE, lead_zero = TRUE, style_positive = "none", style_negative = "hyphen", decimal_point = getOption("OutDec"), ... ) format_percent(x, ...)
x |
Numeric value. |
... |
Arguments passed to or from other methods. |
digits |
Number of digits for rounding or significant figures. May also
be |
protect_integers |
Should integers be kept as integers (i.e., without decimals)? |
missing |
Value by which |
width |
Minimum width of the returned string. If not |
as_percent |
Logical, if |
zap_small |
Logical, if |
lead_zero |
Logical, if |
style_positive |
A string that determines the style of positive numbers.
May be |
style_negative |
A string that determines the style of negative numbers.
May be |
decimal_point |
Character string containing a single character that is used as decimal point in output conversions. |
A formatted string.
format_value(1.20) format_value(1.2) format_value(1.2012313) format_value(c(0.0045, 234, -23)) format_value(c(0.0045, 0.12, 0.34)) format_value(c(0.0045, 0.12, 0.34), as_percent = TRUE) format_value(c(0.0045, 0.12, 0.34), digits = "scientific") format_value(c(0.0045, 0.12, 0.34), digits = "scientific2") format_value(c(0.045, 0.12, 0.34), lead_zero = FALSE) format_value(c(0.0045, 0.12, 0.34), decimal_point = ",") # default format_value(c(0.0045, 0.123, 0.345)) # significant figures format_value(c(0.0045, 0.123, 0.345), digits = "signif") format_value(as.factor(c("A", "B", "A"))) format_value(iris$Species) format_value(3) format_value(3, protect_integers = TRUE) format_value(head(iris))
format_value(1.20) format_value(1.2) format_value(1.2012313) format_value(c(0.0045, 234, -23)) format_value(c(0.0045, 0.12, 0.34)) format_value(c(0.0045, 0.12, 0.34), as_percent = TRUE) format_value(c(0.0045, 0.12, 0.34), digits = "scientific") format_value(c(0.0045, 0.12, 0.34), digits = "scientific2") format_value(c(0.045, 0.12, 0.34), lead_zero = FALSE) format_value(c(0.0045, 0.12, 0.34), decimal_point = ",") # default format_value(c(0.0045, 0.123, 0.345)) # significant figures format_value(c(0.0045, 0.123, 0.345), digits = "signif") format_value(as.factor(c("A", "B", "A"))) format_value(iris$Species) format_value(3) format_value(3, protect_integers = TRUE) format_value(head(iris))
Returns the requested auxiliary parameters from models, like dispersion, sigma, or beta...
get_auxiliary( x, type = "sigma", summary = TRUE, centrality = "mean", verbose = TRUE, ... ) get_dispersion(x, ...) ## Default S3 method: get_dispersion(x, ...)
get_auxiliary( x, type = "sigma", summary = TRUE, centrality = "mean", verbose = TRUE, ... ) get_dispersion(x, ...) ## Default S3 method: get_dispersion(x, ...)
x |
A model. |
type |
The name of the auxiliary parameter that should be retrieved.
|
summary |
Logical, indicates whether the full posterior samples
( |
centrality |
Only for models with posterior samples, and when
|
verbose |
Toggle warnings. |
... |
Currently not used. |
Currently, only sigma and the dispersion parameter are returned, and only for a limited set of models.
The requested auxiliary parameter, or NULL
if this information
could not be accessed.
See get_sigma()
.
There are many different definitions of "dispersion", depending on the context.
get_auxiliary()
returns the dispersion parameters that usually can
be considered as variance-to-mean ratio for generalized (linear) mixed
models. Exceptions are models of class glmmTMB
, where the dispersion
equals σ2.
In detail, the computation of the dispersion parameter for generalized linear
models is the ratio of the sum of the squared working-residuals and the
residual degrees of freedom. For mixed models of class glmer
, the
dispersion parameter is also called φ
and is the ratio of the sum of the squared Pearson-residuals and the residual
degrees of freedom. For models of class glmmTMB
, dispersion is
σ2.
For models of class brmsfit
, there are different options for the
type
argument. See a list of supported auxiliary parameters here:
find_parameters.BGGM()
.
# from ?glm clotting <- data.frame( u = c(5, 10, 15, 20, 30, 40, 60, 80, 100), lot1 = c(118, 58, 42, 35, 27, 25, 21, 19, 18), lot2 = c(69, 35, 26, 21, 18, 16, 13, 12, 12) ) model <- glm(lot1 ~ log(u), data = clotting, family = Gamma()) get_auxiliary(model, type = "dispersion") # same as summary(model)$dispersion
# from ?glm clotting <- data.frame( u = c(5, 10, 15, 20, 30, 40, 60, 80, 100), lot1 = c(118, 58, 42, 35, 27, 25, 21, 19, 18), lot2 = c(69, 35, 26, 21, 18, 16, 13, 12, 12) ) model <- glm(lot1 ~ log(u), data = clotting, family = Gamma()) get_auxiliary(model, type = "dispersion") # same as summary(model)$dispersion
Returns the model's function call when available.
get_call(x)
get_call(x)
x |
A fitted mixed model. |
A function call.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_call(m) m <- lme4::lmer(Sepal.Length ~ Sepal.Width + (1 | Species), data = iris) get_call(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_call(m) m <- lme4::lmer(Sepal.Length ~ Sepal.Width + (1 | Species), data = iris) get_call(m)
This functions tries to get the data that was used to fit the model and returns it as data frame.
get_data(x, ...) ## Default S3 method: get_data(x, source = "environment", verbose = TRUE, ...) ## S3 method for class 'glmmTMB' get_data( x, effects = "all", component = "all", source = "environment", verbose = TRUE, ... ) ## S3 method for class 'afex_aov' get_data(x, shape = c("long", "wide"), ...) ## S3 method for class 'rma' get_data( x, source = "environment", verbose = TRUE, include_interval = FALSE, transf = NULL, transf_args = NULL, ci = 0.95, ... )
get_data(x, ...) ## Default S3 method: get_data(x, source = "environment", verbose = TRUE, ...) ## S3 method for class 'glmmTMB' get_data( x, effects = "all", component = "all", source = "environment", verbose = TRUE, ... ) ## S3 method for class 'afex_aov' get_data(x, shape = c("long", "wide"), ...) ## S3 method for class 'rma' get_data( x, source = "environment", verbose = TRUE, include_interval = FALSE, transf = NULL, transf_args = NULL, ci = 0.95, ... )
x |
A fitted model. |
... |
Currently not used. |
source |
String, indicating from where data should be recovered. If
|
verbose |
Toggle messages and warnings. |
effects |
Should model data for fixed effects ( |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
shape |
Return long or wide data? Only applicable in repeated measures designs. |
include_interval |
For meta-analysis models, should normal-approximation confidence intervals be added for each response effect size? |
transf |
For meta-analysis models, if intervals are included, a function applied to each response effect size and its interval. |
transf_args |
For meta-analysis models, an optional list of arguments
passed to the |
ci |
For meta-analysis models, the Confidence Interval (CI) level if
|
The data that was used to fit the model.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(cbpp, package = "lme4") cbpp$trials <- cbpp$size - cbpp$incidence m <- glm(cbind(incidence, trials) ~ period, data = cbpp, family = binomial) head(get_data(m))
data(cbpp, package = "lme4") cbpp$trials <- cbpp$size - cbpp$incidence m <- glm(cbind(incidence, trials) ~ period, data = cbpp, family = binomial) head(get_data(m))
Create a reference matrix, useful for visualisation, with evenly spread and
combined values. Usually used to make generate predictions using
get_predicted()
. See this
vignette
for a tutorial on how to create a visualisation matrix using this function.
Alternatively, these can also be used to extract the "grid" columns from
objects generated by emmeans and marginaleffects.
get_datagrid(x, ...) ## S3 method for class 'data.frame' get_datagrid( x, by = "all", factors = "reference", numerics = "mean", preserve_range = FALSE, reference = x, length = 10, range = "range", ... ) ## S3 method for class 'numeric' get_datagrid(x, length = 10, range = "range", ...) ## S3 method for class 'factor' get_datagrid(x, ...) ## Default S3 method: get_datagrid( x, by = "all", factors = "reference", numerics = "mean", preserve_range = TRUE, reference = x, include_smooth = TRUE, include_random = FALSE, include_response = FALSE, data = NULL, verbose = TRUE, ... ) ## S3 method for class 'emmGrid' get_datagrid(x, ...) ## S3 method for class 'slopes' get_datagrid(x, ...)
get_datagrid(x, ...) ## S3 method for class 'data.frame' get_datagrid( x, by = "all", factors = "reference", numerics = "mean", preserve_range = FALSE, reference = x, length = 10, range = "range", ... ) ## S3 method for class 'numeric' get_datagrid(x, length = 10, range = "range", ...) ## S3 method for class 'factor' get_datagrid(x, ...) ## Default S3 method: get_datagrid( x, by = "all", factors = "reference", numerics = "mean", preserve_range = TRUE, reference = x, include_smooth = TRUE, include_random = FALSE, include_response = FALSE, data = NULL, verbose = TRUE, ... ) ## S3 method for class 'emmGrid' get_datagrid(x, ...) ## S3 method for class 'slopes' get_datagrid(x, ...)
x |
An object from which to construct the reference grid. |
... |
Arguments passed to or from other methods (for instance, |
by |
Indicates the focal predictors (variables) for the reference grid
and at which values focal predictors should be represented. If not specified
otherwise, representative values for numeric variables or predictors are
evenly distributed from the minimum to the maximum, with a total number of
There is a special handling of assignments with brackets, i.e. values
defined inside
For factor variables, the value(s) inside the brackets should indicate
one or more factor levels, like The remaining variables not specified in |
factors |
Type of summary for factors. Can be |
numerics |
Type of summary for numeric values. Can be |
preserve_range |
In the case of combinations between numeric variables
and factors, setting |
reference |
The reference vector from which to compute the mean and SD.
Used when standardizing or unstandardizing the grid using |
length |
Length of numeric target variables selected in |
range |
Option to control the representative values given in
|
include_smooth |
If |
include_random |
If |
include_response |
If |
data |
Optional, the data frame that was used to fit the model. Usually,
the data is retrieved via |
verbose |
Toggle warnings. |
Reference grid data frame.
# Datagrids of variables and dataframes ===================================== # Single variable is of interest; all others are "fixed" ------------------ # Factors get_datagrid(iris, by = "Species") # Returns all the levels get_datagrid(iris, by = "Species = c('setosa', 'versicolor')") # Specify an expression # Numeric variables get_datagrid(iris, by = "Sepal.Length") # default spread length = 10 get_datagrid(iris, by = "Sepal.Length", length = 3) # change length get_datagrid(iris[2:150, ], by = "Sepal.Length", factors = "mode", numerics = "median" ) # change non-targets fixing get_datagrid(iris, by = "Sepal.Length", range = "ci", ci = 0.90) # change min/max of target get_datagrid(iris, by = "Sepal.Length = [0, 1]") # Manually change min/max get_datagrid(iris, by = "Sepal.Length = [sd]") # -1 SD, mean and +1 SD # identical to previous line: -1 SD, mean and +1 SD get_datagrid(iris, by = "Sepal.Length", range = "sd", length = 3) get_datagrid(iris, by = "Sepal.Length = [quartiles]") # quartiles # Numeric and categorical variables, generating a grid for plots # default spread length = 10 get_datagrid(iris, by = c("Sepal.Length", "Species"), range = "grid") # default spread length = 3 (-1 SD, mean and +1 SD) get_datagrid(iris, by = c("Species", "Sepal.Length"), range = "grid") # Standardization and unstandardization data <- get_datagrid(iris, by = "Sepal.Length", range = "sd", length = 3) data$Sepal.Length # It is a named vector (extract names with `names(out$Sepal.Length)`) datawizard::standardize(data, select = "Sepal.Length") data <- get_datagrid(iris, by = "Sepal.Length = c(-2, 0, 2)") # Manually specify values data datawizard::unstandardize(data, select = "Sepal.Length") # Multiple variables are of interest, creating a combination -------------- get_datagrid(iris, by = c("Sepal.Length", "Species"), length = 3) get_datagrid(iris, by = c("Sepal.Length", "Petal.Length"), length = c(3, 2)) get_datagrid(iris, by = c(1, 3), length = 3) get_datagrid(iris, by = c("Sepal.Length", "Species"), preserve_range = TRUE) get_datagrid(iris, by = c("Sepal.Length", "Species"), numerics = 0) get_datagrid(iris, by = c("Sepal.Length = 3", "Species")) get_datagrid(iris, by = c("Sepal.Length = c(3, 1)", "Species = 'setosa'")) # With list-style by-argument get_datagrid(iris, by = list(Sepal.Length = c(1, 3), Species = "setosa")) # With models =============================================================== # Fit a linear regression model <- lm(Sepal.Length ~ Sepal.Width * Petal.Length, data = iris) # Get datagrid of predictors data <- get_datagrid(model, length = c(20, 3), range = c("range", "sd")) # same as: get_datagrid(model, range = "grid", length = 20) # Add predictions data$Sepal.Length <- get_predicted(model, data = data) # Visualize relationships (each color is at -1 SD, Mean, and + 1 SD of Petal.Length) plot(data$Sepal.Width, data$Sepal.Length, col = data$Petal.Length, main = "Relationship at -1 SD, Mean, and + 1 SD of Petal.Length" )
# Datagrids of variables and dataframes ===================================== # Single variable is of interest; all others are "fixed" ------------------ # Factors get_datagrid(iris, by = "Species") # Returns all the levels get_datagrid(iris, by = "Species = c('setosa', 'versicolor')") # Specify an expression # Numeric variables get_datagrid(iris, by = "Sepal.Length") # default spread length = 10 get_datagrid(iris, by = "Sepal.Length", length = 3) # change length get_datagrid(iris[2:150, ], by = "Sepal.Length", factors = "mode", numerics = "median" ) # change non-targets fixing get_datagrid(iris, by = "Sepal.Length", range = "ci", ci = 0.90) # change min/max of target get_datagrid(iris, by = "Sepal.Length = [0, 1]") # Manually change min/max get_datagrid(iris, by = "Sepal.Length = [sd]") # -1 SD, mean and +1 SD # identical to previous line: -1 SD, mean and +1 SD get_datagrid(iris, by = "Sepal.Length", range = "sd", length = 3) get_datagrid(iris, by = "Sepal.Length = [quartiles]") # quartiles # Numeric and categorical variables, generating a grid for plots # default spread length = 10 get_datagrid(iris, by = c("Sepal.Length", "Species"), range = "grid") # default spread length = 3 (-1 SD, mean and +1 SD) get_datagrid(iris, by = c("Species", "Sepal.Length"), range = "grid") # Standardization and unstandardization data <- get_datagrid(iris, by = "Sepal.Length", range = "sd", length = 3) data$Sepal.Length # It is a named vector (extract names with `names(out$Sepal.Length)`) datawizard::standardize(data, select = "Sepal.Length") data <- get_datagrid(iris, by = "Sepal.Length = c(-2, 0, 2)") # Manually specify values data datawizard::unstandardize(data, select = "Sepal.Length") # Multiple variables are of interest, creating a combination -------------- get_datagrid(iris, by = c("Sepal.Length", "Species"), length = 3) get_datagrid(iris, by = c("Sepal.Length", "Petal.Length"), length = c(3, 2)) get_datagrid(iris, by = c(1, 3), length = 3) get_datagrid(iris, by = c("Sepal.Length", "Species"), preserve_range = TRUE) get_datagrid(iris, by = c("Sepal.Length", "Species"), numerics = 0) get_datagrid(iris, by = c("Sepal.Length = 3", "Species")) get_datagrid(iris, by = c("Sepal.Length = c(3, 1)", "Species = 'setosa'")) # With list-style by-argument get_datagrid(iris, by = list(Sepal.Length = c(1, 3), Species = "setosa")) # With models =============================================================== # Fit a linear regression model <- lm(Sepal.Length ~ Sepal.Width * Petal.Length, data = iris) # Get datagrid of predictors data <- get_datagrid(model, length = c(20, 3), range = c("range", "sd")) # same as: get_datagrid(model, range = "grid", length = 20) # Add predictions data$Sepal.Length <- get_predicted(model, data = data) # Visualize relationships (each color is at -1 SD, Mean, and + 1 SD of Petal.Length) plot(data$Sepal.Width, data$Sepal.Length, col = data$Petal.Length, main = "Relationship at -1 SD, Mean, and + 1 SD of Petal.Length" )
Returns model deviance (see stats::deviance()
).
get_deviance(x, ...) ## Default S3 method: get_deviance(x, verbose = TRUE, ...)
get_deviance(x, ...) ## Default S3 method: get_deviance(x, verbose = TRUE, ...)
x |
A model. |
... |
Not used. |
verbose |
Toggle warnings and messages. |
For GLMMs of class glmerMod
, glmmTMB
or MixMod
,
the absolute unconditional deviance is returned (see 'Details' in
?lme4::merMod-class
), i.e. minus twice the log-likelihood. To get
the relative conditional deviance (relative to a saturated model,
conditioned on the conditional modes of random effects), use deviance()
.
The value returned get_deviance()
usually equals the deviance-value
from the summary()
.
The model deviance.
data(mtcars) x <- lm(mpg ~ cyl, data = mtcars) get_deviance(x)
data(mtcars) x <- lm(mpg ~ cyl, data = mtcars) get_deviance(x)
Estimate or extract residual or model-based degrees of freedom from regression models.
get_df(x, ...) ## Default S3 method: get_df(x, type = "residual", verbose = TRUE, ...)
get_df(x, ...) ## Default S3 method: get_df(x, type = "residual", verbose = TRUE, ...)
x |
A statistical model. |
... |
Currently not used. |
type |
Type of approximation for the degrees of freedom. Can be one of the following:
Usually, when degrees of freedom are required to calculate p-values or
confidence intervals, |
verbose |
Toggle warnings. |
Degrees of freedom for mixed models
Inferential statistics (like p-values, confidence intervals and standard errors) may be biased in mixed models when the number of clusters is small (even if the sample size of level-1 units is high). In such cases it is recommended to approximate a more accurate number of degrees of freedom for such inferential statistics (see Li and Redden 2015).
m-l-1 degrees of freedom
The m-l-1 heuristic is an approach that uses a t-distribution with fewer
degrees of freedom. In particular for repeated measure designs (longitudinal
data analysis), the m-l-1 heuristic is likely to be more accurate than simply
using the residual or infinite degrees of freedom, because get_df(type = "ml1")
returns different degrees of freedom for within-cluster and between-cluster
effects. Note that the "m-l-1" heuristic is not applicable (or at least less
accurate) for complex multilevel designs, e.g. with cross-classified clusters.
In such cases, more accurate approaches like the Kenward-Roger approximation
is recommended. However, the "m-l-1" heuristic also applies to generalized
mixed models, while approaches like Kenward-Roger or Satterthwaite are limited
to linear mixed models only.
Between-within degrees of freedom
The Between-within denominator degrees of freedom approximation is, similar
to the "m-l-1" heuristic, recommended in particular for (generalized) linear
mixed models with repeated measurements (longitudinal design).
get_df(type = "betwithin")
implements a heuristic based on the between-within
approach, i.e. this type returns different degrees of freedom for within-cluster
and between-cluster effects. Note that this implementation does not return
exactly the same results as shown in Li and Redden 2015, but similar.
Satterthwaite and Kenward-Rogers degrees of freedom
Unlike simpler approximation heuristics like the "m-l-1" rule (type = "ml1"
),
the Satterthwaite or Kenward-Rogers approximation is also applicable in more
complex multilevel designs. However, the "m-l-1" or "between-within" heuristics
also apply to generalized mixed models, while approaches like Kenward-Roger
or Satterthwaite are limited to linear mixed models only.
Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 983-997.
Satterthwaite FE (1946) An approximate distribution of estimates of variance components. Biometrics Bulletin 2 (6):110–4.
Elff, M.; Heisig, J.P.; Schaeffer, M.; Shikano, S. (2019). Multilevel Analysis with Few Clusters: Improving Likelihood-based Methods to Provide Unbiased Estimates and Accurate Inference, British Journal of Political Science.
Li, P., Redden, D. T. (2015). Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample cluster-randomized trials. BMC Medical Research Methodology, 15(1), 38
model <- lm(Sepal.Length ~ Petal.Length * Species, data = iris) get_df(model) # same as df.residual(model) get_df(model, type = "model") # same as attr(logLik(model), "df")
model <- lm(Sepal.Length ~ Petal.Length * Species, data = iris) get_df(model) # same as df.residual(model) get_df(model, type = "model") # same as attr(logLik(model), "df")
A robust and resilient alternative to stats::family
. To avoid issues
with models like gamm4
.
get_family(x, ...)
get_family(x, ...)
x |
A statistical model. |
... |
Further arguments passed to methods. |
data(mtcars) x <- glm(vs ~ wt, data = mtcars, family = "binomial") get_family(x) x <- mgcv::gamm( vs ~ am + s(wt), random = list(cyl = ~1), data = mtcars, family = "binomial" ) get_family(x)
data(mtcars) x <- glm(vs ~ wt, data = mtcars, family = "binomial") get_family(x) x <- mgcv::gamm( vs ~ am + s(wt), random = list(cyl = ~1), data = mtcars, family = "binomial" ) get_family(x)
Returns the value at the intercept (i.e., the intercept
parameter), and NA
if there isn't one.
get_intercept(x, ...)
get_intercept(x, ...)
x |
A model. |
... |
Not used. |
The value of the intercept.
get_intercept(lm(Sepal.Length ~ Petal.Width, data = iris)) get_intercept(lm(Sepal.Length ~ 0 + Petal.Width, data = iris)) get_intercept(lme4::lmer(Sepal.Length ~ Sepal.Width + (1 | Species), data = iris)) get_intercept(gamm4::gamm4(Sepal.Length ~ s(Petal.Width), data = iris))
get_intercept(lm(Sepal.Length ~ Petal.Width, data = iris)) get_intercept(lm(Sepal.Length ~ 0 + Petal.Width, data = iris)) get_intercept(lme4::lmer(Sepal.Length ~ Sepal.Width + (1 | Species), data = iris)) get_intercept(gamm4::gamm4(Sepal.Length ~ s(Petal.Width), data = iris))
A robust function to compute the log-likelihood of a model, as well as
individual log-likelihoods (for each observation) whenever possible. Can be
used as a replacement for stats::logLik()
out of the box, as the
returned object is of the same class (and it gives the same results by
default).
get_loglikelihood_adjustment()
can be used to correct the log-likelihood
for models with transformed response variables. The adjustment value can
be added to the log-likelihood to get the corrected value. This is done
automatically in get_loglikelihood()
if check_response = TRUE
.
get_loglikelihood(x, ...) loglikelihood(x, ...) get_loglikelihood_adjustment(x) ## S3 method for class 'lm' get_loglikelihood( x, estimator = "ML", REML = FALSE, check_response = FALSE, verbose = TRUE, ... )
get_loglikelihood(x, ...) loglikelihood(x, ...) get_loglikelihood_adjustment(x) ## S3 method for class 'lm' get_loglikelihood( x, estimator = "ML", REML = FALSE, check_response = FALSE, verbose = TRUE, ... )
x |
A model. |
... |
Passed down to |
estimator |
Corresponds to the different estimators for the standard
deviation of the errors. If |
REML |
Only for linear models. This argument is present for
compatibility with |
check_response |
Logical, if |
verbose |
Toggle warnings and messages. |
get_loglikelihood()
returns an object of class "logLik"
, also
containing the log-likelihoods for each observation as a per_observation
attribute (attributes(get_loglikelihood(x))$per_observation
) when
possible. The code was partly inspired from the nonnest2 package.
get_loglikelihood_adjustment()
returns the adjustment value to be added to
the log-likelihood to correct for transformed response variables, or NULL
if the adjustment could not be computed.
x <- lm(Sepal.Length ~ Petal.Width + Species, data = iris) get_loglikelihood(x, estimator = "ML") # Equivalent to stats::logLik(x) get_loglikelihood(x, estimator = "REML") # Equivalent to stats::logLik(x, REML=TRUE) get_loglikelihood(x, estimator = "OLS")
x <- lm(Sepal.Length ~ Petal.Width + Species, data = iris) get_loglikelihood(x, estimator = "ML") # Equivalent to stats::logLik(x) get_loglikelihood(x, estimator = "REML") # Equivalent to stats::logLik(x, REML=TRUE) get_loglikelihood(x, estimator = "OLS")
Creates a design matrix from the description. Any character variables are coerced to factors.
get_modelmatrix(x, ...)
get_modelmatrix(x, ...)
x |
An object. |
... |
Passed down to other methods (mainly |
data(mtcars) model <- lm(am ~ vs, data = mtcars) get_modelmatrix(model)
data(mtcars) model <- lm(am ~ vs, data = mtcars) get_modelmatrix(model)
Returns the coefficients (or posterior samples for Bayesian models) from a model. See the documentation for your object's class:
Bayesian models (rstanarm, brms, MCMCglmm, ...)
Estimated marginal means (emmeans)
Generalized additive models (mgcv, VGAM, ...)
Marginal effects models (mfx)
Mixed models (lme4, glmmTMB, GLMMadaptive, ...)
Zero-inflated and hurdle models (pscl, ...)
Models with special components (betareg, MuMIn, ...)
Hypothesis tests (htest
)
get_parameters(x, ...) ## Default S3 method: get_parameters(x, verbose = TRUE, ...)
get_parameters(x, ...) ## Default S3 method: get_parameters(x, verbose = TRUE, ...)
x |
A fitted model. |
... |
Currently not used. |
verbose |
Toggle messages and warnings. |
In most cases when models either return different "effects" (fixed,
random) or "components" (conditional, zero-inflated, ...), the arguments
effects
and component
can be used.
get_parameters()
is comparable to coef()
, however, the coefficients
are returned as data frame (with columns for names and point estimates of
coefficients). For Bayesian models, the posterior samples of parameters are
returned.
for non-Bayesian models, a data frame with two columns: the parameter names and the related point estimates.
for Anova (aov()
) with error term, a list of parameters for the
conditional and the random effects parameters
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
Returns the coefficients from a model.
## S3 method for class 'betamfx' get_parameters(x, component = "all", ...)
## S3 method for class 'betamfx' get_parameters(x, component = "all", ...)
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
... |
Currently not used. |
A data frame with three columns: the parameter names, the related point estimates and the component.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
Returns the coefficients from a model.
## S3 method for class 'betareg' get_parameters(x, component = "all", ...)
## S3 method for class 'betareg' get_parameters(x, component = "all", ...)
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
... |
Currently not used. |
A data frame with three columns: the parameter names, the related point estimates and the component.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data("GasolineYield", package = "betareg") m <- betareg::betareg(yield ~ batch + temp, data = GasolineYield) get_parameters(m) get_parameters(m, component = "precision")
data("GasolineYield", package = "betareg") m <- betareg::betareg(yield ~ batch + temp, data = GasolineYield) get_parameters(m) get_parameters(m, component = "precision")
Returns the coefficients (or posterior samples for Bayesian models) from a model.
## S3 method for class 'BGGM' get_parameters( x, component = "correlation", summary = FALSE, centrality = "mean", ... ) ## S3 method for class 'BFBayesFactor' get_parameters( x, effects = "all", component = "all", iterations = 4000, progress = FALSE, verbose = TRUE, summary = FALSE, centrality = "mean", ... ) ## S3 method for class 'brmsfit' get_parameters( x, effects = "fixed", component = "all", parameters = NULL, summary = FALSE, centrality = "mean", ... )
## S3 method for class 'BGGM' get_parameters( x, component = "correlation", summary = FALSE, centrality = "mean", ... ) ## S3 method for class 'BFBayesFactor' get_parameters( x, effects = "all", component = "all", iterations = 4000, progress = FALSE, verbose = TRUE, summary = FALSE, centrality = "mean", ... ) ## S3 method for class 'brmsfit' get_parameters( x, effects = "fixed", component = "all", parameters = NULL, summary = FALSE, centrality = "mean", ... )
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
summary |
Logical, indicates whether the full posterior samples
( |
centrality |
Only for models with posterior samples, and when
|
... |
Currently not used. |
effects |
Should variables for fixed effects ( |
iterations |
Number of posterior draws. |
progress |
Display progress. |
verbose |
Toggle messages and warnings. |
parameters |
Regular expression pattern that describes the parameters that should be returned. |
In most cases when models either return different "effects" (fixed,
random) or "components" (conditional, zero-inflated, ...), the arguments
effects
and component
can be used.
The posterior samples from the requested parameters as data frame.
If summary = TRUE
, returns a data frame with two columns: the
parameter names and the related point estimates (based on centrality
).
Note that for BFBayesFactor
models (from the BayesFactor package),
posteriors are only extracted from the first numerator model (i.e.,
model[1]
). If you want to apply some function foo()
to another
model stored in the BFBayesFactor
object, index it directly, e.g.
foo(model[2])
, foo(1/model[5])
, etc.
See also bayestestR::weighted_posteriors()
.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
Returns the coefficients from a model.
## S3 method for class 'emmGrid' get_parameters(x, summary = FALSE, merge_parameters = FALSE, ...)
## S3 method for class 'emmGrid' get_parameters(x, summary = FALSE, merge_parameters = FALSE, ...)
x |
A fitted model. |
summary |
Logical, indicates whether the full posterior samples
( |
merge_parameters |
Logical, if |
... |
Currently not used. |
A data frame with two columns: the parameter names and the related point estimates.
Note that emmGrid
or emm_list
objects returned by functions from
emmeans have a different structure compared to usual regression models.
Hence, the Parameter
column does not always contain names of variables,
but may rather contain values, e.g. for contrasts. See an example for
pairwise comparisons below.
data(mtcars) model <- lm(mpg ~ wt * factor(cyl), data = mtcars) emm <- emmeans(model, "cyl") get_parameters(emm) emm <- emmeans(model, pairwise ~ cyl) get_parameters(emm)
data(mtcars) model <- lm(mpg ~ wt * factor(cyl), data = mtcars) emm <- emmeans(model, "cyl") get_parameters(emm) emm <- emmeans(model, pairwise ~ cyl) get_parameters(emm)
Returns the coefficients from a model.
## S3 method for class 'gamm' get_parameters(x, component = "all", ...)
## S3 method for class 'gamm' get_parameters(x, component = "all", ...)
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
... |
Currently not used. |
For models with smooth terms or zero-inflation component, a data frame with three columns: the parameter names, the related point estimates and the component.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
Returns the coefficients from a model.
## S3 method for class 'glmmTMB' get_parameters(x, effects = "fixed", component = "all", ...)
## S3 method for class 'glmmTMB' get_parameters(x, effects = "fixed", component = "all", ...)
x |
A fitted model. |
effects |
Should variables for fixed effects ( |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
... |
Currently not used. |
In most cases when models either return different "effects" (fixed,
random) or "components" (conditional, zero-inflated, ...), the arguments
effects
and component
can be used. See details in the section
Model Components.
If effects = "fixed"
, a data frame with two columns: the
parameter names and the related point estimates. If effects = "random"
, a list of data frames with the random effects (as returned by
ranef()
), unless the random effects have the same simplified
structure as fixed effects (e.g. for models from MCMCglmm).
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(Salamanders, package = "glmmTMB") m <- glmmTMB::glmmTMB( count ~ mined + (1 | site), ziformula = ~mined, family = poisson(), data = Salamanders ) get_parameters(m)
data(Salamanders, package = "glmmTMB") m <- glmmTMB::glmmTMB( count ~ mined + (1 | site), ziformula = ~mined, family = poisson(), data = Salamanders ) get_parameters(m)
Returns the parameters from a hypothesis test.
## S3 method for class 'htest' get_parameters(x, ...)
## S3 method for class 'htest' get_parameters(x, ...)
x |
A fitted model. |
... |
Currently not used. |
A data frame with two columns: the parameter names and the related point estimates.
get_parameters(t.test(1:10, y = c(7:20)))
get_parameters(t.test(1:10, y = c(7:20)))
Returns the coefficients from a model.
## S3 method for class 'zeroinfl' get_parameters(x, component = "all", ...)
## S3 method for class 'zeroinfl' get_parameters(x, component = "all", ...)
x |
A fitted model. |
component |
Which type of parameters to return, such as parameters for the conditional model, the zero-inflated part of the model, the dispersion term, the instrumental variables or marginal effects be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variables (so called fixed-effects regressions), or models with marginal effects (from mfx). See details in section Model Components .May be abbreviated. Note that the conditional component also refers to the count or mean component - names may differ, depending on the modeling package. There are three convenient shortcuts (not applicable to all model classes):
|
... |
Currently not used. |
For models with smooth terms or zero-inflation component, a data frame with three columns: the parameter names, the related point estimates and the component.
Possible values for the component
argument depend on the model class.
Following are valid options:
"all"
: returns all model components, applies to all models, but will only
have an effect for models with more than just the conditional model component.
"conditional"
: only returns the conditional component, i.e. "fixed effects"
terms from the model. Will only have an effect for models with more than
just the conditional model component.
"smooth_terms"
: returns smooth terms, only applies to GAMs (or similar
models that may contain smooth terms).
"zero_inflated"
(or "zi"
): returns the zero-inflation component.
"dispersion"
: returns the dispersion model component. This is common
for models with zero-inflation or that can model the dispersion parameter.
"instruments"
: for instrumental-variable or some fixed effects regression,
returns the instruments.
"nonlinear"
: for non-linear models (like models of class nlmerMod
or
nls
), returns staring estimates for the nonlinear parameters.
"correlation"
: for models with correlation-component, like gls
, the
variables used to describe the correlation structure are returned.
"location"
: returns location parameters such as conditional
,
zero_inflated
, smooth_terms
, or instruments
(everything that are
fixed or random effects - depending on the effects
argument - but no
auxiliary parameters).
"distributional"
(or "auxiliary"
): components like sigma
, dispersion
,
beta
or precision
(and other auxiliary parameters) are returned.
Special models
Some model classes also allow rather uncommon options. These are:
mhurdle: "infrequent_purchase"
, "ip"
, and "auxiliary"
BGGM: "correlation"
and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional"
and "full"
mjoint: "survival"
mfx: "precision"
, "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds"
and "correlation"
clm2: "scale"
selection: "selection"
, "outcome"
, and "auxiliary"
For models of class brmsfit
(package brms), even more options are
possible for the component
argument, which are not all documented in detail
here.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_parameters(m)
The get_predicted()
function is a robust, flexible and user-friendly
alternative to base R predict()
function. Additional features and
advantages include availability of uncertainty intervals (CI), bootstrapping,
a more intuitive API and the support of more models than base R's predict()
function. However, although the interface are simplified, it is still very
important to read the documentation of the arguments. This is because making
"predictions" (a lose term for a variety of things) is a non-trivial process,
with lots of caveats and complications. Read the 'Details' section for more
information.
get_predicted_ci()
returns the confidence (or prediction) interval (CI)
associated with predictions made by a model. This function can be called
separately on a vector of predicted values. get_predicted()
usually
returns confidence intervals (included as attribute, and accessible via the
as.data.frame()
method) by default. It is preferred to rely on the
get_predicted()
function for standard errors and confidence intervals -
use get_predicted_ci()
only if standard errors and confidence intervals
are not available otherwise.
get_predicted(x, ...) ## Default S3 method: get_predicted( x, data = NULL, predict = "expectation", ci = NULL, ci_type = "confidence", ci_method = NULL, dispersion_method = "sd", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'lm' get_predicted( x, data = NULL, predict = "expectation", ci = NULL, iterations = NULL, verbose = TRUE, ... ) ## S3 method for class 'stanreg' get_predicted( x, data = NULL, predict = "expectation", iterations = NULL, ci = NULL, ci_method = NULL, include_random = "default", include_smooth = TRUE, verbose = TRUE, ... ) ## S3 method for class 'gam' get_predicted( x, data = NULL, predict = "expectation", ci = NULL, include_random = TRUE, include_smooth = TRUE, iterations = NULL, verbose = TRUE, ... ) ## S3 method for class 'lmerMod' get_predicted( x, data = NULL, predict = "expectation", ci = NULL, ci_method = NULL, include_random = "default", iterations = NULL, verbose = TRUE, ... ) ## S3 method for class 'principal' get_predicted(x, data = NULL, ...)
get_predicted(x, ...) ## Default S3 method: get_predicted( x, data = NULL, predict = "expectation", ci = NULL, ci_type = "confidence", ci_method = NULL, dispersion_method = "sd", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'lm' get_predicted( x, data = NULL, predict = "expectation", ci = NULL, iterations = NULL, verbose = TRUE, ... ) ## S3 method for class 'stanreg' get_predicted( x, data = NULL, predict = "expectation", iterations = NULL, ci = NULL, ci_method = NULL, include_random = "default", include_smooth = TRUE, verbose = TRUE, ... ) ## S3 method for class 'gam' get_predicted( x, data = NULL, predict = "expectation", ci = NULL, include_random = TRUE, include_smooth = TRUE, iterations = NULL, verbose = TRUE, ... ) ## S3 method for class 'lmerMod' get_predicted( x, data = NULL, predict = "expectation", ci = NULL, ci_method = NULL, include_random = "default", iterations = NULL, verbose = TRUE, ... ) ## S3 method for class 'principal' get_predicted(x, data = NULL, ...)
x |
A statistical model (can also be a data.frame, in which case the second argument has to be a model). |
... |
Other argument to be passed, for instance to |
data |
An optional data frame in which to look for variables with which
to predict. If omitted, the data used to fit the model is used. Visualization
matrices can be generated using |
predict |
string or
|
ci |
The interval level. Default is |
ci_type |
Can be |
ci_method |
The method for computing p values and confidence intervals. Possible values depend on model type.
See |
dispersion_method |
Bootstrap dispersion and Bayesian posterior summary:
|
vcov |
Variance-covariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.
|
vcov_args |
List of arguments to be passed to the function identified by
the |
verbose |
Toggle warnings. |
iterations |
For Bayesian models, this corresponds to the number of
posterior draws. If |
include_random |
If |
include_smooth |
For General Additive Models (GAMs). If |
In insight::get_predicted()
, the predict
argument jointly
modulates two separate concepts, the scale and the uncertainty interval.
The fitted values (i.e. predictions for the response). For Bayesian
or bootstrapped models (when iterations != NULL
), iterations (as
columns and observations are rows) can be accessed via as.data.frame()
.
Linear models - lm()
: For linear models, prediction
intervals (predict="prediction"
) show the range that likely
contains the value of a new observation (in what range it is likely to
fall), whereas confidence intervals (predict="expectation"
or
predict="link"
) reflect the uncertainty around the estimated
parameters (and gives the range of uncertainty of the regression line). In
general, Prediction Intervals (PIs) account for both the uncertainty in the
model's parameters, plus the random variation of the individual values.
Thus, prediction intervals are always wider than confidence intervals.
Moreover, prediction intervals will not necessarily become narrower as the
sample size increases (as they do not reflect only the quality of the fit,
but also the variability within the data).
Generalized Linear models - glm()
: For binomial models,
prediction intervals are somewhat useless (for instance, for a binomial
(Bernoulli) model for which the dependent variable is a vector of 1s and
0s, the prediction interval is... [0, 1]
).
When users set the predict
argument to "expectation"
, the predictions
are returned on the response scale, which is arguably the most convenient
way to understand and visualize relationships of interest. When users set
the predict
argument to "link"
, predictions are returned on the link
scale, and no transformation is applied. For instance, for a logistic
regression model, the response scale corresponds to the predicted
probabilities, whereas the link-scale makes predictions of log-odds
(probabilities on the logit scale). Note that when users select
predict="classification"
in binomial models, the get_predicted()
function will first calculate predictions as if the user had selected
predict="expectation"
. Then, it will round the responses in order to
return the most likely outcome.
The arguments vcov
and vcov_args
can be used to calculate robust
standard errors for confidence intervals of predictions. These arguments,
when provided in get_predicted()
, are passed down to get_predicted_ci()
,
thus, see the related documentation there for more
details.
For predictions based on multiple iterations, for instance in the case of Bayesian
models and bootstrapped predictions, the function used to compute the centrality
(point-estimate predictions) can be modified via the centrality_function
argument. For instance, get_predicted(model, centrality_function = stats::median)
.
The default is mean
. Individual draws can be accessed by running
iter <- as.data.frame(get_predicted(model))
, and their iterations can be
reshaped into a long format by bayestestR::reshape_iterations(iter)
.
data(mtcars) x <- lm(mpg ~ cyl + hp, data = mtcars) predictions <- get_predicted(x, ci = 0.95) predictions # Options and methods --------------------- get_predicted(x, predict = "prediction") # Get CI as.data.frame(predictions) # Bootstrapped as.data.frame(get_predicted(x, iterations = 4)) # Same as as.data.frame(..., keep_iterations = FALSE) summary(get_predicted(x, iterations = 4)) # Different prediction types ------------------------ data(iris) data <- droplevels(iris[1:100, ]) # Fit a logistic model x <- glm(Species ~ Sepal.Length, data = data, family = "binomial") # Expectation (default): response scale + CI pred <- get_predicted(x, predict = "expectation", ci = 0.95) head(as.data.frame(pred)) # Prediction: response scale + PI pred <- get_predicted(x, predict = "prediction", ci = 0.95) head(as.data.frame(pred)) # Link: link scale + CI pred <- get_predicted(x, predict = "link", ci = 0.95) head(as.data.frame(pred)) # Classification: classification "type" + PI pred <- get_predicted(x, predict = "classification", ci = 0.95) head(as.data.frame(pred))
data(mtcars) x <- lm(mpg ~ cyl + hp, data = mtcars) predictions <- get_predicted(x, ci = 0.95) predictions # Options and methods --------------------- get_predicted(x, predict = "prediction") # Get CI as.data.frame(predictions) # Bootstrapped as.data.frame(get_predicted(x, iterations = 4)) # Same as as.data.frame(..., keep_iterations = FALSE) summary(get_predicted(x, iterations = 4)) # Different prediction types ------------------------ data(iris) data <- droplevels(iris[1:100, ]) # Fit a logistic model x <- glm(Species ~ Sepal.Length, data = data, family = "binomial") # Expectation (default): response scale + CI pred <- get_predicted(x, predict = "expectation", ci = 0.95) head(as.data.frame(pred)) # Prediction: response scale + PI pred <- get_predicted(x, predict = "prediction", ci = 0.95) head(as.data.frame(pred)) # Link: link scale + CI pred <- get_predicted(x, predict = "link", ci = 0.95) head(as.data.frame(pred)) # Classification: classification "type" + PI pred <- get_predicted(x, predict = "classification", ci = 0.95) head(as.data.frame(pred))
Confidence intervals around predicted values
get_predicted_ci(x, ...) ## Default S3 method: get_predicted_ci( x, predictions = NULL, data = NULL, se = NULL, ci = 0.95, ci_type = "confidence", ci_method = NULL, dispersion_method = "sd", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... )
get_predicted_ci(x, ...) ## Default S3 method: get_predicted_ci( x, predictions = NULL, data = NULL, se = NULL, ci = 0.95, ci_type = "confidence", ci_method = NULL, dispersion_method = "sd", vcov = NULL, vcov_args = NULL, verbose = TRUE, ... )
x |
A statistical model (can also be a data.frame, in which case the second argument has to be a model). |
... |
Other argument to be passed, for instance to |
predictions |
A vector of predicted values (as obtained by
|
data |
An optional data frame in which to look for variables with which
to predict. If omitted, the data used to fit the model is used. Visualization
matrices can be generated using |
se |
Numeric vector of standard error of predicted values. If |
ci |
The interval level. Default is |
ci_type |
Can be |
ci_method |
The method for computing p values and confidence intervals. Possible values depend on model type.
See |
dispersion_method |
Bootstrap dispersion and Bayesian posterior summary:
|
vcov |
Variance-covariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.
|
vcov_args |
List of arguments to be passed to the function identified by
the |
verbose |
Toggle warnings. |
Typically, get_predicted()
returns confidence intervals based on the standard
errors as returned by the predict()
-function, assuming normal distribution
(+/- 1.96 * SE
) resp. a Student's t-distribution (if degrees of freedom are
available). If predict()
for a certain class does not return standard
errors (for example, merMod-objects), these are calculated manually, based
on following steps: matrix-multiply X
by the parameter vector B
to get the
predictions, then extract the variance-covariance matrix V
of the parameters
and compute XVX'
to get the variance-covariance matrix of the predictions.
The square-root of the diagonal of this matrix represent the standard errors
of the predictions, which are then multiplied by the critical test-statistic
value (e.g., ~1.96 for normal distribution) for the confidence intervals.
If ci_type = "prediction"
, prediction intervals are calculated. These are
wider than confidence intervals, because they also take into account the
uncertainty of the model itself. Before taking the square-root of the
diagonal of the variance-covariance matrix, get_predicted_ci()
adds the
residual variance to these values. For mixed models, get_variance_residual()
is used, while get_sigma()^2
is used for non-mixed models.
It is preferred to rely on standard errors returned by get_predicted()
(i.e.
returned by the predict()
-function), because these are more accurate than
manually calculated standard errors. Use get_predicted_ci()
only if standard
errors are not available otherwise. An exception are Bayesian models or
bootstrapped predictions, where get_predicted_ci()
returns quantiles of the
posterior distribution or bootstrapped samples of the predictions. These are
actually accurate standard errors resp. confidence (or uncertainty) intervals.
# Confidence Intervals for Model Predictions # ------------------------------------------ data(mtcars) # Linear model # ------------ x <- lm(mpg ~ cyl + hp, data = mtcars) predictions <- predict(x) ci_vals <- get_predicted_ci(x, predictions, ci_type = "prediction") head(ci_vals) ci_vals <- get_predicted_ci(x, predictions, ci_type = "confidence") head(ci_vals) ci_vals <- get_predicted_ci(x, predictions, ci = c(0.8, 0.9, 0.95)) head(ci_vals) # Bootstrapped # ------------ predictions <- get_predicted(x, iterations = 500) get_predicted_ci(x, predictions) ci_vals <- get_predicted_ci(x, predictions, ci = c(0.80, 0.95)) head(ci_vals) datawizard::reshape_ci(ci_vals) ci_vals <- get_predicted_ci(x, predictions, dispersion_method = "MAD", ci_method = "HDI" ) head(ci_vals) # Logistic model # -------------- x <- glm(vs ~ wt, data = mtcars, family = "binomial") predictions <- predict(x, type = "link") ci_vals <- get_predicted_ci(x, predictions, ci_type = "prediction") head(ci_vals) ci_vals <- get_predicted_ci(x, predictions, ci_type = "confidence") head(ci_vals)
# Confidence Intervals for Model Predictions # ------------------------------------------ data(mtcars) # Linear model # ------------ x <- lm(mpg ~ cyl + hp, data = mtcars) predictions <- predict(x) ci_vals <- get_predicted_ci(x, predictions, ci_type = "prediction") head(ci_vals) ci_vals <- get_predicted_ci(x, predictions, ci_type = "confidence") head(ci_vals) ci_vals <- get_predicted_ci(x, predictions, ci = c(0.8, 0.9, 0.95)) head(ci_vals) # Bootstrapped # ------------ predictions <- get_predicted(x, iterations = 500) get_predicted_ci(x, predictions) ci_vals <- get_predicted_ci(x, predictions, ci = c(0.80, 0.95)) head(ci_vals) datawizard::reshape_ci(ci_vals) ci_vals <- get_predicted_ci(x, predictions, dispersion_method = "MAD", ci_method = "HDI" ) head(ci_vals) # Logistic model # -------------- x <- glm(vs ~ wt, data = mtcars, family = "binomial") predictions <- predict(x, type = "link") ci_vals <- get_predicted_ci(x, predictions, ci_type = "prediction") head(ci_vals) ci_vals <- get_predicted_ci(x, predictions, ci_type = "confidence") head(ci_vals)
Returns the data from all predictor variables (fixed effects).
get_predictors(x, verbose = TRUE)
get_predictors(x, verbose = TRUE)
x |
A fitted model. |
verbose |
Toggle messages and warnings. |
The data from all predictor variables, as data frame.
m <- lm(mpg ~ wt + cyl + vs, data = mtcars) head(get_predictors(m))
m <- lm(mpg ~ wt + cyl + vs, data = mtcars) head(get_predictors(m))
Provides a summary of the prior distributions used for the parameters in a given model.
get_priors(x, ...) ## S3 method for class 'brmsfit' get_priors(x, verbose = TRUE, ...)
get_priors(x, ...) ## S3 method for class 'brmsfit' get_priors(x, verbose = TRUE, ...)
x |
A Bayesian model. |
... |
Currently not used. |
verbose |
Toggle warnings and messages. |
A data frame with a summary of the prior distributions used for the parameters in a given model.
library(rstanarm) model <- stan_glm(Sepal.Width ~ Species * Petal.Length, data = iris) get_priors(model)
library(rstanarm) model <- stan_glm(Sepal.Width ~ Species * Petal.Length, data = iris) get_priors(model)
Returns the data from all random effects terms.
get_random(x)
get_random(x)
x |
A fitted mixed model. |
The data from all random effects terms, as data frame. Or NULL
if model has no random effects.
data(sleepstudy) # prepare some data... sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$mysubgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$mygrp == i sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } m <- lmer( Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject), data = sleepstudy ) head(get_random(m))
data(sleepstudy) # prepare some data... sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$mysubgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$mygrp == i sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } m <- lmer( Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject), data = sleepstudy ) head(get_random(m))
Returns the residuals from regression models.
get_residuals(x, ...) ## Default S3 method: get_residuals(x, weighted = FALSE, verbose = TRUE, ...)
get_residuals(x, ...) ## Default S3 method: get_residuals(x, weighted = FALSE, verbose = TRUE, ...)
x |
A model. |
... |
Passed down to |
weighted |
Logical, if |
verbose |
Toggle warnings and messages. |
The residuals, or NULL
if this information could not be
accessed.
This function returns the default type of residuals, i.e. for the
response from linear models, the deviance residuals for models of class
glm
etc. To access different types, pass down the type
argument (see
'Examples').
This function is a robust alternative to residuals()
, as it works for
some special model objects that otherwise do not respond properly to calling
residuals()
.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_residuals(m) m <- glm(vs ~ wt + cyl + mpg, data = mtcars, family = binomial()) get_residuals(m) # type = "deviance" by default get_residuals(m, type = "response")
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_residuals(m) m <- glm(vs ~ wt + cyl + mpg, data = mtcars, family = binomial()) get_residuals(m) # type = "deviance" by default get_residuals(m, type = "response")
Returns the values the response variable(s) from a model object. If the model is a multivariate response model, a data frame with values from all response variables is returned.
get_response(x, ...) ## Default S3 method: get_response( x, select = NULL, as_proportion = TRUE, source = "environment", verbose = TRUE, ... ) ## S3 method for class 'nestedLogit' get_response(x, dichotomies = FALSE, source = "environment", ...)
get_response(x, ...) ## Default S3 method: get_response( x, select = NULL, as_proportion = TRUE, source = "environment", verbose = TRUE, ... ) ## S3 method for class 'nestedLogit' get_response(x, dichotomies = FALSE, source = "environment", ...)
x |
A fitted model. |
... |
Currently not used. |
select |
Optional name(s) of response variables for which to extract values. Can be used in case of regression models with multiple response variables. |
as_proportion |
Logical, if |
source |
String, indicating from where data should be recovered. If
|
verbose |
Toggle warnings. |
dichotomies |
Logical, if model is a |
The values of the response variable, as vector, or a data frame if
x
has more than one defined response variable.
data(cbpp) cbpp$trials <- cbpp$size - cbpp$incidence dat <<- cbpp m <- glm(cbind(incidence, trials) ~ period, data = dat, family = binomial) head(get_response(m)) get_response(m, select = "incidence") data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_response(m)
data(cbpp) cbpp$trials <- cbpp$size - cbpp$incidence dat <<- cbpp m <- glm(cbind(incidence, trials) ~ period, data = dat, family = binomial) head(get_response(m)) get_response(m, select = "incidence") data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_response(m)
Returns sigma
, which corresponds the estimated standard
deviation of the residuals. This function extends the sigma()
base R
generic for models that don't have implemented it. It also computes the
confidence interval (CI), which is stored as an attribute.
Sigma is a key-component of regression models, and part of the so-called
auxiliary parameters that are estimated. Indeed, linear models for instance
assume that the residuals comes from a normal distribution with mean 0 and
standard deviation sigma
. See the details section below for more
information about its interpretation and calculation.
get_sigma(x, ci = NULL, verbose = TRUE)
get_sigma(x, ci = NULL, verbose = TRUE)
x |
A model. |
ci |
Scalar, the CI level. The default ( |
verbose |
Toggle messages and warnings. |
The residual standard deviation (sigma), or NULL
if this
information could not be accessed.
The residual standard deviation, σ,
indicates that the predicted outcome will be within +/-
σ units of the linear predictor for
approximately 68%
of the data points (Gelman, Hill & Vehtari 2020, p.84).
In other words, the residual standard deviation indicates the accuracy for a
model to predict scores, thus it can be thought of as "a measure of the
average distance each observation falls from its prediction from the model"
(Gelman, Hill & Vehtari 2020, p.168).
σ can be considered as a measure of
the unexplained variation in the data, or of the precision of inferences
about regression coefficients.
By default, get_sigma()
tries to extract sigma by calling stats::sigma()
.
If the model-object has no sigma()
method, the next step is calculating
sigma as square-root of the model-deviance divided by the residual degrees of
freedom. Finally, if even this approach fails, and x
is a mixed model, the
residual standard deviation is accessed using the square-root from
get_variance_residual()
.
Gelman, A., Hill, J., & Vehtari, A. (2020). Regression and Other Stories. Cambridge University Press.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_sigma(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_sigma(m)
Returns the statistic (t, z
, ...) for model
estimates. In most cases, this is the related column from
coef(summary())
.
get_statistic(x, ...) ## Default S3 method: get_statistic(x, column_index = 3, verbose = TRUE, ...) ## S3 method for class 'glmmTMB' get_statistic(x, component = "all", ...) ## S3 method for class 'emmGrid' get_statistic(x, ci = 0.95, adjust = "none", merge_parameters = FALSE, ...) ## S3 method for class 'gee' get_statistic(x, robust = FALSE, ...)
get_statistic(x, ...) ## Default S3 method: get_statistic(x, column_index = 3, verbose = TRUE, ...) ## S3 method for class 'glmmTMB' get_statistic(x, component = "all", ...) ## S3 method for class 'emmGrid' get_statistic(x, ci = 0.95, adjust = "none", merge_parameters = FALSE, ...) ## S3 method for class 'gee' get_statistic(x, robust = FALSE, ...)
x |
A model. |
... |
Currently not used. |
column_index |
For model objects that have no defined
|
verbose |
Toggle messages and warnings. |
component |
String, indicating the model component for which parameters
should be returned. The default for all models is
|
ci |
Confidence Interval (CI) level. Default to |
adjust |
Character value naming the method used to adjust p-values or
confidence intervals. See |
merge_parameters |
Logical, if |
robust |
Logical, if |
A data frame with the model's parameter names and the related test statistic.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_statistic(m)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_statistic(m)
This functions checks whether any transformation, such as log- or
exp-transforming, was applied to the response variable (dependent variable)
in a regression formula, and returns the related function that was used for
transformation. See find_transformation()
for an overview of supported
transformations that are detected.
get_transformation(x, include_all = FALSE, verbose = TRUE)
get_transformation(x, include_all = FALSE, verbose = TRUE)
x |
A regression model or a character string of the formulation of the (response) variable. |
include_all |
Logical, if |
verbose |
Logical, if |
A list of two functions: $transformation
, the function that was used to
transform the response variable; $inverse
, the inverse-function of
$transformation
(can be used for "back-transformation"). If no
transformation was applied, both list-elements $transformation
and
$inverse
just return function(x) x
. If transformation is unknown,
NULL
is returned.
# identity, no transformation model <- lm(Sepal.Length ~ Species, data = iris) get_transformation(model) # log-transformation model <- lm(log(Sepal.Length) ~ Species, data = iris) get_transformation(model) # log-function get_transformation(model)$transformation(0.3) log(0.3) # inverse function is exp() get_transformation(model)$inverse(0.3) exp(0.3) # get transformations for all model terms model <- lm(mpg ~ log(wt) + I(gear^2) + exp(am), data = mtcars) get_transformation(model, include_all = TRUE)
# identity, no transformation model <- lm(Sepal.Length ~ Species, data = iris) get_transformation(model) # log-transformation model <- lm(log(Sepal.Length) ~ Species, data = iris) get_transformation(model) # log-function get_transformation(model)$transformation(0.3) log(0.3) # inverse function is exp() get_transformation(model)$inverse(0.3) exp(0.3) # get transformations for all model terms model <- lm(mpg ~ log(wt) + I(gear^2) + exp(am), data = mtcars) get_transformation(model, include_all = TRUE)
Returns the variance-covariance, as retrieved by stats::vcov()
, but works
for more model objects that probably don't provide a vcov()
-method.
get_varcov(x, ...) ## Default S3 method: get_varcov(x, verbose = TRUE, vcov = NULL, vcov_args = NULL, ...) ## S3 method for class 'glmgee' get_varcov( x, verbose = TRUE, vcov = c("robust", "df-adjusted", "model", "bias-corrected", "jackknife"), ... ) ## S3 method for class 'nestedLogit' get_varcov( x, component = "all", verbose = TRUE, vcov = NULL, vcov_args = NULL, ... ) ## S3 method for class 'betareg' get_varcov( x, component = c("conditional", "precision", "all"), verbose = TRUE, ... ) ## S3 method for class 'clm2' get_varcov(x, component = c("all", "conditional", "scale"), ...) ## S3 method for class 'truncreg' get_varcov(x, component = c("conditional", "all"), verbose = TRUE, ...) ## S3 method for class 'hurdle' get_varcov( x, component = c("conditional", "zero_inflated", "zi", "all"), vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'glmmTMB' get_varcov( x, component = c("conditional", "zero_inflated", "zi", "dispersion", "all"), verbose = TRUE, ... ) ## S3 method for class 'MixMod' get_varcov( x, effects = c("fixed", "random"), component = c("conditional", "zero_inflated", "zi", "dispersion", "auxiliary", "all"), verbose = TRUE, ... ) ## S3 method for class 'brmsfit' get_varcov(x, component = "conditional", verbose = TRUE, ...) ## S3 method for class 'betamfx' get_varcov( x, component = c("conditional", "precision", "all"), verbose = TRUE, ... ) ## S3 method for class 'aov' get_varcov(x, complete = FALSE, verbose = TRUE, ...) ## S3 method for class 'mixor' get_varcov(x, effects = c("all", "fixed", "random"), verbose = TRUE, ...)
get_varcov(x, ...) ## Default S3 method: get_varcov(x, verbose = TRUE, vcov = NULL, vcov_args = NULL, ...) ## S3 method for class 'glmgee' get_varcov( x, verbose = TRUE, vcov = c("robust", "df-adjusted", "model", "bias-corrected", "jackknife"), ... ) ## S3 method for class 'nestedLogit' get_varcov( x, component = "all", verbose = TRUE, vcov = NULL, vcov_args = NULL, ... ) ## S3 method for class 'betareg' get_varcov( x, component = c("conditional", "precision", "all"), verbose = TRUE, ... ) ## S3 method for class 'clm2' get_varcov(x, component = c("all", "conditional", "scale"), ...) ## S3 method for class 'truncreg' get_varcov(x, component = c("conditional", "all"), verbose = TRUE, ...) ## S3 method for class 'hurdle' get_varcov( x, component = c("conditional", "zero_inflated", "zi", "all"), vcov = NULL, vcov_args = NULL, verbose = TRUE, ... ) ## S3 method for class 'glmmTMB' get_varcov( x, component = c("conditional", "zero_inflated", "zi", "dispersion", "all"), verbose = TRUE, ... ) ## S3 method for class 'MixMod' get_varcov( x, effects = c("fixed", "random"), component = c("conditional", "zero_inflated", "zi", "dispersion", "auxiliary", "all"), verbose = TRUE, ... ) ## S3 method for class 'brmsfit' get_varcov(x, component = "conditional", verbose = TRUE, ...) ## S3 method for class 'betamfx' get_varcov( x, component = c("conditional", "precision", "all"), verbose = TRUE, ... ) ## S3 method for class 'aov' get_varcov(x, complete = FALSE, verbose = TRUE, ...) ## S3 method for class 'mixor' get_varcov(x, effects = c("all", "fixed", "random"), verbose = TRUE, ...)
x |
A model. |
... |
Currently not used. |
verbose |
Toggle warnings. |
vcov |
Variance-covariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.
|
vcov_args |
List of arguments to be passed to the function identified by
the |
component |
Should the complete variance-covariance matrix of the model
be returned, or only for specific model components only (like count or
zero-inflated model parts)? Applies to models with zero-inflated component,
or models with precision (e.g. |
effects |
Should the complete variance-covariance matrix of the model
be returned, or only for specific model parameters only? Currently only
applies to models of class |
complete |
Logical, if |
The variance-covariance matrix, as matrix
-object.
get_varcov()
tries to return the nearest positive definite matrix
in case of negative eigenvalues of the variance-covariance matrix. This
ensures that it is still possible, for instance, to calculate standard
errors of model parameters. A message is shown when the matrix is negative
definite and a corrected matrix is returned.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_varcov(m) # vcov of zero-inflation component from hurdle-model data("bioChemists", package = "pscl") mod <- hurdle(art ~ phd + fem | ment, data = bioChemists, dist = "negbin") get_varcov(mod, component = "zero_inflated") # robust vcov of, count component from hurdle-model data("bioChemists", package = "pscl") mod <- hurdle(art ~ phd + fem | ment, data = bioChemists, dist = "negbin") get_varcov( mod, component = "conditional", vcov = "BS", vcov_args = list(R = 50) )
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) get_varcov(m) # vcov of zero-inflation component from hurdle-model data("bioChemists", package = "pscl") mod <- hurdle(art ~ phd + fem | ment, data = bioChemists, dist = "negbin") get_varcov(mod, component = "zero_inflated") # robust vcov of, count component from hurdle-model data("bioChemists", package = "pscl") mod <- hurdle(art ~ phd + fem | ment, data = bioChemists, dist = "negbin") get_varcov( mod, component = "conditional", vcov = "BS", vcov_args = list(R = 50) )
This function extracts the different variance components of a mixed model and
returns the result as list. Functions like get_variance_residual(x)
or
get_variance_fixed(x)
are shortcuts for get_variance(x, component = "residual")
etc.
get_variance(x, ...) ## S3 method for class 'merMod' get_variance( x, component = c("all", "fixed", "random", "residual", "distribution", "dispersion", "intercept", "slope", "rho01", "rho00"), tolerance = 1e-08, null_model = NULL, approximation = "lognormal", verbose = TRUE, ... ) ## S3 method for class 'glmmTMB' get_variance( x, component = c("all", "fixed", "random", "residual", "distribution", "dispersion", "intercept", "slope", "rho01", "rho00"), model_component = NULL, tolerance = 1e-08, null_model = NULL, approximation = "lognormal", verbose = TRUE, ... ) get_variance_residual(x, verbose = TRUE, ...) get_variance_fixed(x, verbose = TRUE, ...) get_variance_random(x, verbose = TRUE, tolerance = 1e-08, ...) get_variance_distribution(x, verbose = TRUE, ...) get_variance_dispersion(x, verbose = TRUE, ...) get_variance_intercept(x, verbose = TRUE, ...) get_variance_slope(x, verbose = TRUE, ...) get_correlation_slope_intercept(x, verbose = TRUE, ...) get_correlation_slopes(x, verbose = TRUE, ...)
get_variance(x, ...) ## S3 method for class 'merMod' get_variance( x, component = c("all", "fixed", "random", "residual", "distribution", "dispersion", "intercept", "slope", "rho01", "rho00"), tolerance = 1e-08, null_model = NULL, approximation = "lognormal", verbose = TRUE, ... ) ## S3 method for class 'glmmTMB' get_variance( x, component = c("all", "fixed", "random", "residual", "distribution", "dispersion", "intercept", "slope", "rho01", "rho00"), model_component = NULL, tolerance = 1e-08, null_model = NULL, approximation = "lognormal", verbose = TRUE, ... ) get_variance_residual(x, verbose = TRUE, ...) get_variance_fixed(x, verbose = TRUE, ...) get_variance_random(x, verbose = TRUE, tolerance = 1e-08, ...) get_variance_distribution(x, verbose = TRUE, ...) get_variance_dispersion(x, verbose = TRUE, ...) get_variance_intercept(x, verbose = TRUE, ...) get_variance_slope(x, verbose = TRUE, ...) get_correlation_slope_intercept(x, verbose = TRUE, ...) get_correlation_slopes(x, verbose = TRUE, ...)
x |
A mixed effects model. |
... |
Currently not used. |
component |
Character value, indicating the variance component that
should be returned. By default, all variance components are returned. The
distribution-specific ( |
tolerance |
Tolerance for singularity check of random effects, to decide
whether to compute random effect variances or not. Indicates up to which
value the convergence result is accepted. The larger tolerance is, the
stricter the test will be. See |
null_model |
Optional, a null-model to be used for the calculation of
random effect variances. If |
approximation |
Character string, indicating the approximation method
for the distribution-specific (observation level, or residual) variance. Only
applies to non-Gaussian models. Can be |
verbose |
Toggle off warnings. |
model_component |
For models that can have a zero-inflation component,
specify for which component variances should be returned. If |
This function returns different variance components from mixed models, which are needed, for instance, to calculate r-squared measures or the intraclass-correlation coefficient (ICC).
A list with following elements:
var.fixed
, variance attributable to the fixed effects
var.random
, (mean) variance of random effects
var.residual
, residual variance (sum of dispersion and distribution-specific/observation level variance)
var.distribution
, distribution-specific (or observation level) variance
var.dispersion
, variance due to additive dispersion
var.intercept
, the random-intercept-variance, or between-subject-variance (τ00)
var.slope
, the random-slope-variance (τ11)
cor.slope_intercept
, the random-slope-intercept-correlation (ρ01)
cor.slopes
, the correlation between random slopes (ρ00)
The fixed effects variance, σ2f, is the variance of the matrix-multiplication β∗X (parameter vector by model matrix).
The random effect variance, σ2i, represents the mean random effect variance of the model. Since this variance reflects the "average" random effects variance for mixed models, it is also appropriate for models with more complex random effects structures, like random slopes or nested random effects. Details can be found in Johnson 2014, in particular equation 10. For simple random-intercept models, the random effects variance equals the random-intercept variance.
The distribution-specific variance,
σ2d,
is the conditional variance of the response given the predictors , Var[y|x]
,
which depends on the model family.
Gaussian: For Gaussian models, it is
σ2 (i.e. sigma(model)^2
).
Bernoulli: For models with binary outcome, it is
π2/3 for logit-link,
1
for probit-link, and π2/6
for cloglog-links.
Binomial: For other binomial models, the distribution-specific variance for Bernoulli models is used, divided by a weighting factor based on the number of trials and successes.
Gamma: Models from Gamma-families use μ2
(as obtained from family$variance()
).
For all other models, the distribution-specific variance is by default
based on lognormal approximation,
log(1 + var(x) / μ2)
(see Nakagawa et al. 2017). Other approximation methods can be specified
with the approximation
argument.
Zero-inflation models: The expected variance of a zero-inflated model is computed according to Zuur et al. 2012, p277.
The variance for the additive overdispersion term,
σ2e,
represents "the excess variation relative to what is expected from a certain
distribution" (Nakagawa et al. 2017). In (most? many?) cases, this will be
0
.
The residual variance, σ2ε, is simply σ2d + σ2e.
The random intercept variance, or between-subject variance
(τ00), is obtained from
VarCorr()
. It indicates how much groups or subjects differ from each other,
while the residual variance σ2ε
indicates the within-subject variance.
The random slope variance (τ11)
is obtained from VarCorr()
. This measure is only available for mixed models
with random slopes.
The random slope-intercept correlation
(ρ01) is obtained from
VarCorr()
. This measure is only available for mixed models with random
intercepts and slopes.
This function supports models of class merMod
(including models from
blme), clmm
, cpglmm
, glmmadmb
, glmmTMB
, MixMod
, lme
, mixed
,
rlmerMod
, stanreg
, brmsfit
or wbm
. Support for objects of class
MixMod
(GLMMadaptive), lme
(nlme) or brmsfit
(brms) is
not fully implemented or tested, and therefore may not work for all models
of the aforementioned classes.
The results are validated against the solutions provided by Nakagawa et al. (2017), in particular examples shown in the Supplement 2 of the paper. Other model families are validated against results from the MuMIn package. This means that the returned variance components should be accurate and reliable for following mixed models or model families:
Bernoulli (logistic) regression
Binomial regression (with other than binary outcomes)
Poisson and Quasi-Poisson regression
Negative binomial regression (including nbinom1, nbinom2 and nbinom12 families)
Gaussian regression (linear models)
Gamma regression
Tweedie regression
Beta regression
Ordered beta regression
Following model families are not yet validated, but should work:
Zero-inflated and hurdle models
Beta-binomial regression
Compound Poisson regression
Generalized Poisson regression
Log-normal regression
Skew-normal regression
Extracting variance components for models with zero-inflation part is not straightforward, because it is not definitely clear how the distribution-specific variance should be calculated. Therefore, it is recommended to carefully inspect the results, and probably validate against other models, e.g. Bayesian models (although results may be only roughly comparable).
Log-normal regressions (e.g. lognormal()
family in glmmTMB or
gaussian("log")
) often have a very low fixed effects variance (if they were
calculated as suggested by Nakagawa et al. 2017). This results in very low
ICC or r-squared values, which may not be meaningful (see
performance::icc()
or performance::r2_nakagawa()
).
Johnson, P. C. D. (2014). Extension of Nakagawa & Schielzeth’s R2 GLMM to random slopes models. Methods in Ecology and Evolution, 5(9), 944–946. doi:10.1111/2041-210X.12225
Nakagawa, S., Johnson, P. C. D., & Schielzeth, H. (2017). The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. Journal of The Royal Society Interface, 14(134), 20170213. doi:10.1098/rsif.2017.0213
Zuur, A. F., Savel'ev, A. A., & Ieno, E. N. (2012). Zero inflated models and generalized linear mixed models with R. Newburgh, United Kingdom: Highland Statistics.
library(lme4) data(sleepstudy) m <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy) get_variance(m) get_variance_fixed(m) get_variance_residual(m)
library(lme4) data(sleepstudy) m <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy) get_variance(m) get_variance_fixed(m) get_variance_residual(m)
Returns weighting variable of a model.
get_weights(x, ...) ## Default S3 method: get_weights(x, remove_na = FALSE, null_as_ones = FALSE, ...)
get_weights(x, ...) ## Default S3 method: get_weights(x, remove_na = FALSE, null_as_ones = FALSE, ...)
x |
A fitted model. |
... |
Currently not used. |
remove_na |
Logical, if |
null_as_ones |
Logical, if |
The weighting variable, or NULL
if no weights were specified.
If the weighting variable should also be returned (instead of NULL
)
when all weights are set to 1 (i.e. no weighting),
set null_as_ones = TRUE
.
data(mtcars) set.seed(123) mtcars$weight <- rnorm(nrow(mtcars), 1, .3) # LMs m <- lm(mpg ~ wt + cyl + vs, data = mtcars, weights = weight) get_weights(m) get_weights(lm(mpg ~ wt, data = mtcars), null_as_ones = TRUE) # GLMs m <- glm(vs ~ disp + mpg, data = mtcars, weights = weight, family = quasibinomial) get_weights(m) m <- glm(cbind(cyl, gear) ~ mpg, data = mtcars, weights = weight, family = binomial) get_weights(m)
data(mtcars) set.seed(123) mtcars$weight <- rnorm(nrow(mtcars), 1, .3) # LMs m <- lm(mpg ~ wt + cyl + vs, data = mtcars, weights = weight) get_weights(m) get_weights(lm(mpg ~ wt, data = mtcars), null_as_ones = TRUE) # GLMs m <- glm(vs ~ disp + mpg, data = mtcars, weights = weight, family = quasibinomial) get_weights(m) m <- glm(cbind(cyl, gear) ~ mpg, data = mtcars, weights = weight, family = binomial) get_weights(m)
Checks if model has an intercept.
has_intercept(x, verbose = TRUE)
has_intercept(x, verbose = TRUE)
x |
A model object. |
verbose |
Toggle warnings. |
TRUE
if x
has an intercept, FALSE
otherwise.
model <- lm(mpg ~ 0 + gear, data = mtcars) has_intercept(model) model <- lm(mpg ~ gear, data = mtcars) has_intercept(model) model <- lmer(Reaction ~ 0 + Days + (Days | Subject), data = sleepstudy) has_intercept(model) model <- lmer(Reaction ~ Days + (Days | Subject), data = sleepstudy) has_intercept(model)
model <- lm(mpg ~ 0 + gear, data = mtcars) has_intercept(model) model <- lm(mpg ~ gear, data = mtcars) has_intercept(model) model <- lmer(Reaction ~ 0 + Days + (Days | Subject), data = sleepstudy) has_intercept(model) model <- lmer(Reaction ~ Days + (Days | Subject), data = sleepstudy) has_intercept(model)
is_converged()
provides an alternative convergence
test for merMod
-objects.
is_converged(x, tolerance = 0.001, ...)
is_converged(x, tolerance = 0.001, ...)
x |
A |
tolerance |
Indicates up to which value the convergence result is
accepted. The smaller |
... |
Currently not used. |
TRUE
if convergence is fine and FALSE
if convergence
is suspicious. Additionally, the convergence value is returned as attribute.
Convergence problems typically arise when the model hasn't converged to a solution where the log-likelihood has a true maximum. This may result in unreliable and overly complex (or non-estimable) estimates and standard errors.
lme4 performs a convergence-check (see ?lme4::convergence
), however, as
discussed here and suggested by
one of the lme4-authors in this comment,
this check can be too strict. is_converged()
thus provides an alternative
convergence test for merMod
-objects.
Convergence issues are not easy to diagnose. The help page on ?lme4::convergence
provides most of the current advice about how to resolve convergence issues.
Another clue might be large parameter values, e.g. estimates (on the scale of
the linear predictor) larger than 10 in (non-identity link) generalized linear
model might indicate complete separation, which can be addressed by
regularization, e.g. penalized regression or Bayesian regression with
appropriate priors on the fixed effects.
Note the different meaning between singularity and convergence: singularity indicates an issue with the "true" best estimate, i.e. whether the maximum likelihood estimation for the variance-covariance matrix of the random effects is positive definite or only semi-definite. Convergence is a question of whether we can assume that the numerical optimization has worked correctly or not.
data(cbpp) set.seed(1) cbpp$x <- rnorm(nrow(cbpp)) cbpp$x2 <- runif(nrow(cbpp)) model <- glmer( cbind(incidence, size - incidence) ~ period + x + x2 + (1 + x | herd), data = cbpp, family = binomial() ) is_converged(model) model <- glmmTMB( Sepal.Length ~ poly(Petal.Width, 4) * poly(Petal.Length, 4) + (1 + poly(Petal.Width, 4) | Species), data = iris ) is_converged(model)
data(cbpp) set.seed(1) cbpp$x <- rnorm(nrow(cbpp)) cbpp$x2 <- runif(nrow(cbpp)) model <- glmer( cbind(incidence, size - incidence) ~ period + x + x2 + (1 + x | herd), data = cbpp, family = binomial() ) is_converged(model) model <- glmmTMB( Sepal.Length ~ poly(Petal.Width, 4) * poly(Petal.Length, 4) + (1 + poly(Petal.Width, 4) | Species), data = iris ) is_converged(model)
Check if object is empty
is_empty_object(x)
is_empty_object(x)
x |
A list, a vector, or a dataframe. |
A logical indicating whether the entered object is empty.
is_empty_object(c(1, 2, 3, NA)) is_empty_object(list(NULL, c(NA, NA))) is_empty_object(list(NULL, NA))
is_empty_object(c(1, 2, 3, NA)) is_empty_object(list(NULL, c(NA, NA))) is_empty_object(list(NULL, NA))
Small helper that checks if a model is a generalized additive model.
is_gam_model(x)
is_gam_model(x)
x |
A model object. |
A logical, TRUE
if x
is a generalized additive model
and has smooth-terms
This function only returns TRUE
when the model inherits from a
typical GAM model class and when smooth terms are present in the model
formula. If model has no smooth terms or is not from a typical gam class,
FALSE
is returned.
data(iris) model1 <- lm(Petal.Length ~ Petal.Width + Sepal.Length, data = iris) model2 <- mgcv::gam(Petal.Length ~ Petal.Width + s(Sepal.Length), data = iris) is_gam_model(model1) is_gam_model(model2)
data(iris) model1 <- lm(Petal.Length ~ Petal.Width + Sepal.Length, data = iris) model2 <- mgcv::gam(Petal.Length ~ Petal.Width + s(Sepal.Length), data = iris) is_gam_model(model1) is_gam_model(model2)
Small helper that checks if a model is a mixed effects model, i.e. if it the model has random effects.
is_mixed_model(x)
is_mixed_model(x)
x |
A model object. |
A logical, TRUE
if x
is a mixed model.
data(mtcars) model <- lm(mpg ~ wt + cyl + vs, data = mtcars) is_mixed_model(model) data(sleepstudy, package = "lme4") model <- lme4::lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy) is_mixed_model(model)
data(mtcars) model <- lm(mpg ~ wt + cyl + vs, data = mtcars) is_mixed_model(model) data(sleepstudy, package = "lme4") model <- lme4::lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy) is_mixed_model(model)
Small helper that checks if a model is a regression model or
a statistical object. is_regression_model()
is stricter and only
returns TRUE
for regression models, but not for, e.g., htest
objects.
is_model(x) is_regression_model(x)
is_model(x) is_regression_model(x)
x |
An object. |
This function returns TRUE
if x
is a model object.
A logical, TRUE
if x
is a (supported) model object.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) is_model(m) is_model(mtcars) test <- t.test(1:10, y = c(7:20)) is_model(test) is_regression_model(test)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) is_model(m) is_model(mtcars) test <- t.test(1:10, y = c(7:20)) is_model(test) is_regression_model(test)
Small helper that checks if a model is a supported
(regression) model object. supported_models()
prints a list
of currently supported model classes.
is_model_supported(x) supported_models()
is_model_supported(x) supported_models()
x |
An object. |
This function returns TRUE
if x
is a model object that works with the
package's functions. A list of supported models can also be found here:
https://github.com/easystats/insight.
A logical, TRUE
if x
is a (supported) model object.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) is_model_supported(m) is_model_supported(mtcars) # to see all supported models supported_models()
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) is_model_supported(m) is_model_supported(mtcars) # to see all supported models supported_models()
Small helper that checks if a model is a multivariate response model, i.e. a model with multiple outcomes.
is_multivariate(x)
is_multivariate(x)
x |
A model object, or an object returned by a function from this package. |
A logical, TRUE
if either x
is a model object and is a multivariate
response model, or TRUE
if a return value from a function of insight
is from a multivariate response model.
library(rstanarm) data("pbcLong") model <- suppressWarnings(stan_mvmer( formula = list( logBili ~ year + (1 | id), albumin ~ sex + year + (year | id) ), data = pbcLong, chains = 1, cores = 1, seed = 12345, iter = 1000, show_messages = FALSE, refresh = 0 )) f <- find_formula(model) is_multivariate(model) is_multivariate(f)
library(rstanarm) data("pbcLong") model <- suppressWarnings(stan_mvmer( formula = list( logBili ~ year + (1 | id), albumin ~ sex + year + (year | id) ), data = pbcLong, chains = 1, cores = 1, seed = 12345, iter = 1000, show_messages = FALSE, refresh = 0 )) f <- find_formula(model) is_multivariate(model) is_multivariate(f)
Checks whether a list of models are nested models, strictly following the order they were passed to the function.
is_nested_models(...)
is_nested_models(...)
... |
Multiple regression model objects. |
The term "nested" here means that all the fixed predictors of a
model are contained within the fixed predictors of a larger model (sometimes
referred to as the encompassing model). Currently, is_nested_models()
ignores
random effects parameters.
TRUE
if models are nested, FALSE
otherwise. If models
are nested, also returns two attributes that indicate whether nesting of
models is in decreasing or increasing order.
m1 <- lm(Sepal.Length ~ Petal.Width + Species, data = iris) m2 <- lm(Sepal.Length ~ Species, data = iris) m3 <- lm(Sepal.Length ~ Petal.Width, data = iris) m4 <- lm(Sepal.Length ~ 1, data = iris) is_nested_models(m1, m2, m4) is_nested_models(m4, m2, m1) is_nested_models(m1, m2, m3)
m1 <- lm(Sepal.Length ~ Petal.Width + Species, data = iris) m2 <- lm(Sepal.Length ~ Species, data = iris) m3 <- lm(Sepal.Length ~ Petal.Width, data = iris) m4 <- lm(Sepal.Length ~ 1, data = iris) is_nested_models(m1, m2, m4) is_nested_models(m4, m2, m1) is_nested_models(m1, m2, m3)
Checks if model is a null-model (intercept-only), i.e. if the conditional part of the model has no predictors.
is_nullmodel(x)
is_nullmodel(x)
x |
A model object. |
TRUE
if x
is a null-model, FALSE
otherwise.
model <- lm(mpg ~ 1, data = mtcars) is_nullmodel(model) model <- lm(mpg ~ gear, data = mtcars) is_nullmodel(model) data(sleepstudy, package = "lme4") model <- lme4::lmer(Reaction ~ 1 + (Days | Subject), data = sleepstudy) is_nullmodel(model) model <- lme4::lmer(Reaction ~ Days + (Days | Subject), data = sleepstudy) is_nullmodel(model)
model <- lm(mpg ~ 1, data = mtcars) is_nullmodel(model) model <- lm(mpg ~ gear, data = mtcars) is_nullmodel(model) data(sleepstudy, package = "lme4") model <- lme4::lmer(Reaction ~ 1 + (Days | Subject), data = sleepstudy) is_nullmodel(model) model <- lme4::lmer(Reaction ~ Days + (Days | Subject), data = sleepstudy) is_nullmodel(model)
Returns the link-function from a model object.
link_function(x, ...) ## S3 method for class 'betamfx' link_function(x, what = c("mean", "precision"), ...) ## S3 method for class 'gamlss' link_function(x, what = c("mu", "sigma", "nu", "tau"), ...) ## S3 method for class 'betareg' link_function(x, what = c("mean", "precision"), ...) ## S3 method for class 'DirichletRegModel' link_function(x, what = c("mean", "precision"), ...)
link_function(x, ...) ## S3 method for class 'betamfx' link_function(x, what = c("mean", "precision"), ...) ## S3 method for class 'gamlss' link_function(x, what = c("mu", "sigma", "nu", "tau"), ...) ## S3 method for class 'betareg' link_function(x, what = c("mean", "precision"), ...) ## S3 method for class 'DirichletRegModel' link_function(x, what = c("mean", "precision"), ...)
x |
A fitted model. |
... |
Currently not used. |
what |
For |
A function, describing the link-function from a model-object. For multivariate-response models, a list of functions is returned.
# example from ?stats::glm counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12) outcome <- gl(3, 1, 9) treatment <- gl(3, 3) m <- glm(counts ~ outcome + treatment, family = poisson()) link_function(m)(0.3) # same as log(0.3)
# example from ?stats::glm counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12) outcome <- gl(3, 1, 9) treatment <- gl(3, 3) m <- glm(counts ~ outcome + treatment, family = poisson()) link_function(m)(0.3) # same as log(0.3)
Returns the link-inverse function from a model object.
link_inverse(x, ...) ## S3 method for class 'betareg' link_inverse(x, what = c("mean", "precision"), ...) ## S3 method for class 'DirichletRegModel' link_inverse(x, what = c("mean", "precision"), ...) ## S3 method for class 'betamfx' link_inverse(x, what = c("mean", "precision"), ...) ## S3 method for class 'gamlss' link_inverse(x, what = c("mu", "sigma", "nu", "tau"), ...)
link_inverse(x, ...) ## S3 method for class 'betareg' link_inverse(x, what = c("mean", "precision"), ...) ## S3 method for class 'DirichletRegModel' link_inverse(x, what = c("mean", "precision"), ...) ## S3 method for class 'betamfx' link_inverse(x, what = c("mean", "precision"), ...) ## S3 method for class 'gamlss' link_inverse(x, what = c("mu", "sigma", "nu", "tau"), ...)
x |
A fitted model. |
... |
Currently not used. |
what |
For |
A function, describing the inverse-link function from a model-object. For multivariate-response models, a list of functions is returned.
# example from ?stats::glm counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12) outcome <- gl(3, 1, 9) treatment <- gl(3, 3) m <- glm(counts ~ outcome + treatment, family = poisson()) link_inverse(m)(0.3) # same as exp(0.3)
# example from ?stats::glm counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12) outcome <- gl(3, 1, 9) treatment <- gl(3, 3) m <- glm(counts ~ outcome + treatment, family = poisson()) link_inverse(m)(0.3) # same as exp(0.3)
Retrieve information from model objects.
model_info(x, ...) ## Default S3 method: model_info(x, verbose = TRUE, ...)
model_info(x, ...) ## Default S3 method: model_info(x, verbose = TRUE, ...)
x |
A fitted model. |
... |
Currently not used. |
verbose |
Toggle off warnings. |
model_info()
returns a list with information about the
model for many different model objects. Following information
is returned, where all values starting with is_
are logicals.
is_binomial
: family is binomial (but not negative binomial)
is_bernoulli
: special case of binomial models: family is Bernoulli
is_poisson
: family is poisson
is_negbin
: family is negative binomial
is_count
: model is a count model (i.e. family is either poisson or negative binomial)
is_beta
: family is beta
is_betabinomial
: family is beta-binomial
is_orderedbeta
: family is ordered beta
is_dirichlet
: family is dirichlet
is_exponential
: family is exponential (e.g. Gamma or Weibull)
is_logit
: model has logit link
is_probit
: model has probit link
is_linear
: family is gaussian
is_tweedie
: family is tweedie
is_ordinal
: family is ordinal or cumulative link
is_cumulative
: family is ordinal or cumulative link
is_multinomial
: family is multinomial or categorical link
is_categorical
: family is categorical link
is_censored
: model is a censored model (has a censored response, including survival models)
is_truncated
: model is a truncated model (has a truncated response)
is_survival
: model is a survival model
is_zero_inflated
: model has zero-inflation component
is_hurdle
: model has zero-inflation component and is a hurdle-model (truncated family distribution)
is_dispersion
: model has dispersion component (not only dispersion parameter)
is_mixed
: model is a mixed effects model (with random effects)
is_multivariate
: model is a multivariate response model (currently only works for brmsfit and vglm/vgam objects)
is_trial
: model response contains additional information about the trials
is_bayesian
: model is a Bayesian model
is_gam
: model is a generalized additive model
is_anova
: model is an Anova object
is_ttest
: model is an an object of class htest
, returned by t.test()
is_correlation
: model is an an object of class htest
, returned by cor.test()
is_ranktest
: model is an an object of class htest
, returned by cor.test()
(if Spearman's rank correlation), wilcox.text()
or kruskal.test()
.
is_variancetest
: model is an an object of class htest
, returned by
bartlett.test()
, shapiro.test()
or car::leveneTest()
.
is_levenetest
: model is an an object of class anova
, returned by car::leveneTest()
.
is_onewaytest
: model is an an object of class htest
, returned by oneway.test()
is_proptest
: model is an an object of class htest
, returned by prop.test()
is_binomtest
: model is an an object of class htest
, returned by binom.test()
is_chi2test
: model is an an object of class htest
, returned by chisq.test()
is_xtab
: model is an an object of class htest
or BFBayesFactor
, and
test-statistic stems from a contingency table (i.e. chisq.test()
or
BayesFactor::contingencyTableBF()
).
link_function
: the link-function
family
: name of the distributional family of the model. For some
exceptions (like some htest
objects), can also be the name of the test.
n_obs
: number of observations
n_grouplevels
: for mixed models, returns names and numbers of random effect groups
A list with information about the model, like family, link-function etc. (see 'Details').
ldose <- rep(0:5, 2) numdead <- c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16) sex <- factor(rep(c("M", "F"), c(6, 6))) SF <- cbind(numdead, numalive = 20 - numdead) dat <- data.frame(ldose, sex, SF, stringsAsFactors = FALSE) m <- glm(SF ~ sex * ldose, family = binomial) # logistic regression model_info(m) # t-test m <- t.test(1:10, y = c(7:20)) model_info(m)
ldose <- rep(0:5, 2) numdead <- c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16) sex <- factor(rep(c("M", "F"), c(6, 6))) SF <- cbind(numdead, numalive = 20 - numdead) dat <- data.frame(ldose, sex, SF, stringsAsFactors = FALSE) m <- glm(SF ~ sex * ldose, family = binomial) # logistic regression model_info(m) # t-test m <- t.test(1:10, y = c(7:20)) model_info(m)
Returns the "name" (class attribute) of a model, possibly including further information.
model_name(x, ...) ## Default S3 method: model_name(x, include_formula = FALSE, include_call = FALSE, ...)
model_name(x, ...) ## Default S3 method: model_name(x, include_formula = FALSE, include_call = FALSE, ...)
x |
A model. |
... |
Currently not used. |
include_formula |
Should the name include the model's formula. |
include_call |
If |
A character string of a name (which usually equals the model's class attribute).
m <- lm(Sepal.Length ~ Petal.Width, data = iris) model_name(m) model_name(m, include_formula = TRUE) model_name(m, include_call = TRUE) model_name(lme4::lmer(Sepal.Length ~ Sepal.Width + (1 | Species), data = iris))
m <- lm(Sepal.Length ~ Petal.Width, data = iris) model_name(m) model_name(m, include_formula = TRUE) model_name(m, include_call = TRUE) model_name(lme4::lmer(Sepal.Length ~ Sepal.Width + (1 | Species), data = iris))
Returns the number of group levels of random effects from mixed models.
n_grouplevels(x, ...)
n_grouplevels(x, ...)
x |
A mixed model. |
... |
Additional arguments that can be passed to the function. Currently,
you can use |
The number of group levels in the model.
data(sleepstudy, package = "lme4") set.seed(12345) sleepstudy$grp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$subgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$grp == i sleepstudy$subgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } model <- lme4::lmer( Reaction ~ Days + (1 | grp / subgrp) + (1 | Subject), data = sleepstudy ) n_grouplevels(model)
data(sleepstudy, package = "lme4") set.seed(12345) sleepstudy$grp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$subgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$grp == i sleepstudy$subgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } model <- lme4::lmer( Reaction ~ Days + (1 | grp / subgrp) + (1 | Subject), data = sleepstudy ) n_grouplevels(model)
This method returns the number of observation that were used to fit the model, as numeric value.
n_obs(x, ...) ## S3 method for class 'glm' n_obs(x, disaggregate = FALSE, ...) ## S3 method for class 'svyolr' n_obs(x, weighted = FALSE, ...) ## S3 method for class 'afex_aov' n_obs(x, shape = c("long", "wide"), ...) ## S3 method for class 'stanmvreg' n_obs(x, select = NULL, ...)
n_obs(x, ...) ## S3 method for class 'glm' n_obs(x, disaggregate = FALSE, ...) ## S3 method for class 'svyolr' n_obs(x, weighted = FALSE, ...) ## S3 method for class 'afex_aov' n_obs(x, shape = c("long", "wide"), ...) ## S3 method for class 'stanmvreg' n_obs(x, select = NULL, ...)
x |
A fitted model. |
... |
Currently not used. |
disaggregate |
For binomial models with aggregated data, |
weighted |
For survey designs, returns the weighted sample size. |
shape |
Return long or wide data? Only applicable in repeated measures designs. |
select |
Optional name(s) of response variables for which to extract values. Can be used in case of regression models with multiple response variables. |
The number of observations used to fit the model, or NULL
if
this information is not available.
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) n_obs(m) data(cbpp, package = "lme4") m <- glm( cbind(incidence, size - incidence) ~ period, data = cbpp, family = binomial(link = "logit") ) n_obs(m) n_obs(m, disaggregate = TRUE)
data(mtcars) m <- lm(mpg ~ wt + cyl + vs, data = mtcars) n_obs(m) data(cbpp, package = "lme4") m <- glm( cbind(incidence, size - incidence) ~ period, data = cbpp, family = binomial(link = "logit") ) n_obs(m) n_obs(m, disaggregate = TRUE)
Returns the number of parameters (coefficients) of a model.
n_parameters(x, ...) ## Default S3 method: n_parameters(x, remove_nonestimable = FALSE, ...) ## S3 method for class 'merMod' n_parameters( x, effects = c("fixed", "random"), remove_nonestimable = FALSE, ... ) ## S3 method for class 'glmmTMB' n_parameters( x, effects = c("fixed", "random"), component = c("all", "conditional", "zi", "zero_inflated"), remove_nonestimable = FALSE, ... ) ## S3 method for class 'zeroinfl' n_parameters( x, component = c("all", "conditional", "zi", "zero_inflated"), remove_nonestimable = FALSE, ... ) ## S3 method for class 'gam' n_parameters( x, component = c("all", "conditional", "smooth_terms"), remove_nonestimable = FALSE, ... ) ## S3 method for class 'brmsfit' n_parameters(x, effects = "all", component = "all", ...)
n_parameters(x, ...) ## Default S3 method: n_parameters(x, remove_nonestimable = FALSE, ...) ## S3 method for class 'merMod' n_parameters( x, effects = c("fixed", "random"), remove_nonestimable = FALSE, ... ) ## S3 method for class 'glmmTMB' n_parameters( x, effects = c("fixed", "random"), component = c("all", "conditional", "zi", "zero_inflated"), remove_nonestimable = FALSE, ... ) ## S3 method for class 'zeroinfl' n_parameters( x, component = c("all", "conditional", "zi", "zero_inflated"), remove_nonestimable = FALSE, ... ) ## S3 method for class 'gam' n_parameters( x, component = c("all", "conditional", "smooth_terms"), remove_nonestimable = FALSE, ... ) ## S3 method for class 'brmsfit' n_parameters(x, effects = "all", component = "all", ...)
x |
A statistical model. |
... |
Arguments passed to or from other methods. |
remove_nonestimable |
Logical, if |
effects |
Should variables for fixed effects ( |
component |
Should total number of parameters, number parameters for the conditional model, the zero-inflated part of the model, the dispersion term or the instrumental variables be returned? Applies to models with zero-inflated and/or dispersion formula, or to models with instrumental variable (so called fixed-effects regressions). May be abbreviated. |
The number of parameters in the model.
This function returns the number of parameters for the fixed effects by
default, as returned by find_parameters(x, effects = "fixed")
. It does not
include all estimated model parameters, i.e. auxiliary parameters like
sigma or dispersion are not counted. To get the number of all estimated
parameters, use get_df(x, type = "model")
.
data(iris) model <- lm(Sepal.Length ~ Sepal.Width * Species, data = iris) n_parameters(model)
data(iris) model <- lm(Sepal.Length ~ Sepal.Width * Species, data = iris) n_parameters(model)
This function computes the null-model (i.e. (y ~ 1)
) of
a model. For mixed models, the null-model takes random effects into account.
null_model(model, verbose = TRUE, ...)
null_model(model, verbose = TRUE, ...)
model |
A (mixed effects) model. |
verbose |
Toggle off warnings. |
... |
Arguments passed to or from other methods. |
The null-model of x
data(sleepstudy) m <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy) summary(m) summary(null_model(m))
data(sleepstudy) m <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy) summary(m) summary(null_model(m))
object_has_names()
checks if specified names are present in the given object.
object_has_rownames()
checks if rownames are present in a dataframe.
object_has_names(x, names) object_has_rownames(x)
object_has_names(x, names) object_has_rownames(x)
x |
A named object (an atomic vector, a list, a dataframe, etc.). |
names |
A single character or a vector of characters. |
A logical or a vector of logicals.
# check if specified names are present in the given object object_has_names(mtcars, "am") object_has_names(anscombe, c("x1", "z1", "y1")) object_has_names(list("x" = 1, "y" = 2), c("x", "a")) # check if a dataframe has rownames object_has_rownames(mtcars)
# check if specified names are present in the given object object_has_names(mtcars, "am") object_has_names(anscombe, c("x1", "z1", "y1")) object_has_names(list("x" = 1, "y" = 2), c("x", "a")) # check if a dataframe has rownames object_has_rownames(mtcars)
Convenient function that allows coloured output in the console. Mainly implemented to reduce package dependencies.
print_color(text, color) print_colour(text, colour) color_text(text, color) colour_text(text, colour) color_theme()
print_color(text, color) print_colour(text, colour) color_text(text, color) colour_text(text, colour) color_theme()
text |
The text to print. |
color , colour
|
Character vector, indicating the colour for printing.
May be one of |
This function prints text
directly to the console using
cat()
, so no string is returned. color_text()
, however,
returns only the formatted string, without using cat()
.
color_theme()
either returns "dark"
when RStudio is used
with dark color scheme, "light"
when it's used with light theme,
and NULL
if the theme could not be detected.
Nothing.
print_color("I'm blue dabedi dabedei", "blue")
print_color("I'm blue dabedi dabedei", "blue")
This function takes a data frame, typically a data frame with information on
summaries of model parameters like bayestestR::describe_posterior()
,
bayestestR::hdi()
or parameters::model_parameters()
, as input and splits
this information into several parts, depending on the model. See details
below.
print_parameters( x, ..., by = c("Effects", "Component", "Group", "Response"), format = "text", parameter_column = "Parameter", keep_parameter_column = TRUE, remove_empty_column = FALSE, titles = NULL, subtitles = NULL )
print_parameters( x, ..., by = c("Effects", "Component", "Group", "Response"), format = "text", parameter_column = "Parameter", keep_parameter_column = TRUE, remove_empty_column = FALSE, titles = NULL, subtitles = NULL )
x |
A fitted model, or a data frame returned by |
... |
One or more objects (data frames), which contain information about the model parameters and related statistics (like confidence intervals, HDI, ROPE, ...). |
by |
|
format |
Name of output-format, as string. If |
parameter_column |
String, name of the column that contains the
parameter names. Usually, for data frames returned by functions the
easystats-packages, this will be |
keep_parameter_column |
Logical, if |
remove_empty_column |
Logical, if |
titles , subtitles
|
By default, the names of the model components (like
fixed or random effects, count or zero-inflated model part) are added as
attributes |
This function prepares data frames that contain information about model parameters for clear printing.
First, x
is required, which should either be a model object or a
prepared data frame as returned by clean_parameters()
. If
x
is a model, clean_parameters()
is called on that model
object to get information with which model components the parameters
are associated.
Then, ...
take one or more data frames that also contain information
about parameters from the same model, but also have additional information
provided by other methods. For instance, a data frame in ...
might
be the result of, for instance, bayestestR::describe_posterior()
,
or parameters::model_parameters()
, where we have a) a
Parameter
column and b) columns with other parameter values (like
CI, HDI, test statistic, etc.).
Now we have a data frame with model parameters and information about the
association to the different model components, a data frame with model
parameters, and some summary statistics. print_parameters()
then merges
these data frames, so the parameters or statistics of interest are also
associated with the different model components. The data frame is split into
a list, so for a clear printing. Users can loop over this list and print each
component for a better overview. Further, parameter names are "cleaned", if
necessary, also for a cleaner print. See also 'Examples'.
A data frame or a list of data frames (if by
is not NULL
). If a
list is returned, the element names reflect the model components where the
extracted information in the data frames belong to, e.g.
random.zero_inflated.Intercept: persons
. This is the data frame that
contains the parameters for the random effects from group-level "persons"
from the zero-inflated model component.
library(bayestestR) model <- download_model("brms_zi_2") x <- hdi(model, effects = "all", component = "all") # hdi() returns a data frame; here we use only the # information on parameter names and HDI values tmp <- as.data.frame(x)[, 1:4] tmp # Based on the "by" argument, we get a list of data frames that # is split into several parts that reflect the model components. print_parameters(model, tmp) # This is the standard print()-method for "bayestestR::hdi"-objects. # For printing methods, it is easy to print complex summary statistics # in a clean way to the console by splitting the information into # different model components. x
library(bayestestR) model <- download_model("brms_zi_2") x <- hdi(model, effects = "all", component = "all") # hdi() returns a data frame; here we use only the # information on parameter names and HDI values tmp <- as.data.frame(x)[, 1:4] tmp # Based on the "by" argument, we get a list of data frames that # is split into several parts that reflect the model components. print_parameters(model, tmp) # This is the standard print()-method for "bayestestR::hdi"-objects. # For printing methods, it is easy to print complex summary statistics # in a clean way to the console by splitting the information into # different model components. x
Standardizes order of columns for dataframes and other objects from easystats and broom ecosystem packages.
standardize_column_order(data, ...) ## S3 method for class 'parameters_model' standardize_column_order(data, style = c("easystats", "broom"), ...)
standardize_column_order(data, ...) ## S3 method for class 'parameters_model' standardize_column_order(data, style = c("easystats", "broom"), ...)
data |
A data frame. In particular, objects from easystats
package functions like |
... |
Currently not used. |
style |
Standardization can either be based on the naming conventions from the easystats-project, or on broom's naming scheme. |
A data frame, with standardized column order.
# easystats conventions df1 <- cbind.data.frame( CI_low = -2.873, t = 5.494, CI_high = -1.088, p = 0.00001, Parameter = -1.980, CI = 0.95, df = 29.234, Method = "Student's t-test" ) standardize_column_order(df1, style = "easystats") # broom conventions df2 <- cbind.data.frame( conf.low = -2.873, statistic = 5.494, conf.high = -1.088, p.value = 0.00001, estimate = -1.980, conf.level = 0.95, df = 29.234, method = "Student's t-test" ) standardize_column_order(df2, style = "broom")
# easystats conventions df1 <- cbind.data.frame( CI_low = -2.873, t = 5.494, CI_high = -1.088, p = 0.00001, Parameter = -1.980, CI = 0.95, df = 29.234, Method = "Student's t-test" ) standardize_column_order(df1, style = "easystats") # broom conventions df2 <- cbind.data.frame( conf.low = -2.873, statistic = 5.494, conf.high = -1.088, p.value = 0.00001, estimate = -1.980, conf.level = 0.95, df = 29.234, method = "Student's t-test" ) standardize_column_order(df2, style = "broom")
Standardize column names from data frames, in particular objects returned
from parameters::model_parameters()
, so column names are consistent and
the same for any model object.
standardize_names(data, ...) ## S3 method for class 'parameters_model' standardize_names( data, style = c("easystats", "broom"), ignore_estimate = FALSE, ... )
standardize_names(data, ...) ## S3 method for class 'parameters_model' standardize_names( data, style = c("easystats", "broom"), ignore_estimate = FALSE, ... )
data |
A data frame. In particular, objects from easystats
package functions like |
... |
Currently not used. |
style |
Standardization can either be based on the naming conventions from the easystats-project, or on broom's naming scheme. |
ignore_estimate |
Logical, if |
This method is in particular useful for package developers or users
who use, e.g., parameters::model_parameters()
in their own code or
functions to retrieve model parameters for further processing. As
model_parameters()
returns a data frame with varying column names
(depending on the input), accessing the required information is probably
not quite straightforward. In such cases, standardize_names()
can be
used to get consistent, i.e. always the same column names, no matter what
kind of model was used in model_parameters()
.
For style = "broom"
, column names are renamed to match broom's
naming scheme, i.e. Parameter
is renamed to term
, Coefficient
becomes
estimate
and so on.
For style = "easystats"
, when data
is an object from broom::tidy()
,
column names are converted from "broom"-style into "easystats"-style.
A data frame, with standardized column names.
model <- lm(mpg ~ wt + cyl, data = mtcars) mp <- model_parameters(model) as.data.frame(mp) standardize_names(mp) standardize_names(mp, style = "broom")
model <- lm(mpg ~ wt + cyl, data = mtcars) mp <- model_parameters(model) as.data.frame(mp) standardize_names(mp) standardize_names(mp, style = "broom")
This function removes backticks from a string.
text_remove_backticks(x, ...) ## S3 method for class 'data.frame' text_remove_backticks(x, column = "Parameter", verbose = FALSE, ...)
text_remove_backticks(x, ...) ## S3 method for class 'data.frame' text_remove_backticks(x, column = "Parameter", verbose = FALSE, ...)
x |
A character vector, a data frame or a matrix. If a matrix, backticks are removed from the column and row names, not from values of a character vector. |
... |
Currently not used. |
column |
If |
verbose |
Toggle warnings. |
x
, where all backticks are removed.
If x
is a character vector or data frame, backticks are removed from
the elements of that character vector (or character vectors from the data
frame.) If x
is a matrix, the behaviour slightly differs: in this case,
backticks are removed from the column and row names. The reason for this
behaviour is that this function mainly serves formatting coefficient names.
For vcov()
(a matrix), row and column names equal the coefficient names
and therefore are manipulated then.
# example model data(iris) iris$`a m` <- iris$Species iris$`Sepal Width` <- iris$Sepal.Width model <- lm(`Sepal Width` ~ Petal.Length + `a m`, data = iris) # remove backticks from string names(coef(model)) text_remove_backticks(names(coef(model))) # remove backticks from character variable in a data frame # column defaults to "Parameter". d <- data.frame( Parameter = names(coef(model)), Estimate = unname(coef(model)) ) d text_remove_backticks(d)
# example model data(iris) iris$`a m` <- iris$Species iris$`Sepal Width` <- iris$Sepal.Width model <- lm(`Sepal Width` ~ Petal.Length + `a m`, data = iris) # remove backticks from string names(coef(model)) text_remove_backticks(names(coef(model))) # remove backticks from character variable in a data frame # column defaults to "Parameter". d <- data.frame( Parameter = names(coef(model)), Estimate = unname(coef(model)) ) d text_remove_backticks(d)
Collection of small helper functions. trim_ws()
is an
efficient function to trim leading and trailing whitespaces from character
vectors or strings. n_unique()
returns the number of unique values in a
vector. has_single_value()
is equivalent to n_unique() == 1
but is faster.
safe_deparse()
is comparable to deparse1()
, i.e. it can safely
deparse very long expressions into a single string. safe_deparse_symbol()
only deparses a substituted expressions when possible, which can be much faster
than deparse(substitute())
for those cases where substitute()
returns no
valid object name.
trim_ws(x, ...) ## S3 method for class 'data.frame' trim_ws(x, character_only = TRUE, ...) n_unique(x, ...) ## Default S3 method: n_unique(x, remove_na = TRUE, ...) safe_deparse(x, ...) safe_deparse_symbol(x) has_single_value(x, remove_na = FALSE, ...)
trim_ws(x, ...) ## S3 method for class 'data.frame' trim_ws(x, character_only = TRUE, ...) n_unique(x, ...) ## Default S3 method: n_unique(x, remove_na = TRUE, ...) safe_deparse(x, ...) safe_deparse_symbol(x) has_single_value(x, remove_na = FALSE, ...)
x |
A (character) vector, or for some functions may also be a data frame. |
... |
Currently not used. |
character_only |
Logical, if |
remove_na |
Logical, if missing values should be removed from the input. |
n_unique()
: For a vector, n_unique
always returns an integer value,
even if the input is NULL
(the return value will be 0
then). For data
frames or lists, n_unique()
returns a named numeric vector, with the
number of unique values for each element.
has_single_value()
: TRUE
if x
has only one unique value,
FALSE
otherwise.
trim_ws()
: A character vector, where trailing and leading white spaces
are removed.
safe_deparse()
: A character string of the unevaluated expression or symbol.
safe_deparse_symbol()
: A character string of the unevaluated expression
or symbol, if x
was a symbol. If x
is no symbol (i.e. if is.name(x)
would return FALSE
), NULL
is returned.
trim_ws(" no space! ") n_unique(iris$Species) has_single_value(c(1, 1, 2)) # safe_deparse_symbol() compared to deparse(substitute()) safe_deparse_symbol(as.name("test")) deparse(substitute(as.name("test")))
trim_ws(" no space! ") n_unique(iris$Species) has_single_value(c(1, 1, 2)) # safe_deparse_symbol() compared to deparse(substitute()) safe_deparse_symbol(as.name("test")) deparse(substitute(as.name("test")))
This is a replacement for match.arg()
, however, the error
string should be more informative for users. The name of the affected argument
is shown, and possible typos as well as remaining valid options.
validate_argument(argument, options)
validate_argument(argument, options)
argument |
The bare name of the argument to be validated. |
options |
Valid options, usually a character vector. |
argument
if it is a valid option, else an error is thrown.
foo <- function(test = "small") { validate_argument(test, c("small", "medium", "large")) } foo("small") # errors: # foo("masll")
foo <- function(test = "small") { validate_argument(test, c("small", "medium", "large")) } foo("small") # errors: # foo("masll")