Package: datawizard 1.0.1

Etienne Bacher

datawizard: Easy Data Wrangling and Statistical Transformations

A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.

Authors:Indrajeet Patil [aut], Etienne Bacher [aut, cre], Dominique Makowski [aut], Daniel Lüdecke [aut], Mattan S. Ben-Shachar [aut], Brenton M. Wiernik [aut], Rémi Thériault [ctb], Thomas J. Faulkenberry [rev], Robert Garrett [rev]

datawizard_1.0.1.tar.gz
datawizard_1.0.1.zip(r-4.5)datawizard_1.0.1.zip(r-4.4)datawizard_1.0.1.zip(r-4.3)
datawizard_1.0.1.tgz(r-4.5-any)datawizard_1.0.1.tgz(r-4.4-any)datawizard_1.0.1.tgz(r-4.3-any)
datawizard_1.0.1.tar.gz(r-4.5-noble)datawizard_1.0.1.tar.gz(r-4.4-noble)
datawizard_1.0.1.tgz(r-4.4-emscripten)datawizard_1.0.1.tgz(r-4.3-emscripten)
datawizard.pdf |datawizard.html✨
datawizard/json (API)
NEWS

# Install 'datawizard' in R:

install.packages('datawizard', repos = c('https://easystats.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/easystats/datawizard/issues

Pkgdown site:https://easystats.github.io

Datasets:

efc - Sample dataset from the EFC Survey
nhanes_sample - Sample dataset from the National Health and Nutrition Examination Survey

On CRAN:

data dplyr hacktoberfest janitor manipulation reshape tidyr wrangling

14.71 score 222 stars 119 packages 436 scripts 125k downloads 110 exports 1 dependencies

Last updated 4 days agofrom:1af27228d3. Checks:9 OK. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Mar 07 2025
R-4.5-win	OK	Mar 07 2025
R-4.5-mac	OK	Mar 07 2025
R-4.5-linux	OK	Mar 07 2025
R-4.4-win	OK	Mar 07 2025
R-4.4-mac	OK	Mar 07 2025
R-4.4-linux	OK	Mar 07 2025
R-4.3-win	OK	Mar 07 2025
R-4.3-mac	OK	Mar 07 2025

Exports:adjust assign_labels categorize center centre change_scale coef_var coerce_to_numeric colnames_to_row column_as_rownames contr.deviation convert_na_to convert_to_na data_addprefix data_addsuffix data_adjust data_arrange data_codebook data_duplicated data_extract data_filter data_group data_join data_match data_merge data_modify data_partition data_peek data_read data_relocate data_remove data_rename data_rename_rows data_reorder data_replicate data_restoretype data_rotate data_seek data_select data_separate data_summary data_tabulate data_to_long data_to_wide data_transpose data_ungroup data_unique data_unite data_write degroup demean describe_distribution detrend distribution_coef_var distribution_mode empty_columns empty_rows extract_column_names find_columns kurtosis labels_to_levels mean_sd means_by_group median_mad normalize print_html print_md ranktransform recode_into recode_values remove_empty remove_empty_columns remove_empty_rows replace_nan_inf rescale rescale_weights reshape_ci reshape_longer reshape_wider reverse reverse_scale row_count row_means row_sums row_to_colnames rowid_as_column rownames_as_column skewness slide smoothness standardise standardize text_concatenate text_format text_fullstop text_lastchar text_paste text_remove text_wrap to_factor to_numeric unnormalize unstandardise unstandardize visualisation_recipe weighted_mad weighted_mean weighted_median weighted_sd winsorize

Dependencies:insight

Overview of Vignettes

Rendered fromoverview_of_vignettes.Rmdusingknitr::rmarkdownon Mar 07 2025.

Last update: 2024-09-02
Started: 2021-05-26

Help page	Topics
Adjust data for the effect of other variable(s)	adjust data_adjust
Assign variable and value labels	assign_labels assign_labels.data.frame assign_labels.numeric
Recode (or "cut" / "bin") data into groups of values.	categorize categorize.data.frame categorize.numeric
Centering (Grand-Mean Centering)	center center.data.frame center.numeric centre
Compute the coefficient of variation	coef_var coef_var.numeric distribution_coef_var distribution_cv
Convert to Numeric (if possible)	coerce_to_numeric
Deviation Contrast Matrix	contr.deviation
Replace missing values in a variable or a data frame.	convert_na_to convert_na_to.character convert_na_to.data.frame convert_na_to.numeric
Convert non-missing values in a variable into missing values.	convert_to_na convert_to_na.data.frame convert_to_na.factor convert_to_na.numeric
Add a prefix or suffix to column names	data_addprefix data_addsuffix
Arrange rows by column values	data_arrange
Generate a codebook of a data frame.	data_codebook print_html.data_codebook
Extract all duplicates	data_duplicated
Extract one or more columns or elements from an object	data_extract data_extract.data.frame
Create a grouped data frame	data_group data_ungroup
Return filtered or sliced data frame, or row indices	data_filter data_match
Merge (join) two data frames, or a list of data frames	data_join data_merge data_merge.data.frame data_merge.list
Create new variables in a data frame	data_modify data_modify.data.frame
Partition data	data_partition
Peek at values and type of variables in a data frame	data_peek data_peek.data.frame
Read (import) data files from various sources	data_read data_write
Relocate (reorder) columns of a data frame	data_relocate data_remove data_reorder
Rename columns and variable names	data_rename data_rename_rows
Expand (i.e. replicate rows) a data frame	data_replicate
Restore the type of columns according to a reference data frame	data_restoretype
Rotate a data frame	data_rotate data_transpose
Find variables by their names, variable or value labels	data_seek
Find or get columns in a data frame based on search patterns	data_select extract_column_names find_columns
Separate single variable into multiple variables	data_separate
Summarize data	data_summary data_summary.data.frame
Create frequency and crosstables of variables	as.data.frame.datawizard_tables data_tabulate data_tabulate.data.frame data_tabulate.default
Reshape (pivot) data from wide to long	data_to_long reshape_longer
Reshape (pivot) data from long to wide	data_to_wide reshape_wider
Keep only one row from all with duplicated IDs	data_unique
Unite ("merge") multiple variables	data_unite
Compute group-meaned and de-meaned variables	degroup demean detrend
Describe a distribution	describe_distribution describe_distribution.data.frame describe_distribution.factor describe_distribution.numeric
Compute mode for a statistical distribution	distribution_mode
Sample dataset from the EFC Survey	efc
Convert value labels into factor levels	labels_to_levels labels_to_levels.data.frame labels_to_levels.factor
Utility Function for Safe Prediction with 'datawizard' transformers	makepredictcall.dw_transformer
Summary Helpers	mean_sd median_mad
Summary of mean values by group	means_by_group means_by_group.data.frame means_by_group.numeric
Sample dataset from the National Health and Nutrition Examination Survey	nhanes_sample
Normalize numeric variable to 0-1 range	normalize normalize.data.frame normalize.numeric unnormalize unnormalize.data.frame unnormalize.grouped_df unnormalize.numeric
(Signed) rank transformation	ranktransform ranktransform.data.frame ranktransform.numeric
Recode values from one or more variables into a new variable	recode_into
Recode old values of variables into new values	recode_values recode_values.data.frame recode_values.numeric
Return or remove variables or observations that are completely missing	empty_columns empty_rows remove_empty remove_empty_columns remove_empty_rows
Convert infinite or 'NaN' values into 'NA'	replace_nan_inf
Rescale Variables to a New Range	change_scale rescale rescale.data.frame rescale.numeric
Rescale design weights for multilevel analysis	rescale_weights
Reshape CI between wide/long formats	reshape_ci
Reverse-Score Variables	reverse reverse.data.frame reverse.numeric reverse_scale
Count specific values row-wise	row_count
Row means or sums (optionally with minimum amount of valid values)	row_means row_sums
Tools for working with column names	colnames_to_row row_to_colnames
Tools for working with row names or row ids	column_as_rownames rowid_as_column rownames_as_column
Compute Skewness and (Excess) Kurtosis	kurtosis kurtosis.numeric print.parameters_kurtosis print.parameters_skewness skewness skewness.numeric summary.parameters_kurtosis summary.parameters_skewness
Shift numeric value range	slide slide.data.frame slide.numeric
Quantify the smoothness of a vector	smoothness
Standardization (Z-scoring)	standardise standardize standardize.data.frame standardize.factor standardize.numeric unstandardise unstandardize unstandardize.data.frame unstandardize.numeric
Re-fit a model with standardized data	standardize.default standardize_models
Convenient text formatting functionalities	text_concatenate text_format text_fullstop text_lastchar text_paste text_remove text_wrap
Convert data to factors	to_factor to_factor.data.frame to_factor.numeric
Convert data to numeric	to_numeric to_numeric.data.frame
Prepare objects for visualisation	visualisation_recipe
Weighted Mean, Median, SD, and MAD	weighted_mad weighted_mean weighted_median weighted_sd
Winsorize data	winsorize winsorize.numeric

Package: datawizard 1.0.1

datawizard: Easy Data Wrangling and Statistical Transformations

Overview of Vignettes

Citation

Development and contributors

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)