Package: datawizard 0.11.0.2

Etienne Bacher

datawizard: Easy Data Wrangling and Statistical Transformations

A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.

Authors:Indrajeet Patil [aut], Etienne Bacher [aut, cre], Dominique Makowski [aut], Daniel Lüdecke [aut], Mattan S. Ben-Shachar [aut], Brenton M. Wiernik [aut], Rémi Thériault [ctb], Thomas J. Faulkenberry [rev], Robert Garrett [rev]

datawizard_0.11.0.2.tar.gz
datawizard_0.11.0.2.zip(r-4.5)datawizard_0.11.0.2.zip(r-4.4)datawizard_0.11.0.2.zip(r-4.3)
datawizard_0.11.0.2.tgz(r-4.4-any)datawizard_0.11.0.2.tgz(r-4.3-any)
datawizard_0.11.0.2.tar.gz(r-4.5-noble)datawizard_0.11.0.2.tar.gz(r-4.4-noble)
datawizard_0.11.0.2.tgz(r-4.4-emscripten)datawizard_0.11.0.2.tgz(r-4.3-emscripten)
datawizard.pdf |datawizard.html
datawizard/json (API)
NEWS

# Install datawizard in R:
install.packages('datawizard', repos = c('https://easystats.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/easystats/datawizard/issues

Datasets:
  • efc - Sample dataset from the EFC Survey
  • nhanes_sample - Sample dataset from the National Health and Nutrition Examination Survey

On CRAN:

datadplyrhacktoberfestjanitormanipulationreshapetidyrwrangling

112 exports 192 stars 7.97 score 1 dependencies 106 dependents 109.3k downloads

Last updated 13 days agofrom:59d24e975222fb9c90a984897f24675485f9f9da

Exports:adjustassign_labelscategorizecentercentrechange_codechange_scalecoef_varcoerce_to_numericcolnames_to_rowcolumn_as_rownamescontr.deviationconvert_na_toconvert_to_nadata_addprefixdata_addsuffixdata_adjustdata_arrangedata_codebookdata_duplicateddata_extractdata_filterdata_finddata_groupdata_joindata_matchdata_mergedata_modifydata_partitiondata_peekdata_readdata_relocatedata_removedata_renamedata_rename_rowsdata_reorderdata_replicatedata_restoretypedata_rotatedata_seekdata_selectdata_separatedata_summarydata_tabulatedata_to_longdata_to_widedata_transposedata_ungroupdata_uniquedata_unitedata_writedegroupdemeandescribe_distributiondetrenddistribution_coef_vardistribution_modeempty_columnsempty_rowsextract_column_namesfind_columnsformat_textget_columnskurtosislabels_to_levelsmean_sdmeans_by_groupmedian_madnormalizeprint_htmlprint_mdranktransformrecode_intorecode_valuesremove_emptyremove_empty_columnsremove_empty_rowsreplace_nan_infrescalerescale_weightsreshape_cireshape_longerreshape_widerreversereverse_scalerow_meansrow_to_colnamesrowid_as_columnrownames_as_columnskewnessslidesmoothnessstandardisestandardizetext_concatenatetext_formattext_fullstoptext_lastchartext_pastetext_removetext_wrapto_factorto_numericunnormalizeunstandardiseunstandardizevisualisation_recipeweighted_madweighted_meanweighted_medianweighted_sdwinsorize

Dependencies:insight

A quick summary of selection syntax in {datawizard}

Rendered fromselection_syntax.Rmdusingknitr::rmarkdownon Jun 08 2024.

Last update: 2024-05-19
Started: 2022-09-15

Coming from 'tidyverse'

Rendered fromtidyverse_translation.Rmdusingknitr::rmarkdownon Jun 08 2024.

Last update: 2023-06-13
Started: 2022-07-26

Data Standardization

Rendered fromstandardize_data.Rmdusingknitr::rmarkdownon Jun 08 2024.

Last update: 2023-12-21
Started: 2021-06-20

Readme and manuals

Help Manual

Help pageTopics
Adjust data for the effect of other variable(s)adjust data_adjust
Assign variable and value labelsassign_labels assign_labels.data.frame assign_labels.numeric
Recode (or "cut" / "bin") data into groups of values.categorize categorize.data.frame categorize.numeric
Centering (Grand-Mean Centering)center center.data.frame center.numeric centre
Compute the coefficient of variationcoef_var coef_var.numeric distribution_coef_var distribution_cv
Convert to Numeric (if possible)coerce_to_numeric
Deviation Contrast Matrixcontr.deviation
Replace missing values in a variable or a data frame.convert_na_to convert_na_to.character convert_na_to.data.frame convert_na_to.numeric
Convert non-missing values in a variable into missing values.convert_to_na convert_to_na.data.frame convert_to_na.factor convert_to_na.numeric
Rename columns and variable namesdata_addprefix data_addsuffix data_rename data_rename_rows
Arrange rows by column valuesdata_arrange
Generate a codebook of a data frame.data_codebook print_html.data_codebook
Extract all duplicatesdata_duplicated
Extract one or more columns or elements from an objectdata_extract data_extract.data.frame
Create a grouped data framedata_group data_ungroup
Return filtered or sliced data frame, or row indicesdata_filter data_match
Merge (join) two data frames, or a list of data framesdata_join data_merge data_merge.data.frame data_merge.list
Create new variables in a data framedata_modify data_modify.data.frame
Partition datadata_partition
Peek at values and type of variables in a data framedata_peek data_peek.data.frame
Read (import) data files from various sourcesdata_read data_write
Relocate (reorder) columns of a data framedata_relocate data_remove data_reorder
Expand (i.e. replicate rows) a data framedata_replicate
Restore the type of columns according to a reference data framedata_restoretype
Rotate a data framedata_rotate data_transpose
Find variables by their names, variable or value labelsdata_seek
Find or get columns in a data frame based on search patternsdata_find data_select extract_column_names find_columns get_columns
Separate single variable into multiple variablesdata_separate
Summarize datadata_summary data_summary.data.frame
Create frequency and crosstables of variablesdata_tabulate data_tabulate.data.frame data_tabulate.default
Reshape (pivot) data from wide to longdata_to_long reshape_longer
Reshape (pivot) data from long to widedata_to_wide reshape_wider
Keep only one row from all with duplicated IDsdata_unique
Unite ("merge") multiple variablesdata_unite
Compute group-meaned and de-meaned variablesdegroup demean detrend
Describe a distributiondescribe_distribution describe_distribution.data.frame describe_distribution.factor describe_distribution.numeric
Compute mode for a statistical distributiondistribution_mode
Sample dataset from the EFC Surveyefc
Convert value labels into factor levelslabels_to_levels labels_to_levels.data.frame labels_to_levels.factor
Utility Function for Safe Prediction with 'datawizard' transformersmakepredictcall.dw_transformer
Summary Helpersmean_sd median_mad
Summary of mean values by groupmeans_by_group means_by_group.data.frame means_by_group.numeric
Sample dataset from the National Health and Nutrition Examination Surveynhanes_sample
Normalize numeric variable to 0-1 rangenormalize normalize.data.frame normalize.numeric unnormalize unnormalize.data.frame unnormalize.grouped_df unnormalize.numeric
(Signed) rank transformationranktransform ranktransform.data.frame ranktransform.numeric
Recode values from one or more variables into a new variablerecode_into
Recode old values of variables into new valueschange_code recode_values recode_values.data.frame recode_values.numeric
Return or remove variables or observations that are completely missingempty_columns empty_rows remove_empty remove_empty_columns remove_empty_rows
Convert infinite or 'NaN' values into 'NA'replace_nan_inf
Rescale Variables to a New Rangechange_scale rescale rescale.data.frame rescale.numeric
Rescale design weights for multilevel analysisrescale_weights
Reshape CI between wide/long formatsreshape_ci
Reverse-Score Variablesreverse reverse.data.frame reverse.numeric reverse_scale
Row means (optionally with minimum amount of valid values)row_means
Tools for working with column namescolnames_to_row row_to_colnames
Tools for working with row names or row idscolumn_as_rownames rowid_as_column rownames_as_column
Compute Skewness and (Excess) Kurtosiskurtosis kurtosis.numeric print.parameters_kurtosis print.parameters_skewness skewness skewness.numeric summary.parameters_kurtosis summary.parameters_skewness
Shift numeric value rangeslide slide.data.frame slide.numeric
Quantify the smoothness of a vectorsmoothness
Standardization (Z-scoring)standardise standardize standardize.data.frame standardize.factor standardize.numeric unstandardise unstandardize unstandardize.data.frame unstandardize.numeric
Re-fit a model with standardized datastandardize.default standardize_models
Convenient text formatting functionalitiesformat_text text_concatenate text_format text_fullstop text_lastchar text_paste text_remove text_wrap
Convert data to factorsto_factor to_factor.data.frame to_factor.numeric
Convert data to numericto_numeric to_numeric.data.frame
Prepare objects for visualisationvisualisation_recipe
Weighted Mean, Median, SD, and MADweighted_mad weighted_mean weighted_median weighted_sd
Winsorize datawinsorize winsorize.numeric