Generates descriptive statistics for a given data frame. Continuous variables stats are grouped in a single data frame while each categorical variable has its stats stored in a separate data frame.

get_descriptive_stat(df, vars = NULL, include_long_catg = FALSE)

Arguments

df

A data.frame object to analyse

vars

Variables to compute stats for. NULL means all variables

include_long_catg

Whether or not to include stats for categorical variables with more than 50 unique values

Value

a list of data frames with descriptive statistics

Examples

get_descriptive_stat(iris)
#> $Species
#>       values count percentage
#> 1     setosa    50   33.33333
#> 2 versicolor    50   33.33333
#> 3  virginica    50   33.33333
#> 
#> $ContinuousVariables
#>              min max     mean median       std  variance  Q1  Q3
#> Sepal.Length 4.3 7.9 5.843333   5.80 0.8280661 0.6856935 5.1 6.4
#> Sepal.Width  2.0 4.4 3.057333   3.00 0.4358663 0.1899794 2.8 3.3
#> Petal.Length 1.0 6.9 3.758000   4.35 1.7652982 3.1162779 1.6 5.1
#> Petal.Width  0.1 2.5 1.199333   1.30 0.7622377 0.5810063 0.3 1.8
#> 
get_descriptive_stat(iris, vars = c('Petal.Length', 'Species'))
#> $Species
#>       values count percentage
#> 1     setosa    50   33.33333
#> 2 versicolor    50   33.33333
#> 3  virginica    50   33.33333
#> 
#> $ContinuousVariables
#>              min max  mean median      std variance  Q1  Q3
#> Petal.Length   1 6.9 3.758   4.35 1.765298 3.116278 1.6 5.1
#> 
get_descriptive_stat(iris, include_long_catg = TRUE)
#> $Species
#>       values count percentage
#> 1     setosa    50   33.33333
#> 2 versicolor    50   33.33333
#> 3  virginica    50   33.33333
#> 
#> $ContinuousVariables
#>              min max     mean median       std  variance  Q1  Q3
#> Sepal.Length 4.3 7.9 5.843333   5.80 0.8280661 0.6856935 5.1 6.4
#> Sepal.Width  2.0 4.4 3.057333   3.00 0.4358663 0.1899794 2.8 3.3
#> Petal.Length 1.0 6.9 3.758000   4.35 1.7652982 3.1162779 1.6 5.1
#> Petal.Width  0.1 2.5 1.199333   1.30 0.7622377 0.5810063 0.3 1.8
#>