Expand categorical attribute variables to a series of dichotomous variables

expand_attributes(
  data,
  attributes,
  valueLabels = NULL,
  prefix = "",
  glue = "__",
  suffix = "",
  falseValue = 0,
  trueValue = 1,
  valueFirst = TRUE,
  append = TRUE
)

Arguments

data

The data frame, normally the $qdt data frame that exists in the object returned by a call to parse_sources().

attributes

The name of the attribute(s) to expand.

valueLabels

It's possible to use different names for the created variables than the values of the attributes. This can be set with the valueLabels argument. If only one attribute is specified, pass a named vector for valueLabels, and if multiple attributes are specified, pass a named list of named vectors, where the name of each vector corresponds to an attribute passed in attributes. The names of the vector elements must correspond to the values of the attributes (see the example).

prefix, suffix

The prefix and suffix to add to the variables names that are returned.

glue

The glue to paste the first part ad the second part of the composite variable name together.

falseValue, trueValue

The values to set for rows that, respectively, do not match and do match an attribute value.

valueFirst

Whether to insert the attribute value first, or the attribute name, in the composite variable names.

append

Whether to append the columns to the supplied data frame or not.

Value

A data.frame

Examples

### Get path to example source
examplePath <-
  system.file("extdata", package="rock");

### Get a path to one example file
exampleFile <-
  file.path(examplePath, "example-1.rock");

### Parse single example source
parsedExample <- rock::parse_source(exampleFile);

### Create a categorical attribute column
parsedExample$qdt$age_group <-
  c(rep(c("<18", "18-30", "31-60", ">60"),
        each=19),
    rep(c("<18", ">60"),
        time = c(3, 4)));

### Expand to four logical columns
parsedExample$qdt <-
  rock::expand_attributes(
    parsedExample$qdt,
    "age_group",
    valueLabels =
      c(
        "<18" = "youngest",
        "18-30" = "youngish",
        "31-60" = "oldish",
        ">60" = "oldest"
       ),
    valueFirst = FALSE
);

### Show some of the result
table(parsedExample$qdt$age_group,
      parsedExample$qdt$age_group__youngest);
#>        
#>          0  1
#>   18-30 19  0
#>   31-60 19  0
#>   <18    0 22
#>   >60   23  0
table(parsedExample$qdt$age_group,
      parsedExample$qdt$age_group__oldish);
#>        
#>          0  1
#>   18-30 19  0
#>   31-60  0 19
#>   <18   22  0
#>   >60   23  0