dichotomize.Rd
This converts factor(s) o character(s) column(s) of a data frame into a set of dichotomous columns. Their names will correspond to the labels or text of every category.
dichotomize(data, variables, sep = "", min = 1, length = 0, values = NULL, sparse = FALSE, add = TRUE, sort = TRUE, nas = "None")
data | a data frame with a factor or textual column which can be simple (only one value for each scenario) or multiple if components are delinited with a separator. |
---|---|
variables | vector of column names that have to be converted into dichotomous vectors. |
sep | vector of characters used to divide columns with multiple events. If this separator is "", every unique cell of every column is converted into a dichotomus data frame's column. |
min | convert to dichotomous vectors only label or text that has a frequency less or equal to the value of this parameter. If the value of min is between 0 and 1, its value is interpreted as a percentage |
length | maximum number of dichotomous columns generated for every variable |
values | vector of labels or texts selected to their conversion to dichotomous columns |
sparse | produce a sparse matrix instead of a data.frame |
add | add the new columns to the input data.frame |
sort | order the new columns by their frequencies |
nas | variable name to convert the NA values of the set of variables |
A data frame composed by the original plus the added dichotmous columns.
Escobar, M. and Martinez-Uribe, L. (2020)
Network Coincidence Analysis: The netCoin
R
Package.
Journal of Statistical Software, 93, 1-32.
doi: 10.18637/jss.v093.i11
.
Modesto Escobar, Department of Sociology and Communication, University of Salamanca, and Luis Martinez Uribe, Fundacion Juan March. See https://sociocav.usal.es/blog/modesto-escobar/
# A character column frame1 <- data.frame(A = c("Man", "Women", "Man", "Undet.")) dichotomize(frame1, "A", sep = "; ")#> A Man Undet. Women A:None #> V1 Man 1 0 0 0 #> V2 Women 0 0 1 0 #> V3 Man 1 0 0 0 #> V4 Undet. 0 1 0 0# A character column (with separator) frame2 <- data.frame(A = c("Man; Women", "Women; Women", "Man; Man", "Undet.; Women; Man")) dichotomize(frame2, "A", sep = "; ")#> A Man Women Undet. A:None #> V1 Man; Women 1 1 0 0 #> V2 Women; Women 0 1 0 0 #> V3 Man; Man 1 0 0 0 #> V4 Undet.; Women; Man 1 1 1 0# A character column and another factor column (same sepatator) frame3 <- data.frame(A = c("Man; Women", "Women; Women", "Man; Man", "Undet.; Women; Man"), C = factor(c(1:4), labels = c("Paris", "New York", "London; New York", "<NA>"))) dichotomize(frame3, c("A", "C"), sep = "; ")#> A C A:Man A:Women A:Undet. A:None C:New York #> V1 Man; Women Paris 1 1 0 0 0 #> V2 Women; Women New York 0 1 0 0 1 #> V3 Man; Man London; New York 1 0 0 0 1 #> V4 Undet.; Women; Man <NA> 1 1 1 0 NA #> C:None C:London C:Paris #> V1 0 0 1 #> V2 0 0 0 #> V3 0 1 0 #> V4 NA NA NA# A set of simple character or factor (same levels) variables. # In this case, you must use "C" separator. frame4 <- data.frame(A = c("Man", "Women","Man", "Undet",NA), B = c("Women","Women","Man","Women",NA), C = c(NA,NA,NA,"Man",NA)) dichotomize(frame4,c("A","B","C"), sep="C")#> A B C Man Women None Undet String:None #> V1 Man Women <NA> 1 1 0 0 0 #> V2 Women Women <NA> 0 1 0 0 0 #> V3 Man Man <NA> 1 0 0 0 0 #> V4 Undet Women Man 1 1 0 1 0 #> V5 <NA> <NA> <NA> 0 0 1 0 0