It calculates a similarity/distance matrix from either an incidence data frame/matrix or a coin object.

sim(input, procedures="Jaccard", level=.95, distance=FALSE, 
    minimum=1, maximum=Inf, sort=FALSE, decreasing=FALSE, 
    weight = NULL, pairwise = FALSE)

Arguments

input

a binary data frame or a coin object, let's say an R list composed by a number of scenarios ($n) and a coincidence matrix with frequencies ($f).

procedures

a vector of statistics of similarity. See details below.

level

confidence level

distance

convert the similarity matrix into a distance matrix

minimum

minimum frequency to obtain a similarity/distance measure.

maximum

maxium frequency to obtain a similarity/distance measure.

sort

sort the list according to the values of a statistic. See details below

decreasing

order in a decreasing way.

weight

a vector of weights. Optimal for data.framed tables

pairwise

Pairwise mode of handling missing values if TRUE. Listwise by default.

Details

Possible measures in procedures are

  • Frequencies (f), Relative frequencies (x), Conditional frequencies (i), Coincidence degree (cc), Probable degree (cp),

  • Expected (e), Confidence interval (con)

  • Matching (m), Rogers & Tanimoto (t), Gower (g), Sneath (s), Anderberg (and),

  • Jaccard (j), Dice (d), antiDice (a), Ochiai (o), Kulczynski (k),

  • Hamann (ham), Yule (y), Pearson (p), odds ratio (od), Rusell (r),

  • Haberman (h), Z value of Haberman (z).

  • Hypergeometric p greater value (hyp).

Value

A similarity/distance matrix.

Author

Modesto Escobar, Department of Sociology and Communication, University of Salamanca. See https://sociocav.usal.es/blog/modesto-escobar/

Examples

# From a random incidence matrix I(25X4) I<-matrix(rbinom(100,1,.5),nrow=25,ncol=4, dimnames=list(NULL,c("A","B","C","D"))) sim(I)
#> A B C D #> A 1.0000000 0.4090909 0.5238095 0.3181818 #> B 0.4090909 1.0000000 0.4090909 0.4000000 #> C 0.5238095 0.4090909 1.0000000 0.3809524 #> D 0.3181818 0.4000000 0.3809524 1.0000000
#Same results C<-coin(I) sim(C)
#> A B C D #> A 1.0000000 0.4090909 0.5238095 0.3181818 #> B 0.4090909 1.0000000 0.4090909 0.4000000 #> C 0.5238095 0.4090909 1.0000000 0.3809524 #> D 0.3181818 0.4000000 0.3809524 1.0000000