# Issue

This Content is from Stack Overflow. Question asked by md hossain

This is my dataframe:

``````disease <- c("high", "high", "high", "high", "low","low","low","low");
A <- c("P","A","P","P","A","A","A","P");
B <- c("P","A","A","P","A","P","A","P");
C <- c("P","p","A","P","A","A","A","P");
df <- data.frame(disease, A, B, C)
``````

I am looking for contingency table as the image [where, row 1 = A, row 2 = B, row 3 = C]

enter image description here

For each column in the dataframe, I need to calculate the frequency (count) for combination of P & high (disease col), P-low, A-high and A-low as you can see in the image above. I can do that by nrow for each column separately as below:

##count for col 2 in df

``````high_P=nrow(df[df\$disease=="high" & df\$A=="P", ])
high_A=nrow(df[df\$disease=="high" & df\$A=="A", ])
low_P=nrow(df[df\$disease=="low" & df\$A=="P", ])
low_A=nrow(df[df\$disease=="low" & df\$A=="A", ])
A_df=data.frame(high_P,high_A,low_P,low_A)
``````

#count for col 3 in df

``````high_P=nrow(df[df\$disease=="high" & df\$B=="P", ])
high_A=nrow(df[df\$disease=="high" & df\$B=="A", ])
low_P=nrow(df[df\$disease=="low" & df\$B=="P", ])
low_A=nrow(df[df\$disease=="low" & df\$toxB=="A", ])
B_df=data.frame(high_P,high_A,low_P,low_A)
``````

#count for col 4 in df

``````high_P=nrow(df[df\$disease=="high" & df\$C=="P", ])
high_A=nrow(df[df\$disease=="high" & df\$C=="A", ])
low_P=nrow(df[df\$disease=="low" & df\$C=="P", ])
low_A=nrow(df[df\$disease=="low" & df\$C=="A", ])
C_df=data.frame(high_P,high_A,low_P,low_A)

Data = rbind(A_df,B_df,C_df)
``````

It does what I want but I want to calculate that for each column one after another using loops, as for a big data set it would be difficult to calculate manually (col by col). Could anyone suggest/help how I can calculate the contingency table in R using loops or….as in the image?

# Solution

You can do this:

``````library(dplyr)
library(tidyr)

df %>%
pivot_longer(!disease, names_to = 'columns', values_to = 'vals') %>%
count(disease, columns, vals) %>%
pivot_wider(names_from = c(disease, vals), values_from = n,
names_sep = '_')

# A tibble: 3 × 5
columns high_A high_P low_A low_P
<chr>    <int>  <int> <int> <int>
1 ToA          1      3     3     1
2 ToB          2      2     2     2
3 ToC          1      3     3     1
``````

``` This Question was asked in  StackOverflow by  md hossain and Answered by Anoushiravan R It is licensed under the terms of
CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.```