Saturday 13 April 2013

Counting Distinct Values in Excel


Very recently some one had posted an excel problem on LinkedIn. The user wanted to get a count of distinct values. Microsoft offers a solution to this need.

={=SUM(IF(FREQUENCY(IF(LEN(A2:A10)>0,MATCH(A2:A10,A2:A10,0),""), IF(LEN(A2:A10)>0,MATCH(A2:A10,A2:A10,0),""))>0,1))}

http://office.microsoft.com/en-in/excel-help/count-occurrences-of-values-or-unique-values-in-a-data-range-HP003056118.aspx

 But the user wanted to count unique values from multiple columns, and the above formula cannot achieve that. (that is because the match function can handle only a single column/row array)

 A Much simpler and more effective formula is

 {=SUM(IFERROR(1/COUNTIF(A$2:A$10,A$2:A$10),""))}

 You could conveniently extend it to multiple rows.

 {=SUM(IFERROR(1/COUNTIF(A$2:C$10,A$2:C$10),""))}

What the formula does is, it counts the number of occurrences and find its reciprocal and add them all up. So, if "Apple" occurs 5 times in your array, the formula adds 1/5 5 times.

Distinct function is one of the most sought-after functions for spreadsheet users from across the world. Microsoft should introduce this function in their future updates. Remove duplicates , feature sometimes simply does not suffice.

Note:
the formulas with flower brackets are array formulas. You need to press CRTL+SHIFT+ENTER to get them to work.
Tip: Enter the formula without flower-brackets  The Cell will read #VALUE!. go back to the cell and hit F2 and then CRTL+SHIFT+ENTER

2 comments:

  1. avoid the CSE combination by doing:
    =SUMPRODUCT(1/COUNTIF(B$2:B$10,B$2:B$10))

    if it contains blanks, then:
    =SUMPRODUCT((A2:A10<>"")/COUNTIF(A$2:A$10,A$2:A$10&""))

    ReplyDelete

Was this useful Information?