Home » excel » How to extract distinct rows in excel?

# How to extract distinct rows in excel?

Questions:

I am attempting to reduce an existing array of cells into a list of the distinct rows. For example, I would like to take an array like this:

``````Ripe  Fruit
Yes   Apple
Yes   Apple
No    Apple
No    Apple
No    Apple
Yes   Orange
Yes   Orange
No    Pear
``````

And reduce it down into this:

``````Ripe  Fruit
Yes   Apple
No    Apple
Yes   Orange
No    Pear
``````

Such that only one copy of each distinct row exists in the new array.

Thus far I have tried to use a mix of the index, countifs, and the small functions, but haven’t been able to produce the result I am looking for.

Edit: I apologize for not clarifying this earlier, but this is part of a system I am trying to build to automate some work. Thus, remove duplicates, while being an excellent one off solution, doesn’t suit my needs here.

As per requests here are some of the solutions I’ve attempted:

The first thing I attempted was to do Countifs such as ‘Countifs(A1:A8, A1, B1:B8, B1) which would yield the new column:

``````Ripe  Fruit   Countif
Yes   Apple   2
Yes   Apple   2
No    Apple   3
No    Apple   3
No    Apple   3
Yes   Orange  2
Yes   Orange  2
No    Pear    1
``````

From there I was debating using the small function in some way, but I ran into an issue where if two distinct rows had the same number of results then the function would not work, so I scrapped the idea.

My next thought was slightly more obtuse; I thought that it might work to convert each of the distinct rows to a number somehow, but I haven’t developed the thought experiment much beyond that.

I suspect somewhere out there is a more efficient formula, but for now that’s all I could develop so… here it goes…

You’d need to introduce a “helper” column that joins both texts using some unique delimiter, e.g. `=TEXTJOIN("||",,A2:B2)`

The next step is to have the following Array function (Ctrl+Shift+Enter) in column G – this will show you the unique values from column C:

``````=IFERROR(INDEX(\$C\$1:\$C\$9,MATCH(0,COUNTIF(\$G\$1:G1,\$C\$1:\$C\$9),0)),"")
``````

Now the job is easy – add `=IF(LEN(G2),LEFT(G2,FIND("||",G2)-1),"")` to column E and `=IF(LEN(G2),RIGHT(G2,LEN(G2)-FIND("||",G2)-1),"")` to column F:

You can hide columns C & G for the presentation purposes.