Home » excel » How to extract distinct rows in excel?

How to extract distinct rows in excel?

Posted by: admin May 14, 2020 Leave a comment

Questions:

I am attempting to reduce an existing array of cells into a list of the distinct rows. For example, I would like to take an array like this:

Ripe  Fruit
Yes   Apple
Yes   Apple
No    Apple
No    Apple
No    Apple
Yes   Orange
Yes   Orange
No    Pear 

And reduce it down into this:

Ripe  Fruit
Yes   Apple
No    Apple
Yes   Orange
No    Pear

Such that only one copy of each distinct row exists in the new array.

Thus far I have tried to use a mix of the index, countifs, and the small functions, but haven’t been able to produce the result I am looking for.

Edit: I apologize for not clarifying this earlier, but this is part of a system I am trying to build to automate some work. Thus, remove duplicates, while being an excellent one off solution, doesn’t suit my needs here.

As per requests here are some of the solutions I’ve attempted:

The first thing I attempted was to do Countifs such as ‘Countifs(A1:A8, A1, B1:B8, B1) which would yield the new column:

Ripe  Fruit   Countif
Yes   Apple   2
Yes   Apple   2
No    Apple   3
No    Apple   3
No    Apple   3
Yes   Orange  2
Yes   Orange  2
No    Pear    1

From there I was debating using the small function in some way, but I ran into an issue where if two distinct rows had the same number of results then the function would not work, so I scrapped the idea.

My next thought was slightly more obtuse; I thought that it might work to convert each of the distinct rows to a number somehow, but I haven’t developed the thought experiment much beyond that.

How to&Answers:

I suspect somewhere out there is a more efficient formula, but for now that’s all I could develop so… here it goes…

You’d need to introduce a “helper” column that joins both texts using some unique delimiter, e.g. =TEXTJOIN("||",,A2:B2)

enter image description here

The next step is to have the following Array function (Ctrl+Shift+Enter) in column G – this will show you the unique values from column C:

=IFERROR(INDEX($C$1:$C$9,MATCH(0,COUNTIF($G$1:G1,$C$1:$C$9),0)),"")

enter image description here

Now the job is easy – add =IF(LEN(G2),LEFT(G2,FIND("||",G2)-1),"") to column E and =IF(LEN(G2),RIGHT(G2,LEN(G2)-FIND("||",G2)-1),"") to column F:

enter image description here

You can hide columns C & G for the presentation purposes.

Answer:

Select all your data.

Then, in the options ribbon, tab DATA, look up for Remove Duplicates:

enter image description here

Click on Accept. You will get this output:

enter image description here

More info:

Find and remove duplicates