Home » Java » Java 8 Filter set based on another set

Java 8 Filter set based on another set

Posted by: admin December 28, 2021 Leave a comment

Questions:

Using Java 8 new constructs, such as streams for example, is there a way to filter a Set based on the order in another collection?

Set<Person> persons = new HashSet<>();

persons.add(new Person("A", 23));
persons.add(new Person("B", 27));
persons.add(new Person("C", 20));

List<String> names = new ArrayList<>();
names.add("B");
names.add("A");

I want to filter items from set persons based on names, such that only those persons who have their names specified in names will be retained, but in the order that they appear in names.

So, I want

Set<Person> filteredPersons = ...;

where 1st element is Person("B", 27) and 2nd element is Person("A", 23).

If I do the following,

Set<Person> filteredPersons = new HashSet<>(persons);
filteredPersons = filteredPersons.stream().filter(p -> names.contains(p.getName())).collect(Collectors.toSet());

the order is not guaranteed to be the same as in names, if I am not mistaken.

I know how to achieve this using a simple for loop; I’m just looking for a java 8 way of doing it.

Thanks for looking!

EDIT:

The for loop that achieves the same result:

Set<Person> filteredPersons = new LinkedHashSet<>();
for (String name : names) {
  for (Person person : persons) {
    if (person.getName().equalsIgnoreCase(name)) {
      filteredPersons.add(person);
      break;
    }
  }
}

The LinkedHashSet implementation ensures that the order is maintained.

Answers:
final Set<Person> persons = ...
Set<Person> filteredPersons = names.stream()
    .flatMap(n -> 
        persons.stream().filter(p -> n.equals(p.getName()))
    )
    .collect(Collectors.toCollection(LinkedHashSet::new));

Collect the streams of persons created by filtering them by each name. This is fast for situations like the example provided, but will scale linearly with the number of persons, like O(N*P).

For larger collections of persons and names, creating an index that can be used to look up a person by name will be faster overall, scaling as O(N+P):

Map<String, Person> index = persons.stream()
    .collect(Collectors.toMap(Person::getName, Function.identity()));
Set<Person> filteredPersons = names.stream()
    .map(index::get)
    .filter(Objects::nonNull)
    .collect(Collectors.toCollection(LinkedHashSet::new));

###

I am open to changing the set implementation, for example to a
LinkedHashSet as shown in the example above to achieve the final goal.

If you are completely open to change the data structure used to store the persons, then you should probably consider using a Map, as it would improve a lot the efficiency of your algorithm.

Map<String, Person> persons = new HashMap<>();

persons.put("A", new Person("A", 23));
persons.put("B", new Person("B", 27));
persons.put("C", new Person("C", 20));

List<String> names = new ArrayList<>();
names.add("B");
names.add("A");

List<Person> filteredPersons = names.stream()
        .map(persons::get)
        .filter(Objects::nonNull)
        .collect(Collectors.toList());

If the letter case may be different in persons and names, you can do a .toLowerCase() in the keys of the Map.

###

You could use something like following (not tested):

.sorted((p1, p2) ->
        Integer.compare(names.indexOf(p1.getName()),
                        names.indexOf(p2.getName())))

And, as mentionned above, collect to a List instead of a Set.


As mentionned in comment by Alexis, you can also write this more concisely:

.sorted(comparingInt(p -> names.indexOf(p.getName())))

Where comparingInt comes from a static import of Comparator.