Home » Java » Java 8 Streams: multiple filters vs. complex condition

Java 8 Streams: multiple filters vs. complex condition

Posted by: admin November 30, 2017 Leave a comment

Questions:

Sometimes you want to filter a Stream with more than one condition:

myList.stream().filter(x -> x.size() > 10).filter(x -> x.isCool()) ...

or you could do the same with a complex condition and a single filter:

myList.stream().filter(x -> x.size() > 10 && x -> x.isCool()) ...

My guess is that the second approach has better performance characteristics, but I don’t know it.

The first approach wins in readability, but what is better for the performance?

Answers:

The code that has to be executed for both alternatives is so similar that you can’t predict a result reliably. The underlying object structure might differ but that’s no challenge to the hotspot optimizer. So it depends on other surrounding conditions which will yield to a faster execution, if there is any difference.

Combining two filter instances creates more objects and hence more delegating code but this can change if you use method references rather than lambda expressions, e.g. replace filter(x -> x.isCool()) by filter(ItemType::isCool). That way you have eliminated the synthetic delegating method created for your lambda expression. So combining two filters using two method references might create the same or lesser delegation code than a single filter invocation using a lambda expression with &&.

But, as said, this kind of overhead will be eliminated by the HotSpot optimizer and is negligible.

In theory, two filters could be easier parallelized than a single filter but that’s only relevant for rather computational intense tasks.

So there is no simple answer.

The bottom line is, don’t think about such performance differences below the odor detection threshold. Use what is more readable.

Questions:
Answers:

In a Set, i have 10 thousand references to a distinct objects. Then i did the “speed test”:

long time1 = System.currentTimeMillis();    
users.stream()
     .filter((u) -> u.getGender() == Gender.FEMALE && u.getAge() % 2 == 0);
long time2 = System.currentTimeMillis();
System.out.printf("%d milli seconds", time2 - time1);

After 10 executions the average time was 45,5 milliseconds. Next, i use multi filter():

long time1 = System.currentTimeMillis();    
users.stream()
     .filter((u) -> u.getGender() == Gender.FEMALE)
     .filter((u) -> u.getAge() % 2 == 0);
long time2 = System.currentTimeMillis();
System.out.printf("%d milli seconds", time2 - time1);

The average time was 44,7 milliseconds.

Of course this is a test with simple operations, but i believe there is no difference, because, Stream uses internal iteration, unlike “for” loopings that uses external iteration, in other words, the looping happens only once and after completed the result is returned. It’s likely that there is a difference in time that the compiler will take to read the filters, which is insignificant.

Questions:
Answers:

This test shows that your second option can perform significantly better. Findings first, then the code:

one filter with predicate of form u -> exp1 && exp2, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=4142, min=29, average=41.420000, max=82}
two filters with predicates of form u -> exp1, list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=13315, min=117, average=133.150000, max=153}
one filter with predicate of form predOne.and(pred2), list size 10000000, averaged over 100 runs: LongSummaryStatistics{count=100, sum=10320, min=82, average=103.200000, max=127}

now the code:

enum Gender {
    FEMALE,
    MALE
}

static class User {
    Gender gender;
    int age;

    public User(Gender gender, int age){
        this.gender = gender;
        this.age = age;
    }

    public Gender getGender() {
        return gender;
    }

    public void setGender(Gender gender) {
        this.gender = gender;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
}

static long test1(List<User> users){
    long time1 = System.currentTimeMillis();
    users.stream()
            .filter((u) -> u.getGender() == Gender.FEMALE && u.getAge() % 2 == 0)
            .allMatch(u -> true);                   // least overhead terminal function I can think of
    long time2 = System.currentTimeMillis();
    return time2 - time1;
}

static long test2(List<User> users){
    long time1 = System.currentTimeMillis();
    users.stream()
            .filter(u -> u.getGender() == Gender.FEMALE)
            .filter(u -> u.getAge() % 2 == 0)
            .allMatch(u -> true);                   // least overhead terminal function I can think of
    long time2 = System.currentTimeMillis();
    return time2 - time1;
}

static long test3(List<User> users){
    long time1 = System.currentTimeMillis();
    users.stream()
            .filter(((Predicate<User>) u -> u.getGender() == Gender.FEMALE).and(u -> u.getAge() % 2 == 0))
            .allMatch(u -> true);                   // least overhead terminal function I can think of
    long time2 = System.currentTimeMillis();
    return time2 - time1;
}

public static void main(String... args) {
    int size = 10000000;
    List<User> users =
    IntStream.range(0,size)
            .mapToObj(i -> i % 2 == 0 ? new User(Gender.MALE, i % 100) : new User(Gender.FEMALE, i % 100))
            .collect(Collectors.toCollection(()->new ArrayList<>(size)));
    repeat("one filter with predicate of form u -> exp1 && exp2", users, Temp::test1, 100);
    repeat("two filters with predicates of form u -> exp1", users, Temp::test2, 100);
    repeat("one filter with predicate of form predOne.and(pred2)", users, Temp::test3, 100);
}

private static void repeat(String name, List<User> users, ToLongFunction<List<User>> test, int iterations) {
    System.out.println(name + ", list size " + users.size() + ", averaged over " + iterations + " runs: " + IntStream.range(0, iterations)
            .mapToLong(i -> test.applyAsLong(users))
            .summaryStatistics());
}