Home » Java » What is the best way to filter a Java Collection?

What is the best way to filter a Java Collection?

Posted by: admin November 2, 2017 Leave a comment

Questions:

I want to filter a java.util.Collection based on a predicate.

Answers:

Java 8 (2014) solves this problem using streams and lambdas in one line of code:

List<Person> beerDrinkers = persons.stream()
    .filter(p -> p.getAge() > 16).collect(Collectors.toList());

Here’s a tutorial.

Use Collection#removeIf to modify the collection in place (provided it supports element removal):

persons.removeIf(p -> p.getAge() > 16);

lambdaj allows filtering collections without writing loops or inner classes:

List<Person> beerDrinkers = select(persons, having(on(Person.class).getAge(),
    greaterThan(16)));

Can you imagine something more readable?

Disclaimer: I am a contributor on lambdaj

Questions:
Answers:

Assuming that you are using Java 1.5, and that you cannot add Google Collections, I would do something very similar to what the Google guys did. This is a slight variation on Jon’s comments.

First add this interface to your codebase.

public interface IPredicate<T> { boolean apply(T type); }

Its implementers can answer when a certain predicate is true of a certain type. E.g. If T were User and AuthorizedUserPredicate<User> implements IPredicate<T>, then AuthorizedUserPredicate#apply returns whether the passed in User is authorized.

Then in some utility class, you could say

public static <T> Collection<T> filter(Collection<T> target, IPredicate<T> predicate) {
    Collection<T> result = new ArrayList<T>();
    for (T element: target) {
        if (predicate.apply(element)) {
            result.add(element);
        }
    }
    return result;
}

So, assuming that you have the use of the above might be

Predicate<User> isAuthorized = new Predicate<User>() {
    public boolean apply(User user) {
        // binds a boolean method in User to a reference
        return user.isAuthorized();
    }
};
// allUsers is a Collection<User>
Collection<User> authorizedUsers = filter(allUsers, isAuthorized);

If performance on the linear check is of concern, then I might want to have a domain object that has the target collection. The domain object that has the target collection would have filtering logic for the methods that initialize, add and set the target collection.

UPDATE:

In the utility class (let’s say Predicate), I have added a select method with an option for default value when the predicate doesn’t return the expected value, and also a static property for params to be used inside the new IPredicate.

public class Predicate {
    public static Object predicateParams;

    public static <T> Collection<T> filter(Collection<T> target, IPredicate<T> predicate) {
        Collection<T> result = new ArrayList<T>();
        for (T element : target) {
            if (predicate.apply(element)) {
                result.add(element);
            }
        }
        return result;
    }

    public static <T> T select(Collection<T> target, IPredicate<T> predicate) {
        T result = null;
        for (T element : target) {
            if (!predicate.apply(element))
                continue;
            result = element;
            break;
        }
        return result;
    }

    public static <T> T select(Collection<T> target, IPredicate<T> predicate, T defaultValue) {
        T result = defaultValue;
        for (T element : target) {
            if (!predicate.apply(element))
                continue;
            result = element;
            break;
        }
        return result;
    }
}

The following example looks for missing objects between collections:

List<MyTypeA> missingObjects = (List<MyTypeA>) Predicate.filter(myCollectionOfA,
    new IPredicate<MyTypeA>() {
        public boolean apply(MyTypeA objectOfA) {
            Predicate.predicateParams = objectOfA.getName();
            return Predicate.select(myCollectionB, new IPredicate<MyTypeB>() {
                public boolean apply(MyTypeB objectOfB) {
                    return objectOfB.getName().equals(Predicate.predicateParams.toString());
                }
            }) == null;
        }
    });

The following example, looks for an instance in a collection, and returns the first element of the collection as default value when the instance is not found:

MyType myObject = Predicate.select(collectionOfMyType, new IPredicate<MyType>() {
public boolean apply(MyType objectOfMyType) {
    return objectOfMyType.isDefault();
}}, collectionOfMyType.get(0));

UPDATE (after Java 8 release):

It’s been several years since I (Alan) first posted this answer, and I still cannot believe I am collecting SO points for this answer. At any rate, now that Java 8 has introduced closures to the language, my answer would now be considerably different, and simpler. With Java 8, there is no need for a distinct static utility class. So if you want to find the 1st element that matches your predicate.

final UserService userService = ... // perhaps injected IoC
final Optional<UserModel> userOption = userCollection.stream().filter(u -> {
    boolean isAuthorized = userService.isAuthorized(u);
    return isAuthorized;
}).findFirst();

The JDK 8 API for optionals has the ability to get(), isPresent(), orElse(defaultUser), orElseGet(userSupplier) and orElseThrow(exceptionSupplier), as well as other ‘monadic’ functions such as map, flatMap and filter.

If you want to simply collect all the users which match the predicate, then use the Collectors to terminate the stream in the desired collection.

final UserService userService = ... // perhaps injected IoC
final List<UserModel> userOption = userCollection.stream().filter(u -> {
    boolean isAuthorized = userService.isAuthorized(u);
    return isAuthorized;
}).collect(Collectors.toList());

See here for more examples on how Java 8 streams work.

Questions:
Answers:

Use CollectionUtils.filter(Collection,Predicate), from Apache Commons.

Questions:
Answers:

Consider Google Collections for an updated Collections framework that supports generics.

UPDATE: The google collections library is now deprecated. You should use the latest release of Guava instead. It still has all the same extensions to the collections framework including a mechanism for filtering based on a predicate.

Questions:
Answers:

“Best” way is too wide a request. Is it “shortest”? “Fastest”? “Readable”?
Filter in place or into another collection?

Simplest (but not most readable) way is to iterate it and use Iterator.remove() method:

Iterator<Foo> it = col.iterator();
while( it.hasNext() ) {
  Foo foo = it.next();
  if( !condition(foo) ) it.remove();
}

Now, to make it more readable, you can wrap it into a utility method. Then invent a IPredicate interface, create an anonymous implementation of that interface and do something like:

CollectionUtils.filterInPlace(col,
  new IPredicate<Foo>(){
    public boolean keepIt(Foo foo) {
      return foo.isBar();
    }
  });

where filterInPlace() iterate the collection and calls Predicate.keepIt() to learn if the instance to be kept in the collection.

I don’t really see a justification for bringing in a third-party library just for this task.

Questions:
Answers:

Wait for Java 8:

List<Person> olderThan30 = 
  //Create a Stream from the personList
  personList.stream().
  //filter the element to select only those with age >= 30
  filter(p -> p.age >= 30).
  //put those filtered elements into a new List.
  collect(Collectors.toList());

Questions:
Answers:

Since the early release of Java 8, you could try something like:

Collection<T> collection = ...;
Stream<T> stream = collection.stream().filter(...);

For example, if you had a list of integers and you wanted to filter the numbers that are > 10 and then print out those numbers to the console, you could do something like:

List<Integer> numbers = Arrays.asList(12, 74, 5, 8, 16);
numbers.stream().filter(n -> n > 10).forEach(System.out::println);

Questions:
Answers:

I’ll throw RxJava in the ring, which is also available on Android. RxJava might not always be the best option, but it will give you more flexibility if you wish add more transformations on your collection or handle errors while filtering.

Observable.from(Arrays.asList(1, 2, 3, 4, 5))
    .filter(new Func1<Integer, Boolean>() {
        public Boolean call(Integer i) {
            return i % 2 != 0;
        }
    })
    .subscribe(new Action1<Integer>() {
        public void call(Integer i) {
            System.out.println(i);
        }
    });

Output:

1
3
5

More details on RxJava’s filter can be found here.

Questions:
Answers:

The setup:

public interface Predicate<T> {
  public boolean filter(T t);
}

void filterCollection(Collection<T> col, Predicate<T> predicate) {
  for (Iterator i = col.iterator(); i.hasNext();) {
    T obj = i.next();
    if (predicate.filter(obj)) {
      i.remove();
    }
  }
}

The usage:

List<MyObject> myList = ...;
filterCollection(myList, new Predicate<MyObject>() {
  public boolean filter(MyObject obj) {
    return obj.shouldFilter();
  }
});

Questions:
Answers:

Are you sure you want to filter the Collection itself, rather than an iterator?

see org.apache.commons.collections.iterators.FilterIterator

or using version 4 of apache commons org.apache.commons.collections4.iterators.FilterIterator

Questions:
Answers:

Let’s look at how to filter a built-in JDK List and a MutableList using Eclipse Collections (formerly GS Collections).

List<Integer> jdkList = Arrays.asList(1, 2, 3, 4, 5);
MutableList<Integer> ecList = Lists.mutable.with(1, 2, 3, 4, 5);

If you wanted to filter the numbers less than 3, you would expect the following outputs.

List<Integer> selected = Lists.mutable.with(1, 2);
List<Integer> rejected = Lists.mutable.with(3, 4, 5);

Here’s how you can filter using an anonymous inner class as the Predicate.

Predicate<Integer> lessThan3 = new Predicate<Integer>()
{
    public boolean accept(Integer each)
    {
        return each < 3;
    }
};

Assert.assertEquals(selected, Iterate.select(jdkList, lessThan3));

Assert.assertEquals(selected, ecList.select(lessThan3));

Here are some alternatives to filtering JDK lists and Eclipse Collections MutableLists using the Predicates factory.

Assert.assertEquals(selected, Iterate.select(jdkList, Predicates.lessThan(3)));

Assert.assertEquals(selected, ecList.select(Predicates.lessThan(3)));

Here is a version that doesn’t allocate an object for the predicate, by using the Predicates2 factory instead with the selectWith method that takes a Predicate2.

Assert.assertEquals(
    selected, ecList.selectWith(Predicates2.<Integer>lessThan(), 3));

Sometimes you want to filter on a negative condition. There is a special method in Eclipse Collections for that called reject.

Assert.assertEquals(rejected, Iterate.reject(jdkList, lessThan3));

Assert.assertEquals(rejected, ecList.reject(lessThan3));

Here’s how you can filter using a Java 8 lambda as the Predicate.

Assert.assertEquals(selected, Iterate.select(jdkList, each -> each < 3));
Assert.assertEquals(rejected, Iterate.reject(jdkList, each -> each < 3));

Assert.assertEquals(selected, gscList.select(each -> each < 3));
Assert.assertEquals(rejected, gscList.reject(each -> each < 3));

The method partition will return two collections, containing the elements selected by and rejected by the Predicate.

PartitionIterable<Integer> jdkPartitioned = Iterate.partition(jdkList, lessThan3);
Assert.assertEquals(selected, jdkPartitioned.getSelected());
Assert.assertEquals(rejected, jdkPartitioned.getRejected());

PartitionList<Integer> ecPartitioned = gscList.partition(lessThan3);
Assert.assertEquals(selected, ecPartitioned.getSelected());
Assert.assertEquals(rejected, ecPartitioned.getRejected());

Note: I am a committer for Eclipse Collections.

Questions:
Answers:

With the ForEach DSL you may write

import static ch.akuhn.util.query.Query.select;
import static ch.akuhn.util.query.Query.$result;
import ch.akuhn.util.query.Select;

Collection<String> collection = ...

for (Select<String> each : select(collection)) {
    each.yield = each.value.length() > 3;
}

Collection<String> result = $result();

Given a collection of [The, quick, brown, fox, jumps, over, the, lazy, dog] this results in [quick, brown, jumps, over, lazy], ie all strings longer than three characters.

All iteration styles supported by the ForEach DSL are

  • AllSatisfy
  • AnySatisfy
  • Collect
  • Counnt
  • CutPieces
  • Detect
  • GroupedBy
  • IndexOf
  • InjectInto
  • Reject
  • Select

For more details, please refer to https://www.iam.unibe.ch/scg/svn_repos/Sources/ForEach

Questions:
Answers:

The Collections2.filter(Collection,Predicate) method in Google’s Guava library does just what you’re looking for.

Questions:
Answers:

How about some plain and straighforward Java

 List<Customer> list ...;
 List<Customer> newList = new ArrayList<>();
 for (Customer c : list){
    if (c.getName().equals("dd")) newList.add(c);
 }

Simple, readable and easy (and works in Android!)
But if you’re using Java 8 you can do it in a sweet one line:

List<Customer> newList = list.stream().filter(c -> c.getName().equals("dd")).collect(toList());

Note that toList() is statically imported

Questions:
Answers:

This, combined with the lack of real closures, is my biggest gripe for Java.
Honestly, most of the methods mentioned above are pretty easy to read and REALLY efficient; however, after spending time with .Net, Erlang, etc… list comprehension integrated at the language level makes everything so much cleaner. Without additions at the language level, Java just cant be as clean as many other languages in this area.

If performance is a huge concern, Google collections is the way to go (or write your own simple predicate utility). Lambdaj syntax is more readable for some people, but it is not quite as efficient.

And then there is a library I wrote. I will ignore any questions in regard to its efficiency (yea, its that bad)…… Yes, i know its clearly reflection based, and no I don’t actually use it, but it does work:

LinkedList<Person> list = ......
LinkedList<Person> filtered = 
           Query.from(list).where(Condition.ensure("age", Op.GTE, 21));

OR

LinkedList<Person> list = ....
LinkedList<Person> filtered = Query.from(list).where("x => x.age >= 21");

Questions:
Answers:

JFilter http://code.google.com/p/jfilter/ is best suited for your requirement.

JFilter is a simple and high performance open source library to query collection of Java beans.

Key features

  • Support of collection (java.util.Collection, java.util.Map and Array) properties.
  • Support of collection inside collection of any depth.
  • Support of inner queries.
  • Support of parameterized queries.
  • Can filter 1 million records in few 100 ms.
  • Filter ( query) is given in simple json format, it is like Mangodb queries. Following are some examples.
  • { “id”:{“$le”:”10″}
    • where object id property is less than equals to 10.
  • { “id”: {“$in”:[“0”, “100”]}}
    • where object id property is 0 or 100.
  • {“lineItems”:{“lineAmount”:”1″}}
    • where lineItems collection property of parameterized type has lineAmount equals to 1.
  • { “$and”:[{“id”: “0”}, {“billingAddress”:{“city”:”DEL”}}]}
    • where id property is 0 and billingAddress.city property is DEL.
  • {“lineItems”:{“taxes”:{ “key”:{“code”:”GST”}, “value”:{“$gt”: “1.01”}}}}
    • where lineItems collection property of parameterized type which has taxes map type property of parameteriszed type has code equals to GST value greater than 1.01.
  • {‘$or’:[{‘code’:’10’},{‘skus’: {‘$and’:[{‘price’:{‘$in’:[’20’, ’40’]}}, {‘code’:’RedApple’}]}}]}
    • Select all products where product code is 10 or sku price in 20 and 40 and sku code is “RedApple”.
Questions:
Answers:

I wrote an extended Iterable class that support applying functional algorithms without copying the collection content.

Usage:

List<Integer> myList = new ArrayList<Integer>(){ 1, 2, 3, 4, 5 }

Iterable<Integer> filtered = Iterable.wrap(myList).select(new Predicate1<Integer>()
{
    public Boolean call(Integer n) throws FunctionalException
    {
        return n % 2 == 0;
    }
})

for( int n : filtered )
{
    System.out.println(n);
}

The code above will actually execute

for( int n : myList )
{
    if( n % 2 == 0 ) 
    {
        System.out.println(n);
    }
}

Questions:
Answers:

Use Collection Query Engine (CQEngine). It is by far the fastest way to do this.

See also: How do you query object collections in Java (Criteria/SQL-like)?

Questions:
Answers:

The simple pre-Java8 solution:

ArrayList<Item> filtered = new ArrayList<Item>(); 
for (Item item : items) if (condition(item)) filtered.add(item);

Unfortunately this solution isn’t fully generic, outputting a list rather than the type of the given collection. Also, bringing in libraries or writing functions that wrap this code seems like overkill to me unless the condition is complex, but then you can write a function for the condition.

Questions:
Answers:

https://code.google.com/p/joquery/

Supports different possibilities,

Given collection,

Collection<Dto> testList = new ArrayList<>();

of type,

class Dto
{
    private int id;
    private String text;

    public int getId()
    {
        return id;
    }

    public int getText()
    {
        return text;
    }
}

Filter

Java 7

Filter<Dto> query = CQ.<Dto>filter(testList)
    .where()
    .property("id").eq().value(1);
Collection<Dto> filtered = query.list();

Java 8

Filter<Dto> query = CQ.<Dto>filter(testList)
    .where()
    .property(Dto::getId)
    .eq().value(1);
Collection<Dto> filtered = query.list();

Also,

Filter<Dto> query = CQ.<Dto>filter()
        .from(testList)
        .where()
        .property(Dto::getId).between().value(1).value(2)
        .and()
        .property(Dto::grtText).in().value(new string[]{"a","b"});

Sorting (also available for the Java 7)

Filter<Dto> query = CQ.<Dto>filter(testList)
        .orderBy()
        .property(Dto::getId)
        .property(Dto::getName)
    Collection<Dto> sorted = query.list();

Grouping (also available for the Java 7)

GroupQuery<Integer,Dto> query = CQ.<Dto,Dto>query(testList)
        .group()
        .groupBy(Dto::getId)
    Collection<Grouping<Integer,Dto>> grouped = query.list();

Joins (also available for the Java 7)

Given,

class LeftDto
{
    private int id;
    private String text;

    public int getId()
    {
        return id;
    }

    public int getText()
    {
        return text;
    }
}

class RightDto
{
    private int id;
    private int leftId;
    private String text;

    public int getId()
    {
        return id;
    }

    public int getLeftId()
        {
            return leftId;
        }

    public int getText()
    {
        return text;
    }
}

class JoinedDto
{
    private int leftId;
    private int rightId;
    private String text;

    public JoinedDto(int leftId,int rightId,String text)
    {
        this.leftId = leftId;
        this.rightId = rightId;
        this.text = text;
    }

    public int getLeftId()
    {
        return leftId;
    }

    public int getRightId()
        {
            return rightId;
        }

    public int getText()
    {
        return text;
    }
}

Collection<LeftDto> leftList = new ArrayList<>();

Collection<RightDto> rightList = new ArrayList<>();

Can be Joined like,

Collection<JoinedDto> results = CQ.<LeftDto, LeftDto>query().from(leftList)
                .<RightDto, JoinedDto>innerJoin(CQ.<RightDto, RightDto>query().from(rightList))
                .on(LeftFyo::getId, RightDto::getLeftId)
                .transformDirect(selection ->  new JoinedDto(selection.getLeft().getText()
                                                     , selection.getLeft().getId()
                                                     , selection.getRight().getId())
                                 )
                .list();

Expressions

Filter<Dto> query = CQ.<Dto>filter()
    .from(testList)
    .where()
    .exec(s -> s.getId() + 1).eq().value(2);

Questions:
Answers:

Some really great great answers here. Me, I’d like to keep thins as simple and readable as possible:

public abstract class AbstractFilter<T> {

/**
 * Method that returns whether an item is to be included or not.
 * @param item an item from the given collection.
 * @return true if this item is to be included in the collection, false in case it has to be removed.
 */
protected abstract boolean excludeItem(T item);

public void filter(Collection<T> collection) {
    if (CollectionUtils.isNotEmpty(collection)) {
        Iterator<T> iterator = collection.iterator();
        while (iterator.hasNext()) {
            if (excludeItem(iterator.next())) {
                iterator.remove();
            }
        }
    }
}

}

Questions:
Answers:

My answer builds on that from Kevin Wong, here as a one-liner using CollectionUtils from spring and a Java 8 lambda expression.

CollectionUtils.filter(list, p -> ((Person) p).getAge() > 16);

This is as concise and readable as any alternative I have seen (without using aspect-based libraries)

Spring CollectionUtils is available from spring version 4.0.2.RELEASE, and remember you need JDK 1.8 and language level 8+.

Questions:
Answers:

Using java 8, specifically lambda expression, you can do it simply like the below example:

myProducts.stream().filter(prod -> prod.price>10).collect(Collectors.toList())

where for each product inside myProducts collection, if prod.price>10, then add this product to the new filtered list.

Questions:
Answers:

With Guava:

Collection<Integer> collection = Lists.newArrayList(1, 2, 3, 4, 5);

Iterators.removeIf(collection.iterator(), new Predicate<Integer>() {
    @Override
    public boolean apply(Integer i) {
        return i % 2 == 0;
    }
});

System.out.println(collection); // Prints 1, 3, 5

Questions:
Answers:

I needed to filter a list depending on the values already present in the list. For example, remove all values following that is less than the current value. {2 5 3 4 7 5} -> {2 5 7}. Or for example to remove all duplicates {3 5 4 2 3 5 6} -> {3 5 4 2 6}.

public class Filter {
    public static <T> void List(List<T> list, Chooser<T> chooser) {
        List<Integer> toBeRemoved = new ArrayList<>();
        leftloop:
        for (int right = 1; right < list.size(); ++right) {
            for (int left = 0; left < right; ++left) {
                if (toBeRemoved.contains(left)) {
                    continue;
                }
                Keep keep = chooser.choose(list.get(left), list.get(right));
                switch (keep) {
                    case LEFT:
                        toBeRemoved.add(right);
                        continue leftloop;
                    case RIGHT:
                        toBeRemoved.add(left);
                        break;
                    case NONE:
                        toBeRemoved.add(left);
                        toBeRemoved.add(right);
                        continue leftloop;
                }
            }
        }

        Collections.sort(toBeRemoved, new Comparator<Integer>() {
            @Override
            public int compare(Integer o1, Integer o2) {
                return o2 - o1;
            }
        });

        for (int i : toBeRemoved) {
            if (i >= 0 && i < list.size()) {
                list.remove(i);
            }
        }
    }

    public static <T> void List(List<T> list, Keeper<T> keeper) {
        Iterator<T> iterator = list.iterator();
        while (iterator.hasNext()) {
            if (!keeper.keep(iterator.next())) {
                iterator.remove();
            }
        }
    }

    public interface Keeper<E> {
        boolean keep(E obj);
    }

    public interface Chooser<E> {
        Keep choose(E left, E right);
    }

    public enum Keep {
        LEFT, RIGHT, BOTH, NONE;
    }
}

This will bee used like this.

List<String> names = new ArrayList<>();
names.add("Anders");
names.add("Stefan");
names.add("Anders");
Filter.List(names, new Filter.Chooser<String>() {
    @Override
    public Filter.Keep choose(String left, String right) {
        return left.equals(right) ? Filter.Keep.LEFT : Filter.Keep.BOTH;
    }
});