Home » Php » performance – PHP, in_array and fast searches (by the end) in arrays

performance – PHP, in_array and fast searches (by the end) in arrays

Posted by: admin July 12, 2020 Leave a comment

Questions:

I have a doubt about what’s the better way to make a fast search in arrays (I’m talking about an specific case).

Supose that I have an array L = [A, B, C] (when I start). While the program is running, may be L will grow (but by the end), one possible case when I’ll do the search is that L = [A, B, C, D, E].

The fact is that when I’m searching, the values that I want find could be only D and E. Now I’m using find_array(elem, array), but this function can’t be “tweaked” to search starting at the end and decreasing the index, and I’m “afraid” that for all the searches the function in_array will examine all the elements with lower indexes before will find the value that I’m searching.

¿There is another search function wich fits better to my problem? ¿How works internally the in_array function?

Thanks in advance

How to&Answers:

I assume that in_array is a linear search from 0 to n-1.

The fastest search will be to store the values as the keys and use array_key_exists.

$a['foo'] = true;
$a['bar'] = true;

if (array_key_exists('foo', $a)) ...

But if that’s not an option, you can make your own for indexed arrays quite easily:

function in_array_i($needle, array $a, $i = 0);
{
  $c = count($a);
  for (;$i < $c; ++$i)
    if ($a[$i] == $needle) return true;
  return false;
}

It will start at $i, which you can keep track of yourself in order to skip the first elements.

Or alternatively…

function in_array_i($needle, array $a, $i = 0);
{
  return in_array($needle, $i ? array_slice($a, $i) : $a);
}

You can benchmark to see which is faster.

Answer:

How works internally the in_array function?

Internally the in_array() searches from the beginning to the end of the array. So in your case this is slow.

Depending of the nature of your data you can change the search strategy. If you only have non-duplicate values and all values are either string or integer (not NULL), a common trick is to array_flip() the array which works pretty fast and then check if there is an entry for your value as key in the array hash via isset():

  $array = array( ... non-duplicate string and integer values ... );
  $needle = 'find me!';
  $lookup = array_flip($array);
  $found = isset($lookup[$needle]) ? $lookup[$needle] : false;
  if (false === $found) {
    echo "Not found!\n";
  } else {
    echo "Found at {$found}!\n";
  }

If these pre-conditions are not met, you can do that what konforce suggested.

If you have really much data and it’s not only that you’re looking at either from the beginning or end, you might want to implement one search algorithm on your own, like neither starting from the beginning nor end, but wrapping and/or starting at a random position to distribute the search time.

Additionally you can keep elements sorted while adding to the array probably which can then be searched much faster with a fitting algorithm.

Answer:

Tweaking an extensive comparative test between

for numerical and string searches, by Kasim Kochkin posted on GitHub, I find the following results

using php 7.3.11

using array_flip once and multiple searches,

  • for single to few searches, in_array and array_search are faster.

  • for string searches, flip (once) + isset becomes faster above 200 searches.

  • for numerical searches, flip (once) + isset becomes faster above 10 searches.

results for String search

N=1000000 (million)
in_array: 0.00845003
flip: 0.17343211
isset: 2.86E-6
array_search: 0.00835395
array_key_exists: 5.01E-6

N=100000
in_array: 0.00854707
flip: 0.12469196
isset: 7.15E-6
array_search: 0.00861216
array_key_exists: 6.2E-6

N=10000
in_array: 0.00854087
flip: 0.10549212
isset: 6.91E-6
array_search: 0.00846505
array_key_exists: 4.05E-6

Numerical search results,

N=1000000
in_array: 0.01197696
flip: 0.06217289
isset: 6.2E-6
array_search: 0.01673698
array_key_exists: 4.05E-6

N=100000
in_array: 0.01191092
flip: 0.06582093
isset: 6.91E-6
array_search: 0.01637983
array_key_exists: 4.05E-6

N=10000
in_array: 0.01375008
flip: 0.07185006
isset: 5.01E-6
array_search: 0.01485705
array_key_exists: 4.05E-6