Home » Ruby » ruby efficient way to get multiple hash keys for a given value

ruby efficient way to get multiple hash keys for a given value

Posted by: admin December 28, 2017 Leave a comment

Questions:

What’s the most efficient way to get all the hash keys from a given value.

my_hash = {"a"=>"aa", "b"=>"bb", "c"=>"bb"}

I want to give the hash “bb” as an input value and get all they keys (b,c) back as an array

Returns only one key:

my_hash.index("bb")
# returns only b

This works but seems inefficient:

my_hash.select{|k,v| v == 'bb' }.map{|i| i[0] }
# returns b and c

I’ve read all the docs. I feel like there’s something obvious I’m missing.

Thanks!

Update:

I ended up switching the keys and values for the hash creation and going with an array for the values. This is a more efficient solution. See below for best ways to do value look ups if you have to.

New structure:

my_hash = {"aa"=>["a"],"bb"=>["b","c"]}
Answers:

Slightly faster:

my_hash.map{ |k,v| v=='bb' ? k : nil }.compact

Slower for a small hash and single query only. Faster if you need to request the reverse mapping for multiple values. I’d recommend maintaining a reverse map if this is important to your application.

rev = Hash.new{ |h,k| h[k]=[] }
my_hash.each{ |k,v| rev[v] << k }
rev['bb']

Benchmarked:

require 'benchmark'
N = 1_000_000
my_hash = {"a"=>"aa", "b"=>"bb", "c"=>"bb"}

Benchmark.bmbm do |x|
  x.report('select/map'){ N.times{
    my_hash.select{|k,v|v=='bb'}.map{|i| i[0]}
  }}
  x.report('select/map/destructure'){ N.times{
    my_hash.select{|k,v|v=='bb'}.map{|k,v| k}
  }}
  x.report('map/compact'){ N.times{
    my_hash.map{|k,v|v=='bb' ? k : nil}.compact        
  }}
  x.report('reverse map'){ N.times{
    rev = Hash.new{|h,k|h[k]=[]}
    my_hash.each{ |k,v| rev[v]<<k }
    rev['bb']
  }}
  x.report('reject'){ N.times{
    my_hash.reject{|k,v|v != "bb"}.keys
  }}
end
#=> Rehearsal ----------------------------------------------------------
#=> select/map               1.950000   0.000000   1.950000 (  1.950137)
#=> select/map/destructure   1.960000   0.010000   1.970000 (  1.963740)
#=> map/compact              1.200000   0.000000   1.200000 (  1.197340)
#=> reverse map              3.660000   0.000000   3.660000 (  3.658245)
#=> reject                   2.110000   0.000000   2.110000 (  2.115805)
#=> ------------------------------------------------ total: 10.890000sec
#=> 
#=>                              user     system      total        real
#=> select/map               1.950000   0.000000   1.950000 (  1.948784)
#=> select/map/destructure   1.970000   0.010000   1.980000 (  1.966636)
#=> map/compact              1.190000   0.000000   1.190000 (  1.192052)
#=> reverse map              3.670000   0.000000   3.670000 (  3.664798)
#=> reject                   2.140000   0.000000   2.140000 (  2.135069)

Questions:
Answers:

Associative data structures, by design, make it fast to look up a value given a key. They aren’t intended to be efficient for looking up keys by value.

In C++ std::map does not do this efficiently, but boost::bimap does. I don’t know how it works underneath, but logically it could work by creating two maps (key to value and value to key) and making an interface to make it look like they’re one. You could do the same thing. Since your map appears to be single-valued in one direction and multivalued in the other direction (i.e., no one-to-one correspondence) I doubt there are many prebuilt libraries that do precisely what you want.

Disclaimer: I do not know Ruby.

Questions:
Answers:

As hashes aren’t even ordered by value (which MIGHT speed things up a bit by maybe allowing use of a binary search), you’re not going to find a way of doing this without walking through the entire hash anyway, so your solution actually looks like the most efficient one.

An alternative might be:

my_hash.reject{|k,v|v != "bb"}.keys

It’s slightly shorter code-wise but IMHO also a bit harder to understand than your version. Also the reject method builds a copy of the hash so it’s more memory-intensive. If you don’t care about the original hash, use delete_if which works in-place:

my_hash.delete_if{|k,v|v != "bb"}.keys

Questions:
Answers:

Another alternative: my_hash.find_all{|k,v|v == "bb"}.map(&:first)

Not the fastest, but nice to human eyes 🙂

Using Phrogz’s benchmark:

Rehearsal ----------------------------------------------------------
select/map               3.604000   0.000000   3.604000 (  3.638208)
select/map/destructure   3.682000   0.000000   3.682000 (  3.678210)
map/compact              2.620000   0.000000   2.620000 (  2.620150)
reverse map              5.991000   0.000000   5.991000 (  5.985342)
reject                   3.572000   0.000000   3.572000 (  3.612207)
find_all/map             2.964000   0.000000   2.964000 (  2.965169)
------------------------------------------------ total: 22.433000sec

                             user     system      total        real
select/map               3.619000   0.000000   3.619000 (  3.634207)
select/map/destructure   3.698000   0.000000   3.698000 (  3.702212)
map/compact              2.589000   0.000000   2.589000 (  2.620150)
reverse map              5.913000   0.000000   5.913000 (  6.013344)
reject                   3.557000   0.000000   3.557000 (  3.569204)
find_all/map             2.948000   0.000000   2.948000 (  2.959169)