I’m looking to geocode over 5000 addresses at once in a PHP script (this will only ever be run once).
I have been looking into google as a potential resource for doing just this, however I’ve read reports that after running 200 or so queries through them google will kick you off for the day.
I was just wondering if there was any other way to geocode 5000 or so addresses, another service like google offers or something similar I could use?
Or will I just have to stagger this? The problem is I don’t really have much time and to do 200 or 300 a day for 5000 results will take almost 5 (working) weeks.
You could use Bing Maps instead: the Spatial Data API is made for batch geocoding thousands of addresses at once (that link is even a detailed tutorial on how to use it with PHP).
You just need to register a key at http://www.bingmapsportal.com but that’s free and fast (you get the confirmation email within minutes).
Is there a limit to the number of geocode requests I can submit?
If more than 2,500 geocode requests in a 24 hour period are received from a single IP address, or geocode requests are submitted from a single IP address at too fast a rate, the Google Maps API geocoder will begin responding with a status code of 620.[…]
If you need to submit a very large set of addresses to the Geocoding Web Service to cache for later use, you should consider Google Maps API Premier, which provides a separate batch geocoding quota for this purpose.
As @Pekka mentioned: note that Google’s terms of service forbid geocoding stuff for purposes other than showing it on a map.
The most reliable solution is to download geolocation database to your host so that you can do unlimited queries.
As @Bart Kiers says, there’s a limit on the number of requests you can do in a 24hr period; there’s also a “not too fast” per-hour (?) limit. I’d suggest that you divide (seconds per day) 86400/2500 (the limit) to get a query rate that shouldn’t exceed the “too fast” per/hour limit. It comes out to about one query per 35 seconds, which should get you the results in two days.
However, do check the return codes: if the service starts returning 620, stop and give it a rest for some time, else you risk a ban.
What you’re trying to do is indeed not according to Google’s terms of service.
That said, Google will start returning ‘over-quota’ responses if you don’t pause at least 250mS between geocoding requests.
In practice, if you only make 2 requests a second you won’t get throttled until the 2’500 day’s limit.