I maintain a site with a bunch of downloadable files. It is currently hosted on a server in the U.S., but I have recently acquired a new server in Germany. I would like to mirror the downloads to the server in Germany, and have a PHP script on the first server (hosting the website) detect which file mirror to use based on the user’s location. For instance, if the user is in Canada, they should download the file from my current server in the U.S. If they’re in France, they should get the file from Germany, rather than downloading across the Atlantic. How, then, can I determine which country they are closer to?
I know about MaxMind GeoIP, and have it installed, but that just gives me a country, and AFAIK, there is no way to automatically determine which of my two mirror countries the given country is closest to. I suppose what I could do is go by continent: have users in Asia, Europe, Africa, and Australia get the content from Germany, and have visitors from North and South America get the file from the U.S. If anyone can think of a better solution, I’m open to suggestions.
Well, I guess I am going to go go with my original idea of checking by continents. For others looking to do this sort of thing, that will be a good place to start. The problem will come when I have multiple mirrors in Europe, but the continent idea will have to work for now.
Seems there is a lot of developer overhead in the proposed solutions thus far. If this was a problem I had to address in my own applications, I might save me a few hours of work by opting not to reinvent the wheel on this one.
Determining the Closest Mirror (Using Zip Codes)
- Maintain a list of postal codes in an array for those mirror servers available.
- Determine the postal code of the user agent (e.g. user input or PHP library)
- Calculate the distance between the two postal codes (e.g. PHP library)
- Proceed with mirror selection based on distance returned
Please keep in mind that a closer distance does not necessarily constitute a faster response time. In the context of your scenario, however, a mirror in one country will obviously be faster than a mirror in another, assuming both mirrors are up. Continue reading for what I consider to be a more “robust” solution.
Resources & Links
PHP 5 Zip Code Range and Distance Calculation: http://www.micahcarrick.com/php5-zip-code-range-and-distance.html
ip2nation Country Based Redirection: http://www.ip2nation.com/ip2nation/Sample_Scripts/Country_Based_Redirect
Might also be good to look for a library or MySQL dump that will help you derive the postal code or region of a given IP address (e.g. ip2nation). Example: How to determine a zip code and city from an IP address?
The “Maverick” Approach
In my opinion, Mavericks are also known as those innovators, problem solvers, and inventors of these great libraries and frameworks we all use today. Sometimes mistakenly associated with “hackish” ideas, but we embrace the complement 🙂
Create your own API service on either one of your mirror servers that will accept either a $_GET or $_POST request.
This API service will take an IP address it is given and ping() it, calculating the response times and then taking the average, returning it to the requesting interface (e.g. your frontend portal through which clients are connecting and/or the server trying to determine the closest mirror). The server that responds with the lowest average ought to be your quickest responding server, albeit not necessarily closest. Which is more important to you? See Ping site and return result in PHP for a working ping() function that does not rely on executing shell commands locally (e.g. platform independent).
Last step, derive the IP address of the requesting client and pass it to your API service running on either mirror server in the background. And we all know how to derive the IP, but not as well as you think we might. If you’re load balanced or behind a proxy, you may want to first check to see if any of these headers came through (HTTP_FORWARDED, HTTP_FORWARDED_FOR, HTTP_X_FORWARDED, HTTP_X_FORWARDED_FOR, HTTP_CLIENT_IP). If so, that’s probably the real IP address of the user agent.
It is at this point (Step 3) where you would compare the averages of response times that each mirror replied with when they went to ping the user agent. Then proceed with selecting which mirror the user agent should download from. The service flow you will have created then resembles something like this:
- User agent visits portal
- Portal forwards user agent’s IP address to API service running separately on both mirrors using a background AJAX/jQuery request (or traditional POST and redirect).
- API service running on mirrors pings the IP address it receives and returns an average of the total number of responses it is configured to fetch.
- Portal reads the returned averages and compares them.
Hope that helps and happy coding!
If you have just two mirrors, kick off AJAX requests in your browser that download a 50K file from each server. This is small enough not to represent a huge delay for the user, but large enough to make timer measurement differences significant – though of course you should play with that figure a bit.
Then, once you’ve got a ‘best time’, set a JS cookie and redirect to the preferred mirror whenever a download is required. The measurement can be kicked off from a download page in the background, so the user probably won’t notice the delay (whilst they are selecting the file they want).
You could even reply with a ‘server load’ in each AJAX op too, and select the best server not just on response time but on current load also. So, a UK user would use the US server, even though the closest server is in Germany, if the load on the latter is significantly higher than the first.
I dont remember any library that can do this. But instead of build a system, If I have an idea, that might be able to help you out.
Calculate the distance between two IP’s using this distance calculator. Or find out the latitude and longitude of the two IP address (one server) and (one guest) and calculate the distance. Here is a pseudocode to do that
distance = ( 3956 *2 * ASIN( SQRT( POWER( SIN( ( 34.1012181 - ABS( latitude ) ) * PI( ) /180 /2 ) , 2 ) + COS( 34.1012181 * PI( ) /180 ) * COS( ABS( latitude ) * PI( ) /180 ) * POWER( SIN( ( ABS( - 118.325739 ) - ABS( longitude ) ) * PI( ) /180 /2 ) , 2 ) ) ))
Do a traceroute (configure the traceroute client to not resolve hostnames and with a small timeout).
Based on the number of hops and the location of the traceroute client (I supose it’s the same as the PHP script) select between U.S. and Germany.
Geographical distance has nothing to do with network distance and network speed, or bandwidth costs.
Alternatively to the traceroute (since its a hackish, small code solution),
I recommend you to use the $_SERVER[“REMOTE_ADDR”] and look it up in a geo ip database to get the country code. If the country code is not one of the countries on the American continents, to avoid crossing a crowded internet backbone, fallback to Germany (additionaly you could condition the country code to be from Europe).
Once you setup the geo ip database, I recommend you convert the IP addresses in the ranges from dotted format to integer format for speed and ease of querying.
From my experience with the above geo ip database, it misses so rarely it doesn’t matter.
its not more easy use some library like geoip as u said and use the latitude and longitude to compare the distance betwen the mirrors and the user ?
I think its less complicated and its much easy to implemen,works for N mirrors and u dont need to ask for Zip or another kind of data to make the references