Home » Php » php – Get domain from URL

php – Get domain from URL

Posted by: admin July 12, 2020 Leave a comment

Questions:

OK, before you say “oh, come on! this is easy”, I must inform you that I’ve been testing many many different methods for that specific thing, for a long time, and I haven’t found any that really works for any url, and any domain.

Examples :

  • http://www.this-is-a-url.com = this-is-a-url.com
  • www.this-is-another.url.com/some-folder = this-is-another-url.com
  • subdomain.somesub.domain.com/index.php = domain.com
  • diff.erentltd.in = erentltd.in
  • www.andanotherone.org.uk = andanotherone.org.uk

So, any ideas? Do you know of any working function/script?


To anyone interested : Please have a look at @bystwn22‘s answer. It’s one of the smoothest working solutions you could possibly find! 🙂

How to&Answers:

Okay try this, i know the question is really tricky :\

<?php
  $urls = array(
    "http://www.this-is-a-url.com",
    "www.this-is-another-url.com/some-folder",
    "subdomain.somesub.domain.com/index.php",
    "diff.erentltd.in",
    "www.andanotherone.org.uk"
  );

  foreach( $urls as $url ) {
    var_dump( get_domain( $url ) );
  }

  /** Output **/
  // string(17) "this-is-a-url.com"
  // string(23) "this-is-another-url.com"
  // string(10) "domain.com"
  // string(11) "erentltd.in"
  // string(20) "andanotherone.org.uk"
?>

Function get_domain

<?php
  function get_domain( $url ) {
    $regex  = "/^((http|ftp|https):\/\/)?([\w-]+(\.[\w-]+)+)([\w.,@?^=%&amp;:\/~+#-]*[\[email protected]?^=%&amp;\/~+#-])?$/i";
    if ( !preg_match( $regex, $url, $matches ) ) {
      return false;
    }
    $url    = $matches[3];
    $tlds   = array( 'ac', 'ad', 'ae', 'aero', 'af', 'ag', 'ai', 'al', 'am', 'an', 'ao', 'aq', 'ar', 'arpa', 'as', 'asia', 'at', 'au', 'aw', 'ax', 'az', 'ba', 'bb', 'bd', 'be', 'bf', 'bg', 'bh', 'bi', 'biz', 'bj', 'bm', 'bn', 'bo', 'br', 'bs', 'bt', 'bv', 'bw', 'by', 'bz', 'ca', 'cat', 'cc', 'cd', 'cf', 'cg', 'ch', 'ci', 'ck', 'cl', 'cm', 'cn', 'co', 'com', 'coop', 'cr', 'cu', 'cv', 'cx', 'cy', 'cz', 'de', 'dj', 'dk', 'dm', 'do', 'dz', 'ec', 'edu', 'ee', 'eg', 'er', 'es', 'et', 'eu', 'fi', 'fj', 'fk', 'fm', 'fo', 'fr', 'ga', 'gb', 'gd', 'ge', 'gf', 'gg', 'gh', 'gi', 'gl', 'gm', 'gn', 'gov', 'gp', 'gq', 'gr', 'gs', 'gt', 'gu', 'gw', 'gy', 'hk', 'hm', 'hn', 'hr', 'ht', 'hu', 'id', 'ie', 'il', 'im', 'in', 'info', 'int', 'io', 'iq', 'ir', 'is', 'it', 'je', 'jm', 'jo', 'jobs', 'jp', 'ke', 'kg', 'kh', 'ki', 'km', 'kn', 'kp', 'kr', 'kw', 'ky', 'kz', 'la', 'lb', 'lc', 'li', 'lk', 'lr', 'ls', 'lt', 'lu', 'lv', 'ly', 'ma', 'mc', 'md', 'me', 'mg', 'mh', 'mil', 'mk', 'ml', 'mm', 'mn', 'mo', 'mobi', 'mp', 'mq', 'mr', 'ms', 'mt', 'mu', 'museum', 'mv', 'mw', 'mx', 'my', 'mz', 'na', 'name', 'nc', 'ne', 'net', 'nf', 'ng', 'ni', 'nl', 'no', 'np', 'nr', 'nu', 'nz', 'om', 'org', 'pa', 'pe', 'pf', 'pg', 'ph', 'pk', 'pl', 'pm', 'pn', 'pr', 'pro', 'ps', 'pt', 'pw', 'py', 'qa', 're', 'ro', 'rs', 'ru', 'rw', 'sa', 'sb', 'sc', 'sd', 'se', 'sg', 'sh', 'si', 'sj', 'sk', 'sl', 'sm', 'sn', 'so', 'sr', 'st', 'su', 'sv', 'sy', 'sz', 'tc', 'td', 'tel', 'tf', 'tg', 'th', 'tj', 'tk', 'tl', 'tm', 'tn', 'to', 'tp', 'tr', 'travel', 'tt', 'tv', 'tw', 'tz', 'ua', 'ug', 'uk', 'us', 'uy', 'uz', 'va', 'vc', 've', 'vg', 'vi', 'vn', 'vu', 'wf', 'ws', 'ye', 'yt', 'yu', 'za', 'zm', 'zw' );
    $parts  = array_reverse( explode( ".", $url ) );
    $domain = array();

    foreach( $parts as $part ) {
      $domain[] = $part;
      if ( !in_array( strtolower( $part ), $tlds ) ) {
        return implode( ".", array_reverse( $domain ) );
      }
    }
  }
?>

Answer:

I worked on a simpler solution. Due to the issues we faced with parse_url

check("www.google.com");

function check($url) {
        if (!preg_match("/^http/", $url)) $url = "http://" . $url;
        echo preg_replace("/.*\.([^\.]+\.[^\.]+)/", "$1", parse_url ( $url, PHP_URL_HOST ));
}

Answer:

Well you actually need 2 lists: second level domains and top level domains.

  1. Get host from your url with preg_match or parse_url, lets say it will be subdomain.domain.org.uk

  2. Explode it by dot and take last two elements of that array, concatenated by dot again (org.uk). If thats one of second-level domains – add previous element of array, and you have your domain (domain.org.uk).

  3. Otherwise your domain is what you’ve checked in the step 2 (if last element of array is one of top level domains, you can skip this check if you are pretty sure the domain is valid). If your original host was subdomain.domain.com, then you have checked that domain.com is not a second-level domain, that means domain.com is what you were looking for.

Here is the list of second-level domains. Or you can try to find a better one.