Home » Php » Google Sites API via CURL in PHP – Getting "Content is not allowed in prolog."

Google Sites API via CURL in PHP – Getting "Content is not allowed in prolog."

Posted by: admin July 12, 2020 Leave a comment

Questions:
$curl = new Curl();
$data = 'Email='.urlencode('[email protected]').'&Passwd='.urlencode('MYPASSWORD').'&accountType=GOOGLE&source=Google-cURL-Example&service=jotspot';
$curl->post('https://www.google.com/accounts/ClientLogin',$data);

//match authorization token
preg_match("!Auth=(.*)!",$curl->response,$match);
$auth = $match[1];

//set curl headers
$curl->set_headers(array(
'Content-Type: application/atom+xml; charset=utf-8',
'Host: sites.google.com',
'GData-Version: 1.4',
'Authorization: GoogleLogin auth='. trim($auth)));

//get a list of sites associated with my domain
$curl->get('https://sites.google.com/feeds/site/clevertechie.mygbiz.com'); 

//contains data returned by $curl->get();
echo $curl->response;

So instead of getting the list of sites from $curl->response, I get a message – “Content is not allowed in prolog.” I’ve looked everywhere and haven’t been able to find a solution, please help! Thanks! 🙂

This is the XML that is supposed to be returned by the response:

<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns='http://www.w3.org/2005/Atom' xmlns:gAcl='http://schemas.google.com/acl/2007' xmlns:sites='http://schemas.google.com/sites/2008' xmlns:gs='http://schemas.google.com/spreadsheets/2006' xmlns:dc='http://purl.org/dc/terms' xmlns:batch='http://schemas.google.com/gdata/batch' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'>
<updated>2012-10-31T19:00:17.297Z</updated>
<app:edited xmlns:app='http://www.w3.org/2007/app'>2012-10-31T19:00:17.297Z</app:edited>
<title>My Site Title</title>
<summary>My Site Summary</summary>
<sites:siteName>my-site-title</sites:siteName>
<sites:theme>slate</sites:theme>
</entry>

I can’t paste the source of “https://sites.google.com/feeds/site/clevertechie.mygbiz.com” because it can’t be accessed directly without authorization token, which is specified in the headers. The only way to retrieve its data is by using the token in the headers which I’ve done. Instead of getting above XML, I’m getting “Content is not allowed in prolog”.

var_dump of $curl:

object(Curl)#1 (11) { ["curl_resource":protected]=> resource(4) of type (Unknown) ["proxy":protected]=> bool(false) 
["proxy_type":protected]=> NULL ["response"]=> string(33) "Content is not allowed in prolog." ["time"]=> float(249)
 ["info"]=> array(26) { ["url"]=> string(59) "https://sites.google.com/feeds/site/clevertechie.mygbiz.com" 
 ["content_type"]=> string(24) "text/html; charset=UTF-8" ["http_code"]=> int(400) ["header_size"]=> int(676) 
 ["request_size"]=> int(1935) ["filetime"]=> int(-1) ["ssl_verify_result"]=> int(20) ["redirect_count"]=> int(0) 
 ["total_time"]=> float(0.249) ["namelookup_time"]=> float(0.015) ["connect_time"]=> float(0.046) 
 ["pretransfer_time"]=> float(0.109) ["size_upload"]=> float(111) ["size_download"]=> float(33) 
 ["speed_download"]=> float(132) ["speed_upload"]=> float(445) ["download_content_length"]=> float(-1) 
 ["upload_content_length"]=> float(111) ["starttransfer_time"]=> float(0.249) ["redirect_time"]=> float(0) 
 ["certinfo"]=> array(0) { } ["primary_ip"]=> string(14) "74.125.224.194" ["primary_port"]=> int(443) 
 ["local_ip"]=> string(13) "192.168.1.133" ["local_port"]=> int(61985) ["redirect_url"]=> string(0) "" } 
 ["error"]=> NULL ["custom_headers"]=> NULL ["cookie_file"]=> string(46) "cookies.txt" 
 ["custom_curl_options":protected]=> array(3) { [47]=> int(1) [10015]=> string(111) 
 "Email=MYEMAIL&Passwd=MYPASSWORD&accountType=GOOGLE&source=Google-cURL-Example&service=jotspot" 
 [10023]=> array(4) { [0]=> string(49) "Content-Type: application/atom+xml; charset=utf-8" 
 [1]=> string(22) "Host: sites.google.com" [2]=> string(18) "GData-Version: 1.4" [3]=> string(320) 
 "Authorization: GoogleLogin auth=DQAAAMMAAAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" } } ["curl_options":protected]=> array(9) { [10018]=> string(74) 
 "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20100101 Firefox/14.0.1" [10016]=> string(22) 
 "http://www.google.com/" [13]=> int(60) [78]=> int(60) [19913]=> int(1) [52]=> int(1) [64]=> int(0) [81]=> int(0) [42]=> int(0) } }

$auth is just a string, its not supposed to be formatted as XML. I verified that there are no extra spaces or characters and it exactly matches the one returned by the first $curl->post request.

How to&Answers:

Leave the request type as a POST (with the content being the atom xml payload), set the content type to “application/atom+xml” but pass all of the oAuth as a GET, i.e. as escaped (urlEncoded) values on the URL query string.