Query YouTube Video XML Data

July 4, 2015 Update: YouTube has changed their API and the XML feed is now deprecated. Going forward, use this simple JSON feed to get information about a YouTube video: http://www.youtube.com/oembed?url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DVm3wabL5-Bk&format=json But replace the url with the url to the video you want.

 


You might know that one can obtain XML information about a YouTube video by its ID, using the YouTube API (basically downloading an XML document at a URL). The URL to do this is: http://gdata.youtube.com/feeds/api/videos/VIDEO_ID replacing VIDEO_ID with the YouTube ID String. Example: http://gdata.youtube.com/feeds/api/videos/iZ_wNWSV-aQ (Click it to see the XML data on a real video).

Here is a function to extract desired information from the XML document, in PHP:

Extract Information from a YouTube XML File:

<?php
// Get & Parse YouTube XML data
	static function getYouTubeInfo($id){

		$xml_data = @file_get_contents("http://gdata.youtube.com/feeds/api/videos/".$id);
		if(strlen($xml_data) < 25){
// Not enough data
return Video::getYouTubeInfoBasic($id);
}
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
		if( ! $doc->loadXML($xml_data) ){
    // Failed Loading XML Doc
			return Video::getYouTubeInfoBasic($id);
		}

		$ret = array();

    // Get Thumbnail HREFs
		$media = $doc->getElementsByTagNameNS("*","thumbnail");

		if($media){
			foreach($media as $node){
				$ret["thumbnails"][] = $node->getAttribute("url");
			}
		}

    // Get Video Title
		$titles = $doc->getElementsByTagNameNS("*","title");

		if($titles){
			$ret["title"] = $titles->item(0)->nodeValue;
		}

    // Get Link to Video
		$link = $doc->getElementsByTagNameNS("*","player");

		if($link){
			if( $link->item(0) ){
				$ret['url'] = $link->item(0)->getAttribute("url");
			}
		}

		return $ret;
	}

        // If the above function doesn't work (i.e. the XML is not available, 
      //    this function will be called to grab the information manually and check if a video exists by checking for thumbnails
	static function getYouTubeInfoBasic($id){

		$check = @get_headers("http://img.youtube.com/vi/".$id."/1.jpg",true);

    // Check if Thumbnail exists
		if( strpos($check[0],"404") > 0 || strpos($check[0],"400") > 0){
			return false;
		}

		$ret = array();
		$ret['thumbnails'][] = "http://img.youtube.com/vi/".$id."/1.jpg";
		$ret['thumbnails'][] = "http://img.youtube.com/vi/".$id."/2.jpg";
		$ret['thumbnails'][] = "http://img.youtube.com/vi/".$id."/3.jpg";

		$ret['title'] = "Watch Video";
		$ret['url'] = "http://www.youtube.com/watch?v=".$id;

		return $ret;
	}
?>

Sometimes requesting the YouTube XML data URL results in a 404 or 403 error. If the XML data is not found (usually when a video is new, but sometimes can happen even on old videos), there is a fail-safe function that checks if a YouTube video exists based on the existence of its thumbnails. With the fail-safe function, the title of the video is not obtained, though it could be edited to grab the Title by parsing the actual YouTube video page for the <title> tag.

Here’s another helpful function to extract the YouTube ID from different kinds of URLs and even the <embed> code that a user might copy & paste:

Extract YouTube ID From a String using Regular Expression

<?php
	static function getYouTubeID($str){
		$matches = array();

// Links from a user channel page will have the ID in the # hash tag
		if(strpos($str,"/user/") !== false && strpos($str,"#") !== false){
			$params = explode("/",$str);

			$the_id = array_pop($params);

			return $the_id;
		}

// Normal YouTube public page link
		preg_match("/watch?v=([^&]+)/i",$str,$matches);

		if(!$matches[1]){
// last resort, try getting ID from embed code
			preg_match("/v/([^&"']+)/i",$str,$matches);
		}

		return $matches[1];
	}

?>

Obtaining the Duration of the Video in Seconds

Use this snippet to read the duration of the video in seconds.

$durations = $doc->getElementsByTagNameNS("*","duration");
		if($durations){
			$ret["duration"] = $durations->item(0)->getAttribute('seconds');
			// echo $ret['duration'];
		}

Disabling Apache’s Mod Security Rules

I tried upgrading phpMyAdmin to a new directory on a server, not using the built-in cPanel environment. It installed fine, but I couldn’t run certain SQL queries like DROP TABLE tablename; It would generate an Internal Server Error 500.

After some testing I realized that if I simply tried to access any url with the string “x=DROP TABLE abcxyz” the page would simply be an internal server error 500. To see if your server has mod_security enabled, create a test PHP page with hello world in it, call it test.php, and try to access yourdomain.com/test.php?x=DROP TABLE xyz (even if your script doesn’t do anything with the x variable.

To get around this for certain web applications (which are denied access from the public anyway with password protected directories), find the file /usr/local/apache/conf/modsec2/whitelist.conf and add this to it:

<LocationMatch /phpMyMyAdmin/*>
<IfModule mod_security2.c>
SecRuleEngine Off
</IfModule>
</LocationMatch>

You may replace the /phpMyAdmin/* part with any regular expression for a part of your site for which you would like mod_security turned off. If you cannot find whitelist.conf, you can try adding the same code to your httpd.conf (use updatedb and then locate httpd.conf to locate the file)

After you save the change, it might not take effect for a few minutes, or you might have to restart the web server!