How to Create a PHP Crawler

rockmanx

Credits: http://vision-media.ca/resources/php/create-a-php-web-crawler-or-scraper-5-minutes

Utilizing the PHP programming language we show you how to create an infinitely extendable web crawler in under 5 minutes, collecting images and links.

The Crawler Framework

First we need to create the crawler class as follows:

<?php

class Crawler {

}

?>

We then will create methods to fetch the web pages markup, and to parse it for data that we are looking at collecting. The only public methods will be getMarkup() and get() as the parsing methods will generally be used privately for the crawler, however the visibility is set to protected since you never know who will want to extend its functionality.

<?php

class Crawler {

protected $markup = ”;

public function __construct($uri) {

}

public function getMarkup() {

}

public function get($type) {

}

protected function _get_images() {

}

protected function _get_links() {

}

}

?>

Fetching Site Markup

The constructor will accept a…

View original post 313 more words

Advertisements