Version 10^-1.9 of MD_Extract pushed to github

Completely changed the way the string representing the HTML is preprocessed before being fed to tidy. I’ve just changed the function and the approach. The function is not really very elegant but it fixes a bunch of bugs. It’s mostly character iteration and lots and lots of flags (old school style!!). But it got me thinking after doing some quick browsing on the HTML parsing algorithm provided by the WHATWG if I shouldn’t just write my own (though it looks sort of hard and specially time consuming). I’ve also been looking at the source code of tidy and though it’s quite big the other option would be to try to contribute to it and help update it to HTML 5, but it would take some time for me to get to know the base code and the project seems to have been abandoned (and it might be quite big for just one person to work on). Anyhow, I’m not promising anything so far.

I do understand that the current approach that the library takes on this (preprocessing and then sending to tidy) is not the most efficient one. However there is another take on efficiency and that’s economic efficiency, and except for really heavy duty Microdata consuming the library does fulfill it’s purpose and the truth is Microdata is a new spec that still has to be widely adopted, so that’s not a real concern right now. So the question is whether if it makes sense to spend the next 3 months writing a parser from scratch, when the one I have does fit my needs (and probably those of 99.999% of PHP developers that may use the library). So far I don’t see the point. But then again my geeky side keeps bugging me to do it right.

Well, anyhow if you find any bugs (and I’m sure there might be many, simply because there are very few microdata examples and I might be missing strange markup some user might come up with ), please report them!!. Other than that I will write a post next on why I believe microdata to be better than microformats and I would also probably write a personal post that I’m sort of owing myself to write.

My first attempt at a Microdata Extractor.

I’ve just pushed to github, version 10^-2 of MD_Extract . It’s my first attempt at a Microdata consumer.

I based the extraction algorithm on the one published by the whatwg , though the implementation has some variations, mainly for clarity of code and also due to the particulars of it being done in PHP. I took Tab’s suggestion and it does a first pass through the HTML tree to collect references to elements with IDs which makes the code so much clearer and nicer than what I was originally planning of doing. In fact I think the algorithm is beautiful ( and it’s O(n), where n is the number of nodes in the html tree ).

I have versioned it at V. 10^-2 because I have not found that many examples to test it, there are also some anticipated problems with character encodings that do not extend ASCII and a couple of little things I’d like to add. But as far as I know, regarding microdata syntax it’s 100% compliant with the latest spec.

Version 0.7.2 of Extensible Microformat Parser Released

Maintenance release, fixed the issues with the changes in the PHP language. I’m sorry to people that reported the issue, due to some google code settings I was not receiving emails. Anyhow, other than maintenance and bug fixing I won’t be maintaining the code anymore since I find microdata to be a way better spec than microformats.

Version 0.7 of Extensible Microformat Parser Released

I’ve just officially uploaded to google code the new version of XMFP for download. This release adds transformation of the parsed microformat content into JSON, a wider array of Microformats support and fixes a bunch of bugs and some design issues from the older version. It is basically the downloadable version of the changes that I’ve added to the SVN version since the last downloadable version.

I’ve also changed the License to an MIT License.

Using PDO in PHP 5 with MYSQL

Installation

You should first refer to the installation guide in PDO Installation.

In my particular case since I was intending to use this with the MySQL database, the configuration options I had to add were:
'--enable-pdo=shared' '--with-pdo-mysql=shared,/var/production/mysql' '--with-sqlite=shared' '--with-pdo-sqlite=shared'

And I also had to add the following lines to my php.ini file:
extension=pdo.so
extension=pdo_mysql.so

Testing the installation


//How to get which drivers are available for PDO
print_r( PDO::getAvailableDrivers() );

which shows
Array ( [0] => mysql )

Starting a PDO Connection

 

	//Starting a PDO Connection
	try {
		//Please fill this in with appropriate data for your Mysql Database, User and User Password.
		$dbh = new PDO('mysql:host=localhost;dbname=test', 'test_user', 'test_user_password');
	} catch(PDOException $e) {
    	echo $e->getMessage();
	}
	print_r($dbh);

Which shows
PDO Object ( )

You might get an exception there if you don’t set up your data ok, or if there is no driver for your DB engine.

For example if I try the following:


	//Starting a PDO Connection
	try {
		//Sending the wrong Password.
		$dbh = new PDO('mysql:host=localhost;dbname=test', 'test_user', 'invalid_password');
	} catch(PDOException $e) {
    	echo $e->getMessage();
	}

I get SQLSTATE[28000] [1045] Access denied for user ‘test_user’@'localhost’ (using password: YES)PDO Object ( ), so you should use try catch and write appropriate code for the event of the PDO connection failing.

Simple PDO Operations

For the purpose of this examples I’m setting up the following Table in a DB called Test, and I’m inserting some basic data.


 Create database test
 create database test;
 use test;
 CREATE TABLE `test_one` (
    `test_oneID` int(10) unsigned NOT NULL,
    `test_char` varchar(127) default NULL
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `test_one` VALUES (1,'aaaa'),(2,'bbbb'),(3,'cccc'),(4,'dddd'),(5,'eeee');

Executing a simple statement


	//Preparing the statement
	$stm = $dbh->prepare("Select * from test_one");
	//Simple execution of the statement
	$stm->execute();
	print_r( $stm );

Shows PDOStatement Object ( [queryString] => Select * from test_one )

Getting the row count


	//Getting row count of the statement
	$count = $stm->rowCount();
	echo("Count: " . $count . "
");

Shows Count: 5

Iterating through the rows of the PDO statement.


	//PDO::FETCH_ASSOC means it fetches an associative array.
	while ( $row = $stm->fetch( PDO::FETCH_ASSOC ) ) {
		print_r($row);
		echo("
");
	}

Which gives:


Array ( [test_oneID] => 1 [test_char] => aaaa )
Array ( [test_oneID] => 2 [test_char] => bbbb )
Array ( [test_oneID] => 3 [test_char] => cccc )
Array ( [test_oneID] => 4 [test_char] => dddd )
Array ( [test_oneID] => 5 [test_char] => eeee )

It is very important to close the cursor to free up resources.


$stm->closeCursor();

Fetching statements into objects


	$stm = $dbh->prepare("Select * from test_one");
	//Simple execution of the statement
	$stm->execute();
	//PDO::FETCH_ASSOC means it fetches an Object.
	$stm->execute();
	while ( $row = $stm->fetch( PDO::FETCH_OBJ ) ) {
		print_r($row);
		echo("
");
	}
	$stm->closeCursor();

By using PDO::FETCH_OBJ we are retrieving the data to a stdClass Object, which is PHP’s base class for Anonymous Objects

what we get in return then is:


stdClass Object ( [test_oneID] => 1 [test_char] => aaaa )
stdClass Object ( [test_oneID] => 2 [test_char] => bbbb )
stdClass Object ( [test_oneID] => 3 [test_char] => cccc )
stdClass Object ( [test_oneID] => 4 [test_char] => dddd )
stdClass Object ( [test_oneID] => 5 [test_char] => eeee )

Extracting to a predefined object

First we define a Mock object, and then we set the fetch mode to this object.


	//A Mock Object used for testing
	class Test {
		private $test_oneID;
		public $test_char;
		public $another_att ="predefined value";
		static $static_att = "An Static Attribute";
	}

	$stm = $dbh->prepare("Select * from test_one");
	$stm->execute();

	//Determining the Class to be fetched to.
	$stm->setFetchMode( PDO::FETCH_CLASS, 'Test');
	$row = $stm->fetch( PDO::FETCH_CLASS );
	print_r($row);
	$stm->closeCursor();

Which gives us the data of the first row inserted into an instance of our mock object:


Test Object
(
    [test_oneID:private] => 1
    [test_char] => aaaa
    [another_att] => predefined value
)

It is important to remark that there are 2 ways to do this

we could have done the same thing by changing:

$row = $stm->fetch( PDO::FETCH_CLASS );

to:

$row = $stm->fetchObject('Test');

Fetching a set of rows to a predefined object

We could have also fetched the full set of results to our mock class.


	$stm = $dbh->prepare("Select * from test_one");
	//For fetching a set of rows to objects.
	$stm->execute();
	while( $row = $stm->fetchObject('Test')) {
		print_r($row);
		echo("
");
	}

Which will give us:


Test Object ( [test_oneID:private] => 1 [test_char] => aaaa [another_att] => predefined value )
Test Object ( [test_oneID:private] => 2 [test_char] => bbbb [another_att] => predefined value )
Test Object ( [test_oneID:private] => 3 [test_char] => cccc [another_att] => predefined value )
Test Object ( [test_oneID:private] => 4 [test_char] => dddd [another_att] => predefined value )
Test Object ( [test_oneID:private] => 5 [test_char] => eeee [another_att] => predefined value )

Binded PDO Statements

One of the big advantaged of PDO is the use of Binded Statements, because:

  • They prevent SQL Injections
  • They generally mean a huge speed increase, because of the preparation of the statement in the DB before execution.

A simple example


	$stm2 = $dbh->prepare("Select * from test_one where test_oneID = :test_oneID");
	$stm2->bindValue(":test_oneID", 3);
	$stm2->execute();
	echo("
");
	$row = $stm2->fetchObject('Test');
	print_r($row);

Which returns:


Test Object ( [test_oneID:private] => 3 [test_char] => cccc [another_att] => predefined value )

Dealing with statement errors

Now let’s suppose the following code, where we are trying to fecth data from a non existent table:


	$sth = $dbh->prepare("Select * from non_existing table");
	$sth->execute();
	$row = $sth->fetch( PDO::FETCH_ASSOC );
	print_r($row);

Well, this code will produce no result at all. To access the PDO Statement error, we must explicitly address it.


	$arr = $sth->errorInfo();
	print_r($arr);

Which will show:

Array ( [0] => 00000 [1] => 1064 [2] => You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘table’ at line 1 )

Setting PDO to treat PDO Statements errors as exceptions

With the following line, we can set up the PDO object to treat all PDO Statements as Exceptions


$dbh->setAttribute(	PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

Now we can deal with statement errors as exceptions, for example:


	$dbh = new PDO('mysql:host=localhost;dbname=test', 'test_user', 'test_user_password');
	$dbh->setAttribute(	PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
	$sth = $dbh->prepare("Select * from test_onedfgsd");
	try {
		$sth->execute();
	} catch ( Exception $e) {
		var_dump($e);
	}

Which will give us:


object(PDOException)#3 (7) {
  ["message:protected"]=>
  string(92) “SQLSTATE[42S02]: Base table or view not found: 1146 Table ‘test.test_onedfgsd’ doesn’t exist”
  ["string:private"]=>

  string(0) “”
  ["code:protected"]=>
  string(5) “42S02″
  ["file:protected"]=>
  string(36) “/home/www/ttiw/article.PDO_tests.php”
  ["line:protected"]=>
  int(231)
  ["trace:private"]=>
  array(1) {
    [0]=>
    array(6) {
      ["file"]=>

      string(36) “/home/www/ttiw/article.PDO_tests.php”
      ["line"]=>
      int(231)
      ["function"]=>
      string(7) “execute”
      ["class"]=>
      string(12) “PDOStatement”
      ["type"]=>
      string(2) “->”
      ["args"]=>
      array(0) {
      }
    }
  }
  ["errorInfo"]=>

  array(3) {
    [0]=>
    string(5) “42S02″
    [1]=>
    int(1146)
    [2]=>
    string(40) “Table ‘test.test_onedfgsd’ doesn’t exist”
  }
}

Well that pretty much wraps up the basics of PDO. As a side note I would suggest that instead of using try-catch on every PDO Connection or PDO Statement, to better use a higher level exception handler and write some code for PDO Exceptions. For how to achieve this see set_exception_handler PHP Function.

Static (Class Definition) attributes on PHP 5

Static attributes (or methods) can be considered class definition attributes (or methods). They are not accessed from a Class instance but directly from the definition of the class itself, as such they can be accessed directly using the Class_Name::$attribute_name syntax or within the class by the Self::$attribute_name syntax.

The following are a couple of tests of this.

We create a simple class with an Static Attribute


class Static_Test {
	public static $counter = 0;
	function show_counter() {
		echo self::$counter;
	}
	function increase_counter() {
		self::$counter++;
	}
}

Accessing the static attribute from the class definition


echo Static_Test::$counter;

Results: 0

Accessing the static attribute within a method from an instance of the Class


$obj1 = new Static_Test;
$obj1->show_counter();

Results: 0

Modifying the attribute value


Static_Test::$counter++;
$obj1->show_counter(); //1

Results: 1


// $obj1->counter++; //Does Nothing.

Results: nothing. Static Attributes can not be accessed directly, even when declared as public.


$obj1->increase_counter();
$obj1->show_counter(); //2

Results: 2. They can be accessed within methods if the syntax used in the method is appropiate.

Accessing the static attribute within a method from another instance of the Class


$obj2 = new Static_Test;
$obj2->increase_counter();

$obj1->show_counter(); //3

Results: 3. Changes done to a static attribute on an instance are reflected across all instances of the class.

A graphical representation of what we’ve done

Class Definition Attributes Graph

Accessing Static Attributes from a Child Class


class Static_Child_Test extends Static_Test {
	function increase_parent_counter() {
		parent::$counter++ ;
	}
}
$obj3 = new Static_Child_Test;
$obj3->increase_parent_counter();

$obj2->show_counter(); //4

Results: 4.

Accessing Static Methods

The same holds true for static methods. An example:


class Static_Test2 {
	public static function static_method() {
		echo 'This is an static Function. Called directly from the Class Definition and not from an Instance of the Class.';
	}
}

Static_Test2::static_method();

Notes on Attribute Visibility on PHP 5

A couple of years ago when I was studying PHP 5 OOP, I did a couple of scripts to test the different aspects of PHP’s particular object implementation. Since I’m currently migrating a whole bunch of PHP 4 scripts to PHP 5, I’ve decided to consult my notes, which I’m presenting here, plus some explanations for extra clarity.

This are my notes on Attributes and Method visibility within Objects.

First we create a simple class with an attribute (variable) of each type of visibility:


class MyClass {
	public $public = 'public';
	protected $protected = 'protected';
	private $private = 'private';
	function printHello() {
		echo $this->public; echo ', ';
		echo $this->protected; echo ', ';
		echo $this->private;
	}
}

Accessing Attributes Directly


$obj = new MyClass;
echo $obj->public; //Works
//echo $obj->protected; //Fatal Error
//echo $obj->private; //Fatal Error

$obj->printHello(); //Prints Everything.

Access Conclusions

Public attributes can be accessed directly.

Protected and Private can not.

Accesibility in Child classes


class MyClass2 extends MyClass {
	function testPublic() {
		echo $this->public; //Prints "Public".
	}
	function testProtected() {
		echo $this->protected; //Prints "Protected".
	}
	function testPrivate() {
		echo $this->private; //Prints nothing.
	}
}
$obj2 = new MyClass2;
$obj2->printHello(); //Prints everything.
$obj2->testProtected(); //Prints "Protected".
$obj2->testPrivate(); //Prints nothing.

Conclusions

Public and Protected attributes can be accessed in child objects.

Private can not.

Redeclaring (or Overriding ) attributes in Child classes


class myClass3 extends myClass {
	public $public = 'new public value';
	protected $protected = 'new protected value';
	private $private = 'new private value';
}
$obj3 = new MyClass3;
$obj3->printHello(); //Prints "new public value, new protected value, private"

Conclusions

Public and Protected attributes can be redeclared in child objects.

Private can not. If redeclared, the redeclaration will not take effect.

Final Conclusions

Note: The same holds true for object methods.

Encapsulation is the reason behind the separation of methods and attributes by levels of visibility. When refactoring an object, Private methods and attributes can be modified freely since they are only accesible within the object and never directly in the scripts or by inheritance in another objects. Protected makes things a little more complex since even though they can not be accessed directly, they can be accessed, redeclared or overriden by child classes. Lots of care needs to be taken when refactoring Public methods or attributes.

Notes on Object Assignment under PHP 5

This is an extended explanation of the examples on Object Assignment (example 19-5) from OOP5 Basics on the PHP Manual.

The full script can be accessed here

First we create the simple class from the example.


class SimpleClass {
	//member declaration
	public $var = 'default value';
	//method declaration
	public function displayVar() {
		echo $this->var;
	}
}

Creating a New Instance


$instance = new SimpleClass();

What exactly does this do?. It Allocates an space in memory for this new object and then references the variable $instance to this space in memory (representing the instance of the object).

Instance equals new object Graph

Assigning a Variable to an already created object


$assigned = $instance;

References the variable $assigned to the object in memory referenced by $instance.

Assigned equals instance Graph

Referencing a Variable to a variable that references an object


$reference &= $instance;

References the variable $reference not to the object in memory referenced by $instance but to the variable $instance itself.

Reference references instance Graph

Cloning an object


$cloned = clone $instance

Allocate in memory space for a new instance of the Object. Copy all the values from the instance referenced by $instance. References the variable $cloned to this new instance of the object in memory.

Cloned clones instance Graph

Full Picture up to now

If we dump with var_dump all of this (see script). We get the following:

Instance: object(SimpleClass)#1 (1) { ["var"]=>  string(13) “default value” }
Assigned: object(SimpleClass)#1 (1) { ["var"]=> string(13) “default value” }
Referenced: object(SimpleClass)#1 (1) { ["var"]=> string(13) “default value” }
Cloned: object(SimpleClass)#2 (1) { ["var"]=> string(13) “default value” }

A good graphical representation of this is:

PHP 5Object Assignment Script State 1

Changing the value of a Member and then Dereferencing $Instance

Now, we’ll do a couple of changes so as to make all of this even clearer.


//Change variable in instance (will be changed in all but cloned)
$instance->var = 'New value';

//Derefence $instance.
$instance = NULL;

What we have just done is change the value of the member var in Obj 1 and then derefencing $instance to obj1.

Instance equals null Graph

Full Final Picture of the script

Let’s see what happens now when we dump all of the variables (see script). Results:

Instance: NULL
Reference: NULL
Assigned: object(SimpleClass)#1 (1) { ["var"]=> string(9) “New value” }
Cloned: object(SimpleClass)#2 (1) { ["var"]=> string(13) “default value” }

A graphical representation of the final state of our script:

PHP 5Object Assignment Script State 2

Post Archive

Post Categories

Search Posts