Class Serialization in Perl

Submitted by esalazar on Thu, 12/27/2007 - 6:15pm.
Serialization in Perl is the process of saving a class with multiple data types to a scalar (string of bytes). This can be used to save objects to a file or to transmit objects across the Internet. For this article I am going to describe the basics of creating a class in Perl and serialize it using the following packages: Data::Dumper, FreezeThaw, PHP::Serialization, and XML::Dumper.

Data types in Perl
Before we serialize anything we first need to learn a bit about the data types in Perl. There are only three data types in Perl, these are scalars, arrays and hash tables. Below is an example of each.
$myScalar = 'This is a Scalar';
@myArray = ('Element zero','Element one','Element Two');
%myHash = ( keyOne => 'Data1', keyTwo => 'Data2');

print "Scalar Data: " . $myScalar . "\n";
print "Array element at index 1: " . @myArray[1] . "\n";
print "Hash at KeyOne: " . $myHash{'keyOne'} . "\n";
Output:
Scalar Data: This is a Scalar
Array element at index 1: Element one
Hash at KeyOne: Data1
Like in C and C++, you can also have references to the data in each data type. The reference is not the data its self, but a location where the data can be found. By using special syntax the data can be retrieved from the reference itself.
$scalarReference = \$myScalar;
$arrayReference = \@myArray;
$hashReference = \%myHash;

print "Scalar Reference: $scalarReference \n";
print "Array Reference:  $arrayReference \n";
print "Hash Reference:   $hashReference \n";

#Different ways to print the referenced data

print "Scalar Data: " . ${$scalarReference} . "\n"; #Or
print "Scalar Data: " . $$scalarReference . "\n";

print "Array element at index 1: " . @{$arrayReference}[1] . "\n"; #Or
print "Array element at index 1: " . @$arrayReference[1] . "\n";   #Or
print "Array element at index 1: " . $arrayReference->[1] . "\n";

print "Hash at KeyOne: " . ${$hashReference}{'keyOne'} . "\n";  #Or
print "Hash at KeyOne: " . $hashReference->{'keyOne'} . "\n";
Output:
Scalar Reference: SCALAR(0x8153600)
Array Reference:  ARRAY(0x8153630)
Hash Reference:   HASH(0x8153690)
Scalar Data: This is a Scalar
Scalar Data: This is a Scalar
Array element at index 1: Element one
Array element at index 1: Element one
Array element at index 1: Element one
Hash at KeyOne: Data1
Hash at KeyOne: Data1

Classes in Perl
Perl was not originally designed to be an object oriented language, although it is possible to use it as one by using the following techniques. A class in Perl is basically just a module that returns a reference to a hash containing the data that is accessible to the class. This referenced is then “blessed” which means that it is bounded to the module. Perl uses some syntactical sugar to make this processes easy. Below is a test class in Perl that uses each of the data types. This class was created to test the serialization of each package.
#!/usr/bin/perl
#####################################################
#Author:Evan Salazar
#This is a just a simple perl class that uses
#different data types to be used in serialization
#####################################################
package TestClass;
use strict; #Normally use to keep data clean

#The Constructor
sub new {
    print "Creating the Class \n";
    
    my $obj = {
        name => {firstName => 'none', lastName => 'none'}, #Test Hash  
        email => 'none@none.com', #Test Scalar
        skills => undef,         #Test array
        gpgKey => '' };                  #Binary Data
    bless($obj);

    return $obj;

}

#Set the Contact Name
sub setName {
    my $self = shift;
    $self->{'name'}->{'firstName'} = $_[0];
    $self->{'name'}->{'lastName'} = $_[1];
}


#Set the E-Mail
sub setEmail {
    my $self = shift;
    $self->{'email'} = $_[0];    
}

#Set the Skills
sub setSkills {
    my $self = shift;
    $self->{'skills'} = \@_;    
}

#Set the GPG Key
sub setGpgKey {
    my $self = shift;
    $self->{'gpgKey'} = $_[0];    
}

#Print all of the class Data
sub printAll {

    my $self = shift;
 
    #Print Full Name
    print "Name: " .
          $self->{'name'}->{'firstName'} .
          " " .
          $self->{'name'}->{'firstName'} . "\n";
    #Print E-Mail
    print "E-Mail: " . $self->{'email'} . "\n";
    
    #Print Skills
    print "Skills: ";
    for(my $i=0;$i<=$#{$self->{'skills'}};$i++) {
        print $self->{'skills'}[$i] . " ";
        }
    print "\n";
    print "GPG Key: " . unpack('H*',$self->{'gpgKey'}) . "\n"

}

1;

Perl Serialization Methods
There is no built in serialization in Perl, therefor serialization has to be done with an external package. After searching CPAN I found the following packages.
  • Data::Dumper - Serialize data into Perl code that can then be unserialized using the eval() procedure.
  • FreezeThaw - Converts objects to a string for data storage and retrieval.
  • PHP::Serialization - Serialize using a method that is compatible with the serialize() method in PHP.
  • XML::Dumper - Serialize to XML. Dose not work with binary data.

These packages can be installed by downloading the source code and compiling or by using the following 'cpan' commands.
sudo cpan install Data::Dumper
sudo cpan install FreezeThaw
sudo cpan install PHP::Serialization
sudo cpan install XML::Dumper

Serializing TestClass
Below is a program that will serialize TestClass initialized with some sample data. The data will be serialized using all 4 classes.
#!/usr/bin/perl
#####################################################
#Author: Evan Salazar
#Serialize The data in TestClass
#####################################################
use strict;
use Storable;
use PHP::Serialization;
use FreezeThaw;
use TestClass;
use XML::Dumper;
use Data::Dumper;


#Create the New Test Class
my $myclass = TestClass::new();

#Fill the Test Class with data
$myclass->setName('Evan','Salazar');
$myclass->setEmail('esalazar1981@gmail.com');
$myclass->setSkills('Perl','PHP','Java','C','C++');
#-----Comment out to use XML::Dumper--------------#
$myclass->setGpgKey(pack('H*','11061398fe828dcd83a4b9a79594c399')); #Not really my key but 128bits of random data



#Open File for wrting Serialized Data
open(PHPSER,    ">phpser");
open(FREEZETHAW, ">freezeThaw");
open(XMLDUMPER, ">xmldumper.xml");
open(DATADUMPER, ">dataDumper.pl");


#Serialize Using Data::Dumper
print "Data::Dumper\n";
print DATADUMPER Data::Dumper::Dumper($myclass);

#Serialize Using FreezeThaw
print "FreezeThaw\n";
print FREEZETHAW FreezeThaw::freeze($myclass);

#Serialize Using PHP::Serialization
print "PHP::Serializatoin\n";
print PHPSER PHP::Serialization::serialize($myclass);

#Serialize Using XML::Dumper
print "XML::Dumper\n";
my $dump = new XML::Dumper;
print XMLDUMPER $dump->pl2xml($myclass);

This is the output from each method:

dataDumper.pl
$VAR1 = bless( { 'email' => 'esalazar1981@gmail.com', 'skills' => [ 'Perl', 'PHP', 'Java', 'C', 'C++' ], 'name' => { 'firstName' => 'Evan', 'lastName' => 'Salazar' }, 'gpgKey' => '���������Ù' }, 'TestClass' );
FreezeThaw
FrT;@1|>>0|$9|TestClass%8|$5|email$6|gpgKey$4|name$6|skills$22|esalazar1981@gmail.com$16|���������Ù%4|$9|firstName$8|lastName$4|Evan$7|Salazar@5|$4|Perl$3|PHP$4|Java$1|C$3|C++
phpser
O:9:"TestClass":4:{s:5:"email";s:22:"esalazar1981@gmail.com";s:6:"skills";a:5:{i:0;s:4:"Perl";i:1;s:3:"PHP";i:2;s:4:"Java";i:3;s:1:"C";i:4;s:3:"C++";}s:4:"name";a:2:{s:9:"firstName";s:4:"Evan";s:8:"lastName";s:7:"Salazar";}s:6:"gpgKey";s:16:"���������Ù";}
xmldumper.xml
<perldata> <hashref blessed_package="TestClass" memory_address="0x8314de4"> <item key="email">esalazar1981@gmail.com</item> <item key="gpgKey">���������Ù</item> <item key="name"> <hashref memory_address="0x8152c28"> <item key="firstName">Evan</item> <item key="lastName">Salazar</item> </hashref> </item> <item key="skills"> <arrayref memory_address="0x823bcf0"> <item key="0">Perl</item> <item key="1">PHP</item> <item key="2">Java</item> <item key="3">C</item> <item key="4">C++</item> </arrayref> </item> </hashref> </perldata>

Unserializing the Data
Below is the program that will unseralize the data from the previous program. Note that any binary data will not be unseralized using XML::Dumper. If you need to serialize binary data with this package, consider first encoding it using UUencode or base64 encode.
#!/usr/bin/perl
#####################################################
#Author: Evan Salazar
#Unserialize the data in TestClass
#####################################################
use strict;
use Storable;
use PHP::Serialization;
use FreezeThaw;
use TestClass;
use XML::Dumper;
use Data::Dumper;

#Set Slurp mode for reading
local($/) = undef;


#Open File for reading  Serialized Data
open(PHPSER,    "phpser");
open(FREEZETHAW, "freezeThaw");
open(XMLDUMPER, "xmldumper.xml");
open(DATADUMPER, "dataDumper.pl");

#Unserialize Using Data::Dumper
print "Data::Dumper\n";
my $dataDumper = <DATADUMPER>;
#print $dataDumper
my $VAR1;
eval($dataDumper); #Data is stored into variable $VAR1
$VAR1->printAll;
print "\n";

#Unserialize Using FreezeThaw
print "FreezeThaw\n";
my $freezeThaw = <FREEZETHAW>;
#print $freezeThaw;
my ($classFreeze) = FreezeThaw::thaw($freezeThaw);
$classFreeze->printAll;
print "\n";


#Unserialize Using PHP::Serialization
print "PHP::Serializatoin\n";
my $phpser = <PHPSER>;
#print $phpser;
my $classPHP = PHP::Serialization::unserialize($phpser);
bless($classPHP,'TestClass');
$classPHP->printAll;
print "\n";


#Unserialize Using XML::Dumper
print "XML::Dumper\n";
my $xmlDumper = <XMLDUMPER>;
my $dump = new XML::Dumper;
my $xmlClass = $dump->xml2pl($xmlDumper);
$xmlClass->printAll;

Program Output
Data::Dumper
Name: Evan Evan
E-Mail: esalazar1981@gmail.com
Skills: Perl PHP Java C C++
GPG Key: 11061398fe828dcd83a4b9a79594c399

FreezeThaw
Name: Evan Evan
E-Mail: esalazar1981@gmail.com
Skills: Perl PHP Java C C++
GPG Key: 11061398fe828dcd83a4b9a79594c399

PHP::Serializatoin
Name: Evan Evan
E-Mail: esalazar1981@gmail.com
Skills: Perl PHP Java C C++
GPG Key: 11061398fe828dcd83a4b9a79594c399

XML::Dumper

not well-formed (invalid token) at line 4, column 21, byte 148 at /usr/lib/perl5/XML/Parser.pm line 187
If you remove the binary data you will get
Data::Dumper
Name: Evan Evan
E-Mail: esalazar1981@gmail.com
Skills: Perl PHP Java C C++
GPG Key:

FreezeThaw
Name: Evan Evan
E-Mail: esalazar1981@gmail.com
Skills: Perl PHP Java C C++
GPG Key:

PHP::Serializatoin
Name: Evan Evan
E-Mail: esalazar1981@gmail.com
Skills: Perl PHP Java C C++
GPG Key:

XML::Dumper
Name: Evan Evan
E-Mail: esalazar1981@gmail.com
Skills: Perl PHP Java C C++
GPG Key:

Conclusion
All of these methods worked well for the given class, although binary to ASCII encoding is necessary for XML serialization. I personally prefer XML serialization because the data can be used with a wide variety of languages. Also XML serialization is human readable.
Submitted by Sasha on Fri, 12/28/2007 - 2:33am.

1. It is called 'scalar', not 'scaler'.
2. not '@myArray[1]' but $myArray[1]
3. Best module to use for serialization is Storable, and it is in core Perl along with Data::Dumper. You don't need to install them.

Submitted by esalazar on Fri, 12/28/2007 - 6:38pm.

I apologize for my misspelling of 'scalar', this has been corrected.

I have found that both '@myArray[1]' and '$myArray[1]' work the same. It would be nice to know the advantages of using one versus the other.

I have not used Storable, but I will try it out.

I seem to recall having to install Data::Dumper, I suppose that it could be in core Perl. I will look into it.

Submitted by Sasha on Sat, 12/29/2007 - 5:08pm.

@myArray[1] is a list. You can write @myArray[1,2] to get list of second and third element. $myArray[1] is second element nad is a scalar.

Data::Dumper is core from perl 5.5.

Submitted by Anonymous Coward on Thu, 01/24/2008 - 1:27am.

my @array = ('one', 'two', 'three');
my $y = @array[0];
my $x = $array[0];

# $y == 1
# $x eq 'one'

# P.S.
# What's with the CAPTCHA? It's so trivial as to be no deterrent at all.
# It would be far harder to write the spam bot than to solve it.
# P. P. S.
# I'm not implying that writing a spam bot is hard.

Submitted by esalazar on Sun, 01/27/2008 - 7:19pm.

Your right about the difference between @array[0] and $array[0]. For the printing above, both work because @array[0] produces a sub array with only one element, but $array[0] would be more "correct".

We have a trivial CAPTCHA just to block 99.9% of spam bots. Our page is not yet popular enough for anyone to waist there time writing a custom spam bot for the sole purpose of spamming us. This is based on the philosophy from www.codinghorror.com who uses a far more trivial CAPTCHA with much more success.

Submitted by Anonymous but not a coward(that was rude) on Fri, 08/22/2008 - 8:58am.

What i would like to ask is if you serialize the object and deserialize it again.
say to a variable xyz. can i use this xyz variable to call any of the subroutines of the class ? i am aksing it because i have seen that once you deserialze it and try to use it fails. but once u load the module in ur code it works? could to explain the logic behind it if you know

Submitted by esalazar on Mon, 08/25/2008 - 7:49am.

The subroutines are stored inside the module. Only the data for the object is serialized, that's why once you deserialize you need to bless the object with the object name so you can use the subroutines of the module with the object.

Submitted by College Papers on Sat, 01/02/2010 - 4:16am.

Interesting post. I normally comment AFTER reading the posts I visit. If I am ona interesting blog, but, do not like the post, or do not find it worthy to comment on, I refrain from doing so. Thanks for sharing.

free essay
webpage screenshot

Submitted by odżywki on Tue, 02/02/2010 - 6:28am.

Very interesting, i like this website.Look my website : Odżywki