Convert::BER - ASN.1 Basic Encoding Rules |
Convert::BER - ASN.1 Basic Encoding Rules
use Convert::BER;
$ber = new Convert::BER;
$ber->encode( INTEGER => 1, SEQUENCE => [ BOOLEAN => 0, STRING => "Hello", ], REAL => 3.7, );
$ber->decode( INTEGER => \$i, SEQUENCE => [ BOOLEAN => \$b, STRING => \$s, ], REAL => \$r, );
Convert::BER
provides an OO interface to encoding and decoding data
using the ASN.1 Basic Encoding Rules (BER), a platform independent way
of encoding structured binary data together with the structure.
new
creates a new Convert::BER
object.
pos
.
pos
returns the offset where the last decode
finished, or the last offset set by pos
. If POS is specified
then POS will be where the next decode starts.
encode
or
decode
returns undef, check this.
FH
, or STDERR if not specified. The
output contains the hex dump of each element, and an ASN.1-like text
representation of that element.
FH
, or STDERR if not specified. The
output is hex with the possibly-printable text alongside.
An opList is a list of operator-value pairs. An operator can
be any of those defined below, or any defined by sub-classing
Convert::BER
, which will probably be derived from the primitives
given here.
The values depend on whether BER is being encoded or decoded:
These operators encode and decode the basic primitive types defined by BER.
A BOOLEAN value is either true or false.
# Encode a TRUE value $ber->encode( BOOLEAN => 1, ) or die;
# Decode a boolean value into $bval $ber->decode( BOOLEAN => \$bval, ) or die;
An INTEGER value is either a positive whole number, or a negative
whole number, or zero. Numbers can either be native perl integers, or
values of the Math::BigInt
class.
$ber->encode( INTEGER => -123456, ) or die;
$ber->decode( INTEGER => \$ival, ) or die;
This is an OCTET STRING, which is an arbitrarily long binary value.
$ber->encode( STRING => "\xC0First character is hex C0", ) or die;
$ber->decode( STRING => \$sval, ) or die;
There is no value for NULL. You often use NULL in ASN.1 when you want to denote that something else is absent rather than just not encoding the 'something else'.
$ber->encode( NULL => undef, ) or die;
$ber->decode( NULL => \$nval, ) or die;
An OBJECT_ID value is an OBJECT IDENTIFIER (also called an OID). This is a hierarchically structured value that is used in protocols to uniquely identify something. For example, SNMP (the Simple Network Management Protocol) uses OIDs to denote the information being requested, and LDAP (the Lightweight Directory Access Protocol, RFC 2251) uses OIDs to denote each attribute in a directory entry.
Each level of the OID hierarchy is either zero or a positive integer.
$ber->encode( OBJECT_ID => '2.5.4.0', # LDAP objectClass ) or die;
$ber->decode( OBJECT_ID => \$oval, ) or die;
The ENUMERATED type is effectively the same as the INTEGER type. It exists so that friendly names can be assigned to certain integer values. To be useful, you should sub-class this operator.
The BIT STRING type is an arbitrarily long string of bits - 0
's and
1
's.
0
and 1
characters. As
these are packed into 8-bit octets when encoding and there may not be
a multiple of 8 bits to be encoded, trailing padding bits are added in
the encoding.
$ber->encode( BIT_STRING => '0011', ) or die;
0
and 1
characters. The string
will have the same number of bits as were encoded (the padding bits
are ignored.)
$ber->decode( BIT_STRING => \$bval, ) or die;
This is a variation of the BIT_STRING operator, which is optimized for writing bit strings which are multiples of 8-bits in length. You can use the BIT_STRING operator to decode BER encoded with the BIT_STRING8 operator (and vice-versa.)
0
and 1
characters.
$ber->encode( BIT_STRING8 => pack('B8', '10110101'), ) or die;
$ber->decode( BIT_STRING8 => \$bval, ) or die;
The REAL type encodes an floating-point number. It requires the POSIX module.
$ber->encode( REAL => 3.14159265358979, ) or die;
$ber->decode( REAL => \$rval, );
The ObjectDescriptor type encodes an ObjectDescriptor string. It is a
sub-class of STRING
.
The UTF8String type encodes a string encoded in UTF-8. It is a
sub-class of STRING
.
The NumericString type encodes a NumericString, which is defined to
only contain the characters 0-9 and space. It is a sub-class of
STRING
.
The PrintableString type encodes a PrintableString, which is defined
to only contain the characters A-Z, a-z, 0-9, space, and the
punctuation characters ()-+=:',./?. It is a sub-class of STRING
.
The TeletexString type encodes a TeletexString, which is a string
containing characters according to the T.61 character set. Each T.61
character may be one or more bytes wide. It is a sub-class of
STRING
.
T61String is an alternative name for TeletexString.
The VideotexString type encodes a VideotexString, which is a
string. It is a sub-class of STRING
.
The IA5String type encodes an IA5String. IA5 (International Alphabet
5) is equivalent to US-ASCII. It is a sub-class of STRING
.
The UTCTime type encodes a UTCTime value. Note this value only
represents years using two digits, so it is not recommended in
Y2K-compliant applications. It is a sub-class of STRING
.
UTCTime values must be strings like:
yymmddHHMM[SS]Z or: yymmddHHMM[SS]sHHMM
Where yy is the year, mm is the month (01-12), dd is the day (01-31), HH is the hour (00-23), MM is the minutes (00-60). SS is the optional seconds (00-61).
The time is either terminated by the literal character Z, or a timezone offset. The ``Z'' character indicates Zulu time or UTC. The timezone offset specifies the sign s, which is + or -, and the difference in hours and minutes.
The GeneralizedTime type encodes a GeneralizedTime value. Unlike
UTCTime
it represents years using 4 digits, so is Y2K-compliant. It
is a sub-class of STRING
.
GeneralizedTime values must be strings like:
yyyymmddHHMM[SS][.U][Z] or: yyyymmddHHMM[SS][.U]sHHMM
Where yyyy is the year, mm is the month (01-12), dd is the day (01-31), HH is the hour (00-23), MM is the minutes (00-60). SS is the optional seconds (00-61). U is the optional fractional seconds value; a comma is permitted instead of a dot before this value.
The time may be terminated by the literal character Z, or a timezone offset. The ``Z'' character indicates Zulu time or UTC. The timezone offset specifies the sign s, which is + or -, and the difference in hours and minutes. If there is timezone specified UTC is assumed.
The GraphicString type encodes a GraphicString value. It is a
sub-class of STRING
.
The VisibleString type encodes a VisibleString value, which is a value
using the ISO646 character set. It is a sub-class of STRING
.
ISO646String is an alternative name for VisibleString.
The GeneralString type encodes a GeneralString value. It is a
sub-class of STRING
.
The UniveralString type encodes a UniveralString value, which is a
value using the ISO10646 character set. Each character in ISO10646 is
4-bytes wide. It is a sub-class of STRING
.
CharacterString is an alternative name for UniversalString.
The BMPString type encodes a BMPString value, which is a value using
the Unicode character set. Each character in the Unicode character set
is 2-bytes wide. It is a sub-class of STRING
.
These operators are used to build constructed types, which contain values in different types, like a C structure.
A SEQUENCE is a complex type that contains other types, a bit like a C structure. Elements inside a SEQUENCE are encoded and decoded in the order given.
$ber->encode( SEQUENCE => [ INTEGER => 123, BOOLEAN => [ 1, 0 ], ] ) or die;
$ber->decode( SEQUENCE => [ INTEGER => \$ival, BOOLEAN => \@bvals, ] ) or die;
A SET is an complex type that contains other types, rather like a SEQUENCE. Elements inside a SET may be present in any order.
$ber->encode( SET => [ INTEGER => 13, STRING => 'Hello', ] ) or die;
$ber->decode( SET => [ STRING => \$sval, INTEGER => \$ival, ] ) or die;
A SEQUENCE_OF is an ordered list of other types.
The remaining opList will then usually contain values which are code references. If the ref is to a list, then the contents of that item in the list are passed as the only argument to the code reference. If the ref is to a hash, then only the key is passed to the code.
@vals = ( [ 10, 'Foo' ], [ 20, 'Bar' ] ); # List of refs to lists $ber->encode( SEQUENCE_OF => [ \@vals, SEQUENCE => [ INTEGER => sub { $_[0][0] }, # Passed a ref to the inner list STRING => sub { $_[0][1] }, # Passed a ref to the inner list ] ] ) or die; %hash = ( 40 => 'Baz', 30 => 'Bletch' ); # Just a hash $ber->decode( SEQUENCE_OF => [ \%hash, SEQUENCE => [ INTEGER => sub { $_[0] }, # Passed the key STRING => sub { $hash{$_[0]} }, # Passed the key ] ] );
$ber->decode( SEQUENCE_OF => [ \$count, # In the following subs, make space at the end of an array, and # return a reference to that newly created space. SEQUENCE => [ INTEGER => sub { $ival[$_[0]] = undef; \$ival[-1] }, STRING => sub { $sval[$_[0]] = undef; \$sval[-1] }, ] ] ) or die;
A SET_OF is an unordered list. This is treated in an identical way to a SEQUENCE_OF, except that no ordering should be inferred from the list passed or returned.
It is sometimes useful to construct or deconstruct BER encodings in several pieces. The BER operator lets you do this.
Convert::BER
object, which will be
inserted into the buffer. If value is undefined then nothing is
added.
$tmp->encode( SEQUENCE => [ INTEGER => 20, STRING => 'Foo', ] ); $ber->encode( BER => $tmp, BOOLEAN => 1 );
Convert::BER
object. This object will contain the remainder of the
current sequence or set being decoded.
# After this, ber2 will contain the encoded INTEGER B<and> STRING. # sval will be ignored and left undefined, but bval will be decoded. The # decode of ber2 will return the integer and string values. $ber->decode( SEQUENCE => [ BER => \$ber2, STRING => \$sval, ], BOOLEAN => \$bval, ); $ber2->decode( INTEGER => \$ival, STRING => \$sval2, );
This is like the BER
operator except that when decoding only the
next item is decoded and placed into the Convert::BER
object
returned. There is no difference when encoding.
Convert::BER
object. This object will only contain the next single
item in the current sequence being decoded.
# After this, ber2 will decode further, and ival and sval # will be decoded. $ber->decode( INTEGER = \$ival, ANY => \$ber2, STRING => \$sval, );
This operator allows you to specify that an element is absent from the encoding.
$ber->encode( SEQUENCE => [ INTEGER => 16, # Will be encoded OPTIONAL => [ INTEGER => undef, # Will not be encoded ], STRING => 'Foo', # Will be encoded ] );
$ber->decode( SEQUENCE => [ INTEGER => \$ival1, OPTIONAL => [ INTEGER => \$ival2, ], STRING => \$sval, ] );
The opList is a list of alternate operator-value pairs. Only one will be encoded, and only one will be decoded.
# Encode the BMPString alternate of the CHOICE $ber->encode( CHOICE => [ 2, PrintableString => 'Printable', TeletexString => 'Teletex/T61', BMPString => 'BMP/Unicode', UniversalString => 'Universal/ISO10646', ] ) or die;
# Decode the above. # Afterwards, $alt will be set to 2, $str will be set to 'BMP/Unicode'. $ber->decode( CHOICE => [ \$alt, PrintableString => \$str, TeletexString => \$str, BMPString => \$str, UniversalString => \$str, ] ) or die;
In BER everything being encoded has a tag, a length, and a value. Normally the tag is derived from the operator - so INTEGER has a different tag from a BOOLEAN, for instance.
In some applications it is necessary to change the tags used. For example, a SET may need to contain two different INTEGER values. Tags may be changed in two ways, either IMPLICITly or EXPLICITly. With IMPLICIT tagging, the new tag completely replaces the old tag. With EXPLICIT tagging, the new tag is used as well as the old tag.
Convert::BER
supports two ways of using IMPLICIT tagging. One
method is to sub-class Convert::BER
, which is described in the next
section. For small applications or those that think sub-classing is
just too much then the operator may be passed an arrayref. The array
must contain two elements, the first is the usual operator name and
the second is the tag value to use, as shown below.
$ber->encode( [ SEQUENCE => 0x34 ] => [ INTEGER => 10, STRING => "A" ] ) or die;
This will encode a sequence, with a tag value of 0x34
, which will
contain and integer and a string which will have their default tag
values.
You may wish to construct your tags using some pre-defined functions
such as &Convert::BER::BER_APPLICATION
,
&Convert::BER::BER_CONTEXT
, etc, instead of calculating the tag
values yourself.
To use EXPLICIT tagging, enclose the original element in a SEQUENCE,
and just override the SEQUENCE's tag as above. Don't forget to set the
constructed bit using &Convert::BER::BER_CONSTRUCTOR
. For example,
the ASN.1 definition:
Foo ::= SEQUENCE { [0] EXPLICIT INTEGER, INTEGER }
might be encoded using this:
$ber->encode( SEQUENCE => [ [ SEQUENCE => &Convert::BER::BER_CONTEXT | &Convert::BER::BER_CONSTRUCTOR | 0 ] => [ INTEGER => 10, ], INTEGER => 11, ], ) or die;
For large applications where operators with non default tags are used
a lot the above mechanism can be very error-prone. For this reason,
Convert::BER
may be sub-classed.
To do this the sub-class must call a static method define
. The
arguments to define
is a list of arrayrefs. Each arrayref will
define one new operator. Each arrayref contains three values, the
first is the name of the operator, the second is how the data is
encoded and the third is the tag value. To aid with the creation of
these arguments Convert::BER
exports some variables and constant
subroutines.
For each operator defined by Convert::BER
, or a Convert::BER
sub-class, a scalar variable with the same name is available for
import, for example $INTEGER
is available from Convert::BER
. And
any operators defined by a new sub-class will be available for import
from that class. One of these variables may be used as the second
element of each arrayref.
Convert::BER
also exports some constant subroutines that can be
used to create the tag value. The subroutines exported are:
BER_BOOLEAN BER_INTEGER BER_BIT_STR BER_OCTET_STR BER_NULL BER_OBJECT_ID BER_SEQUENCE BER_SET
BER_UNIVERSAL BER_APPLICATION BER_CONTEXT BER_PRIVATE BER_PRIMITIVE BER_CONSTRUCTOR
Convert::BER
also provides a subroutine called ber_tag
to calculate
an integer value that will be used to represent a tag. For tags with
values less than 30 this is not needed, but for tags >= 30 then tag
value passed for an operator definition must be the result of ber_tag
ber_tag
takes two arguments, the first is the tag class and the second
is the tag value.
Using this information a sub-class of Convert::BER can be created as shown below.
package Net::LDAP::BER;
use Convert::BER qw(/^(\$|BER_)/);
use strict; use vars qw($VERSION @ISA);
@ISA = qw(Convert::BER); $VERSION = "1.00";
Net::LDAP::BER->define(
# Name Type Tag ########################################
[ REQ_UNBIND => $NULL, BER_APPLICATION | 0x02 ],
[ REQ_COMPARE => $SEQUENCE, BER_APPLICATION | BER_CONSTRUCTOR | 0x0E ],
[ REQ_ABANDON => $INTEGER, ber_tag(BER_APPLICATION, 0x10) ], );
This will create a new class Net::LDAP::BER
which has three new operators
available. This class then may be used as follows
$ber = new Net::LDAP::BER;
$ber->encode( REQ_UNBIND => 0, REQ_COMPARE => [ REQ_ABANDON => 123, ] );
$ber->decode( REQ_UNBIND => \$var, REQ_COMPARE => [ REQ_ABANDON => \$num, ] );
Which will encode or decode the data using the formats and tags
defined in the Net::LDAP::BER
sub-class. It also helps to make the
code more readable.
As well as defining new operators which inherit from existing operators it is also possible to define a new operator and how data is encoded and decoded. The interface for doing this is still changing but will be documented here when it is done. To be continued ...
Convert::BER cannot support tags that contain more bits than can be stored in a scalar variable, typically this is 32 bits.
Convert::BER cannot support items that have a packed length which cannot be stored in 32 bits.
The SET
decode method fails if the encoded order is different to
the opList order.
Graham Barr <gbarr@pobox.com>
Significant POD updates from Chris Ridd <Chris.Ridd@messagingdirect.com>
Copyright (c) 1995-2000 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Convert::BER - ASN.1 Basic Encoding Rules |