Bit::Vector - efficient base class implementing bit vectors.


NAME

Bit::Vector - efficient base class implementing bit vectors.


SUPPORTED PLATFORMS

This module is not included with the standard ActivePerl distribution. It is available as a separate download using PPM.

PREFACE

This module implements bit vectors of arbitrary size and provides efficient methods for handling them.

This goes far beyond the built-in capabilities of Perl for handling bit vectors (compare with the method list below).

Moreover, the C core of this module can be used ``stand-alone'' in other C applications; Perl is not necessarily required.

The module is intended to serve as a base class for other applications or application classes, such as implementing sets or performing big integer arithmetic.

All methods are implemented in C internally for maximum performance.

The module also provides overloaded arithmetic and relational operators for maximum ease of use (Perl only).

Note that there is (of course) a little speed penalty to pay for overloaded operators. If speed is crucial, use the methods of this module directly instead of their corresponding overloaded operators!

This module is useful for a large range of different tasks:

-
for example for implementing sets and performing set operations (like union, difference, intersection, complement, check for subset relationship etc.),

-
as a basis for many efficient algorithms, for instance the ``Sieve of Erathostenes'' (for calculating prime numbers),

(The complexities of the methods in this module are usually either O(1) or O(n/b), where ``b'' is the number of bits in a machine word on your system.)

-
for shift registers of arbitrary length (for example for cyclic redundancy checksums),

-
to calculate ``look-ahead'', ``first'' and ``follow'' character sets for parsers and compiler-compilers,

-
for graph algorithms,

-
for efficient storage and retrieval of status information,

-
for performing text synthesis ruled by boolean expressions,

-
for ``big integer'' arithmetic with arbitrarily large integers,

-
for manipulations of chunks of bits of arbitrary size,

-
for bitwise processing of audio CD wave files,

-
to convert formats of data files,

and more.

(A number of example applications is available from my web site at http://www.engelschall.com/u/sb/download/.)

A large number of import/export methods allow you to access individual bits, contiguous ranges of bits, machine words, arbitrary chunks of bits, lists (arrays) of chunks of bits or machine words and a whole bit vector at once (for instance for blockwrite/-read to and from a file).

You can also import and export the contents of a bit vector in binary, hexadecimal and decimal representation as well as ``.newsrc'' style enumerations.

Note that this module is specifically designed for efficiency, which is also the reason why its methods are implemented in C.

To further increase execution speed, the module doesn't use bytes as its basic storage unit, but rather uses machine words, assuming that a machine word is the most efficiently handled size of all scalar types on all machines (that's what the ANSI C standard proposes and assumes anyway).

In order to achieve this, it automatically determines the number of bits in a machine word on your system and then adjusts its internal configuration constants accordingly.

The greater the size of this basic storage unit, the better the complexity (= execution speed) of the methods in this module, but also the greater the average waste of unused bits in the last word.

Example applications:

See the module ``Set::IntRange'' for an easy-to-use module for sets of integers of arbitrary ranges.

See the module ``Math::MatrixBool'' for an efficient implementation of boolean matrices and boolean matrix operations.

(Both modules are also available from my web site at http://www.engelschall.com/u/sb/download/ or any CPAN server.)

An application relying crucially on this ``Bit::Vector'' module is ``Slice'', a tool for generating different document versions out of a single master file, ruled by boolean expressions (``include english version of text plus examples but not ...'').

(See also http://www.engelschall.com/sw/slice/.)

This tool is itself part of another tool, ``Website META Language'' (``WML''), which allows you to generate and maintain large web sites.

Among many other features, it allows you to define your own HTML tags which will be expanded either at generation or at run time, depending on your choice.

(See also http://www.engelschall.com/sw/wml/.)


SYNOPSIS

CLASS METHODS (implemented in C)

  Version
      $version = Bit::Vector->Version();
  Word_Bits
      $bits = Bit::Vector->Word_Bits();  #  bits in a machine word
  Long_Bits
      $bits = Bit::Vector->Long_Bits();  #  bits in an unsigned long
  new
      $vector = Bit::Vector->new($bits);  #  bit vector constructor
  new_Hex
      $vector = Bit::Vector->new_Hex($bits,$string);
  new_Bin
      $vector = Bit::Vector->new_Bin($bits,$string);
  new_Dec
      $vector = Bit::Vector->new_Dec($bits,$string);
  new_Enum
      $vector = Bit::Vector->new_Enum($bits,$string);
  Concat_List
      $vector = Bit::Vector->Concat_List(@vectors);

OBJECT METHODS (implemented in C)

  new
      $vec2 = $vec1->new($bits);  #  alternative call of constructor
  Shadow
      $vec2 = $vec1->Shadow();  #  new vector, same size but empty
  Clone
      $vec2 = $vec1->Clone();  #  new vector, exact duplicate
  Concat
      $vector = $vec1->Concat($vec2);
  Concat_List
      $vector = $vec1->Concat_List($vec2,$vec3,...);
  Size
      $bits = $vector->Size();
  Resize
      $vector->Resize($bits);
      $vector->Resize($vector->Size()+5);
      $vector->Resize($vector->Size()-5);
  Copy
      $vec2->Copy($vec1);
  Empty
      $vector->Empty();
  Fill
      $vector->Fill();
  Flip
      $vector->Flip();
  Primes
      $vector->Primes();  #  Sieve of Erathostenes
  Reverse
      $vec2->Reverse($vec1);
  Interval_Empty
      $vector->Interval_Empty($min,$max);
  Interval_Fill
      $vector->Interval_Fill($min,$max);
  Interval_Flip
      $vector->Interval_Flip($min,$max);
  Interval_Reverse
      $vector->Interval_Reverse($min,$max);
  Interval_Scan_inc
      if (($min,$max) = $vector->Interval_Scan_inc($start))
  Interval_Scan_dec
      if (($min,$max) = $vector->Interval_Scan_dec($start))
  Interval_Copy
      $vec2->Interval_Copy($vec1,$offset2,$offset1,$length);
  Interval_Substitute
      $vec2->Interval_Substitute($vec1,$off2,$len2,$off1,$len1);
  is_empty
      if ($vector->is_empty())
  is_full
      if ($vector->is_full())
  equal
      if ($vec1->equal($vec2))
  Lexicompare (unsigned)
      if ($vec1->Lexicompare($vec2) == 0)
      if ($vec1->Lexicompare($vec2) != 0)
      if ($vec1->Lexicompare($vec2) <  0)
      if ($vec1->Lexicompare($vec2) <= 0)
      if ($vec1->Lexicompare($vec2) >  0)
      if ($vec1->Lexicompare($vec2) >= 0)
  Compare (signed)
      if ($vec1->Compare($vec2) == 0)
      if ($vec1->Compare($vec2) != 0)
      if ($vec1->Compare($vec2) <  0)
      if ($vec1->Compare($vec2) <= 0)
      if ($vec1->Compare($vec2) >  0)
      if ($vec1->Compare($vec2) >= 0)
  to_Hex
      $string = $vector->to_Hex();
  from_Hex
      $vector->from_Hex($string);
  to_Bin
      $string = $vector->to_Bin();
  from_Bin
      $vector->from_Bin($string);
  to_Dec
      $string = $vector->to_Dec();
  from_Dec
      $vector->from_Dec($string);
  to_Enum
      $string = $vector->to_Enum();  #  e.g. "2,3,5-7,11,13-19"
  from_Enum
      $vector->from_Enum($string);
  Bit_Off
      $vector->Bit_Off($index);
  Bit_On
      $vector->Bit_On($index);
  bit_flip
      $bit = $vector->bit_flip($index);
  bit_test, contains
      $bit = $vector->bit_test($index);
      $bit = $vector->contains($index);
      if ($vector->bit_test($index))
      if ($vector->contains($index))
  Bit_Copy
      $vector->Bit_Copy($index,$bit);
  LSB (least significant bit)
      $vector->LSB($bit);
  MSB (most significant bit)
      $vector->MSB($bit);
  lsb (least significant bit)
      $bit = $vector->lsb();
  msb (most significant bit)
      $bit = $vector->msb();
  rotate_left
      $carry = $vector->rotate_left();
  rotate_right
      $carry = $vector->rotate_right();
  shift_left
      $carry = $vector->shift_left($carry);
  shift_right
      $carry = $vector->shift_right($carry);
  Move_Left
      $vector->Move_Left($bits);  #  shift left "$bits" positions
  Move_Right
      $vector->Move_Right($bits);  #  shift right "$bits" positions
  Insert
      $vector->Insert($offset,$bits);
  Delete
      $vector->Delete($offset,$bits);
  increment
      $carry = $vector->increment();
  decrement
      $carry = $vector->decrement();
  add
      $carry = $vec3->add($vec1,$vec2,$carry);
  subtract
      $carry = $vec3->subtract($vec1,$vec2,$carry);
  Negate
      $vec2->Negate($vec1);
  Absolute
      $vec2->Absolute($vec1);
  Sign
      if ($vector->Sign() == 0)
      if ($vector->Sign() != 0)
      if ($vector->Sign() <  0)
      if ($vector->Sign() <= 0)
      if ($vector->Sign() >  0)
      if ($vector->Sign() >= 0)
  Multiply
      $vec3->Multiply($vec1,$vec2);
  Divide
      $quot->Divide($vec1,$vec2,$rest);
  GCD (Greatest Common Divisor)
      $vec3->GCD($vec1,$vec2);
  Block_Store
      $vector->Block_Store($buffer);
  Block_Read
      $buffer = $vector->Block_Read();
  Word_Size
      $size = $vector->Word_Size();  #  number of words in "$vector"
  Word_Store
      $vector->Word_Store($offset,$word);
  Word_Read
      $word = $vector->Word_Read($offset);
  Word_List_Store
      $vector->Word_List_Store(@words);
  Word_List_Read
      @words = $vector->Word_List_Read();
  Word_Insert
      $vector->Word_Insert($offset,$count);
  Word_Delete
      $vector->Word_Delete($offset,$count);
  Chunk_Store
      $vector->Chunk_Store($chunksize,$offset,$chunk);
  Chunk_Read
      $chunk = $vector->Chunk_Read($chunksize,$offset);
  Chunk_List_Store
      $vector->Chunk_List_Store($chunksize,@chunks);
  Chunk_List_Read
      @chunks = $vector->Chunk_List_Read($chunksize);
  Index_List_Remove
      $vector->Index_List_Remove(@indices);
  Index_List_Store
      $vector->Index_List_Store(@indices);
  Index_List_Read
      @indices = $vector->Index_List_Read();
  Union
      $set3->Union($set1,$set2);
  Intersection
      $set3->Intersection($set1,$set2);
  Difference
      $set3->Difference($set1,$set2);
  ExclusiveOr
      $set3->ExclusiveOr($set1,$set2);
  Complement
      $set2->Complement($set1);
  subset
      if ($set1->subset($set2))  #  true if $set1 is subset of $set2
  Norm
      $norm = $set->Norm();
  Min
      $min = $set->Min();
  Max
      $max = $set->Max();
  Multiplication
      $matrix3->Multiplication($rows3,$cols3,
                      $matrix1,$rows1,$cols1,
                      $matrix2,$rows2,$cols2);
  Product
      $matrix3->Product($rows3,$cols3,
               $matrix1,$rows1,$cols1,
               $matrix2,$rows2,$cols2);
  Closure
      $matrix->Closure($rows,$cols);
  Transpose
      $matrix2->Transpose($rows2,$cols2,$matrix1,$rows1,$cols1);

CLASS METHODS (implemented in Perl)

  Configuration
      $config = Bit::Vector->Configuration();
      Bit::Vector->Configuration($config);
      $oldconfig = Bit::Vector->Configuration($newconfig);

OVERLOADED OPERATORS (implemented in Perl)

  String Conversion
      $string = "$vector";             #  depending on configuration
      print "\$vector = '$vector'\n";
  Emptyness
      if ($vector)  #  if not empty (non-zero)
      if (! $vector)  #  if empty (zero)
      unless ($vector)  #  if empty (zero)
  Complement (one's complement)
      $vector2 = ~$vector1;
      $vector = ~$vector;
  Negation (two's complement)
      $vector2 = -$vector1;
      $vector = -$vector;
  Norm
      $norm = abs($vector);  #  depending on configuration
  Absolute
      $vector2 = abs($vector1);  #  depending on configuration
  Concatenation
      $vector3 = $vector1 . $vector2;
      $vector1 .= $vector2;
      $vector1 = $vector2 . $vector1;
      $vector2 = $vector1 . $scalar;  #  depending on configuration
      $vector2 = $scalar . $vector1;
      $vector .= $scalar;
  Duplication
      $vector2 = $vector1 x $factor;
      $vector x= $factor;
  Shift Left
      $vector2 = $vector1 << $bits;
      $vector <<= $bits;
  Shift Right
      $vector2 = $vector1 >> $bits;
      $vector >>= $bits;
  Union
      $vector3 = $vector1 | $vector2;
      $vector1 |= $vector2;
      $vector2 = $vector1 | $scalar;
      $vector |= $scalar;
      $vector3 = $vector1 + $vector2;  #  depending on configuration
      $vector1 += $vector2;
      $vector2 = $vector1 + $scalar;
      $vector += $scalar;
  Intersection
      $vector3 = $vector1 & $vector2;
      $vector1 &= $vector2;
      $vector2 = $vector1 & $scalar;
      $vector &= $scalar;
      $vector3 = $vector1 * $vector2;  #  depending on configuration
      $vector1 *= $vector2;
      $vector2 = $vector1 * $scalar;
      $vector *= $scalar;
  ExclusiveOr
      $vector3 = $vector1 ^ $vector2;
      $vector1 ^= $vector2;
      $vector2 = $vector1 ^ $scalar;
      $vector ^= $scalar;
  Set Difference
      $vector3 = $vector1 - $vector2;  #  depending on configuration
      $vector1 -= $vector2;
      $vector1 = $vector2 - $vector1;
      $vector2 = $vector1 - $scalar;
      $vector2 = $scalar - $vector1;
      $vector -= $scalar;
  Addition
      $vector3 = $vector1 + $vector2;  #  depending on configuration
      $vector1 += $vector2;
      $vector2 = $vector1 + $scalar;
      $vector += $scalar;
  Subtraction
      $vector3 = $vector1 - $vector2;  #  depending on configuration
      $vector1 -= $vector2;
      $vector1 = $vector2 - $vector1;
      $vector2 = $vector1 - $scalar;
      $vector2 = $scalar - $vector1;
      $vector -= $scalar;
  Multiplication
      $vector3 = $vector1 * $vector2;  #  depending on configuration
      $vector1 *= $vector2;
      $vector2 = $vector1 * $scalar;
      $vector *= $scalar;
  Division
      $vector3 = $vector1 / $vector2;
      $vector1 /= $vector2;
      $vector1 = $vector2 / $vector1;
      $vector2 = $vector1 / $scalar;
      $vector2 = $scalar / $vector1;
      $vector /= $scalar;
  Modulo
      $vector3 = $vector1 % $vector2;
      $vector1 %= $vector2;
      $vector1 = $vector2 % $vector1;
      $vector2 = $vector1 % $scalar;
      $vector2 = $scalar % $vector1;
      $vector %= $scalar;
  Increment
      ++$vector;
      $vector++;
  Decrement
      --$vector;
      $vector--;
  Lexical Comparison (unsigned)
      $cmp = $vector1 cmp $vector2;
      if ($vector1 lt $vector2)
      if ($vector1 le $vector2)
      if ($vector1 gt $vector2)
      if ($vector1 ge $vector2)
      $cmp = $vector cmp $scalar;
      if ($vector lt $scalar)
      if ($vector le $scalar)
      if ($vector gt $scalar)
      if ($vector ge $scalar)
  Comparison (signed)
      $cmp = $vector1 <=> $vector2;
      if ($vector1 < $vector2)  #  depending on configuration
      if ($vector1 <= $vector2)
      if ($vector1 > $vector2)
      if ($vector1 >= $vector2)
      $cmp = $vector <=> $scalar;
      if ($vector < $scalar)  #  depending on configuration
      if ($vector <= $scalar)
      if ($vector > $scalar)
      if ($vector >= $scalar)
  Equality
      if ($vector1 eq $vector2)
      if ($vector1 ne $vector2)
      if ($vector eq $scalar)
      if ($vector ne $scalar)
      if ($vector1 == $vector2)
      if ($vector1 != $vector2)
      if ($vector == $scalar)
      if ($vector != $scalar)
  Subset Relationship
      if ($vector1 <= $vector2)  #  depending on configuration
  True Subset Relationship
      if ($vector1 < $vector2)  #  depending on configuration
  Superset Relationship
      if ($vector1 >= $vector2)  #  depending on configuration
  True Superset Relationship
      if ($vector1 > $vector2)  #  depending on configuration


IMPORTANT NOTES


DESCRIPTION

CLASS METHODS (implemented in C)

OBJECT METHODS (implemented in C)

CLASS METHODS (implemented in Perl)

Scalar input:

In the case of scalar input, ``^bit$'', ``^index'', or ``^indice'' all cause scalar input to be considered to represent a bit index, i.e., ``$vector ^= 5;'' will flip bit #5 in the given bit vector (this is essentially the same as ``$vector->bit_flip(5);'').

Note that ``bit indices'' is the default setting for ``scalar input''.

The keyword ``^hex'' will cause scalar input to be considered as being in hexadecimal, i.e., ``$vector ^= 5;'' will flip bit #0 and bit #2 (because hexadecimal ``5'' is binary ``0101'').

(Note though that hexadecimal input should always be enclosed in quotes, otherwise it will be interpreted as a decimal number by Perl! The example relies on the fact that hexadecimal 0-9 and decimal 0-9 are the same.)

The keyword ``^bin'' will cause scalar input to be considered as being in binary format. All characters except ``0'' and ``1'' are forbidden in this case (i.e., produce a syntax error).

``$vector ^= '0101';'', for instance, will flip bit #0 and bit #2.

The keyword ``^dec'' causes scalar input to be considered as integers in decimal format, i.e., ``$vector ^= 5;'' will flip bit #0 and bit #2 (because decimal ``5'' is binary ``0101'').

(Note though that all decimal input should be enclosed in quotes, because for large numbers, Perl will use scientific notation internally for representing them, which produces a syntax error because scientific notation is neither supported by this module nor needed.)

Finally, the keyword ``^enum'' causes scalar input to be considered as being a list (``enumeration'') of indices and ranges of (contiguous) indices, i.e., ``$vector |= '2,3,5,7-13,17-23';'' will cause bits #2, #3, #5, #7 through #13 and #17 through #23 to be set.

Operator semantics:

Several overloaded operators can have two distinct functions depending on this setting.

The affected operators are: ``+'', ``-'', ``*'', ``<'', ``<='', ``>'' and ``>=''.

With the default setting, ``set operations'', these operators perform:

  +       set union                           ( set1  u   set2 )
  -       set difference                      ( set1  \   set2 )
  *       set intersection                    ( set1  n   set2 )
  <       true subset relationship            ( set1  <   set2 )
  <=      subset relationship                 ( set1  <=  set2 )
  >       true superset relationship          ( set1  >   set2 )
  >=      superset relationship               ( set1  >=  set2 )

With the alternative setting, ``arithmetic operations'', these operators perform:

  +       addition                            ( num1  +   num2 )
  -       subtraction                         ( num1  -   num2 )
  *       multiplication                      ( num1  *   num2 )
  <       "less than" comparison              ( num1  <   num2 )
  <=      "less than or equal" comparison     ( num1  <=  num2 )
  >       "greater than" comparison           ( num1  >   num2 )
  >=      "greater than or equal" comparison  ( num1  >=  num2 )

Note that these latter comparison operators (``<'', ``<='', ``>'' and ``>='') regard their operands as being SIGNED.

To perform comparisons with UNSIGNED operands, use the operators ``lt'', ``le'', ``gt'' and ``ge'' instead (in contrast to the operators above, these operators are NOT affected by the ``operator semantics'' setting).

String output:

There are four methods which convert the contents of a given bit vector into a string: ``to_Hex()'', ``to_Bin()'', ``to_Dec()'' and ``to_Enum()'' (not counting ``Block_Read()'', since this method does not return a human-readable string).

(For conversion to octal, see the description of the method ``Chunk_List_Read()''.)

Therefore, there are four possible formats into which a bit vector can be converted when it is enclosed in double quotes, for example:

  print "\$vector = '$vector'\n";
  $string = "$vector";

Hence you can set ``string output'' to four different values: To ``hex'' for hexadecimal format (which is the default), to ``bin'' for binary format, to ``dec'' for conversion to decimal numbers and to ``enum'' for conversion to enumerations (``.newsrc'' style sets).

BEWARE that the conversion to decimal numbers is inherently slow; it can easily take up several seconds for a single large bit vector!

Therefore you should store the decimal strings returned to you rather than converting a given bit vector again.

Examples:

The default setting as returned by the method ``Configuration()'' is:

        Scalar Input       = Bit Index
        Operator Semantics = Set Operators
        String Output      = Hexadecimal

Performing a statement such as:

  Bit::Vector->Configuration("in=bin,ops=arithmetic,out=bin");
  print Bit::Vector->Configuration(), "\n";

yields the following output:

        Scalar Input       = Binary
        Operator Semantics = Arithmetic Operators
        String Output      = Binary

Note that you can always feed this output back into the ``Configuration()'' method to restore that setting later.

This also means that you can enter the same given setting with almost any degree of verbosity you like (as long as the required keywords appear and no ambiguities arise).

Note further that any aspect you do not specify is not changed, i.e., the statement

  Bit::Vector->Configuration("operators = arithmetic");

leaves all other aspects unchanged.

OVERLOADED OPERATORS (implemented in Perl)


SEE ALSO

Set::IntRange(3), Math::MatrixBool(3), Math::MatrixReal(3), DFA::Kleene(3), Math::Kleene(3), Graph::Kruskal(3).

perl(1), perlsub(1), perlmod(1), perlref(1), perlobj(1), perlbot(1), perltoot(1), perlxs(1), perlxstut(1), perlguts(1), overload(3).


VERSION

This man page documents ``Bit::Vector'' version 5.7.


AUTHOR

  Steffen Beyer
  Ainmillerstr. 5 / App. 513
  D-80801 Munich
  Germany
  mailto:sb@engelschall.com
  http://www.engelschall.com/u/sb/download/

Please contact me by e-mail whenever possible!


COPYRIGHT

Copyright (c) 1995, 1996, 1997, 1998, 1999 by Steffen Beyer. All rights reserved.


LICENSE

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, i.e., under the terms of the ``Artistic License'' or the ``GNU General Public License''.

The C library at the core of this Perl module can additionally be redistributed and/or modified under the terms of the ``GNU Library General Public License''.

Please refer to the files ``Artistic.txt'', ``GNU_GPL.txt'' and ``GNU_LGPL.txt'' in this distribution for details!


DISCLAIMER

This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the ``GNU General Public License'' for more details.

 Bit::Vector - efficient base class implementing bit vectors.