NAME
Sort::DataTypes - Sort a list of data using methods relevant to the type
of data
SYNOPSIS
use Sort::DataTypes qw(:all);
DESCRIPTION
This allows you to sort a list of data elements using methods that are
relevant to the type of data it is. This modules does not attempt to be
the fastest sorter on the block. If you are sorting thousands of
elements and need a lot of speed, you should refer to a module
specializing in the specific type of sort you will be doing. However, to
do smaller sorts of different types of data, this is the module to use.
ROUTINES
All sort routines are named sort_METHOD where METHOD is the name of the
method. All sort_METHOD have both a forward and reverse sort:
sort_METHOD(\@list,@args);
sort_rev_METHOD(\@list,@args);
where @args are any additional arguments needed for that sort method.
Corresponding to every sort_METHOD routine is a cmp_METHOD routine which
takes two elements (and possibly additional arguments as required by the
actual method) and returns a -1, 0, or 1 (similar to the cmp or <=>
operators).
$flag = cmp_METHOD($x,$y,@args);
$flag = cmp_rev_METHOD($x,$y,@args);
All sort_METHOD functions can also be used to sort a list using a hash:
sort_METHOD(\@list,[@args],\%hash);
sort_rev_METHOD(\@list,[@args],\%hash);
In this case, elements of @list are used as keys in %hash. The values of
the hash are compared using the cmp_METHOD function to sort the keys in
@list.
For example, if %hash contains the key/value pairs:
foo => 3
bar => 5
ick => 1
and @list contains (foo,bar,ick), then sorting:
sort_numerical(\@list,%hash)
=> @list = (ick,foo,bar)
since "ick" corresponds to a numerical value of 1, "foo" to 3, and "bar"
to 5.
sort_valid_method, cmp_valid_method
use Sort::DataTypes qw(:all)
$flag = sort_valid_method($string);
$flag = cmp_valid_method($string);
These are identical and return 1 if there is a valid sort method
named $string in the module. For example, there is a function
"sort_numerical" defined in this modules, but there is no function
"sort_foobar", so the following would occur:
sort_valid_method("numerical")
=> 1
sort_valid_method("foobar")
=> 0
Note that the methods must NOT include the "sort_" or "cmp_" prefix.
sort_by_method, cmp_by_method
use Sort::DataTypes qw(:all)
sort_by_method($method,\@list [,@args]);
cmp_by_method ($method,$ele1,$ele2 [,@args]);
These sort a list, or compare two elements, using the given method
(which is any string which returns 1 when passed to
sort_valid_method. @args are arguments to pass to the sort.
If the method is not valid, the list is left untouched.
sort_numerical, sort_rev_numerical, cmp_numerical, cmp_rev_numerical
use Sort::DataTypes qw(:all)
sort_numerical(\@list);
sort_rev_numerical(\@list);
sort_numerical(\@list,\%hash);
sort_rev_numerical(\@list,\%hash);
$flag = cmp_numerical($x,$y);
$flag = cmp_rev_numerical($x,$y);
These sorts a list numerically in forward or reverse order, or
compare two elements numerically. There is little reason to use
either of these routines (it would be more efficient to simply call
sort as:
sort { $a <=> $b } @list
but they are included for the sake of completeness (and for use by
the sort_by_method/cmp_by_method routines). Also, if the code is
being automatically generated, numerical sorts won't have to be a
special case.
sort_alphabetic, sort_rev_alphabetic, cmp_alphabetic, cmp_rev_alphabetic
use Sort::DataTypes qw(:all)
sort_alphabetic(\@list);
sort_rev_alphabetic(\@list);
sort_alphabetic(\@list,\%hash);
sort_rev_alphabetic(\@list,\%hash);
$flag = cmp_alphabetic($x,$y);
$flag = cmp_rev_alphabetic($x,$y);
These do alphabetic sorts. As with numerical sorts, there is little
reason to call these, and they are included for the sake of
completeness.
sort_length, sort_rev_length, cmp_length, cmp_rev_length
use Sort::DataTypes qw(:all)
sort_length(\@list);
sort_rev_length(\@list);
sort_length(\@list,\%hash);
sort_rev_length(\@list,\%hash);
$flag = cmp_length($x,$y);
$flag = cmp_rev_length($x,$y);
These take strings and compare them by length and alphabetically if
they are the same length.
sort_ip, sort_rev_ip, cmp_ip, cmp_rev_ip
use Sort::DataTypes qw(:all)
sort_ip(\@list);
sort_rev_ip(\@list);
sort_ip(\@list,\%hash);
sort_rev_ip(\@list,\%hash);
$flag = cmp_ip($x,$y);
$flag = cmp_rev_ip($x,$y);
These sort/compare IP numbers of the form A.B.C.D.
sort_domain, sort_rev_domain, cmp_domain, cmp_rev_domain
use Sort::DataTypes qw(:all)
sort_domain(\@list [,$sep]);
sort_rev_domain(\@list [,$sep]);
sort_domain(\@list, [$sep,] \%hash);
sort_rev_domain(\@list, [$sep,] \%hash);
$flag = cmp_domain($x,$y [,$sep]);
$flag = cmp_rev_domain($x,$y [,$sep]);
These sort domain names (A.B.C...) or anything else consisting of a
class, subclass, subsubclass, etc., with the most significant class
at the right.
Elements in the domain are separated from each other by a period (.)
unless $sep is passed in. If $sep is passed in, it is a regular
expression to split the elements in a domain.
Since the most significant element in the domain is at the right,
any domain ending with ".com" would come before any domain ending in
".edu".
a.b < z.b < a.bb < z.bb < a.c
sort_numdomain, sort_rev_numdomain, cmp_numdomain, cmp_rev_numdomain
use Sort::DataTypes qw(:all)
sort_numdomain(\@list [,$sep]);
sort_rev_numdomain(\@list [,$sep]);
sort_numdomain(\@list, [$sep,] \%hash);
sort_rev_numdomain(\@list, [$sep,] \%hash);
$flag = cmp_numdomain($x,$y [,$sep]);
$flag = cmp_rev_numdomain($x,$y [,$sep]);
A related type of sorting is numdomain sorting. This is identical to
domain sorting except that if two elements in the domain are
integers, numerical sorts will be done. So:
a.2.c < a.11.c
It should be noted that if a field may be either numeric or
alphanumeric, sorting with this method may yield unexpected results.
For example, sorting the three elements:
a.1.b
a.2.b
a.X.b
will use numeric comparisons when comparing the 2nd field of the
first and second elements, but it will use alphabetic comparisons
when comparing the first and third elements (or the second and third
elements).
sort_path, sort_rev_path, cmp_path, cmp_rev_path
use Sort::DataTypes qw(:all)
sort_path(\@list [,$sep]);
sort_rev_path(\@list [,$sep]);
sort_path(\@list, [$sep,] \%hash);
sort_rev_path(\@list, [$sep,] \%hash);
$flag = cmp_path($x,$y [,$sep]);
$flag = cmp_rev_path($x,$y [,$sep]);
This sorts paths (/A/B/C...) or anything else consisting of a class,
subclass, subsubclass, etc., with the most significant class at the
left.
Elements in a path (or classes, subclasses, etc.) are separated from
each other by a slash (/) unless $sep is passed in. If $sep is
passed in, it is a regular expression to split the elements in a
path.
Since the most significant element in the domain is at the left, you
get the following behavior:
a/b < a/z < aa/b < aa/z < b/b
When sorting lists that have a mixture of relative paths and
explicit paths, the explicit paths will come first. So:
/b/c < a/b
sort_numpath, sort_rev_numpath, cmp_numpath, cmp_rev_numpath
use Sort::DataTypes qw(:all)
sort_numpath(\@list [,$sep]);
sort_rev_numpath(\@list [,$sep]);
sort_numpath(\@list, [$sep,] \%hash);
sort_rev_numpath(\@list, [$sep,] \%hash);
$flag = cmp_numpath($x,$y [,$sep]);
$flag = cmp_rev_numpath($x,$y [,$sep]);
A related type of sorting is numpath sorting. This is identical to
path sorting except that if two elements in the path are integers,
numerical sorts will be done. So:
a/2/c < a/11/c
sort_random, sort_rev_random, cmp_random, cmp_rev_random
use Sort::DataTypes qw(:all)
sort_random(\@list);
sort_rev_random(\@list);
sort_random(\@list,\%hash);
sort_rev_random(\@list,\%hash);
$flag = cmp_random($x,$y);
$flag = cmp_rev_random($x,$y);
This uses the Fisher-Yates algorithm to randomly shuffle an array in
place. This routine was derived from the book
The Perl Cookbook
Tom Christiansen and Nathan Torkington
The sort_rev_random is identical, and is included simply for the
situation where the sort routines are being called in some
automatically generated code that may add the 'rev_' prefix.
The cmp_random simply returns a random -1, 0, or 1.
sort_version, sort_rev_version, cmp_version, cmp_rev_version
use Sort::DataTypes qw(:all)
sort_version(\@list);
sort_rev_version(\@list);
sort_version(\@list,\%hash);
sort_rev_version(\@list,\%hash);
$flag = cmp_version($x,$y);
$flag = cmp_rev_version($x,$y);
These sorts a list of version numbers of the form
MAJOR.MINOR.SUBMINOR ... (any number of levels are allowed). The
following examples should illustrate the ordering:
1.1.x < 1.2 < 1.2.x Numerical versions are compared first at
the highest level, then at the next highest,
etc. The first non-equal compare sets the
order.
1.a < 1.b Alphanumeric levels that start with a letter
are compared alphabetically.
1.2a < 1.2 < 1.03a Alphanumeric levels that start with a number
are first compared numerically with only the
numeric part. If they are equal, alphanumeric
levels come before purely numerical levels.
Otherwise, they are compared alphabetically.
1.a < 1.2a An alphanumeric level that starts with a letter
comes before one that starts with a number.
1.01a < 1.1a Two alphanumeric levels that are numerically
equal in the number part and equal in the
remaining part are compared alphabetically.
sort_date, sort_rev_date, cmp_date, cmp_rev_date
use Sort::DataTypes qw(:all)
sort_date(\@list);
sort_rev_date(\@list);
sort_date(\@list,\%hash);
sort_rev_date(\@list,\%hash);
$flag = cmp_date($x,$y);
$flag = cmp_rev_date($x,$y);
These sorts a list of dates. Dates are anything that can be parsed
with Date::Manip.
sort_line, sort_rev_line, cmp_line, cmp_rev_line
use Sort::DataTypes qw(:all)
sort_line(\@list,$n [,$sep]);
sort_rev_line(\@list,$n [,$sep]);
sort_line(\@list,$n, [$sep,] \%hash);
sort_rev_line(\@list,$n, [$sep,] \%hash);
$flag = cmp_line($x,$y,$n [,$sep]);
$flag = cmp_rev_line($x,$y,$n [,$sep]);
These take a list of lines and sort on the Nth field using $sep as
the regular expression splitting the lines into fields. Fields are
numbered starting at 0. If no $sep is given, it defaults to white
space.
sort_numline, sort_rev_numline, cmp_numline, cmp_rev_numline
use Sort::DataTypes qw(:all)
sort_numline(\@list,$n [,$sep]);
sort_rev_numline(\@list,$n [,$sep]);
sort_numline(\@list,$n, [$sep,] \%hash);
sort_rev_numline(\@list,$n, [$sep,] \%hash);
$flag = cmp_numline($x,$y,$n [,$sep]);
$flag = cmp_rev_numline($x,$y,$n [,$sep]);
These are similar but will sort numerically if the Nth field is an
integer, and alphabetically otherwise.
sort_function, sort_rev_function, cmp_function, cmp_rev_function
use Sort::DataTypes qw(:all)
sort_function(\@list,\&func);
sort_rev_function(\@list,\&func);
sort_function(\@list,\&func,\%hash);
sort_rev_function(\@list,\&func,\%hash);
$flag = cmp_function($x,$y,\&func);
$flag = cmp_rev_function($x,$y,\&func);
This is a catch-all sort function. It takes a reference to a
function suitable to compare two elements and return -1, 0, or 1
depending on the order of the elements.
BACKWARDS INCOMPATIBILITIES
The following are a list of backwards incompatibilities.
Version 2.00 handling of hashes
In version 1.xx, when sorting by hash, the hash was passed in as the
hash. As of 2.00, it is passed in by reference to avoid any
confusion with optional arguments.
KNOWN PROBLEMS
None at this point.
LICENSE
This script is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
AUTHOR
Sullivan Beck (sbeck@cpan.org)