Data-ULID

Perl implementation of ULID (Universally Unique Lexicographically Sortable Identifier)

8
6
8
4
Perl
public

NAME

Data::ULID - Universally Unique Lexicographically Sortable Identifier

SYNOPSIS

use Data::ULID qw/ulid binary_ulid ulid_date/;

my $ulid = ulid();  # e.g. 01ARZ3NDEKTSV4RRFFQ69G5FAV
my $bin_ulid = binary_ulid($ulid);
my $datetime_obj = ulid_date($ulid);  # e.g. 2016-06-13T13:25:20
my $uuid = ulid_to_uuid($ulid);
my $ulid2 = uuid_to_ulid($uuid);

DESCRIPTION

Background

This is an implementation in Perl of the ULID identifier type introduced by
Alizain Feerasta. The original implementation (in Javascript) can be found at
https://github.com/alizain/ulid.

ULIDs have several advantages over UUIDs in many contexts. The advantages
include:

  • Lexicographically sortable
  • The canonical representation is shorter than UUID (26 vs 36 characters)
  • Case insensitve and safely chunkable.
  • URL-safe
  • Timestamp can always be easily extracted if so desired.
  • Limited compatibility with UUIDs, since both are 128-bit formats.
    Some conversion back and forth is possible.

Canonical representation

The canonical representation of a ULID is a 26-byte, base32-encoded string
consisting of (1) a 10-byte timestamp with millisecond-resolution; and (2) a
16-byte random part.

Without paramters, the ulid() function returns a new ULID in the canonical
representation, with the current time (up to the nearest millisecond) in the
timestamp part.

$ulid = ulid();

Given a DateTime object as parameter, the function will set the timestamp part
based on that:

$ulid = ulid($datetime_obj);

Given a binary ULID as parameter, it returns the same ULID in canonical
format:

$ulid = ulid($binary_ulid);

Binary representation

The binary representation of a ULID is 16 octets long, with each component in
network byte order (most significant byte first). The components are (1) a
48-bit (6-byte) timestamp in a 32-bit and a 16-bit chunk; (2) an 80-bit
(10-byte) random part in a 16-bit and two 32-bit chunks.

The binary_ulid() function returns a ULID in binary representation. Like
ulid(), it can take no parameters or a DateTime, but it can also take a
ULID in the canonical representation and convert it to binary:

$binary_ulid = binary_ulid($canonical_ulid);

Datetime extraction

The ulid_date() function takes a ULID (canonical or binary) and returns
a DateTime object corresponding to the timestamp it encodes.

$datetime = ulid_date($ulid);

UUID conversion

Very limited conversion between UUIDs and ULIDs is provided.

In order to convert a UUID to ULID:

$ulid = uuid_to_ulid($uuid);

Both binary and hexadecimal UUIDs (with or without separators) are accepted.
The return value is a ULID string in the canonical Base32 form. Note that the
“timestamp” of such a ULID is not to be relied upon.

A ULID can also be converted to a UUID:

$uuid = ulid_to_uuid($binary_or_canonical_ulid);

The UUID returned by this function is a string in the standard hyphenated
hexadecimal format. Note that the variant and version indicators of such a
UUID are meaningless.

UUID conversion limitations

Since both ULIDs and UUIDs are 128-bit, conversion back and forth is possible
in principle. However, the two formats have different semantics. Also, any
given UUID version has at most 122 bits of variance (4 bits being reserved as
variant and version indicators), while all 128 bits of the ULID format can
vary without violating the format description. This means that the conversion
can never be made perfect.

It would be possible to maintain the approximate timestamp of a Version 1 UUID
when converting to ULID, as well as to keep the timestamp of a ULID when
converting to UUID. However, since many UUIDs are not of Version 1, and given
the different semantics of the two formats, the conversion provided by this
module is much simpler and does not preserve the timestamps. In fact, about
the only desirable property that the chosen conversion method has is that it
is uniformly bidirectional, i.e.

$uuid eq ulid_to_uuid(ulid_to_uuid($uuid))

and

$ulid eq uuid_to_ulid(ulid_to_uuid($ulid))

This approach has two immediate consequences:

  1. The “timestamps” of ULIDs created by converting UUIDs are meaningless.
  2. The variant and version indicators of UUIDs created by converting ULIDs are
    similarly wrong. Such UUIDs should only be used in contexts where no checking
    of these fields will be performed and no attempt will be made to extract or
    validate non-random information (i.e. timestamp, MAC address or namespace).

AUTHOR

Baldur Kristinsson, December 2016

LICENSE

This is free software. It may be copied, distributed and modified under the
same terms as Perl itself.

v0.3.3[beta]