Class CharBase

A character base is constructed from a collection of character sets. It holds a list of disjoint, non-empty character sets - the base sets - that can be used to construct every character set in the collection it was constructed from.

Guarantees

  • The base sets are guaranteed to be mutually disjoint and non-empty.

  • Every character set in the collection can be constructed by combining (union) a unique set of base sets.

  • The list of base sets is guaranteed to be as small as possible. There are at most min(n^2, o) base sets where n is the number of unique, non-empty character sets in the collection, and o is the number of characters in the union of all character sets in the collection.

Use case

The primary purpose of base sets is to remap alphabets. Some FA operations scale with the number of characters in the alphabet of the FA (e.g. DFA minimization).

Base sets can be used to determine which characters in an FA's alphabet Σ cannot be distinguished by the FA A. Two characters a,b in Σ are indistinguishable if for all inputs w the following hold true:

  1. w is accepted by A iff w with all occurrences of a replaced with b is accepted by A.
  2. w is accepted by A iff w with all occurrences of b replaced with a is accepted by A.

Two indistinguishable characters are guaranteed to be in the same base set.

By treating each base set as a character, it is possible to create a new (smaller) alphabet Γ (|Γ| <= |Σ|) such that the FA A still behaves the same.

Since Γ is typically (several orders of magnitude) smaller, operations that scale with the size of the alphabet can be done more quickly.

Hierarchy

  • CharBase

Constructors

Properties

Methods

Constructors

  • Create the base sets of the given collection of character sets.

    See CharBase to learn more.

    Parameters

    Returns CharBase

    Throws

    RangeError if the collection contains two character sets with different maximums.

Properties

sets: readonly CharSet[]

A list of disjoint, non-empty character sets.

See CharBase to learn more.

Methods

  • Splits the given character set into its base sets.

    The returned array will be a list of indexes of base sets necessary to construct the given character sets. The indexes will be sorted and occur at most once.

    Note: This assumes that charSet is either empty or can be constructed from the base sets. If the assumption is not met, the output of this function will be undefined.

    Parameters

    Returns number[]