Readonly
setsA list of disjoint, non-empty character sets.
See CharBase to learn more.
Splits the given character set into its base sets.
The returned array will be a list of indexes of base sets necessary to construct the given character sets. The indexes will be sorted and occur at most once.
Note: This assumes that charSet
is either empty or can be constructed from the base sets. If the
assumption is not met, the output of this function will be undefined.
A character base is constructed from a collection of character sets. It holds a list of disjoint, non-empty character sets - the base sets - that can be used to construct every character set in the collection it was constructed from.
Guarantees
The base sets are guaranteed to be mutually disjoint and non-empty.
Every character set in the collection can be constructed by combining (union) a unique set of base sets.
The list of base sets is guaranteed to be as small as possible. There are at most
min(n^2, o)
base sets wheren
is the number of unique, non-empty character sets in the collection, ando
is the number of characters in the union of all character sets in the collection.Use case
The primary purpose of base sets is to remap alphabets. Some FA operations scale with the number of characters in the alphabet of the FA (e.g. DFA minimization).
Base sets can be used to determine which characters in an FA's alphabet Σ cannot be distinguished by the FA A. Two characters a,b in Σ are indistinguishable if for all inputs w the following hold true:
Two indistinguishable characters are guaranteed to be in the same base set.
By treating each base set as a character, it is possible to create a new (smaller) alphabet Γ (|Γ| <= |Σ|) such that the FA A still behaves the same.
Since Γ is typically (several orders of magnitude) smaller, operations that scale with the size of the alphabet can be done more quickly.