World Library  
Flag as Inappropriate
Email this Article

Free monoid

Article Id: WHEBN0000523166
Reproduction Date:

Title: Free monoid  
Author: World Heritage Encyclopedia
Language: English
Subject: Semiautomaton, Monoid, Regular language, Weight (strings), Monoid factorisation
Collection: Combinatorics on Words, Formal Languages, Free Algebraic Structures, Semigroup Theory
Publisher: World Heritage Encyclopedia

Free monoid

In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences (or strings) of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero elements, often called the empty string and denoted by ε or λ, as the identity element. The free monoid on a set A is usually denoted A. The free semigroup on A is the subsemigroup of A containing all elements except the empty string. It is usually denoted A+.[1][2]

More generally, an abstract monoid (or semigroup) S is described as free if it is isomorphic to the free monoid (or semigroup) on some set.[3]

As the name implies, free monoids and semigroups are those objects which satisfy the usual universal property defining free objects, in the respective categories of monoids and semigroups. It follows that every monoid (or semigroup) arises as a homomorphic image of a free monoid (or semigroup). The study of semigroups as images of free semigroups is called combinatorial semigroup theory.


  • Examples 1
    • Natural numbers 1.1
    • Kleene star 1.2
  • Conjugate words 2
    • Equidivisibility 2.1
  • Free generators and rank 3
    • Codes 3.1
  • Free hull 4
  • Morphisms 5
    • Test sets 5.1
  • Endomorphisms 6
    • String projection 6.1
    • Sturmian endomorphisms 6.2
  • The free commutative monoid 7
  • Generalization 8
  • Free monoids and computing 9
  • See also 10
  • Notes 11
  • References 12


Natural numbers

The monoid (N0,+) of natural numbers (including zero) under addition is a free monoid on a singleton free generator, in this case the natural number 1. According to the formal definition, this monoid consists of all sequences like "1", "1+1", "1+1+1", "1+1+1+1", and so on, including the empty sequence. Mapping each such sequence to its evaluation result [4] and the empty sequence to zero establishes an isomorphism from the set of such sequences to N0. This isomorphism is compatible with "+", that is, for any two sequences s and t, if s is mapped (i.e. evaluated) to a number m and t to n, then their concatenation s+t is mapped to the sum m+n.

Kleene star

In formal language theory, usually a finite set (called "alphabet" there) A of "symbols" is considered, a finite sequence of symbols is called "word over A", and the free monoid A is called the "Kleene star of A". Thus, the abstract study of formal languages can be thought of as the study of subsets of finitely generated free monoids. There are deep connections between the theory of semigroups and that of automata. For example, the regular languages over A are the homomorphic pre-images in A of subsets of finite monoids.

For example, assuming an alphabet A = {a, b, c}, its Kleene star A contains all concatenations of a, b, and c:

{ε, a, ab, ba, caa, cccbabbc, ...}.

If A is any set, the word length function on A is the unique monoid homomorphism from A to (N0,+) that maps each element of A to 1. A free monoid is thus a graded monoid.[5]

Conjugate words

Example for 1st case of equidivisibility: m="UNCLE", n="ANLY", p="UN", q="CLEANLY", and s="CLE"

We define a pair of words in A of the form uv and vu as conjugate: the conjugates of a word are thus its circular shifts.[6] Two words are conjugate in this sense if they are conjugate in the sense of group theory as elements of the free group generated by A.[7]


A free monoid is equidivisible: if the equation mn = pq holds, then there exists an s such that either m = ps, sn = q (example see image) or ms = p, n = sq.[8] This result is also known as Levi's lemma.[9]

A monoid is free if and only if it is graded and equidivisible.[8]

Free generators and rank

The members of a set A are called the free generators for A and A+. The superscript * is then commonly understood to be the Kleene star. More generally, if S is an abstract free monoid (semigroup), then a set of elements which maps onto the set of single-letter words under an isomorphism to a semigroup A+ (monoid A) is called a set of free generators for S.

Each free semigroup (or monoid) S has exactly one set of free generators, the cardinality of which is called the rank of S.

Two free monoids or semigroups are isomorphic if and only if they have the same rank. In fact, every set of generators for a free semigroup or monoid S contains the free generators. It follows that a free semigroup or monoid is finitely generated if and only if it has finite rank.

A submonoid N of A is stable if u, v, ux, xv in N together imply x in N.[10] A submonoid of A is stable if and only if it is free.[11] For example, using the set of bits { "0", "1" } as A, the set N of all bit strings containing evenly many "1"s is a stable[12] submonoid[13] of the set A of all bit strings at all. While N cannot be freely generated by any set of single bits, it can be freely generated by the set of bit strings { "0", "11", "101", "1001", "10001", ... }.


A set of free generators for a free monoid P is referred to as a basis for P: a set of words C is a code if C* is a free monoid and C is a basis.[3] A set X of words in A is a prefix, or has the prefix property, if it does not contain a proper (string) prefix of any of its elements. Every prefix in A+ is a code, indeed a prefix code.[3][14]

A submonoid N of A is right unitary if x, xy in N implies y in N. A submonoid is generated by a prefix if and only if it is right unitary.[15]

Free hull

The intersection of free submonoids of a free monoid A is again free.[16][17] If S is a subset of a free monoid A* then the intersection of all free submonoids of A* containing S is well-defined, since A* itself is free, and contains S; it is a free monoid. A basis for this intersection is the free hull of S.

The defect theorem[16][17][18] states that if X is finite and C is the free hull of X, then either X is a code and C = X, or

|C| ≤ |X| − 1 .


A monoid morphism f from a free monoid B to a monoid M is a map such that f(xy) = f(x)⋅f(y) for words x,y and f(ε) = ι, where ε and ι denotes the identity element of B and M, respectively. The morphism f is determined by its values on the letters of B and conversely any map from B to M extends to a morphism. A morphism is non-erasing[19] or continuous[20] if no letter of B maps to ι and trivial if every letter of B maps to ι.[21]

A morphism f from a free monoid B to a free monoid A is total if every letter of A occurs in some word in the image of f; cyclic[21] or periodic[22] if the image of f is contained in w for some word w of A. A morphism f is k-uniform if the length |f(a)| is constant and equal to k for all a in A.[23][24] A 1-uniform morphism is strictly alphabetic[20] or a coding.[25]

A morphism f from a free monoid B to a free monoid A is simplifiable if there is an alphabet C of cardinality less than that of B such the morphism f factors through C; otherwise f is elementary. The morphism f is called a code if the image of the alphabet B under f is a code: every elementary morphism is a code.[26]

Test sets

For L a subset of B, a finite subset T of L is a test set for L if morphisms f and g on B agree on L if and only if they agree on T. The Ehrenfeucht conjecture is that any subset L has a test set:[27] it has been proved[28] independently by Albert and Lawrence; McNaughton; and Guba. The proofs rely on Hilbert's basis theorem.[29]


An endomorphism of A is a morphism from A to itself.[30] The identity map I is an endomorphism of A, and the endomorphisms form a monoid under composition of functions.

An endomorphism f is prolongable if there is a letter a such that f(a) = as for a non-empty string s.[31]

String projection

The operation of string projection is an endomorphism. That is, given a letter a ∈ Σ and a string s ∈ Σ, the string projection pa(s) removes every occurrence of a from s; it is formally defined by

p_a(s) = \begin{cases} \varepsilon & \text{if } s=\varepsilon, \text{ the empty string} \\ p_a(t) & \text{if } s=ta \\ p_a(t)b & \text{if } s=tb \text{ and } b\ne a. \end{cases}

Note that string projection is well-defined even if the rank of the monoid is infinite, as the above recursive definition works for all strings of finite length. String projection is a morphism in the category of free monoids, so that

p_a\left(\Sigma^*\right)= \left(\Sigma-a\right)^*

where p_a\left(\Sigma^*\right) is understood to be the free monoid of all finite strings that don't contain the letter a. The identity morphism is p_\varepsilon, as clearly p_\varepsilon(s)=s for all strings s. Of course, it commutes with the operation of string concatenation, so that p_a(st)=p_a(s)p_a(t) for all strings s and t. There are many right inverses to string projection, and thus it is a split epimorphism.

String projection is commutative, as clearly


For free monoids of finite rank, this follows from the fact that free monoids of the same rank are isomorphic, as projection reduces the rank of the monoid by one.

String projection is idempotent, as


for all strings s. Thus, projection is an idempotent, commutative operation, and so it forms a bounded semilattice or a commutative band.

Sturmian endomorphisms

An endomorphism of the free monoid B on a 2-letter alphabet B is Sturmian if it maps every Sturmian word to a Sturmian word[32][33] and locally Sturmian if it maps some Sturmian word to a Sturmian word.[34] The Sturmian endomorphisms form a submonoid of the monoid of endomorphisms of B.[32]

Define endomorphisms φ and ψ of B, where B = {0,1}, by φ(0) = 01, φ(1) = 0 and ψ(0) = 10, ψ(1) = 0. Then I, φ and ψ are Sturmian,[35] and the Sturmian endomorphisms of B are precisely those endomorphisms in the submonoid of the endomorphism monoid generated by {I,φ,ψ}.[33][34][36]

A primitive substitution is Sturmian if the image of the word 10010010100101 is balanced.[33][37]

The free commutative monoid

Given a set A, the free commutative monoid on A is the set of all finite multisets with elements drawn from A, with the monoid operation being multiset sum and the monoid unit being the empty multiset.

For example, if A = {a, b, c}, elements of the free commutative monoid on A are of the form

{ε, a, ab, a2b, ab3c4, ...}.

The fundamental theorem of arithmetic states that the monoid of positive integers under multiplication is a free commutative monoid on an infinite set of generators, the prime numbers.

The free commutative semigroup is the subset of the free commutative monoid which contains all multisets with elements drawn from A except the empty multiset.


The free partially commutative monoid, or trace monoid, is a generalization that encompasses both the free and free commutative monoids as instances. This generalization finds applications in combinatorics and in the study of parallelism in computer science.

Free monoids and computing

The free monoid on a set A corresponds to lists of elements from A with concatenation as the binary operation. A monoid homomorphism from the free monoid to any other monoid (M,•) is a function f such that

  • f(x1xn) = f(x1) • … • f(xn)
  • f() = e

where e is the identity on M. Computationally, every such homomorphism corresponds to a map operation applying f to all the elements of a list, followed by a fold operation which combines the results using the binary operator •. This computational paradigm (which can be generalised to non-associative binary operators) has inspired the MapReduce software framework.

See also


  1. ^ Lothaire (1997, pp. 2–3), [2]
  2. ^ Pytheas Fogg (2002, p. 2)
  3. ^ a b c Lothaire (1997, p. 5)
  4. ^ Since addition of natural numbers is associative, the result doesn't depend on the order of evaluation, thus ensuring the mapping to be well-defined.
  5. ^ Sakarovitch (2009) p.382
  6. ^ Sakarovitch (2009) p.27
  7. ^ Pytheas Fogg (2002, p. 297)
  8. ^ a b Sakarovitch (2009) p.26
  9. ^ Aldo de Luca; Stefano Varricchio (1999). Finiteness and Regularity in Semigroups and Formal Languages. Springer Berlin Heidelberg. p. 2.  
  10. ^ Berstel, Perrin & Reutenauer (2010, p. 61)
  11. ^ Berstel, Perrin & Reutenauer (2010, p. 62)
  12. ^ if u contains an even number of "1"s, and ux as well, then x must contain an even number of "1"s, too
  13. ^ since it is closed with respect to string concatenation
  14. ^ Berstel, Perrin & Reutenauer (2010, p. 58)
  15. ^ Lothaire (1997, p. 15)
  16. ^ a b Lothaire (1997, p. 6)
  17. ^ a b Lothaire (2011, p. 204)
  18. ^ Berstel, Perrin & Reutenauer (2010, p. 66)
  19. ^ Lothaire (1997, p. 7)
  20. ^ a b Sakarovitch (2009, p. 25)
  21. ^ a b Lothaire (1997, p. 164)
  22. ^ Salomaa (1981) p.77
  23. ^ Lothaire (2005, p. 522)
  24. ^ Berstel, Jean; Reutenauer, Christophe (2011). Noncommutative rational series with applications. Encyclopedia of Mathematics and Its Applications 137. Cambridge:  
  25. ^ Allouche & Shallit (2003, p. 9)
  26. ^ Salomaa (1981) p.72
  27. ^ Lothaire (1997, pp. 178–179)
  28. ^ Lothaire (2011, p. 451)
  29. ^  
  30. ^ Lothaire (2011, p. 450)
  31. ^ Allouche & Shallit (2003) p.10
  32. ^ a b Lothaire (2011, p. 83)
  33. ^ a b c Pytheas Fogg (2002, p. 197)
  34. ^ a b Lothaire (2011, p. 85)
  35. ^ Lothaire (2011, p. 84)
  36. ^ Berstel, J.; Séébold, P. (1994). "A remark on morphic Sturmian words". RAIRO, Inform. Théor. Appl. 2 8 (3-4): 255–263.  
  37. ^ Berstel, Jean; Séébold, Patrice (1993), "A characterization of Sturmian morphisms", in Borzyszkowski, Andrzej M.; Sokołowski, Stefan, Mathematical Foundations of Computer Science 1993. 18th International Symposium, MFCS'93 Gdańsk, Poland, August 30–September 3, 1993 Proceedings, Lecture Notes in Computer Science 711, pp. 281–290,  


  • Allouche, Jean-Paul;  
  • Berstel, Jean; Perrin, Dominique; Reutenauer, Christophe (2010), Codes and automata, Encyclopedia of Mathematics and its Applications 129, Cambridge:  
  • Pytheas Fogg, N. (2002), Substitutions in dynamics, arithmetics and combinatorics, Lecture Notes in Mathematics 1794, Editors Berthé, Valérie; Ferenczi, Sébastien; Mauduit, Christian; Siegel, A., Berlin:  
  • Sakarovitch, Jacques (2009), Elements of automata theory, Translated from the French by Reuben Thomas, Cambridge:  
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from World Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.