CANTE: Descriptional and computational complexity of formal languages

Automata theory is a fundamental part of Computer Science that was intensively studied during the 1960 ́s, and has again, in the last years, become a matter of fruitful research, both at theoretical and applied level. This renewed interest is explained by the important role that new applications of automata theory play in fields such as computational linguistics, bioinformatics, speech and image recognition, software certification and computer networks, to mention only a few. We emphasize the emergence in the last 12 years of annual international conferences exclusively in this area, such as the "International Conference on Implementation and Application of Automata" (CIAA), "Developments in Language Theory" (DLT), "Descriptional Complexity of Formal Systems" (DCFS) and "Conference on Language and Automata Theory and Applications" (LATA) (this last one, since 2006).
One of the areas that has received special attention is the descriptional complexity of formal languages, which includes complexity measures, relative conciseness, complexity of language operations and of conversions between representations. These studies are motivated by the need to have good estimates of the amount of resources required to manipulate those representations. This is crucial in new applied areas where automata and other formal models are used, for instance, for pattern matching in bioinformatics or network security, or for model checking or security certificates in formal verification systems. In general, having smaller objects will improve our control on software, which may become smaller, more efficient and certifiable.
Another decisive issue for the success of these applications is a new approach to computational complexity. Worst-case analysis for most classical finite automata algorithms are well known. The practical performance of these algorithms, however, is sometimes surprising. The fastest finite automata minimization algorithms are not the ones with the best worst-case complexity. On the other hand, worst-case exponential algorithms, such as the powerset construction, rarely leads to an exponential size explosion. This motivates complexity analyses in the average-case, or other fine tuned analyses as the recent successful smoothed analysis. For average-case analysis, enumeration and random generation of representations are essential.
Finally, it is also a consolidated idea that computational symbolic manipulation environments are important tools for assisting research on formal and computational models. But, due to its combinatorial nature, high-performance implementations are required.
The ultimate aim of this project proposal is the obtention of new, concrete characterizations of formal language representations, both from the descriptional as well as computational complexity perspective, leading to efficient manipulation methods.
The innovation of this proposal resides in the combination of different approaches, namely combinatorial, algorithmic, and the use of several complexity measures, connected with the diverse background expertise of the team members. We highlight some of the expected contributions to open problems and challenges in the area:

  • Constructive enumeration of formal language representations that can lead to efficient random generators;
  • Graph theory-based characterizations of some automata properties that allow a better understanding of transformations and operations between representations;
  • Practical analysis of classical algorithms by means of average-case analysis, Kolmogrov complexity, or smoothed analysis;
  • Compact and decidable procedures for extended Kleene algebras and automata-based formal verification, based on results obtained for regular language characterizations;
  • High-performance software tools for manipulation and visualization of automata and related structures.

We are particularly motivated in the application of the results of this project for the construction of automata-based certificates of program properties, and algebraic and coalgebraic methods for proof systems based in Kleene algebras. Research on this topic is currently undertaken by some team members within the Rescue project (PTDC/EIA/65862/2006).



Start date: 

Friday, 1 January, 2010

Area / Group: 

Semigroups, Automata and Languages


Rogério Reis
António Machiavelo
Nelma Moreira

Other members: 

Luis Filipe Coelho Antunes
Marco André Ferreira de Almeida

Financial support: 

90 000EUR

Funding entity: 


Project reference: