Mikołaj Bojańczyk


I work at the University of Warsaw. My research is on logic in computer science, with a focus on automata.



Papers selected and grouped

In the list below, papers are grouped by topic, and links are provided to the most recent versions (i.e. journal preferred over conferences). A complete list of papers with pdfs, chronologically organised based on dblp, is also provided.

Recognisable Languages over Monads

  1. Mikołaj Bojańczyk
    Recognisable languages over monads. CoRR, 2015   PDF

There are regular languages for: finite words, finite trees, infinite words, various kinds of infinite trees, queries, tuples of words, tuples of trees, etc etc. This paper shows that monads, a concept from category theory, can be used to unify some of the theory of “regularity”. The idea is that for each setting, like finite words or labelled countable total orders, or infinite trees with countably many branches, one chooses a monad, and then many definitions and some theorems come out for free. Here are some slides.

Transducers with origin information

  1. Mikołaj Bojańczyk
    Transducers with Origin Information. ICALP (2), 2014   PDF

This paper is about transducers. The idea is to change the semantics of a transducer, by adding information about which positions in the output originate from which positions in the input. Under such semantics, called origin semantics, fewer transducers are equivalent to each other, e.g. the reverse and identity transformations over a one letter alphabet are not equivalent under origin semantics, but they are equivalent under standard semantics. The paper shows that with origin semantics, many technical and combinatorial problems go away. There is also an implicit and debatable philosophical point, which is that origin semantics better reflects the “true essence” of a transducer, because when thinking of a transducer one also typically has in mind some origin information. Here are some slides.


  1. Mikołaj Bojańczyk
    A Bounding Quantifier. CSL, 2004   PDF

  2. Mikołaj Bojańczyk, Thomas Colcombet
    Bounds in w-Regularity. LICS, 2006   PDF

The logic MSO+U is the extension of MSO with the quantifier U, where UX \phi(X) says that there are arbitrarily large finite sets X which make \phi(X) true. The point of the logic is to express boundedness properties like: “a graph has bounded outdegree” or “a sequence of numbers tends to infinity”. When introducing the logic in the first paper above, I thought that it would be decidable. The first signs were promising, e.g. the paper “Bounds in \omega-regularity” showed that counter automata could be used to decide fragments of the logic. Now we know that the full logic is undecidable, but the problems seem to come from quantifying over infinite sets, and weak fragments are decidable. Here are some slides.

MSO+U is undecidable

  1. Mikołaj Bojańczyk, Pawel Parys, Szymon Torunczyk
    The MSO+U theory of (N, <) is undecidable. CoRR, 2015   PDF

  2. Mikołaj Bojańczyk, Tomasz Gogacz, Henryk Michalewski, Michal Skrzypczak
    On the Decidability of MSO+U on Infinite Trees. ICALP (2), 2014   PDF

These papers show that MSO+U (a quantitative extension of MSO) is undecidable. The 2014 paper shows that MSO+U is undecidable over infinite trees, and with an additional set-theoretic assumption called V=L. The 2015 paper (preprint) deprecates the 2014 one by showing undecidability already on infinite words, and not using any set-theoretic assumptions. This group of undecidability results should be contrasted with the decidability results on weak MSO+U.

Weak MSO+U on infinite words and trees

  1. Mikołaj Bojańczyk
    Weak MSO with the Unbounding Quantifier. Theory Comput. Syst., 2011   PDF

  2. Mikołaj Bojańczyk, Szymon Torunczyk
    Deterministic Automata and Extensions of Weak MSO. FSTTCS, 2009   PDF

  3. Mikołaj Bojańczyk, Szymon Torunczyk
    Weak MSO+U over infinite trees. STACS, 2012   PDF

  4. Mikołaj Bojańczyk
    Weak MSO+U with Path Quantifiers over Infinite Trees. ICALP (2), 2014   PDF

This group of papers is about the weak fragment of MSO+U. This is a logic where one can quantify over elements, finite sets of elements (hence the name weak), and use the quantifier U, which is used to say that certain sets have unbounded size.  Without the weak restriction, i.e. for MSO+U, the logic is undecidable already on infinite words. This undecidability motivates the study of fragments of MSO+U, and the “weak” restriction which allows only quantification over finite sets is one obvious way to go.


  1. Mikołaj Bojańczyk, Laurent Braud, Bartek Klin, Slawomir Lasota
    Towards nominal computation. POPL, 2012   PDF

  2. Mikołaj Bojańczyk, Szymon Torunczyk
    Imperative Programming in Sets with Atoms. FSTTCS, 2012   PDF

  3. Mikołaj Bojańczyk, Thomas Place
    Toward Model Theory with Data Values. ICALP (2), 2012   PDF

  4. Mikołaj Bojańczyk
    Modelling Infinite Structures with Atoms. WoLLIC, 2013   PDF

  5. Mikołaj Bojańczyk, Bartek Klin, Slawomir Lasota, Szymon Torunczyk
    Turing Machines with Atoms. LICS, 2013   PDF

Atoms are a theory project that we are developing in Warsaw, which even has its blog. I am very slowly writing a book on this, although progress has recently stalled. The project was originally motivated by trying to understand automata on data words and data trees. The idea is to consider sets with ur-elements that are called atoms, and to consider a different notion of finiteness: a set is called orbit-finite if it has finitely many elements up to permutations of the atoms.

Automata with atoms

  1. Mikołaj Bojańczyk
    Nominal Monoids. Theory Comput. Syst., 2013   PDF

  2. Mikołaj Bojańczyk, Bartek Klin, Slawomir Lasota
    Automata theory in nominal sets. Logical Methods in Computer Science, 2014   PDF

  3. Mikołaj Bojańczyk, Slawomir Lasota
    A Machine-Independent Characterization of Timed Languages. ICALP (2), 2012   PDF

  4. Mikołaj Bojańczyk, Luc Segoufin, Szymon Torunczyk
    Verification of database-driven systems via amalgamation. PODS, 2013   PDF

This is a series of papers on the question: “what is a regular language in sets with atoms?” and as such is a part of the project on sets with atoms. The first paper is about orbit-finite monoids, and it proves a Schützenberger theorem  for them, demonstrating that the structural theory of monoids extends rather smoothly to orbit-finite monoids. Paper [2] studies orbit-finite automata, which are more powerful than finite monoids. In a certain sense, orbit-fintie automata are the same thing as register automata of Francez and Kaminski, but the formalism of orbit-finiteness is a bit more flexible, e.g. allowing a Myhill-Nerode theorem and effective minimization. Paper [3] continues the study of the Myhill-Nerode theorem, but it does so for a kind of atoms that correspond to timed automata, where the assumptions of paper [2] are not satisfied. Finally, paper [4] gives a database angle on the automata from [2], although it tries to avoid using the word atom in order not to annoy database theorists, including Luc Segoufin.

Data words

  1. Mikołaj Bojańczyk, Claire David, Anca Muscholl, Thomas Schwentick, Luc Segoufin
    Two-variable logic on data words. ACM Trans. Comput. Log., 2011   PDF

  2. Henrik Björklund, Mikołaj Bojańczyk
    Shuffle Expressions and Words with Nested Data. MFCS, 2007   PDF

  3. Mikołaj Bojańczyk
    Automata for Data Words and Data Trees. RTA, 2010   PDF

A data word is like a word over an infinite alphabet. (A related topic is data trees, discussed here.) The definition used in the papers above is that each position contains a label from a finite set, plus a data value from an infinite set. The data values can only be compared for equality, so a typical property of words is: “there exist two positions which have the same data value and both have label a“. (A more general notion of data word is obtained by using atoms, discussed here.) These papers study the automata-logic connection for data words. The problem is that this connection is not so robust as in the case of finite alphabets, and there are lots of different models which achieve different trade-offs between expressivity and decidability.  The first two papers are about two variable first-order logic, and the third one is a (slightly dated) survey of the topic.

Data trees

  1. Mikołaj Bojańczyk, Anca Muscholl, Thomas Schwentick, Luc Segoufin
    Two-variable logic on data trees and XML reasoning. J. ACM, 2009   PDF

  2. Henrik Björklund, Mikołaj Bojańczyk
    Bounded Depth Data Trees. ICALP, 2007   PDF

  3. Mikołaj Bojańczyk, Slawomir Lasota
    An extension of data automata that captures XPath Logical Methods in Computer Science, 2012   PDF

  4. Vince Bárány, Mikołaj Bojańczyk, Diego Figueira, Pawel Parys
    Decidable classes of documents for XPath. FSTTCS, 2012   PDF

  5. Mikołaj Bojańczyk, Filip Murlak, Adam Witkowski
    Containment of Monadic Datalog Programs via Bounded Clique-Width. ICALP (2), 2015   PDF

These are papers about data trees, which are the tree version of data words, i.e. these are trees where each node has a label from a finite alphabet and a data value from an infinite alphabet, with the provision that data values can only be compared for equality. The motivation behind data trees is that they are supposed to be like XML documents. This is a sequence of papers which tries to tackle the satisfiability problem for various logics over data trees. All of the papers use automata that run over data trees.

XML algorithms

  1. Mikołaj Bojańczyk, Pawel Parys
    XPath evaluation in linear time. J. ACM, 2011   PDF

  2. Mikołaj Bojańczyk, Pawel Parys
    Efficient Evaluation of Nondeterministic Automata Using Factorization Forests. ICALP (1), 2010   PDF

  3. Mikołaj Bojańczyk, Diego Figueira
    Efficient evaluation for a temporal logic on changing XML documents. PODS  PDF

This is a group of algorithms that work with XML documents in linear or almost linear time. This is in contrast with the XML motivated algorithms on data words and data trees, where typically the running time is exponential or much worse. The first paper shows that XPath can be evaluated in linear time (improving on previous quadratic algorithms), while the second paper tries to explain this phenomenon using automata theory. The last paper considers evolving XML documents. It describes a logic that can express properties like “if a node in the document gets label a at some point in time, then at later points in time all of its descendants will have label b“. The main contribution is an algorithm that runs in time c \cdot n \cdot \log n where n is the number of edits, and c is a constant that depends (unfortunately, in a nonelementary way) on the formula.

Algebra for finite trees

  1. Mikołaj Bojańczyk, Igor Walukiewicz
    Forest algebras. Logic and Automata, 2008   PDF

  2. Mikołaj Bojańczyk
    Two-Way Unary Temporal Logic over Trees Logical Methods in Computer Science, 2009   PDF

  3. Mikołaj Bojańczyk, Luc Segoufin, Howard Straubing
    Piecewise testable tree languages Logical Methods in Computer Science, 2012   PDF

  4. Mikołaj Bojańczyk, Luc Segoufin
    Tree Languages Defined in First-Order Logic with One Quantifier Alternation Logical Methods in Computer Science, 2010   PDF

  5. Mikołaj Bojańczyk, Howard Straubing, Igor Walukiewicz
    Wreath Products of Forest Algebras, with Applications to Tree Logics Logical Methods in Computer Science, 2012   PDF

The main motivation for this group of papers is the following decision problem, whose decidability remains open: decide if a given regular language of finite trees can be defined by a formula of first-order logic that uses predicates for the labels and the descendant relation. In other words, the question is about the tree version of the Schützenberger theorem. Since Schützenberger was so successful with monoids, we thought it would be a good idea to introduce something like monoids for trees, which is the topic of paper [1] that introduces forest algebra. Unfortunately, we were unable to use these monoids to characterize all of first-order logics, but at least some fragments can be described in terms of structural properties of the underlying forest algebras.

Algebra for infinite trees

  1. Mikołaj Bojańczyk, Tomasz Idziaszek
    Algebra for Infinite Forests with an Application to the Temporal Logic EF. CONCUR, 2009   PDF

  2. Mikołaj Bojańczyk, Thomas Place
    Regular Languages of Infinite Trees That Are Boolean Combinations of Open Sets. ICALP (2), 2012   PDF

  3. Mikołaj Bojańczyk, Tomasz Idziaszek, Michal Skrzypczak
    Regular languages of thin trees. STACS, 2013   PDF

There are algebras for finite words (monoids or semigroups), infinite words (e.g. ω-semigroups or Wilke semigroups) and finite trees (e.g. clones or forest algebras). What about infinite trees? In [1] we propose a notion of algebra for infinite trees, which is an extension of forest algebra. It is not completely satisfactory (there is no result that finite algebras can be represented in a finite way), but it is good enough to characterize some logics, e.g. the temporal logic EF. The paper [2] continues the study of algebras for infinite trees, by giving a structural characterization of some random logic (namely, Boolean combinations of sets that are open in the Borel topology). The idea is to learn what kind of structural theory of regular tree languages is needed to do nontrivial characterizations. Finally, the last paper [3] shows that most of the technical problems go away when considering thin trees, which are infinite trees with countably many branches.

Tree-walking automata

  1. Mikołaj Bojańczyk, Thomas Colcombet
    Tree-walking automata cannot be determinized. Theor. Comput. Sci., 2006   PDF

  2. Mikołaj Bojańczyk, Thomas Colcombet
    Tree-Walking Automata Do Not Recognize All Regular Languages. SIAM J. Comput., 2008   PDF

  3. Mikołaj Bojańczyk
    Tree-Walking Automata. LATA, 2008   PDF

A tree-walking automaton is a natural model of automaton for trees. If you would be absolutely fixed on the idea that an automaton must have a single head, then this is the model you would come up with. The idea is that the automaton has a single head, and traverses the tree by going up and down, typically doing something like a depth-first search. This is a series of papers devoted to showing that the model is not well-behaved. The first paper shows they cannot be determinised, the second that they are strictly weaker than the classical model of branching tree automata, and the third is a survey of the topic.

Finite satisfiability for two-way μ-calculus and guarded fixpoint logic

  1. Mikołaj Bojańczyk
    Two-Way Alternating Automata and Finite Models. ICALP, 2002   PDF

  2. Vince Bárány, Mikołaj Bojańczyk
    Finite satisfiability for guarded fixpoint logic. Inf. Process. Lett., 2012   PDF

The two-way modal μ-calculus is a logic which does not have the finite model property, i.e. some formulas are satisfiable but not satisfiable in any finite model. The same holds for guarded fixpoint logic, a logic introduced by Grädel and Walukiewicz. These two papers show that for both of these logics, one can decide if a given formula has a finite model. For the first paper, about the μ-calculus, the solution boils down to finding trees where certain patterns have bounded size. (Trying to understand this boundedness, I later introduced the logic MSO+U.) For the second paper, about guarded fixpoint logic, one needs to use the first paper plus results on cutting and pasting in models for guarded logics by Bárány, Gottlob and Otto.