How likely is that a randomly given (non-) deterministic finite automaton recognizes no word? A quick reflection seems to indicate that not too many finite automata accept no word; but, can this intuition be confirmed?
In this paper we offer a statistical approach which allows us to conclude that for automata, with a large enough number of states, the probability that a given (non-) deterministic finite automaton recognizes no word is close to zero. More precisely, we will show, with a high degree of accuracy (i.e., with precision higher than 99% and level of confidence 0.9973), that for both deterministic and non-deterministic finite automata: a) the probability that an automaton recognizes no word tends to zero when the number of states and the number of letters in the alphabet tend to infinity, b) if the number of states is fixed and rather small, then even if the number of letters of the alphabet of the automaton tends to infinity, the probability is strictly positive. The result a) is obtained via a statistical analysis; for b) we use a combinatorial and statistical analysis.
The present analysis shows that for all practical purposes the fraction of automata recognizing no words tends to zero when the number of states and the number of letters in the alphabet grow indefinitely. From a theoretical point of view, the result can motivate the search for ``certitude'', that is, a proof of the fact established here in probabilistic terms.
In the last section we critically discuss the result and the method used in this paper.
An algebraic characterization of simple splicing languages given in [6] is extended to semi-simple splicing languages.
We consider two complexity parameters related to the graph of reachable configurations of a given P system, namely the outdegree as a measure of the degree of non-determinism, and the indegree as a measure of the degree of confluence. These parameters can be defined for both the generative and the accepting mode of using a P system. We investigate here these parameters in what concerns hierarchies and decidability issues. We prove that all hierarchies have only two levels and that all considered decidability problems have a negative answer.
In this paper we discuss some relationships between cooperating distributed (CD) grammar systems and the basic process algebra (BPA) calculus. We associate different types of process graphs from this calculus to CD grammar systems which describe the behavior of the components of the system under cooperation. We prove that these process graphs form a subalgebra of the graph model of BPA. It is also shown that for certain restricted variants of CD grammar systems and for certain types of these process graphs the bisimilarity of two process graphs is decidable.
Secret sharing schemes, introduced by Blakley and Shamir independently in 1979, have a number of applications in security systems. One approach to the construction of secret sharing schemes is based on coding theory. In principle, every linear code can be used to construct secret sharing schemes. But only well structured linear codes give secret sharing schemes with nice access structures in the sense that every pair of participants plays the same role in the secret sharing. In this paper, we construct a class of good linear codes, and use them to obtain a class of secret sharing schemes with nice access structures.
The paper investigates inference based on quantities |w|_{u}, the number of occurrences of a word u as a scattered subword of w. Parikh matrices recently introduced are useful tools for such investigations. We introduce and study universal languages for Parikh matrices. We also obtain results concerning the inference from numbers |w|_{u} to w, as well as from certain entries of a Parikh matrix to other entries.
We introduce generalized trajectories where the individual symbols are interpreted as operations performed on the operand words. The various previously considered trajectory-based operations can all be expressed in this formalism. It is shown that the generalized operations can simulate Turing machine computations. We consider the equivalence problem and a notion of unambiguity that is sufficient to make equivalence decidable for regular sets of trajectories under nonincreasing interpretations.
We continue investigations of weighted finite automata (WFA) as devices to compute real functions. Based on eigenvalues of the transition matrices of automata we provide a simple necessary condition for continuity and smoothness properties of the functions they compute. Using this condition we show that polynomials are the only smooth functions computed by WFA and that any WFA computing a polynomial of degree k must have at least k+1 states. The results answer problems left open in [7].
We address the problem of designing optimal prefix-free codes over an encoding alphabet with unequal integer letter costs. The most efficient algorithm proposed so far has O(n^{C+2}) time complexity, where n is the number of codewords and C is the maximum letter cost. For the special case when the encoding alphabet is binary, a faster solution was proposed, namely of O(n^{C}) time complexity, based on a more sophisticated modeling of the problem, and on exploiting the Monge property of the cost function. However, those techniques seemed not to extend to the r-letter alphabet. This work proves that, on the contrary, the generalization to the r-letter case is possible, thus leading to a O(n^{C}) time complexity algorithm for the case of arbitrary number of letters.
In the infinite Post Correspondence Problem an instance (h,g) consists of two morphisms h and g, and the problem is to determine whether or not there exists an infinite word a such that h(a) = g(a). In the general case this problem was shown to be undecidable by K. Ruohonen (1985). Recently, it was proved that the infinite PCP is undecidable already when the domain alphabet of the morphisms consists of at least 9 letters. Here we show that the problem is undecidable for instances where the morphisms have a domain of 6 letters, when we restrict to solutions of w-languages of the form R^{w} where R is a given regular language.
We show that it is undecidable whether or not a given N-rational series assumes all nonnegative values. As a consequence we see that it is undecidable whether the image of a given N-rational series equals the image of a given N-rational sequence. We discuss also related results concerning DT0L and D0L length sets.
The original definition of P systems calls for rules to be applied in a maximally parallel fashion. However, in some cases a sequential model may be a more reasonable assumption. Here we study the computational power of different variants of sequential P systems. Initially we look at cooperative systems operating on symbol objects and without prioritized rules, but which allow membrane dissolution and bounded creation rules. We show that they are equivalent to vector addition systems and, hence, nonuniversal. When these systems are used as language acceptors, they are equivalent to communicating P systems which, in turn, are equivalent to partially blind multicounter machines. In contrast, if such cooperative systems are allowed to create an unbounded number of new membranes (i.e., with unbounded membrane creation rules) during the course of the computation, then they become universal. We then consider systems with prioritized rules operating on symbol objects. We show two types of results: there are sequential P systems that are universal and sequential P systems that are nonuniversal. In particular, both communicating and cooperative P systems are universal, even if restricted to 1-deterministic systems with one membrane. However, the reachability problem for multi-membrane catalytic P systems with prioritized rules is NP-complete and, hence, these systems are nonuniversal.
Viruses compress their genome to reduce space. One of the main techniques is overlapping genes. We model this process by the shortest common superstring problem. We give an algorithm for computing optimal solutions which is slow in the number of strings but fast (linear) in their total length. This algorithm is used for a number of viruses with relatively few genes. When the number of genes is larger, we compute approximate solutions using the greedy algorithm which gives an upper bound for the optimal solution. We give also a lower bound for the shortest common superstring problem. The results obtained are then compared with what happens in nature. Remarkably, the compression obtained by viruses is very close to the one achieved by modern computers.
We introduce a type of substitution operation inspired by errors occurring in biologically encoded information. We derive the closure properties of language families in the Chomsky hierarchy under these substitution operations. Moreover, we consider some language equations involving these operations and investigate decidability of the existence of solutions to such equations.
We investigate in this paper a simple intramolecular model for gene assembly in ciliates. Unlike the general intramolecular model, the folds that a micronuclear chromosome may form to assemble the genes is very restricted (minimal) here: between any two pointers there may be at most one coding block. It has been known that the general model is universal, being able to assemble any gene pattern (to sort any signed permutation). The simple model on the other hand is not universal: there exist signed permutations that cannot be sorted in this model. Remarkably though, all known micronuclear gene patterns in ciliates can indeed be assembled in the simple model. We prove in this paper that while the general model is non-deterministic, the simple model is ``weakly deterministic'': any gene pattern either has only successful or only unsuccessful sorting strategies. Moreover, although different strategies may lead to different permutations for the same input, these final results have the same structure.
In this paper we continue the study on synchronized shuffle started in [5] by introducing the condition that the two words which are to be shuffled have to synchronize on a predefined set of backbones. As far as the language-theoretic properties of this operation are concerned, we prove that in a trio the closure under shuffle is equivalent to the closure under synchronized shuffle on a regular set of backbones. Based on this result we show that a set of backbones is regular if and only if the synchronized shuffle of every two regular languages on that set is also regular. Some relationships between this operation and the synchronized shuffle operations introduced in [5] are presented and some open problem are finally discussed.
We investigate the problem of finding monoids that recognize languages of the form L_{1}\Join_{T}L_{2} where T is an arbitrary set of routes. We present a uniform method based on routes to find such monoids. Many classical operations from the theory of formal languages, such as catenation, bi-catenation, simple splicing, shuffle, literal shuffle, and insertion are shown to be just particular instances of the operation \Join_{T}.
With inspiration from the economic reality, where numbers are basic entities to work with, we propose a genuinely new kind of P systems, where numerical variables evolve, starting from initial values, by means of production functions and repartition protocols. We prove that non-deterministic systems of this type, using polynomial production functions, characterize the Turing computable sets of natural numbers, while deterministic systems, with polynomial production functions having non-negative coefficients, compute strictly more than semilinear sets of natural numbers. A series of research topics to be addressed in this framework are mentioned.
The relationship between graphoid independency relations (defined in the text) and such relations induced by Probabilistic Distributions (PD) with binary random variables is investigated. It is shown that there are axioms that are sound for a subset of PD-induced relations with binary variables and are independent of the Graphoid axioms (cannot be logically derived from those axioms).
Bimachines are important conceptual tools used for the characterization of rational word functions (realized by single-valued transducers). Despite the attention received in the past, these sequential machines are far from being exhaustively studied. A natural question which has not been addressed so far is what family of transductions are realized by bimachines that operate nondeterministically. We show that these machines characterize input-unambiguous (IU) rational transductions, i.e., those transductions that can be written as a composition of rational functions and finite substitutions. Two more families of rational transductions are defined and related in a natural way to IU transductions: input-deterministic transductions and rational transductions with finite codomain (FC). We have shown that FC transductions are recognizable and that they can be expressed as finite union of subsequential functions. Moreover, they can be realized by nondeterministic bimachines. Finally, we have defined the so called restricted nondeterministic bimachines and shown that, surprisingly, they are more powerful than nondeterministic bimachines: they characterize exactly the family of finitely ambiguous rational transductions.
We deal with the notion of M-unambiguity [5] in connection with the Parikh matrix mapping introduced by Mateescu and others in [7]. M-unambiguity is studied both in terms of words and matrices and several sufficient criteria for M-unambiguity are provided in both cases, nontrivially generalizing the criteria based on the g-property introduced by Salomaa in [15]. Also, the notion of M-unambiguity with respect to a word is defined in connection with the extended Parikh matrix morphism [16] and some of the M-unambiguity criteria are lifted from the classical setting to the extended one.
This paper is a revised and extended version of [17].
We present a model and a core programming language appropriate for modeling and programming interactive computing systems.
The model consists of rv-systems (interactive systems with registers and voices); it includes register machines, is space-time invariant, is compositional, may describe computations extending in both time and space, and is applicable to open, interactive systems. To achieve modularity in space the model uses voices (a voice is the time dual of a register) - they provide a high level organization of temporal data and are used to describe interaction interfaces of processes.
The programming language uses novel techniques for syntax and semantics to support computation in space paradigm. We describe rv-programs and base their syntax and operational semantics on FIS-es (finite interactive systems) and their grid languages (a FIS is a kind of 2-dimensional automaton specifying both control and interaction used in rv-programs).
We also present specification techniques for rv-systems, using relations between input registers and voices and their output counterparts. The paper includes simple specifications for an OO-system and for an interactive game.
This note begins a study of some elementary properties related to the order structures applied in the algebraic approach to processes semantics. The support examples come from the partially additive semantics developed by Steenstrup (1985) and Manes and Arbib (1986) and from process algebra of Baeten and Weijland (1990). The main sources for the algebraic theory are F.A.Smith (1966) and Golan (1999). We show that different properties can be extended to partially additive distributive algebras more general than sum-ordered partial semirings. One establishes that the support examples constitute multilattices, in the sense of Benado (1955). By the examples, the ordering considered, and the references, this preliminary study is related to Rudeanu et al. (2004) and to the algebraic approach to languages due to Mateescu, e.g., (1996), (1989), (1994).
The Compound Term Composition Algebra (CTCA) is an algebra with four algebraic operators, which can be used to generate the valid (meaningful) compound terms of a given faceted taxonomy, in an efficient and flexible manner. The positive operations allow the derivation of valid compound terms through the declaration of a small set of valid compound terms. The negative operations allow the derivation of valid compound terms through the declaration of a small set of invalid compound terms. In this paper, we show how CTCA can be represented in logic programming with negation-as-failure, according to both Clark's and well-founded semantics. Indeed, the SLDNF-resolution can be used for checking compound term validity and well-formedness of an algebraic expression in polynomial time w.r.t. the size of the expression and the number of terms in the taxonomy. This result makes our logic programming representation a competitive alternative to imperative algorithms. Embedding of our logic programming representation to the programming environment of a web portal for a computer sales company is demonstrated.
We exhibit a low-complexity but non-trivial distance between strings to be used in biology. The experimental results we provide were obtained on a standard laptop and, even if preliminary, are quite encouraging.
A low-complexity grayscale image embedding scheme that can embed multiple secret images is proposed in this paper. In this scheme, different users can extract different secret images according to the secret keys they hold. To reduce the storage cost of the secret images, each of the secret images is first compressed using an improved version of the moment preserving block truncation coding scheme. The compressed message of each secret image is then encrypted by the DES cryptography system with different secret key and then embedded into the host image using the modulus least-significant-bit substitution technique.
According to the experimental results, it is shown that the proposed scheme consumes a little computational cost. Besides, different users can extract different number of secret images. In other words, the proposed scheme indeed provides a good approach for the hiding of grayscale images with access control.
In this paper, we propose an adaptive algorithm for merging n (n ³ 2) prioritized knowledge bases which takes into account the degrees of conflict and agreement among these knowledge bases. The algorithm first selects largely partially maximal consistent subsets (LPMCS) of sources by assessing how (partially) consistent the information in the subset is. Then within each of these created subsets, a maximal consistent subset is further selected and knowledge bases in it are merged with a suitable conjunctive operator based on the degree of agreement among them. This result is then merged with the remaining knowledge bases in the corresponding LPMCS in the second step through the relaxation of the minimum operator. Finally, the knowledge bases obtained from the second step are merged by a maximum operator. In comparison with other merging methods, our approach is more context dependent and is especially useful when most sources of information are in conflict.
In recent years the consideration that events in evolutions of concurrent systems can happen with different histories has received ground. In particular the possibility that part of the history can be abstracted away or identified, like in the collective tokens philosophy for Petri Nets, has gained the stage. The various brands of event structures considered in literature are tailored to a fixed interpretation with respect to the history of an event. We investigate the adequateness of event structures with a disabling/enabling relation, to settle a common ground for the history dependent and history independent interpretations, and we establish a relationship between event automata and these notions of event structures.
The paper proposes a theoretical study of a coordination language embodying Linda's asynchronous communication primitives with a refined matching mechanism based on pairs composed of attribute names associated with their values. Computations in this language are described by means of an operational semantics, reporting the whole traces of executions. The non-compositionality of this intuitive operational semantics motivates the design of a compositional and fully abstract denotational semantics.
In this paper we investigate security problems which occur when exploiting a Linda-like data driven coordination model in an open environment. In this scenario, there is no guarantee that all the agents accessing the shared tuple space are trusted. Starting from a formalization of some typical security properties in the standard Linda coordination model, we present a novel data-driven coordination model which provides mechanisms to support the considered security properties. The first of these mechanisms supports logical partitions of the shared repository: in this way we can restrict the access to tuples stored inside a partition, simply by limiting the access to the partition itself. The second mechanism consists of adding to the tuples some extra information which permits to authenticate the producer of a tuple or to identify its reader/consumer. Finally, we support the possibility to define access control policies based on the kind of operations an agent performs on a tuple, thus discriminating between (destructive) input and (non-destructive) read permissions on each single tuple.
Coordination models like LINDA were first conceived in the context of closed systems, like high-performance parallel applications. There, all coordinated entities were known once and for all at design time, and coordination media were conceptually part of the coordinated application. Correspondingly, traditional formalisations of coordination models - where both coordinated entities and coordination media are uniformly represented as terms of a process algebra - endorse the viewpoint of coordination as a language for building concurrent systems.
The complexity of today application scenarios calls for a new approach to the formalisation of coordination models. Open systems, typically hosting a multiplicity of applications working concurrently, require coordination to be imposed through powerful abstractions that (i) persist through the whole engineering process - from design to execution time - and (ii) provide coordination services to applications by a shared infrastructure in the form of coordination media.
As a unifying framework for a number of existing works on the semantics of coordination media, in this paper we present a basic ontology and a formal framework endorsing the viewpoint of coordination as a service. By this framework, coordination media are characterised in terms of their interactive behaviour, and are seen as primary abstractions amenable of formal investigation, promoting their exploitation at every step of the engineering process
We study a simple software architecture, in which components are coordinated by writing into and reading from a global set. This simple architecture is inspired by the industrial software architecture Splice. We present two results. First, a distributed implementation of the architecture is given and proved correct formally. In the implementation, local sets are maintained and data items are exchanged between these local sets. Next we show that the architecture is sufficiently expressive in principle. In particular, every global specification of a system's behaviour can be divided into components, which coordinate by read and write primitives on a global set only. We heavily rely on recent concepts and proof methods from process algebra.
In this paper we present a coordination model for component-based software systems based on the notion of mobile channels, define it in terms of a compositional trace-based semantics, and describe its implementation in the Java language. Channels allow anonymous, and point-to-point communication among components, while mobility allows dynamic reconfiguration of channel connections in a system. This model supports dynamic distributed systems where components can be mobile. It provides an efficient way of interaction among components. Furthermore, our model provides a clear separation between the computational part and the coordination part of a system, allowing the development and description of the coordination structure of a system to be done in a transparent and exogenous manner. Our description of the Java implementation of this coordination model demonstrates that it is self-contained enough for developing component-based systems in object-oriented languages. However, if desired, our model can be used as a basis to extend other models that focus on other aspects of components that are less concerned with composition and coordination issues.
This paper proposes the use of session types to extend with behavioural information the simple descriptions usually provided by software component interfaces. We show how session types allow not only high level specifications of complex interactions, but also the definition of powerful interoperability tests at the protocol level, namely compatibility and substitutability of components. We present a decidable proof system to verify these notions, which makes our approach of a pragmatic nature.