CSP Tutorial

This tutorial was put together both as an introduction to CSPs and as a resource that can be referred back to.

Sections:
1 What is a constraint satisfaction problem?
2 What are the practical applications of CSPs?
3 Definition & model of a CSP
4 Constraints, variables, and values
5 Search: basic
6 Search: propagating constraints
7 Search: local consistency
8 Search: other
9 Complexity
10 Symmetry
11 Optimization problems

Consulted sources

[1] K. R. Apt. Principles of Constraint Programming. Cambridge University Press, 2003.
[2] R. Dechter. Constraint Processing. Morgan Kaufmann Publishers, 2003.
[3] P. V. Hentenryck. Constraint Satisfaction in Logic Programming. The MIT Press, 1989.
[4] S. Rusell and P. Norvig. Artificial Intelligence: A Modern Approach, 2nd Edition. Prentice Hall, 2003. 137-59.
[5] B. M. Smith. “A Tutorial on Constraint Programming.” Research Report 95.14, School of Computer Studies, University of Leeds, April 1995.
[6] P. Cheeseman, et al. "Where the REALLY Hard Problems Are." Proceedings of IJCAI-91, 331-337, 1991.
[7] D. Mitchell, et al. "Hard and Easy Distributions of SAT Problems." Proceedings of AAAI-92, pages 459-465, 1992.
[8] B. Smith. “Phase Transition and the Mushy Region in Constraint Satisfaction Problems.” Proceedings of ECAI-94, pages 100-104, 1994.


What is a constraint satisfaction problem?

As can be inferred from the name, this set of problems deal with constraints. These constraints are no different from the ones that inhabit our real-world. There are constraints all around us, such as temporal constraints (managing work and home life), or tangible constraints (making sure we don't go over budget), and we figure out ways to deal with them to varying success. Where we don't enjoy success and run into problems, especially solutions that may work, but are not exactly what we wanted, are due in no small part to our limited capacity to deal with problems involving a large amount of variables and constraints. This is where computers, and more specifically, constraint satisfaction problems (CSPs), are necessary.

Like most problems in artificial intelligence (AI), CSPs are solved through search. What makes CSPs unique, however, is the structure of the problem; unlike other AI problems, there is a standard structure to CSPs that allows general search algorithms using heuristics (with knowledge about the structure of the problem and not necessarily domain-specific knowledge) to be implemented for any CSP. In addition to this, all CSPs are also commutative--they can be searched in any order and still give the same result. These special and defining characteristics makes CSPs both interesting and worthwhile to study.


What are the practical applications of CSPs?

The practical applications of CSPs are very straightforward. CSPs are very good for solving general temporal and combinatorial problems, among other things. The following are examples where constraint programming has been successfully applied in various other fields:

   - Operations Research (scheduling, timetabling)
   - Bioinformatics (DNA sequencing)
   - Electrical engineering (circuit layout-ing)
   - Telecommunications (CTVR @ 4C)
   - Hubbell telescope/Satellite scheduling

Generally speaking, CSPs are a rather recent formulation. There is not extensive published literature on the subject, but they are widely studied and their applications will continue to increase.


Definition of a CSP

Formal definition

The formal definition of a CSP involves variables and their domains, and constraints. Suppose we have a set of variables, X1, X2, ..., Xn, all with domains D1, D2, ..., Dn such that all variables Xi have a value in their respective domain Di. There is also a set of constraints, C1, C2, ..., Cm, such that a constraint Ci restricts (imposes a constraint on) the possible values in the domain of some subset of the variables. A solution to a CSP is an assignment of every variable some value in its domain such that every constraint is satisfied. Therefore, each assignment (a state change or step in a search) of a value to a variable must be consistent: it must not violate any of the constraints.

As in any AI search problem, there can be multiple solutions (or none). To address this, a CSP may have a preference of one solution over another using some preference constraints (as opposed to all absolute constraints), want all solutions, or the optimal solution, given by an objective function. Optimizing a CSP model will be explored later in the tutorial.

Finite vs real-valued domains

This explanation of constraint programming will only touch on problems that have finite domain variables. This means that the domains are a finite set of integers, as opposed to a real-valued domain that would include an infinite number of real-values between two bounds.

The modeling of a real problem

Consider the popular N-Queens problem used throughout AI. This problem involves placing n queens on an n x n chessboard such that no queen is attacking another queen. (According to the rules of chess, a queen is able to attack another piece--in this case, a queen--if it is in the same row, column, or diagonal from that queen.) There are, of course, many ways to formulate this problem as a CSP (think: variables, domains, and constraints). A simple model is to represent each queen as a row so that (for example) to solve the 4-queen problem, we have variables Q1, Q2, Q3, and Q4. Each of these variables has an integer domain, whose values correspond to the different columns, 1-4. An assignment consists of assigning a column to a queen, i.e. { Q1 = 2 }, which "places" a queen in row 1, column 2. The constraints on the problem restrict certain values for each variable so that all assignments are consistent. For example, after we have assigned Q1, and now want to assign Q2, we know we cannot use value 2, since this would violate a constraint: Q1 could attack Q2 and visa versa. Thus we come up with the following variables, values, and constraints to model this problem:

Variables: { Q1, Q2, Q3, Q4 }

Domain: { (1, 2, 3, 4), (1, 2, 3, 4), (1, 2, 3, 4), (1, 2, 3, 4) }

Constraints: Alldifferent( Q1, Q2, Q3, Q4 ) and
for i = 0...n and j = (i+1)...n, k = j-i, Q[i] != Q[j] + k and Q[i] != Q[j] - k.


(Some more on) constraints, variables, and values

Constraints

As mentioned, the structure of the CSP is the most important part of it since the same algorithms can be used to search any CSP. Since we know that the structure is standard across all CSPs, we can take a look at heuristics that are able to operate on all different types of problems. However, that is not to say that all algorithms are equally tractable and efficient on all sorts of problems. Currently, the decision of the use of an algorithm for a certain problem is determined empirically.

A constraint is considered n-ary if it involves n variables. So if a constraint affects just a single variable, it is considered unary. Unary constraints can be dealt with as a preprocessing step. Constraints that involve two variables are binary constraints and are of particular interest for two reasons. One is that they can be modeled as a constraint graph, where the nodes of the graph represent the variables and an edge connects two nodes if a constraint exists between the two variables. The second is that a constraint of higher arity (the number of variables involved in a constraint) can always be reduced to a set of binary constraints! (However, that doesn't mean that this is always a good idea--in some cases, the number of binary constraints for a problem can be exponential, thus creating an intractable model.) More complex constraints, with arity > 2, are called global constraints. A simple example of a global constraint is the Alldifferent constraint; this constraint forces all the variables it touches to have different values. (Note: It is easy to see how this particular global constraint could be decomposed into many "not equal" binary constraints.) There is much to be said about global constraints, however, they are beyond the scope of this tutorial.

Constraint programming has close ties with integer programming (IP): one way of representing constraints is to form equations using subsets of the variables. One example of a simple constraint modeled as an equation, where X1 and X2 are variables, is X1 < X2. A more complex constraint is one we would use in the SEND + MORE = MONEY problem. Without elaborating extensively about the problem, the basic idea is that we try to find decimals to represent each letter in the equation so that if all the letters { S, E, N, D, M, O, R, Y } are replaced by decimals (with S and M != 0), SEND + MORE = MONEY. In order to model this problem, we have to construct constraints, such as D + E = 10 * C1 + Y. This constraint makes sure that the sum of the values of D + E on top of the equation are equal to Y and the carry value of the addition. This is a real constraint that is widely used to solve this problem, not some diluted example; the majority of the time defining constraints for a problem really is this intuitive.

Variable ordering

Deciding on the variables to be included in your problem's model is usually not too difficult: there are the obvious variables that must be assigned values for a solution to exist (decision variables) and variables that help make the problem more efficient or contribute to some objective function. While tricks can be used (such as was earlier when the queens were represented as rows) to increase performance, they are just that.

A part of any search algorithm is choosing a variable that has not yet been instantiated and assigning it a value from its domain. There are both static and dynamic variable ordering heuristics available that decide how to choose this next variable. One such heuristic is MRV (minimum-remaining values), which comes from the fail-first principle. The MRV heuristic selects from the set of unassigned variables the next variable with the fewest remaining values in its domain. Essentially this allows us to discover a dead end sooner than we would have and as a result reduce the overall size of our search tree. This heuristic becomes much more useful when dealing with a problem with noticeable variances in the cardinality of domains, both during the preprocessing and (dynamically) as the search progresses. Different search techniques, explained later in this tutorial will delete values from the domains of variables, making MRV appealing.

Another heuristic for variable ordering, often used as a tie-breaker is the degree heuristic. This heuristic attempts to choose the unassigned variable that is involved in the most constraints with other unassigned variables. This reduces the number of children of each node in the search tree by decreasing the domain sizes of other variables.

Value ordering

After we have a variable, we must assign it a value. The way in which we choose values, or value ordering, is also important because we want to branch as often as possible towards a solution (though value ordering is a waste of time if we are looking for all solutions). The most popular heuristic for choosing a value is LCV, or least-constraining value. The idea is to choose the value that would eliminate the fewest values in the domains of other variables and thus hopefully steer the search away from a dead end. By doing so, it leaves the most choices open for subsequent assignments to unassigned variables.


Search: basic

Searching a CSP involves first, choosing a variable, and then assigning a value to it. In a search tree, each node is a variable and branches leading away from that node are the different values that can be assigned to that variable. Therefore, a CSP with n variables will generate a search tree of depth n. Successors are generated for only the current variable and the state of the problem is appended each time the search branches.

Backtracking

If we consider a simple depth-first search on a CSP, we realize that because of the constraints we have imposed, at some point during our search we may be unable to instantiate a variable because its domain is empty! In the case that we arrive at a node where the goal test returns false (there are still unassigned variables) and there are no branches leading away from that node, we must go backward. This is called backtracking and it is the most basic of searches for CSPs. A variable is assigned a value and then the consistency of that assignment is checked. If the assignment is not consistent with the state of the problem, another value is assigned. When a consistent value is found, another variable is chosen and this is repeated. If all values in the domain of a variable are inconsistent, the algorithm will backtrack to the previous assignment and assign it a new value.

The figure below illustrates how a simple backtracking search would work. It is the n-queens problem that was presented earlier in this tutorial:



Search: propagating constraints

When our backtracking search chooses a value for a variable, it checks to see whether that assignment is consistent with the constraints on our problem. Clearly, this is not very efficient. Consider a simple graph-colouring problem. If there is an edge between two nodes, then once we assign a colour to one of these nodes, we know choosing that same colour for the other will not be consistent; therefore, we must temporarily remove the values from the domains of yet uninstantiated variables that are not consistent with the current problem state with the new assignment.

Forward checking

The forward checking algorithm does just this. Every time an assignment is made to a variable, all other variables connected to the variable (that is currently being instantiated) by constraints are retrieved and the values in their domains which are inconsistent with the current assignment are temporarily removed. In this way the domain of a variable can become empty and another value must be chosen for the current assignment. If there are no values left with which to assign the current variable, the search may need to backtrack, in which case those values that were temporarily removed as a result of the original assignment are reinstated to their respective domains. Forward checking is able to predict, in a sense, what assignments will lead to a failure and can act accordingly by pruning branches. It will, of course, encounter these inconsistencies much sooner than backtracking, especially when used in conjunction with the fail-first heuristic described earlier.

The below diagram illustrates a search using the above forward checking algorithm on an n-queens problem:



Search: local consistency

Forward checking utilizes a basic notion of consistency: an assignment to a variable is consistent with other assignments given a set of constraints. k-consistency is a term that defines the extent to which constraints are propagated. By definition, a CSP is k-consistent if for any subset of k – 1 variables in the problem, a consistent assignment to one of those variables allows a consistent assignment to any kth variable. Below, popular consistencies are discussed.

In addition to a problem being k-consistent, it can also be strongly k-consistent, which means it is consistent for k and all weaker consistencies less than k. The benefit of a strongly k-consistent problem is that we will never have to backtrack! As with most things CSP, determining the correct level of consistency checking for a given problem is done empirically.

Node consistency

This is the weakest consistency check and simply assures that each variable is consistent with itself; if a variable is assigned a value, the value must be in that variable’s domain.

Arc consistency (AC)

2-consistency is the most popular consistency and can be used as both a preprocessing step, or dynamically as part of the maintaining arc consistency (MAC) algorithm. The simple definition of arc consistency is that given a constraint CXY between two variables X and Y, for any value of X, there is a consistent value that can be chosen for Y such that CXY is satisfied, and visa versa. Thus, unlike forward checking, arc consistency is directed and is checked in both directions for two connected variables. This makes it stronger than forward checking.

When applied as a preprocessing step, arc consistency removes all values from domains that are inconsistent with one another. If it is applied dynamically as MAC, the same algorithm that is used to check AC for preprocessing is applied after every variable instantiation during the search. That algorithm is presented here along with a demo:

Path consistency

The easiest way to think about path consistency is to consider a triangle, with three points labeled a, b, and c where edge( a, c ) is not a solid line[5]. This represents a problem where there are constraints between a and b, and b and c. Path consistency considers triples of variables, so that while a and c are not explicitly constrained, there is a constraint induced on them through the transitive nature of their constraints involving b. Thus our triangle is an easy representation of this relationship. If the constraints are that a > b and b > c, it is clear that there is an implicit constraint between a and c, such that a > c. 3-consistency, though obviously stronger than arc consistency, is not generally used. While arc consistency checks pairs of variables for consistency, path consistency must check all triples of variables—for a large problem, it is easy to see that the number of combinations is potentially huge. In worst-case time, the complexity is O(d3n3).


Search: other

Local search

Local search for AI problems involves making a complete assignment of variables and then switching (flipping) a value for a variable and checking to see if we have found a solution. This is no different in constraint programming. Assignments are made to all our variables, but these assignments will not be consistent with the constraints on the problem. In local search, each node in the search tree is a complete assignment, with branches involving flipping different variables within the complete assignment until a solution is found. Local search works very well for some types of CSPs. The most popular heuristic for local search in CP is min-conflicts. When min-conflicts flips one of the variables in the assignment, it will choose a value for that variable that results in the minimum number of conflicts with other assignments. This way we have some idea of progress in our local search. The min-conflicts algorithm is presented here:

min-conflicts(csp, max):
  initial := complete assignment
  for 1..max do:
    if initial is a solution:
      return initial
    var := nextVar()
    value := leastConflicts( var )
    var := value in initial
  return failure


Complexity

Phase transition

Studying the phase transition for a type of problem allows us to locate where the fundamentally hard CSP and SAT problems are. In the context of a CSP, constraint tightness refers to the number of instances where, given a constraint between two variables X and Y, the pair of values for X and Y are inconsistent. A phase transition is a phenomenon that is observed by considering the graph of search effort against constraint tightness for many instances of a problem as the constraint tightness of the problem increases from 0 to 1. As this happens, the CSP will move from the part of the problem space that is underconstrained and where there are many solutions to a space that is overconstrained and where problems are not satisfiable.

The transition between the soluble and insoluble regions is referred to as the “mushy region”—a term coined by B Smith—and is populated both with problems that have a solution and those that don't: it is in this region that the peak search effort is spent trying to find a solution that exists with low probability. Because of this, phase transitions are important in the study of NP-complete problems and can give some understanding as to whether a problem is likely to be easy or difficult.

Phase transitions are not algorithm-specific.

Symmetry

When modelling a CSP, it is common that one may encounter symmetry. Symmetry is defined as an assignment, which is equivalent to another assignment; that is to say, the assignments are interchangeable. If you consider these instances, it is easy to see that if one of these assignments is consistent, then they all are. Hence it is possible to have classes of equivalent solutions where different symmetrically equivalent assignments can be interchanged and one can be assured a solution can be found for this "different" problem.

It is important to take notice of symmetries because they can be used to shorten search time (by not searching symmetrically equivalent branches). "Breaking" symmetries, as this is called, can often be the difference in whether a problem is tractable or not. In order to exploit this feature, additional constraints must be added to our model. These constraints (which are model-specific) must make sure that if one assignment does not work, all symmetrically equivalent sets of assignments are automatically ruled out.



COP: Constraint Optimization Problems

Problems which must be optimized for a given constraint--representing an objective function--are solved in the same way as other CSPs. When a solution is encountered, it will have associated with it some "ranking" value for the objective function. The search continues to find all solutions and then chooses the solution with the most optimal value. That is to say, for every solution found if for some objective function the value is more optimal than the previous one found, that solution is saved until all solutions have been found. This idea is also referred to as preferred constraints.

Minimizing and maximizing

The most common objective functions are minimize and maximize. These functions try to minimize a given constraint (or variable) or maximize it. An example of a constraint is a linear equation between two variables, x and y. The constraint may be that x + y > 100, but we want to maximize this constraint for x or y such that we find the largest value(s), which satisfy the all the constraints of the problem.

Scheduling problems are obvious places where objective functions will pop-up. Let's say you work at a grocery store that is open from 7am to 2am. You wouldn't want to work one night from 6pm-2am and then the next morning from 7am-3pm, would you? Using an objective function to create a schedule where all shifts are covered by employees, but minimizing the number of consecutive night-morning shifts would certainly be very useful.

Branch and bound

Branch and bound is an optimizing method for solving CSPs that are too large (whether it be in the number of variables and constraints or the complexity of the constraints) to search all the solutions. The idea borrows from that of partitioning; the first solution to the problem is found (and along with it, some evaluation) and a constraint is added on the objective function to form a "subproblem" of the original so that the subproblem will be searched for the first solution found again and repeated until some optimal solution for a minimizing/maximizing function is found or there are no more solutions left. The creation of a new subproblem from the original is the branching part of the algorithm, while the bounding is the use of the evaluation for a solution to bound the new constraint. While this approach to large problems that need to be optimized is practical for real constraints like time and space, it is also more efficient. Once you have a bound, the search can stop before finding a solution if it knows that the evaluation will not be as low as one previously found.


Please take some time to experiment with some of the games provided on this site to get a sense of constraint programming!