Hardware acceleration of constrained satisfaction and logic optimization problems




Скачать 457.9 Kb.
НазваниеHardware acceleration of constrained satisfaction and logic optimization problems
страница1/7
Дата07.10.2012
Размер457.9 Kb.
ТипДокументы
  1   2   3   4   5   6   7


HARDWARE ACCELERATION OF CONSTRAINED SATISFACTION AND LOGIC OPTIMIZATION PROBLEMS

USING VELOCE EMULATOR


Edited by Marek Perkowski


Chapter 1. Introduction, Marek Perkowski.

Chapter 2. Satisfiability,


CHAPTER 1. INTRODUCTION


Draft 1.23

Version February 11, 2011

By Marek Perkowski

    1. Our Goals.


It is well-known that there are many problems in robotics, computational intelligence, artificial intelligence, computer vision, scheduling of industrial operations, planning of actions (for instance actions of an assembly line), computer aided design, computational biology, genetic engineering and other areas that require extremely fast processing speeds for specialized problems, assuming however a limited cost of the computing system – so that a supercomputer is not used for computations. These are basically logic, not arithmetic problems. Many of these problems can be formulated in a maximally simplified form as logic puzzles that you can find in high school textbooks or popular magazines. The books proposes to teach many important concepts of hardware design using a sequence of relatively simple problems. This is not a systematic textbook on logic design similar to Wakerly or a texbook on logic synthesis similar to Hachtel or Kohavi. Our approach here is pragmatic – we are interested not in design algorithms for standard designs but in intuitive methods that you can apply to some special designs, methods that require some kind of ingenuity. This way, our book can be a supplement to standard textbooks, and is especially suited to teach high school students who look for advanced digital design projects.


Field Programmable Gate Arrays (FPGAs) are special electronic devices that their hardware can be programmed in arbitrary way by their user and thus they can replace standard computers in many applications. The user can program arbitrary logic gates, their structures and connections, including processors with high parallelism. You can find many FPGAs in your cars, refrigerators and daily life “embedded systems”. FPGAs are also used to build inexpensive supercomputers; specialized rather than general-purpose ones. Using FPGAs a student can design his or her “own computer” dedicated to some task or a general purpose processor with their own list of instructions. Several FPGA-based hardware accelerators have been already built or proposed for specialized tasks that require high computing speeds. They allow to accelerate algorithms by implementing them all or their crucial parts, in hardware. Such accelerators have for instance applications in robotics, multimedia and medical diagnostics. For example, an imaging system for medical application may use multiple ASICs and FPGAs.


This book is intended as a supplemental textbook for undergraduate students in Electrical Engineering and Computer Science. It shows how to design special classes of hardware accelerators, with special emphasis on intelligent robotics applications. We will formulate three classes of problems that can be hardware-accelerated, and in our opinion, should be accelerated. As the hardware/software system to help design these accelerators we select the VELOCE hardware emulator from Mentor Graphics. Our goal is thus to teach both computer architecture of special high performance systems and the methods to emulate and verify digital designs using modern software/hardware tools, like the VELOCE emulator from Mentor. Our exercises and projects serve both of these goals. Students are encouraged to solve them individually or in groups. In our graduate classes, students solve the problems together with the professor interactively in the class, and next simulate and emulate them individually. We discuss merits and demerits of various created by variants of architectures.


The book’s main original thesis is that problems that massively use logic operations need a different type of hardware than that used in contemporary standard computers, or in DSP computers, in which architectures the arithmetic operations such as multiplication are accelerated. Our interest is in problems with massive parallelism but simple processors – such processors may have no arithmetic unit and multipliers at all! Such models of computing have been proposed by several new technologies such as nano, quantum or DNA computing.


An example of a logic problem that needs hardware acceleration is the Satisfiability Problem (SAT). Students will learn in this book how popular logic puzzles or robot planners can be reduced to SAT.


SAT problem is formulated as follows:

1) Given is a Boolean equation (Boolean function F) with many input variables and one output variable,

2) Find if there exists an assignment of binary values to Boolean variables that function F is satisfied (has value 1) and if yes, find what is this assignment of Boolean values to variables.

In other words, find a binary vector of all variable values (called a minterm) that satisfies the logic equation, if such vector exist. In similar formulation we may look for an assignment of not all, but a satisfactory number of variables.


Many important and very practical problems can be reduced to the SAT problem as formulated above. For instance in robotics, these problems are usually solved with advanced software programs. Special computers for this task were however also built for CAD applications by NEC, IBM, and other companies, because speeding up SAT would enable efficient solving of important problems that are reducible to SAT. There are thousands of such problems. There are also several important problems that are not identical but similar to SAT. For instance, Tsutomu Sasao from Kyushu Institute of Technology in Japan built a logic machine to solve the logic Tautology problem. Tautology problem is to check for a given function F if F = 1 for all combinations of values of variables that are used in function F (we call them the input variables of F). Many machines to solve Boolean problems, such as solving Boolean Equations, were built in Russia and Germany in 19th and 20th Centuries [ref, Zakrievskij]. We believe that future computers Constraint Satisfaction Problems (CSP) or robotic problems for CAD problems, will use hardware accelerators of various kinds, including logic accelerators as those proposed in this book. These will especially include special hardware accelerators to solve the kinds of logic-intensive problems that we are interested in here. This book is a speculation about the nature and technology of general purpose logic accelerators for a selected set of important problems. These problems have been never addressed in a unified way so far and as far as we know, they were never a subject of a book.


Hardware emulators such as Veloce are used to emulate/simulate new digital designs, usually general purpose processors, DSP and special embedded systems. They serve thus to verify if the hardware is doing what it was expected by its designer to do. Sometimes the emulators can also help to evaluate testability or speed of the designed prototypes. In this book we want to achieve these goals, and even more. We want to check if a hardware emulator can be used as a hardware accelerator in order to obtain a significant speed-up over a standard software implemented on a standard computer (such as a laptop). If yes, then Veloce will be used in the next phase to run, via internet, the real-time control software for mobile robots equipped with stereo-vision. Veloce box will become thus a “poor man specialized supercomputer” for our robotics projects.


A complete cycle of designing a logic accelerator processor will include therefore the stages of designing, simulating, emulating and verifying for correctness and analyzing the potential speed-up over standard software for a PC. Even if we will find out that the hardware emulator Veloce is not giving a sufficient speedup, these experiments will help us, for sure, to evaluate what would be a possible speedup of a VLSI-based (or FPGA-based) system that have been first prototyped using the Veloce tool. For this, we should approximately know only the delays of logic gates in the selected by us implementation technology, such as one of Xilinx or Altera FPGA architectures. The emulated design will be thus redesigned with other tools, widely available free FPGA-based software tools such as logic synthesizers and physical design tools, but with a small risk that the design will have different speed or functionality, as the circuits found and emulated on VELOCE will be very accurate. Therefore, there is no danger that the material developed in this book will be of no practical use for its readers. The readers will learn about high-performance special processors, reconfigurable computing, use of CAD tools and last but not least, many constraint satisfaction problems (such as logic puzzles or known mathematical problems) and how to solve them.


The general class of CSP-accelerated algorithms should have a wide applicability to logic problems; for automated logic circuit synthesis and optimization, machine learning, robotics, vision, and directed knowledge discovery. It is so because, as it will be shown in this book, all these research areas are closely related. Although some problems and their reductions are known from classical algorithms, they may be new to our reader, so all background material will be explained to make the book self-contained. The book assumes only the knowledge of basic logic design and machines, on the level of PSU ECE 271 class (second class in logic design). This material can be found in Wakerly’s book, or equivalent.

The book is intended for undergraduate students and even smart high school teens. Therefore, dear reader of the book, please help us to improve it when something will be not clear. Grammar, interpunction, organization, examples should be improved. Please correct the files that you get from me and send them back to me.

1.2. Why Logic Accelerators are superior to classical computers when solving certain classes of problems.


This book is devoted to some aspects of designing one type of future computers – computers with accelerated logic operations. We can call them the “logic accelerators”, or “logic computers” as they have been historically called. One may ask “Why such computers are of interest and why are they are expected to be more powerful than the standard computers of year 2010? “


  1. First, such logic computer will operate on very long words and will be specialized to execute operations on Boolean, Multiple-Valued logic functions, rough sets, sets, partitions, morphological data and similar data types. A standard processor operates on numbers. Evolution of standard processor architectures and lists of instructions in last 20 years developed very few ideas how to speed-up logic operations. There was much more emphasis on arithmetics, communication and multimedia processing, as well as general DSP algorithms and especially graphic algorithms.




  1. Second, logic computers will have high level of parallelism because all the real estate of the available system’s hardware will be used to implement logic operations and not other operations. These other operations (for instance, floating-point arithmetics) are not important for logic algorithms (even very little of integer arithmetics is used in logic algorithms).




  1. Third, logic computers may use probabilistic algorithms. Probabilistic algorithms are known to be superior to standard (deterministic) algorithms when solving many discrete optimization problems. Only limited forms of hardware for probabilistic algorithms are built into standard processors hardware. There are pseudo-random number generators in some computers. In this book the probabilistic hardware will be built-in on a very low level of logic circuits and it will be parallelized and distributed.




  1. Fourth, these logic computers will be optimized for search algorithms, which are common to many Constraint Satisfaction Problems, such as SAT. Standard computers are not optimized for search, they are designed for general-purpose numerical algorithms. Systems developed by us can be thus called the “hardware search computers”.


1.3. Towards combinatorial problem solving.


1.3.1. NP problems and CSP problems.


It is popularly known, even among non-specialists, that modern computers and all integrated circuits are built using a computer equipped with Computer-Aided-Design software. Humans are just not able to deal with the enormous complexity of such designs without the use of computers in all stages of simulating, designing, optimizing, verifying, validating and testing modern digital systems. This is why CAD tools are developed for all kind of digital and analog design and will remain to be perfected and made more powerful and general.


It is however less well known that several basic problems in Computer Aided Design of standard logic circuits, parts of standard software packages, are “NP-hard”. NP stands for non-polynomially, relating to the complexity that is non-polynomial with respect to the size of the problem. If for instance the size of the problem is n and the complexity of the algorithm is 2n, the problem is non-polynomial. The problem of complexity 4n3+3n2+8n+99 would be polynomial. Many familiar problems known to be hard to solve (like the travelling salesman problem) are NP problems. The word NP-hard means that these problems are optimization problems that are counterparts of the “NP-complete” decision problems. NP-complete problems are decision problems that allow verifying that S is a solution to problem P in polynomial time, but these problems need an exponential time to find the solution. The solution to an NP-complete problem is of Yes/No type. An example of such problem is the Satisfiability Problem in which we have a Boolean formula and we have to answer a question: “does there exist an assignment of values to Boolean variables from the formula such that this formula is satisfied?” Many CAD and robotics problems can be thus reduced to SAT (satisfiability) and few other similar basic combinatorial search problems. These are called the Constraint Satisfaction Problems or CSP problems for short. CSP problems are those that we have a set variables and a set of constraints on values that can be assigned to these variables. We look for a mapping of these variables to their binary values such that all constraints are satisfied. For instance, a problem of coloring a graph with the given specific number of colors K, such that every two adjacent nodes (nodes connected by an edge) have different colors -- is a CSP problem. It is an NP-complete problem. The minimum number of such colors for a given graph G is called the chromatic number of graph G. If one is asked to find the coloring with the number of colors equal to an unknown chromatic number, then it will be the NP-hard optimization problem. If the numerical value of the chromatic number is known, the coloring problem will be NP-complete. In this book we will deal with both types of problems; NP-complete and NP-hard.


1.3.2. Building a circuit for a very simple oracle.


Building SAT oracle is just to realize a circuit for the Boolean Function F from the SAT problem. This function in general can have arbitrary operators (logic gates) and arbitrary form (structure, number of logic levels). A commonly used form is the “Product of Sums of literals of variables form”. (A literal is a variable or a negated variable). An example of a SAT problem with Product of Sums (POS) formula of F is the following:


F = (a’ + b’ + c) (a’ + b’ + c’) (a + b’) (b’ + c). Is there a set of literals that satisfies F?


Oracle for this formula would have one AND gate with 4 inputs and four OR gates, each of OR gates would realize one of OR terms such as (a’ + b’ + c), (a’ + b’ + c’), (a + b’) and (b’ + c). Outputs of OR gates would be connected to inputs of the AND gate and the output of the AND gate would be F. We would generate all inputs to OR gates to find a solution. We leave to our reader the task of drawing the circuit and simulating it with various input combinations of variables a, b and c. The book will illustrate many important and practical problems, from robotics, CAD and other areas that can be reduced to SAT.


Now we will present one more example of building a simple oracle for a problem other than SAT, just to illustrate where we are heading for in our book.


The problem is this. We want to color nodes of the graph from Figure 1.1 below with as few colors as possible, so that any two nodes linked by an edge have different colors. Assuming that we have no any knowledge of the graph that is being colored other than that this graph has five nodes, we have to assume pessimistically that in the worst case the graph needs as many colors as there are nodes, which means five. Five numbers need at least 3 bits to encode them, it would be too bad to have this kind of a problem for a graph with 10,000,000 nodes which would be colorable with 2 colors, but let us make important point again that we have absolutely no information about the data (the graph properties that may help to evaluate its chromatic number) in this variant. We have only pure graph data, nothing else. However, if we would know some additional information, for instance that the graph is planar, we could solve this problem much more efficiently. For example, one can use the famous “Four Color Theorem” to know that only four colors are sufficient to color a planar graph, and thus encode the colors with only two bits. Thus, “knowledge is power” – any piece of general knowledge can help to solve the problem by designing a smarter search algorithm for it.




Figure 1.1: Graph for coloring with five nodes. It is colored with red, blue and yellow colors in such a way that every two neighbor nodes have different colors. The chromatic number of this graph is 3.

Assuming no knowledge of the chromatic number of the graph the encoding requires three bits for each color and is shown as in Table from Figure 1.2 below. One particular example of encoding another simpler graph is shown in Figure 1.5.


Color

Bit

red

a1, a2, a3

blue

b1, b2, b3

blue

c1, c2, c3

yellow

d1, d2, d3

red

e1, e2, e3

Figure 1.2: Assignment of bits to encoded colors of nodes for the graph from Figure 1.1.


An inequality comparator circuit is used to compare two nodes of the graph, as shown in Figure 1.3 for nodes a and b. Such comparator is connected to encoding bits of any two nodes that are linked by an edge in the graph. If the colors of nodes a and b are the same then the output of the comparator will be zero. If the codes are different (which is good) then the output will be 1. Therefore, if oracle has such a comparator for every two nodes of the graph linked by an edge and if a global AND gate of outputs of comparators is created, the output of this AND gate will be one for a good coloring and will be a zero for a bad coloring. Even if only for one pair of neighbor nodes of the graph the proper coloring is violated, the AND gate will produce a 0. See Figure 1.6 for the standard (combinational) oracle.



Figure 1.3: The inequality comparator used in Map Coloring and Graph Coloring problems. Here it compares node a with node b. Observe that the size of this comparator depends significantly on the possible maximum number of colors. The comparator produces “1” at its output if the arguments a and b are different binary vectors of width 3. The binary signal (a ≠ b) is also called a predicate.


(a)

Figure 1.4: (a) The inequality comparator from Figure 3.3.3 applied assuming five or more ( ≤ 8) colors in the graph. This is a schematic for the inequality operator circuit, which is used in many oracles.


The classical schematics of the comparator using EXOR, NOT and AND gates is shown in Figure 1.4a.


red

000

n1

blue

001

n2

yellow

010

n3


Figure 1.5: Encoding of colors for the graph coloring oracle of another graph having 3 nodes.




Figure 1.6: Principle of graph coloring applied to a simple graph from Figure 1.5. This is a classical oracle. In this and previous graph coloring problem we are not checking for a minimal solution. We look here only for a coloring that satisfies the constraint of correct coloring. Thus every proper coloring that uses any 3 of 5 colors is good (this example is trivial, but we wanted to have a simple circuit for the example).


At this point our only goal was to explain the concept of an oracle for a constraints satisfaction problem different than SAT. Remember that many oracles can use various constraints similar to the constraint (A  B) used here. In this example the oracle is very simple and can be designed by hand. In general, the oracle in the processor is a very complex circuit with many logic blocks; its design will require automation and the logic synthesis CAD should be used to build more complex oracles. Think about a graph coloring problem with 10,000 nodes. Think about some image processing system that uses many types of constraint operators other than the inequality constraint (A  B) used above. Many such problems exist in real life. The need to design CAD tools for hardware design of large and complex oracles is a one more reason for us to use the VELOCE emulator.

  1   2   3   4   5   6   7

Похожие:

Hardware acceleration of constrained satisfaction and logic optimization problems iconOn a Feasible-Infeasible Two-Population (fi-2Pop) Genetic Algorithm for Constrained Optimization: Distance Tracing and No Free Lunch

Hardware acceleration of constrained satisfaction and logic optimization problems iconTitle: Solving global unconstrained optimization problems by symmetry-breaking

Hardware acceleration of constrained satisfaction and logic optimization problems iconA new Approach to Solving Problems of Multi-state system reliability optimization

Hardware acceleration of constrained satisfaction and logic optimization problems iconДосистемным в том смысле, что для них не рассматривались вопросы создания целых систем на одном кристалле
По архитектуре эти плис делятся на подклассы программируемых логических матриц плм (pla, Programmable Logic Arrays) и программируемой...
Hardware acceleration of constrained satisfaction and logic optimization problems iconA multi-Subpopulation Particle Swarm Optimization: a hybrid Intelligent Computing for Function Optimization

Hardware acceleration of constrained satisfaction and logic optimization problems iconПри первой загрузке попадаешь сначала в меню настроек, где требуется выставить драйверы устройства (или устройств) вывода. Некоторые из устройств поддерживаются
Очень удобно, что для инициализации каждого нового драйвера не требуется вручную перезапускать Logic, достаточно в появившемся диалоговом...
Hardware acceleration of constrained satisfaction and logic optimization problems iconBoundAry value problems and characteristic value problems

Hardware acceleration of constrained satisfaction and logic optimization problems iconView coherence acceleration

Hardware acceleration of constrained satisfaction and logic optimization problems iconContemporary Crustal Deformation in East Asia Constrained by Global Positioning System Measurements

Hardware acceleration of constrained satisfaction and logic optimization problems iconSimilarity for ultra-relativistic laser plasmas and the optimal acceleration regime

Разместите кнопку на своём сайте:
Библиотека


База данных защищена авторским правом ©lib.znate.ru 2014
обратиться к администрации
Библиотека
Главная страница