# Implementation of 16 bit RISC Processor by FSM

Manoj Barfa<sup>1</sup>, Deepak Sharma<sup>2</sup>

*<sup>1</sup>M.Tech. scholar, Lord Krishna College Of Technology, Indore M.P. 2 Asst. Prof., Lord Krishna College Of Technology, Indore M.P.*

*Abstract*- **The central unit of all smart devices are Processors, whether they be electronic devices or otherwise. Their smartness comes as a direct result of the decisions and controls that processor makes. The existing commercial microprocessors are provided as black box units; with which users are unable to monitor internal signals and operation process, neither can they modify the original structure. In order to solve this problem 16-bit fully functional single cycle processor is designed in terms of its architecture and its functional capabilities. The procedure of design and verification for a 16-bit processor is introduced in this paper. The key architecture elements are being described, as well as the hardware block diagram and internal structure. The summary of instruction set is presented. This processor is modify as a VERILOG Hardware Description Language (VERILOGHDL) and gives access to every internal signal. In order to consume fewer resources, the design of arithmetic logical unit (ALU) is optimized. The RTL views and verified simulation results of processor are shown in this paper. The synthesis report of the design is also described. The design architecture is written in VERILOG Hardware Description Language (VERILOGHDL) code using Xilinx ISE tool for synthesis and simulation.**

*Index Terms***- Arithmetic logical unit (ALU); control unit (CU);comparator; shifter; rotation; instruction se; VERILOGHDL; Xilinx.**

#### I. INTRODUCTION

Processors are divided into 3 categories: 8-bit, 16-bit and 32-bit processor, depending upon the demand of performance, cost, power and programmability. 8-bit processors have extreme low cost and consume less power for simple control system. In contrast to 8-bit, 32-bit processors have high programmability, high performance and are widely used in cellular phone and PDA that need high computation but it has high power consumption. On the other hand 16-bit processors have high performance and power than 8 bit processor and low power consumption than 32-bit

processor. They are often used in 16-bit applications such as disk driver controller, cellular communication and airbags.

The 16-bit fully functional single cycle processor is applicable for real tasks and also used for assembly language programming. We need to participate in the process of processor design and to understand the inner structure of

Processor. Therefore its architecture is well structured and simple enough so that it can be built by first grade students, without any design experience. These all requirements can be obtained by the FPGA based processor with Hardware Description Language VERILOGHDL. Figure 1 shows the basic steps to design the processor.

The remaining paper is organized as follows. The next section of the paper describes the internal architecture of 16-bit fully functional single cycle processor, System operation and interaction between different units, ALU and Instructions. Section III introduces functional simulation results. Section IV describes logic synthesis.



Fig.1. Design Flow Steps Methodology

### II. GENERAL ARCHITECTURE OF TEACHING PROCESSOR

A processor incorporates most or all of the functions of a computer"s Central Processing Unit (CPU) on a single IC or microchip. In order to accomplish these several innovative and unconventional design tradeoffs have been made, without compromising the goals. The general architecture of 16-bit teaching processor is shown in figure 2. It contains number of basic pieces. There is a register array of 8-bit and 16 bit, a 16-bit ALU, a 16-bit shifter, a program counter, an instruction register, a 16-bit comparator, an address register and control unit. All of these units communicate through a common 16-bit tri-state data bus.

A. System operations and interaction between different units

The processor fetches instructions from external memory and executes these instructions to run a program. These instructions are stored in instruction register and decoded by control unit. The control unit causes the appropriate signal interaction for processor unit to execute the instruction.If the instruction is an add of two registers, the control unit would cause the first register value to be written in operational register (OpReg) for temporary storage. The second register value would then be placed on data bus. The ALU is now set at add mode and result will be stored in output register (OutReg). Output register stores the resulting value until it is copied to the final destination. When executing an instruction, number of steps takes place. Program counter holds the address in memory of the current instruction. After an instruction has finished execution, the program counter is advanced to where the next instruction is located. If the processor is executing a linear stream of instructions, this is the next instruction. If a branch is taken, the program counter is loaded with next instruction location directly. The processor values the address register, which gives output as new address on the address bus. At the same time, control unit sets the R/W (read write signals) to  $0$  for read operation and sets signal VMA (Valid Memory Address) to "1", signaling the memory that the address is now valid. Memory decodes the address and places the memory data on data bus. When data has been placed on data bus, memory set the READY signal to "1" indicating that the memory data is ready for consumption.

Control unit causes the memory data to be written into the instruction register. The control unit is now access and decodes the instruction. The decoded instruction executes, and process starts over again.



Fig.2. Internal architecture of 16-bit processor

#### B. Arithmetic logical unit (ALU)

The arithmetic operations are comprised of addition, addition with carry, subtraction, subtraction with borrows. Most of the time behavioral method of hardware description is employed using the following expressions:

#### A+B, A+B+C, A-B, A-B-C

The behavioral capabilities of HDL can be more powerful and more convenient for some designs. However, in this case the behavioral description will likely imply more adder unit usage in order to realize these functions.

The block diagram of the ALU is shown in Figure 3. It consumes only one adder unit and can co-operate with the multiplexer to realize different calculations. Table 1 shows consumed resources of the two methods.



Fig.3 ALU Block Diagram



During a component design experiment, to facilitate the manual operation, the parameter is set as 4 bits; otherwise, for a principal machine design, only the parameter needs to be changed to 16 bits to become a 16-bit ALU.

The advanced ALU has basic arithmetic and logical operations including addition (+), subtraction (-), multiplication  $(*)$ , negation  $(*)$ , and in addition of some other operations such as bit shifts  $(\langle\langle\rangle\rangle)$ , bitwise logic operations  $(x, \lambda, \gamma)$ , and logical operations  $(x&x, !, ?')$ .

#### C. Instructions

Instructions can be divided into five major categories:-

- CPU Control instructions like NOP, STOP or SET does not generate numeric result but alter the processor"s state. The SET and CLR instructions allow setting and clearing of any status or control flag.
- Data Transfer instructions like MOV, LOAD or PUSH copy the content of an internal register to another register, a memory location, or load the data from these sources to register file.
- Branch and Subroutine -- instructions like JMP, CALL or RET alter the value of program counter and access the call stack.
- Arithmetic and Logic -- instructions like ADD, NEG or XOR generate numeric result as a function of two source operands, in unsigned integer depending of the state of SF flag.
- Multiplication and Division -- these instructions generate two results during execution. The multiplication will output a 32-bit result and the division will output a quotient and a remainder.

|                         | SUBI Immediate<br>subtraction |   |                 | SUBIRd, k    | Rd         | $Rd - k$            |  |
|-------------------------|-------------------------------|---|-----------------|--------------|------------|---------------------|--|
|                         |                               |   |                 |              |            |                     |  |
|                         | AND Logic AND                 |   |                 | AND Rd, Rs   | Rd         | $Rd \cdot Rs$       |  |
| 0R.                     | Logic OR                      |   |                 | OR Rd. Rs    | Rd         | Rd or Rs            |  |
| NOT                     | Logic NOT                     |   | NOT Rd          |              | Rd         | NOT (Rd)            |  |
| SHL                     | Shift                         |   | registerSHL Rd  |              | $Rd(n+1)$  |                     |  |
|                         | left                          |   |                 |              | Rd(n).     | Rd(0)               |  |
|                         |                               |   |                 |              | 0          |                     |  |
| <b>SHR</b>              | Shift                         |   | registerSHR Rd  |              | Rd(n)      |                     |  |
|                         | right                         |   |                 |              |            | Rd(n+1), Rd(7)      |  |
|                         |                               |   |                 |              |            |                     |  |
| JMP                     | Immediate jump                |   |                 |              | PC P.C.t.k |                     |  |
| JMR                     | Jump to register JMR Rd       |   |                 |              | ÞС         | Rd                  |  |
| BRC                     | Branch if carry BRC k         |   |                 |              | if(C)      | $= 1$ ) then        |  |
|                         |                               |   |                 |              | PС         | $PC + k$            |  |
| BRZ                     | Branch if zero                |   | BRZ k           |              |            | $if (C = 1) then$   |  |
|                         |                               |   |                 |              | PС         | $PC + k$            |  |
| BRH                     | Branch if half-BRH k          |   |                 |              |            | if $(C = 1)$ then   |  |
|                         | carry                         |   |                 |              | PС         | $PC + k$            |  |
| $\overline{\text{LDI}}$ | Load                          |   | LDI Rd, k       |              | Rd         | k                   |  |
|                         | immediate                     |   |                 |              |            |                     |  |
| $\overline{\text{LDD}}$ | Direct                        |   | loadLDD Rd,[A]  |              | Rd         | [A]                 |  |
|                         |                               |   |                 |              |            |                     |  |
| LDX                     | from memory<br>Indirect       |   | loadLDX Rd.[Rs] |              | Rď         | $\sqrt{Rs}$         |  |
|                         |                               |   |                 |              |            |                     |  |
| STD                     | from memory                   |   |                 | STD [A],Rd   | ľΑT        | Rs                  |  |
|                         | Direct                        |   |                 |              |            |                     |  |
|                         | storage                       |   | to              |              |            |                     |  |
| $\overline{\text{STX}}$ | memory                        |   |                 |              |            |                     |  |
|                         | Indirect                      |   |                 | STX [Rd], Rs | [Rd]       | Rs                  |  |
|                         | storage                       |   | to              |              |            |                     |  |
| LDP                     | Load                          |   | fromLDP Rd      |              | Rd         | PC                  |  |
|                         | program                       |   |                 |              |            |                     |  |
|                         | counter                       |   |                 |              |            |                     |  |
|                         | Rd: Destination register,     |   |                 |              |            | Rs: Source register |  |
|                         | k: Constant, PC:              |   |                 |              |            | Program counter A:  |  |
|                         | <b>Address</b>                |   |                 |              |            |                     |  |
| Type J:                 |                               |   |                 |              |            |                     |  |
|                         | OPCODE                        |   | Not used        |              |            | K                   |  |
|                         |                               |   |                 |              |            |                     |  |
| Type I:                 |                               |   |                 |              |            |                     |  |
|                         | OPCODE                        |   | Rd              |              |            | ΧA                  |  |
|                         |                               | n | 10              |              |            |                     |  |
| Type R:                 |                               |   |                 |              |            |                     |  |
|                         | CROODE                        |   | $R_{\rm A}$     |              | Pe.        | Notmesd             |  |

Fig.4. Instruction format per instruction type

#### D. Simulation Results

Functional simulation is the way to verify a design. In our design we verify the components of processor by the functional simulation and obtained the simulated data which confirms the work ability of our

Table2. INSTRUCTION SET

design. Here we also found that functional simulation is also verified for the whole processor.

| Lerent Simulation<br>and today out |      | 310                                                                                   | 401                                                             |       |  |
|------------------------------------|------|---------------------------------------------------------------------------------------|-----------------------------------------------------------------|-------|--|
| $9.051 - 10$                       | τ    | REFERENCES CONTINUES OF CONTINUES AND LODGED                                          |                                                                 |       |  |
| 04101ff                            | t.,  | <b>HERE A RESIDE X SENIOR VEHICLE A RACING A RESIDENCE</b>                            |                                                                 |       |  |
| ■印度                                | 4748 | 37.2                                                                                  | and the property of the company's terms and the<br><b>COLOR</b> | 173.8 |  |
| 10000                              | v.   | THE R. P. LEWIS CO., LANSING, MICH. 49-14039-1-120-2<br>the company of the company of | the contract of the contract of the contract of                 |       |  |
| 1.61                               | Ŧ    |                                                                                       |                                                                 |       |  |
| 19.98                              | ٦    |                                                                                       |                                                                 |       |  |
| $-19.3$                            | ï    |                                                                                       |                                                                 |       |  |
| $-94$                              | т    |                                                                                       |                                                                 |       |  |
| $+ 10^{-1}$                        | т    |                                                                                       |                                                                 |       |  |
| $+ 10$                             | 3    |                                                                                       |                                                                 |       |  |
| $\frac{1}{2}$                      | 1    |                                                                                       |                                                                 |       |  |
| ● 女性<br>女性<br>女性                   | ï    |                                                                                       |                                                                 |       |  |
|                                    | Ŧ    |                                                                                       |                                                                 |       |  |
|                                    | 3    |                                                                                       |                                                                 |       |  |
|                                    | ä    |                                                                                       |                                                                 |       |  |
|                                    | ×    |                                                                                       |                                                                 |       |  |
| $-19.4$                            | 7    |                                                                                       |                                                                 |       |  |
| $\frac{1}{2}$                      | 3    |                                                                                       |                                                                 |       |  |
|                                    | 1    |                                                                                       |                                                                 |       |  |
| $\sqrt{d}$                         | Ŧ    |                                                                                       |                                                                 |       |  |
| 制造大作                               | 1.   |                                                                                       |                                                                 |       |  |
|                                    |      |                                                                                       |                                                                 |       |  |

Fig.5. Simulation Result of ALU







Fig.7. Simulation Result of Comparator

| <b>□ 동(</b> : 15미       | t.               | <b>INTEACC I INTERES</b><br><b>TNTACSE X 16156E4</b><br>16hX 000 |
|-------------------------|------------------|------------------------------------------------------------------|
| <b>BIPERIOD(31:3)</b>   | $\mathfrak{Z}$ . | 37900000023                                                      |
| <b>6</b> DUTY CYCLE     | 05               | 0.5                                                              |
| <b>■ B4</b> 0FFSET[31:剑 | $3 -$            | 32h08090364                                                      |
| o de                    | 1                |                                                                  |
| <b>미 중(</b> a[150]      | 1.               | 1610000 X DIDACC<br>1014056<br>1515654<br>1516725                |
|                         |                  |                                                                  |

Fig.8. Simulation Result of Biregister



## Fig.9. Simulation Result of Triregister

| <b>Current Senutation</b><br>Time: 1000 ns |               | 1000<br>208<br>400<br>500<br>803                                      |
|--------------------------------------------|---------------|-----------------------------------------------------------------------|
| ■ 84 (15京)                                 | t.            | TENDADS 33 INFITAS 31 IENEFO2<br>169,7222<br>16h6100                  |
| ■ B4PERIOD(31 0)                           | $\mathcal{X}$ | 02100000000                                                           |
| <b>BBOUTY_CYCLE</b>                        | 0.5           |                                                                       |
| <b>■ B4</b> OF FSET 31 调 3.                |               | 325000000364                                                          |
| <b>■ B4</b> in(2.0)                        | 37.7          | 352<br>315<br>מוכ<br>20/3                                             |
| $0$ $th$                                   | t.            |                                                                       |
| 0.00                                       | 1             |                                                                       |
| <b>파 중4</b> 페15 페                          | 12.           | time(c)<br>115/200<br><b>1516108</b><br><b>IEMDADS.</b><br>$-18525A5$ |

Fig.10. Simulation Result of RAM



Fig.11. Simulation Result of Control Unit



Fig.12. Simulation Result of CPU

E. Logic Synthesis

The RTL description of a design is taken through logic synthesis in an EDA tool, which generates a gate-level description (net list) automaticallysn. It converts the VERILOGHDL code into gate level architecture.





Fig.14. RTL View of Shifter



Fig.15. RTL View of Comparator



Fig.16. RTL View of RAM



Fig.17. RTL View of Control Unit

| addr(15.0) | clk   |
|------------|-------|
| vma        | a ca  |
| wrb        | ready |
| data(15:0) | reset |

Fig.18. RTL View of CPU

If we push the top-level design we can find the internal structure like this.



Fig.19. Internal Structure of the CPU

#### III.CONCLUSION

The 16-bit fully functional single cycle processor was described using VERILOG HDL. The design of ALU was optimized so that it consumes fewer resources. Compared with the existing commercial microprocessors, it has advantage because it is an open core which benefits with an in-depth understanding of the microprocessor"s interior

structure. Functional simulation shows that the processor executes for all the various instructions. We verified all the result and found them too correctly.

### IV.APPLICATION DOMAIN

The program used to evaluate the performance of the CPU must make use of all type of instructions to manipulate the stack, execute subroutines and access RAM. With a view to meet the requirements above, a program that calculates Fibonacci series elements can be used.

#### V.FUTURE WORK

In order to obtain a more sophisticated architecture we can add some advanced techniques like pipelining, interrupt handler and input/output controllers obtaining a competitive general purpose 16 bit RISC processor.

#### REFERENCES

[1] Andrei-Sorin F., Corneliu B., 2010 "Savage 16- 16 bit RISC Architecture General Purpose Microprocessor" in Proc. IEEE Journal. (Pp.3-8)

- [2] VenelinAngelov, Volker L., 2009 "The Educational Processor Sweet-16" in Proc. IEEE Conference. (Pp.555-559)
- [3] Xiao Tiejun, Liu Fang, 2008 "16-bit Teaching Microprocessor Design and Application" in Proc. IEEE International Symposium on It in Medicine and Education. (Pp.160-163)
- [4] J. O. Hamblen and M. D. Furman, Rapid Prototyping of Digital Systems. Springer, 2001.
- [5] Cross, J.E. and Soetan, R. A., 1988 "Teaching Microprocessor Design using the 8086 Microprocessor" in Proc. IEEE Conference on Southeastcon"88. (Pp. 175-180)
- [6] D. L. Perry, VHDL, 3rd ed. McGraw-Hill, 1988.
- [7] J. Reichardt and B. Schwarz, VHDL-Synthesis. Oldenburg, 2001.
- [8] Bannatyne, R, 1998 "Migrating from 8 to 16-bit Processors"
- [9] P. Verplaetse, J. Campenhout, "ESCAPE: Environment for the Simulation of Computer Architecture for the Purpose of Education," IEEE TCCA Newsletter, February, pp. 57-59, 1999.
- [10] M. Jaumain, M. Osee, A. Richard, A. Vander Biest, P. Mathys, "Educational simulation of the RiSC processor," ICEE International Conference on Engineering Education, 2007.