Digital Circuit Experiments
Final Project - The Micro-Processor HCL 2516
Murphy Chen, Lan Chang, and Ing-Jei Huang
National Taiwan University
Department of Electrical Engineering
0 Abstract
-- Written by Murphy Chen <B82503131>
This report describes about the 16-bit micro-processor HCL-2516 we
built. We described the design and the function of its registers, flip-flops,
bus system, control unit, arithmetic and logic unit. We also described
about the assembler, the download kit, and the applications we made.
The experiences and the lessons we have learned are also presented.
1 System Overview
-- Written by Murphy Chen <B82503131>
What we have built primarily is a 16-bit micro-processor. The micro-processor
is consist of 9 registers, 7 flip-flops, 25 instructions, a memory unit
with 4096 words of 16 bits each, an arithmetic and logic unit, and a
control unit. In addition, we have developed an assembler for this micro-processor
so that we can write assembly programs conveniently. Finally, we have
written several application programs for demonstrating the power of
this micro-processor.
To communicate with the real world, the mirco-processor has the following
two registers: OUTR and INPR, both are 16 bits wide, which can be used
together with two flip-flops FGI and FGO to transfer 16-bit data to
or from other devices.
To perform useful tasks, the micro-processor has the following instructions:
And, Add, Load Word, Store Word, Branch Unconditionally, Branch and
Save Return Address, Increment and Skip if Zero, Clear Register, Complement
Register, Shift Register, Increment Register, Skip Next Instruction
Conditionally, Halt Computer, Input Character, Output Character, Enable
Interrupt, Disable Interrupt.
The application programs can be written in assembly language, and
compiled by the assembler, and the assembler will output the resulted
machine codes for programming into EPROMs. We have tried to build a
download kit to transfer the resulted machine codes from a PC to SRAMs
through a parallel printer port to facilatate the task of developing
application programs, but we failed.
The block diagram of the system is depicted as follows:
Fig.1 The System
Overview
2 Introduction and Explaination of Each Subsystem
The system is divided into eight subsystems, including memory unit,
registers, flip-flops, bus&decoders, control unit, arithmetic and
logic unit, assembler, download kit, applications. They are described
in the following sections.
2.1 Memory Unit
-- Written by Murphy Chen <B82503131>
The memory unit is used to provide a memory with 4096 words of 16
bits each. We can store data and instructions in it so as to execute
user programs.
There are many practical ways in arranging the memory unit. The user
can choose whatever way they like. For example, in the system developement
time, the user may wish to use four 2K*8bit SRAMs as the memory unit.
Because it is convenient to download programs into SRAMs and test them
over and over again. After the user has fully tested his/her programs,
he/she may want to use two 2K*8bit EPROMs and two 2K*8bit SRAMs as the
memory unit. He/she may program EPROMs with read-only instructions and
static data, and use SRAMs for storing and retrieving dynamic data.
In our applications, we arrange the memory unit in this way: address
0x000 to 0x7FF belongs to EPROMs, and address 0x800 to 0xFFF belongs
to SRAMs.
The type of SRAMs we use is 6116. It is a 16,384-bit high-speed static
RAM organized as 2K*8bit. We cascade two 6116s to provide the addressing
space of 2K words of 16 bits each. The type of EPROMs we use is 27C256.
We choose it because of its low price, but soon we find it provides
another benefit as well. It is a 256K-bit erasable programmable ROM
organized as 32K*8bit. We cascade two 27C256s to provide the addressing
space of 2K words of 16 bits each. In fact, two 27C256s can provide
32K*16bit addressing space, so we found the benefit that we can regard
two 27C256s as eight memory blocks, with each block provides 2K*16bit
addressing space. So, we can write up to eight programs in two 27C256s
in one time, when we want to execute one of the programs, we need only
set the corresponding most significant bits of 27C256s. This means when
one program has bugs, we need not wait for 30 minutes to erase the EPROMs,
we need only write new programs in the next memory block of the same
EPROM!
For the data sheet of SRAMs and EPROMs we used, please refer to the
appendix.
2.2 Registers
-- Written by Murphy Chen <B82503131>
There are 9 registers in our micro-processor. The registers are AR,
PC, DR, AC, IR, TR, OUTR, INPR, and SC.
AR stands for Address Register. It is used to tell the
memory unit where to retreive the content of memory or where to store
some value into the memory unit. PC stands for Program
Counter. It is always pointed to the next instruction for execution.
PC is used to tell the control unit where to find the next instruction
for execution. DR stands for Data Register. It is used
to receive data from memory, and it is also used to provide operands
for the adder and logic circuit. AC stands for Accumulator.
It is used to contain the results calculated from the adder and logic
circuit or from memory. IR stands for Instruction Register.
It is used to contain the current instruction fetched from memory to
tell the control unit what operations are going to be done. TR
stands for Temporary Register. It is used to temporarily contain the
value of PC when executing in the interrupt cycle. OUTR
stands for OUTput Register. It is used to send data to the peripheral
devices. INPR stands for INPut Register. It is used to
receive data from the peripheral devices. SC stands for
Sequential Counter. It is used to provide timing signals for the control
unit.
The text design files corresponding to these registers are REG1.TDF
for PC, REG2.TDF for TR, REG4.TDF for SC,
REGAR.TDF for AR, REGDR.TDF for DR, REGIR.TDF
for IR, REGOUTR.TDF for OUTR in appendix.
And, INPR and AC are combined with Adder
and Logic Unit.
There are so many text design files for the registers, and this is
because every register has its own particular function. Some can be
incremented, some can be cleared, and some can be loaded, etc.
The detailed description and the simulated results of them are shown
as follows.
2.2.1 REG1 (Program Counter)
Input |
Output |
CLR |
LD |
INR |
a[].d is connected to |
H |
X |
X |
0 |
L |
H |
X |
i[] |
L |
L |
H |
a[].q+1 |
L |
L |
L |
a[].q |
Function Table of REG1
The Simulated Result of REG1 from MAXPLUSII
REG1.TDF is to function as a 12-bit program counter. It has the ability
to be loaded with a new 12-bit address value, and to be cleared when
users press reset button to restart the micro-processor, and to be incremented
during the fetching phase of every instruction and during some conditional
branch instructions.
The mismatch of width between input and output is because that the
input of Program Counter comes from the bus, and the bus is 16-bit wide,
so the input is 16-bit wide, but the Program Counter itself is only
12-bit wide, because the addressing space is 4K words, so the most four
significant bits come from the bus is to be thrown away by Program Counter,
and under normal conditions, when Program Couter is ready to receive
data from the bus, the most four significant bits of the data should
always be zeros.
2.2.2 REG2 (Temporary Register)
Input |
Output |
CLR |
LD |
INR |
a[].d is connected to |
H |
X |
X |
0 |
L |
H |
X |
i[] |
L |
L |
H |
a[].q+1 |
L |
L |
L |
a[].q |
Function Table of REG2
The Simulated Result of REG2 from MAXPLUSII
REG2 is to to function as a 16-bit temporary register. Its function
is similar to REG1, except that the wide of its input is equal to that
of its output, both are 16-bit wide. It has the ability to be loaded
with a new 16-bit address value, to be cleared, and to be incremented.
2.2.3 REG4 (Sequential Counter)
Input |
Output |
CLR |
INR |
a[].d is connected to |
H |
X |
0 |
L |
H |
a[].q+1 |
L |
L |
a[].q |
Function Table of REG4
The Simulated Result of REG4 from MAXPLUSII
REG4 is to to function as a 3-bit sequential counter. It has the ability
to be cleared when users restart the micro-processor or at the end of
an instruction cycle or after executing the HLT instruction, and to
be incremented at each phase of an instruction cycle.
2.2.4 REGAR (Address Register)
Input |
Output |
CLR |
LD |
INR |
a[].d is connected to |
H |
X |
X |
0 |
L |
H |
X |
i[] |
L |
L |
H |
a[].q+1 |
L |
L |
L |
a[].q |
Function Table of REGAR
The Simulated Result of REGAR from MAXPLUSII
REGAR is to to function as a 12-bit address register. Its function
is similar to REG1, except that it has one more output a11_not to give
the invert signal of the most significant bit of a[] to the memory unit,
for convenience with cascading four 2K*8bit SRAMs or EPROMs as the memory
unit. It has the ability to be loaded with a new 16-bit address value,
to be cleared, and to be incremented. The reason of the mismatch of
width between the inputs and the outputs is the same as REG1.
2.2.5 REGDR (Data Register)
Input |
Output |
CLR |
LD |
INR |
a[].d is connected to |
H |
X |
X |
0 |
L |
H |
X |
i[] |
L |
L |
H |
a[].q+1 |
L |
L |
L |
a[].q |
Function Table of REGDR
The Simulated Result of REGDR from MAXPLUSII
REGDR is to to function as a 16-bit data register. Its function is
similar to REG2, except that it has one more output DR_ZERO to tell
the control unit whether DR equals zero or not in order to facilitate
the implementation of the ISZ instruction. It has the ability to be
loaded with a new 16-bit address value, to be cleared, and to be incremented.
2.2.6 REGIR (Instruction Register)
Input |
Output |
LD |
a[].d is connected to |
H |
i[] |
L |
a[].q |
Function Table of REGIR
The Simulated Result of REGIR from MAXPLUSII
REGIR is to to function as a 16-bit instruction register. It can be
loaded with a 16-bit instruction during the fetch phase of an instruction
cycle. It's output includes IR_15, OP[2..0], and b[11..0].
IR_15 comes from the most significant bit of instruction register,
and it is an indication of whether this instruction is a direct memory-reference
or an in-direct memory-reference instruction, and is also an indication
of whether this instruction is a register-reference or an input-output
instruction. OP[2..0] comes from the 12th bit to the 14th bit
of instruction register, and it is an indication of which instruction
it is. If OP[2..0] is equal to b"111" and IR_15
is equal to zero, it means that the current instruction to be executed
is a register-reference instruction. If OP[2..0] is equal to
b"111" and IR_15 is equal to one, it means that the
current instruction to be executed is an input-output instruction. b[11..0]
comes from the 0th to the 11th bit of instruction register, it may contain
an address value when it is a memory-reference instruction, or it can
be an indication of which instruction it is when it is a register-reference
or an input-output instruction. For more information about the instructions
of the micro-processor, please refer to the instruction table in appendix.
2.2.7 REGOUTR (Output Register)
Input |
Output |
LD |
a[].d is connected to |
H |
i[] |
L |
a[].q |
Function Table of REGOUTR
The Simulated Result of REGOUTR from MAXPLUSII
REGOUTR is to to function as a 16-bit output register. It can be loaded
with a 16-bit data from AC after the execution of the instruction OUT.
The output of this register is directly connected to the I/O pins of
the micro-processor, and can transfer data inside the micro-processor
to the outside real world.
2.3 Flip-Flops
-- Written by Murphy Chen <B82503131>
There are 7 flip-flops in our micro-processor. The flip-flops are
I, S, E, R, IEN, FGI, and FGO.
I is used to indicate whether the current instruction
is using a direct memory addressing or an in-direct memory addressing.
S is used to indicate whether or not to halt the computer.
E is used to contain the MSB of AC when performing a circulate
right on AC or to contain the LSB of AC when performing a circulate
left on AC, and the status of E can be used to dertermine whether or
not to skip the next instruction. R is used to enter the
instruction when an interrupt occurs. IEN is used to disable
or enable the interrupt. FGI is used to indicate whether
the input device can continue to send data. FGO is used
to indicate whether the output device can continue to receive data.
The text design files corresponding to these flip-flops are FF.TDF
and JK.TDF, listed in appendix. And the detailed description and the
simulated results of them are shown as follows.
2.3.1 FF & JK
Input |
Output |
CLR |
SET |
a.d is connected to |
H |
X |
0 |
L |
H |
1 |
L |
L |
a.q |
Function Table of FF & JK
The simulated result of FF & JK
In fact, FF and JK have the same functions. But somehow, we create
both of them. They are to function as I, R, IEN, FGI, FGO, S. Flip-flop
E has an intimate relation with AC and ALU, so, they are put together
in one text design file.
2.4 Bus & Decoders
2.4.1 Bus
-- Written by Murphy Chen <B82503131>
The bus is used to connect the memory and registers. It is implemented
as a 3*8 multiplexor, and is 16-bit wide. The control unit can choose
one of the six registers AR, PC, DR, AC, IR, TR and the memory unit
for outputing to the bus. And the bus is connected back to the input
of all of them.
For example, if we want to load the value of DR into memory, the control
unit will send a signal s[] to bus to choose DR for output. And the
control unit will also send a signal LOAD (in fact, it will send both
Output Disable and Write Enable) to memory. After the next positive-edge
of the clock, the memory will write the data come from DR into itself.
Other registers will have no change, because the control unit don't
not send LOAD signal to them.
The text design file corresponding to the bus is BUS.TDF, listed in
appendix. And the detailed description and the simulated result of is
are shown as follows.
Input |
Output |
S[2..0] |
o[15..0] |
1 |
AR[11..0] |
2 |
PC[11..0] |
3 |
DR[15..0] |
4 |
AC[15..0] |
5 |
IR[15..0] |
6 |
TR[15..0] |
7 |
MEM[15..0] |
Function Table of Bus
The Simulated Result of Bus
Note that some input are 12 bits wide, and the output is 16 bits wide.
The bus will pad zeros in the most four significant bits, from the 12th
bit to the 15th bit of these inputs for the but output.
2.4.2 Decoders
-- Written by Murphy Chen <B82503131>
Decoders are used to decode operation codes of instructions, and to
decode the timing signals generated from the sequential counter.
The text design file corresponding to the decoder is DECODER.TDF,
listed in appendix. And the function table and the simulated result
of is are shown as follows.
Input |
Output |
code[2..0] |
out[7..0] |
0 |
B"00000001" |
1 |
B"00000010" |
2 |
B"00000100" |
3 |
B"00001000" |
4 |
B"00010000" |
5 |
B"00100000" |
6 |
B"01000000" |
7 |
B"10000000" |
Function Table of Decoder
The Simulated Result of Decoder
2.4.3 Bidirectional I/O
-- Written by Lan Chang<B82503081>
We want to use RAM. So we must read/write data from/to RAM. And the
data line of RAM is the same while writing and reading. So the necessity
of using bidirectional I/O raised.
We successfully used tri-state buffer by tdf to implement our desire.
Luckily we used TDF not GDF so that problem said by TA didn't happen
at all.
2.5 Control Unit
-- Written by Lan Chang<B82503081>
Control Unit is an important part of the micro-processor. Its work
is to receive the input from the input devices, the output of ALU, contents
of registers, and the content of sequence counter, process them, and
send the processed output to every registers, output devices ,ALU and
bus. Without it, the micro-processor can never function correctly.
Control Unit totally uses combinatorial logic to process the signal
from input. Through it's output, it controls bus, registers, and output
devices directly. It also tells the ALU what to do.
Control Unit controls registers through mainly three kind of signals:
- Load(LD): it tells the register load the content on the bus line.
- Increase(INR): it tells the register increase its content by one.
- Clear(CLR): it tells the register clear its content to zero.
Control unit must match our requirement of the S language instructions.
So it must handle the process of every instruction in every clock cycle
precisely. Now, for example, we want to execute the following instruction:
ADD NUMBER
The CPU will execute the instruction for several clock cycles. To
know which clock cycle of the executing sequence is, the control unit
read it from sequence counter(SC). So control unit(CU) first clear SC,
when SC is 0, we call it is T0. When it's T0, by the functions in the
function table(See appendix), we must execute following microoperation:
Fetch R'T0: AR<- PC
Notice that R' represents the complement of R register. The line means
that when R register is zero and T0 is one(T0 is the LSB of decoded
SC's contents), the contents of PC(program counter) will be send to
AR(address register).If we want to complete this function, we must let
the content of PC be on the bus line and actuate the load line of AR(AR_LD),
then the content of PC would be send to AR. So we added following line
in the control.tdf(its complete content is in the appendix):
AR_LD = R'T[0] # …….
x2_PC= R'T[0] # …….
In the first line, the "…}.." means other situations in
which AR would load in the contents on bus line. So if any situation
in this line is matched, the AR_LD would be high so that AR would load
the content on bus line.
In the second line, x2_PC is a variable in TDF file, and it means
the situation in which PC would be put on bus line. Through an encoder,
x1_AR, x2_PC, x3_DR, x4_AC, x5_IR, x6_TR, x7_Mem would be encoded into
a 3-bit signal which called S0, S1, S2.The signal S[2..0] will control
the mutiplexer, then the content we required would be put on bus line.
Then the following microoperation is executed:
R'T1: IR<- M[AR], PC<- PC+1
By the same reason, we would have following three lines in control.tdf:
IR_LD = R'T1 # …}.
x7_MEM = R'T1 # …}..
PC_INR = R'T1 # …}..
We could discover that the two cycle R'T0 and R'T1 completes the action
in which a instruction is fetched into IR and PC is added by one. It's
the basic fetching cycle with which every instruction begins.
After this cycle, IR[14..12] would be automatically decoded into D[7..0]
because it is lined to a combinatorial decoder. And the following microoperations
would be executed( Notice that SC would automatically increase by one
when the S register is high, because S register is lined with SC_INR,
so when the program is executing, the control unit would let S register
high.)
R'T2 AR<- IR(0-11), I<- IR(15)
Of course, we would have following two lines in control.tdf:
AR_LD = R'T0 # R'T2 #
…}.
x5_IR = R'T2 # ……
And because the SET pin of I register is connected to IR(15), so we
need not to add any line in control.tdf.
In T3, we only have a microoperation:
D7'IT3: AR<-M[AR]
because it's not a indirect instruction, I =0, so this microoperation
would not be executed. For we should judge whether the microoperation
be executed, in control.tdf we write:
AR_LD = R'T0# R'T2 # D7'IT3
# …}..
x7_Mem = D7'IT3 # …}..
The next is preparing the ADD action:
D1T4: DR<- M[AR]
of course, we have:
DR_LD = D1T4 # …}..
x7_Mem = D7'IT3 # D1T4 #…}.
(notice that because the output of AR is connected to the address
pins of memory directly, so CU need not do anything to send it out)
The next is the main function of ADD instruction:
D1T5 AC<- AC+DR, E<-Cout,
SC<- 0
So in CU we would have:
AC_LD = D1T5
ADD = D1T5
SC_CLR = D1T5
(notice that ADD is a output that controls ALU, and the operation
of E register is handled by ALU)
So far we see that a ADD operation is completed by CU, ALU, bus, and
registers. And by the same method, we could implement the operations
on CU. So at last the control.tdf is completed by the method described.
(Please see details of control.tdf in the appendix behind the report)
The simulation of the control unit is listed in appendix ( up_ledt.scf
), and described as follows.
We implemented a S program in MEM.TDF, the function of the MEM.TDF
is increasing AC by one every time and send AC out to OUTR. We can see
the function of the program clearly. When reset is pressed, a low pulse
made the computer start to work and execute the program. Because it's
not related to INPR, so INPR maintained zero at all. And most importantly,
we see the SC (S[2..0]) worked properly. Only when SC works properly,
other thing would be possible. We can also see that IR changes every
time when a new instruction is read in. Then we can know that PC and
IR worked together properly.
We can see that when AC.INR is high, AC is added by one instantly.
It demostrated that AC_INR function properly. And we also see that when
indirect address is used, I.D(The output of I register) is high, and
it made the test of I success. Finally, the OUTR increased by one following
AC. We get correct result, and we are sure that there is no bug in the
function we used. We repeated it for three times, captured 7 bugs.
2.6 Arithmetic and Logic Circuit
-- Written by Ing-Jye Huang <B82503007>
This unit is used to implement some arithmetic and logic functions
which a basic computer should contains. First , it's inputs include
AC[15..0] which come from the output of this unit , INP[15..0] which
come from the input of the whole CPU,DR[15..0] , and control signals
from the control unit. The outputs of this unit are AC[15..0] because
this unit do the implementation of instructions about AC.
The control signals from the control unit is as follows:
Symbol |
Action |
Description |
ADD |
AC←AC ^ DR |
AND AC with DR |
AND |
AC←AC + DR |
Add AC with DR |
DR |
AC←DR |
Transfer from DR |
INPR |
AC←INPR |
Transfer from INPR |
COM |
AC←
|
Complement |
SHR |
AC←shr AC , AC(15)←E |
Shift right |
SHL |
AC←shl AC , AC(0)←E |
Shift left |
CLR |
AC←0 |
CLEAR |
INC |
AC←AC+1 |
Increment |
The logic cirucit of ALU is as follows.
The Logic Circuit of ALU
These instructions can be categorized to two parts. AND , ADD , DR
, INPR , COM , SHR , and SHL belong to the LD part which the control
unit will send a signal to indicate. CLR and INC are the general instructions
every kind of registers will have.
To implement AND , we use an AND gate to and AC[] with the corresponding
DR[] and the AND control signal.
The ADD instruction is achieved by using a full adder. We and the
SUM of a full adder with the ADD control signal , and send the Carry
the next full adder.
The DR operation is obtained by anding DR[] with the DR control signal
.
The INPR instruction is similar with DR except anding INPR[] with
the INPR control signal.
The COM instruction is achieved by anding the complement of AC[] with
the COM control signal.
The SHR operation is obtained by anding AC[i+1] with the SHR control
signal . On the other hand , the SHL operation ands AC[i-1] with the
SHL control signal.
The INC operation is in the same way as ADD except that we add AC
with 1. Because only one of the control signals will be HIGH at one
time, ORing the AND gates mentioned above will get the desired data
that should be transferred into AC. The logic circuit is like the figure
shown above. Owing to the repeatability of the unit, we use TDF to implement
this unit.
Another part of this unit deals with the instructions about the E
register. In a Add instruction , the Carry of AC should be transferred
to E. The CME operation complements E. In the SHR instruction , the
data stored in E is sent to AC[15] , and AC[0] is sent to E. Similarly,
in the SHL instruction , the data stored in E is sent to AC[0] , and
AC[15] is sent to E. These instructions can be implemented in similar
way as mentioned above.
The Simulated Result of ALU from MAXPLUSII
The simulation result of this unit is shown above. We let the control
signal be HIGH one at a time orderly. Because the AND and OR gates needed
to implement these instructions bring some delay, if the CLK to the
registers and the control signals go HIGH at the same time , the desired
result will be obtained in the next clock. This won't destroy the correct
operation because all instructions are delayed a clock cycle.
2.7 Assembler
-- Written by Murphy Chen <B82503131>
The assembler is used to translate assembly programs into machine
instructions. It is run on a PC, and the output file of the assembler
can be used to program EPROMs or download into SRAMs, which connected
with the micro-processor to execute application programs.
The steps for developing application programs are described as follows.
First, users write their application program in assembly language. And
then, they save their file with an extension name '.s'. And then, they
execute the assembler with a filename as a parameter, the assembler
will output some useful information to the screen and output the translated
binary code to a file with an extension name '.lst'. And then, users
can use these binary codes to program EPROMs or download into SRAMs.
Finally, users can turn on the power of the micro-processor to run the
resulted program.
The source code of the assember I created is listed in appendix, its
filename is AS.CC. The operation of the assember is described as follows.
There are several tables for the assembler to identify the keywords
and the user-defined symbols. PseudoTable[] contains the pseudo
instructions of the assembler. There are four pseudo instructions, including
ORG, END, DEC, HEX. ORG is used to define the new location of the next
instruction. END is used to indicate the end of the program. DEC is
used to indicate that this line is not an instruction, rather, it is
a decimal datum. Similarly, HEX is used to indicate that this line contains
a heximal datum. MRITable[] contains the 7 memory-reference instructions
and the corresponding machine codes. nMRITable[] contains the
18 non-memory-reference instructions and the corresponding machine codes.
UATSymbol[] is used to store the user-defined symbols. UATaddr[]
is used to store the adderss of a user-defined symbol. UATlen[]
is used to store the length of a user-defined symbol.
Other variables include Code[], LSCode[], MSCode[],
and buffer. Code[] is used to store the resulting machine
codes. LSCode[] is used to store the 8 least significant bits
of the resulting machine codes. Similarly, MSCode[] is used to
store the 8 most significant bits of the resulting machine codes. You
may ask why using LSCode[] and MSCode[]? It is because
most of the EPROMs and the SRAMs are 8-bit wide. But the instructions
and data are both 16-bit wide. So, if we want to program them into EPROMs/SRAMs,
we have to split the 16-bit wide machine codes into two 8-bit wide parts,
and program them into EPROMs/SRAMs separately. buffer is used
to store the content of the input assembly file.
First, the assembler read the content of the input assembly file into
buffer. Then the assembler scans over the entire buffer
twice. During the first pass, the assembler only care the two pseudo
instructions ORG and END, and user-define symbols. Every time it encounters
a user-define symbol, it will first check whether this symbol has been
defined, if it has been defined, then the assembler will output an error
message and exit. If it has not been defined, the assembler will add
this new user-defined symbol into UATSymbol[], and store the
address of that symbol into UATaddr[], and store the length of
that symbo into UATlen[].
During the second pass, the assembler will translate every instruction
it encounters into machine codes according to the two tables MRITable[],
and nMRITable[]. When the assembler encounters a non memory reference
instruction, it will directly generate the corresponding machine codes
according to the table nMRITable[]. When the assembler encouters
a memory reference instruction, it will check the address part of that
instruction, which is a user-defined symbol. Then, the assembler will
search through UATSymbol[] to check whether this symbol has been
defined in the file or not. If this symbol has not been defined, the
assembler will output an error message and exit. If this symbol has
been defined, the assembler will generate the address value corresponding
to that symbol. Then, the assembler will check whether the instruction
is ended with an 'I', which indicates an indirect memory reference.
After doing these checks, the assembler can generate the corresponding
machine codes according to the table MRITale[] and UATaddr[].
After translating the entire assembly instructions into machine codes,
and storing them into Code[], the assembler will output Code[]
into a file with extension name '.bin' for downloading into SRAMs. Then
the assembler will split Code[] into two parts: LSCode[]
and MSCode[] as described before, and output LSCode[] and MSCode[]
to a file with an extension name '.lst'. Users can copy the content
of this file and paste them into EXPRO for programming EPROMs.
2.8 Download Kit
-- Written by Ing-Jye Huang <B82503007>
Our project is to make a CPU. Without applications , we can't demonstrate
its ability. The applications can be written in assembly language ,
and then convert it into machine codes. The problem is how to transfer
the programs to the SRAMs. So, me and Murphy Chen decided to build a
download kit for transferring our programs. Our first idea is to download
the programs via the printer port . Because we have 12-bit address iuputs
, 16-bit data iuputs , and some control signals include WE , OE …} ,
and the printer port can't afford so many data simultaneously , we use
the concept of shift registers .
We send one bit of data each clock. This means the data is transferred
serially, not parallel. For instance , the address inputs have 12 bits.
We send each bit one by one , and after 12 clocks , the complete 12-bit
address is stored at the registers. Then the 16-bit data is transferred
in the same way . After address and data are ready , we send the write
enable signal to inform the SRAM that we want to write these data into
the SRAM. To check if the data is correctly downloaded into SRAM , we
give the SRAM an address , and read the data out one-bit by one-bit
(still use shift registers). Then we can compare it with what we have
sent.
The block diagram is as follows:

The Circuit of Download Kit
CLK1 : the clock of shifting data
CLK2 : the clock of shifting address
CLK3 : the clock of shifting din
LOAD : load DIN[0..15] to registers
OE1 : output enable of DATA[0..15]
OE : output enable of SRAM
WE : write enable of SRAM
The Simulated Result of Shift Register from MAXPLUSII
We connect DIN[0..15] and DATA[0..15] together. After downloading
programs into the SRAMs , we want to read the data back to check if
it's correct. At this time , the output of DATA[0..15] must be high
impedence , or it will effect DIN[0..15] , and we won't get the desired
data. So this is why we need OE1. This is the first problem we encounter.
After finding this problem , we almost downloaded the program into
the SRAMs correctly. (Noted by Murphy Chen, "we had nearly sucessfully
downloaded the data into SRAMs, but when we read back the SRAMs for
check, a strange bug existed that the most significant bit of every
data in every address is always one. I could not figure out why that
happened.") And the next time we came back to the lab to try to
fix the strange bug , even the previous result couldn't be obtained.
We use LA to observe the data downloaded to the ALTERA input , and
it's correct . But the output of ALTERA is wrong , though the simulation
result is right. We tried and tried , even routing the circuit again.
But the previous right result still can't be obtained. For this , we
waste almost two whole days working on this problem. Finally , to finish
this experiment before deadline , we were forced to give up this idea
and use another method, using EPROM to store the programs.
2.9 Applications
2.9.1 The Sine Wave Generator (SIN.S)
-- Written by Murphy Chen <B82503131>
In order to demonstrate the power of our micro-processor HCL2516,
we need to write applications. The first idea I thought is a digital
voltage meter and a digital function generator, two in one. And I asked
Lan for doing that. I wrote a C program to generate three tables of
sine wave, triangle wave, and square wave, respectively. It is listed
in appendix, and is called FUN.CC. That program can be used to generate
the tables the assember can understand. And is suitable for writing
the application program of digital function generator.
But because there is something wrong with the treatment of the time
sequence of our micro-processor ( though we've figured out why, this
will be discussed in the next chapter ), we were forced to use only
EPROM for demonstrating, without SRAMS. This means, we cannot do any
write to memory. This is a serious limit!
So, I came to an idea of displaying a sine wave for demonstrating.
It is simple. I write a C program called SIN.CC (listed in appendix)
to generate assembly programs. There are two loops in the program. In
the first loop, it generate the two instructions:
LDA SINi
OUT
where i is a variable, changing from 0 to 255. These two instructions
are to load a datum of sine wave from memory at address SINi,
and to output it to a DAC.
In the second loop, it generate the following instruction:
SINi, HEX (sin(i*2*3.14159/256)+1)*128
where SINi is a label for assembly to identify, and at that
address stores one sample value of sine wave.
The circuit was wired by Ing-Jye. And the result of the application
hardcopied from a digital oscilloscope is as follows.
Fig 2 The Sine Wave Generated By Our Micro-Processor
We can see that there is a strange peak value, this is because the
quality of EPROM we used. At some address in the EPROM corresponding
to some entry of the sine wave table, the data in EPROM is corrupted,
resulting the dramatic effect.
The source code of the sine wave generator is not listed in this report,
because it is pure repetitive instructions and data.
2.9.2 ZERO.S
-- Written by Lan Chang<B82503081>
This program is very simple, it only display a zero on 7 segment displayer.
It's used to test the whole system first because we try to find the
bug.
2.9.3 7_SEG_T.S
-- Written by Lan Chang<B82503081>
This program displays a 2 on 7 segment displayer. It's purpose is
to test 7 segment displayer and RAM.
2.9.4 DICE.S
-- Written by Lan Chang<B82503081>
Fastly display numbers 0-9 sequentially and continuously on 7 segment
displayer and when a button is pushed, the turning is stopped and the
final number is displayed. The program is for ROM because we could not
use RAM at that time, so we write this program to demo.
2.9.5 7_SEG_1.S
-- Written by Lan Chang<B82503081>
This program will count from 0 to 9999 and display the number on 7
segment displayer, between two numbers, there is a 10000 times loop
to delay so that we can see the number clearly. Pitifully, the program
can't function properly because the failure of using RAM.
2.9.6 0-9.S
-- Written by Lan Chang<B82503081>
This program would display sequentially 0-9 on 7 segment displayer.
It's also pitifully unusable because of RAM.
2.9.7 V_METER.S
-- Written by Lan Chang<B82503081>
This program functions as a voltmeter. It reads data from the output
of ADC. The input of ADC is the voltage we want to measure. Then it
converts binary data to BCD data, and then converts BCD data into 7
segment data. At last display the data on the 7 segment displayer. It's
originally the target we planned, but because of RAM……
3 Discussions
3.1 Murphy Chen <B82503131> says
3.1.1 The Bidirectional Bus in MAXPLUSII
We heard from classmates, they said that T.A. said there's something
wrong with the bidirectional bus in the current version of MAXPLUSII
we used. But in our design, we had used bidirectional bus for connection
with memory. So me and Ing-Jye began to test the funcitionality of the
bi-directional bus of Altera 8636 device.
To test the bi-directional bus, me and Ing-Jye first built the following
circuit, using the tri-state buffer MAXPLUSII provided in Graphic Design
File:

When we compiled that graphic design file in MAXPLUSII, there was
always an error message whcich we could not understand. So we decided
to make our own tri-state buffer and tried again. After we built our
own tri-state buffer in text design file and compiled again, there wasn't
any error messages. And we download the resulted ttf file and tested
that circuit by providing inputs and observing outputs manually, we
found that it works! So, we can finally make sure that the bi-directional
bus in Altera device is okay. (Refer to BIDIR.GDF and TRI2.TDF in appendix)
3.1.2 EPROM programming algorithms in EXPRO
When we programmed EPROMs using EXPRO, we once encountered a problem
that the EPROMs have passed the blank check, but it cannot be programmed
sucessfully. I checked the settings in EXPRO, and found something suspicious,
that is, the EPROM programming algorithm is set to 'fast'. After resetting
the programming algorithm to 'normal', the EPROMs can finally be programmed
sucessfully. So, I think that it is important to note that when we program
EPROMs, beside setting the right brand and the right device, we should
always check if the EPROM programming algorithm is set to the appropriate
one.
3.1.3 Binary Format Files in EXPRO
There is something strange in the binary format of EXPRO. When I compiled
assembly programs and generated the binary files and read them from
EXPRO, if the original binary file contains 0x0a, the EXPRO will always
change it to two bytes 0x0a 0x0d. So, binary files could not work. I
figured out an alternative way: copy & paste. Make the assembler
generate a text file consist of hexidecimal characters instead of a
binary file, and then we can copy these characters and paste them in
the buffer edit mode in EXPRO. This solved the problem!
3.1.4 The Download kit.
Me and Ing-Jye cooperated to build the download kit. Ing-Jye had described
about what we had done. I will describe more about that.
The first idea to transfer data from the PC to SRAMs is to use a shift
register. We once succeeded, but something went wrong. I talked about
that with some other classmates. They told me to try to use the shift
register provided by MAX+PLUSII in graphic design, not to build our
own shift register. So, I tried that, hoping that would work, but in
vain. Please refer to the appendix for SHIFT3.GDF I had tried.
Then, I came to another idea, not to send address data to shift register,
but to build a counter, which can be cleared to zero corresponding to
address 0, and can be incremented corresponding to the incrementing
addresses. Everytime the couter is incremented, the data can be sent
to SRAMs according to the value of the couter. But this still could
not work. Please refer to the appendix for DOWNLOAD.GDF and ADDR.TDF
I had tried.
I think there is something wrong when the PC side send a signal to
act as a positive edge of a clock puse. But it still could not explained
completely why our circuit could not work! We really hated this bug,
it delayed our project.
3.1.5 The Assembler
The assembler is not built and left there. It grows. Because the user,
Lan Change, had many requirements and encountered many bugs of the assembler,
I improved the performance and the ability of the assembler continuously.
For example, the assembler was originally only able to handle three-character
labels, but after the suggestions made by Lan Change, I improved it
to be able to handle up to 256-character labels.
To note one thing, all the C/C++ programs I wrote were compiled by
DJGPP. It is a protected mode dos version of GNU C/C++, and it's free.
3.1.6 Acknowledgement
It is very nice to cooperate with my parners. Ing-Jye did a lot of
job in wiring the circuit and also had many experience about AHDL and
pin assigment in MAX+PLUSII. Beside building arithmetic and logic circuits,
he also worked with me to design the download kit and the application
of the sine wave generator.
Chang Lan did a lot of job in designing the control unit of our micro-processor.
To simulate his circuits is really a big deal, but he really made it.
He also wrote a lot of assembly applications and caught many bugs in
my assembler.
Thanks to my partners for agreement with my crazy idea of building
a computer! Without you guys, I could not fulfill the dream.
Finally, thanks to T.A., without you, we won't have the chance to
use the most advanced instruments and to touch the most advanced technology
in the filed of digital circuits.
3.2 Ing-Jye Huang <B82503007> says
To finish this experiment , we really encountered too much frustration.,
because we started this project quite early , and worked on it regularly
each week. The main part of the project - CPU is finished very soon
, but the problem of downloading programs via printer port to SRAM can't
be solved and cost us lots of time. One reason that we don't give up
this part is that we have spend some time on writing the downloading
program (this is all due to Chen 's hard working) , and another reason
is that downloading applications from PC to SRAMs is much more convenient
than writing them to the EPROMs. But unfortunately , at last we still
have to give up , and use EPROMs to store our applications. By the way
, we learned a quite valuable lesson from using EPROMs. That is : don't
hope to save your money and buy cheaper ones. Cheaper ROMs is really
unreliable than those with high quality though more expensive.
It 's really a nice experience to cooperate with my two partners .
I learn ed very much from them. And I have to admit that my contribution
to this experiment is less than my partners. But they didn't have any
complaint. This is also why I feel very nice to work with them. So at
last , I want to say it again "Thank you , my partners.".
3.3 Lan Chang <B82503081> says
It's like a novel, I think. Because we really made a computer! Like
a dream but we indeed spent our time and energy on it. Yes, we just
have done it.
The control unit is designed properly, I think. At first we decided
to refer to Ch. 5 and Ch. 6 of《Computer System Architecture》, then I
designed the control unit according to the operation function table.
It's fortune that I can just use the function table to design a totally
combinatorial TDF file, because it's easy to design and read.
But troubles began when I want to simulate the control unit. I had
to draw every input in SCF by myself. And I must inspect it carefully.
Finally, the capturing of three bugs made it perfect. (Maybe not perfect
but just we didn't touch the wrong part).
After CPU and bus combined with ALU and registers, we faced a more
big trouble. We should simulate the whole system!! Wow, it's not a simple
work, for there are so many inputs and outputs, and the ROM or RAM were
not yet prepared. We must do a "false" memory to simulate
it in MAX-PLUS II software. Finally we did it. I wrote the hex code
of machine language directly in MEM.tdf, and then implemented it in
the system. Then there was a long bug-capturing journey. And I tried
three MEMs(MEM, MEM1, MEM2, which represent different programs) and
got about 7 bugs done. The system seemed okay, though maybe there were
still hidden bugs.
On the same period of time, Murphy Chen completed the assembler and
began to try downloading from PC through print port to RAM directly.
And Mr. Huang completed his ALU and wired the circuit we need.
The time was about the bottom of December. And I began to write application.
In DC lab, everyone was using MAX-PLUS II in computer except me. I was
using notepad to write our S program!! It's an interesting work because
I felt that I am different(ha…}). And when the MAX-PLUS II has any problem,
only I can use this computer because NOTEPAD never fails!! (HA HA).
Were it not the problem of RAM, we would gain more achievement feeling
from the work. It's a pity, but we have more achievement, learing, and
happiness than pity and tire. Thank you TA and thank every classmates
that helped us.
4 References
1. M. MORRIS MANO, Computer System Architecture, Chapter 5 Basic Computer
Organization and Design, Prentice-Hall.
2. M. MORRIS MANO, Computer System Architecture, Chapter 6 Programming
the Basic Computer, Prentice-Hall.
3. JOSEPH D. GREENFIELD, Practical Digital Design Using Ics, Chapter
13 Memories, Prentice-Hall.
4. JOSEPH D. GREENFIELD, Practical Digital Design Using Ics, Chapter
18 Analog to Digital Conversion, Prentice-Hall.
5. MAX+PLUS II On-Line Help.
6. DATA SHEET of IDT6116SA, Integrated Device Technology, Inc.
7. DATA SHEET of M27C256B, SGS-THOMSON Microelectronics.
8. Information about printer ports, http://www.centronics.com.
5 Appendix
See the following pages.
Fetch |
R'T0 |
ARPC |
|
R'T1 |
IRM[AR], PCPC+1 |
Decode |
R'T2 |
D0,…}., D7 Decode IR(12-14), |
|
|
ARIR(0-11), IIR(15) |
Indirect |
D'7IT3: |
ARM[AR] |
Interrupt |
T'0T'1T'2(IEN)(FGI+FGO) |
R1 |
|
RT0: |
M[AR]TR, PC0 |
|
RT1: |
M[AR]TR, PC0 |
|
RT2: |
PCPC+1, R0, SC0 |
Memory-reference instructions |
AND |
D0T4: |
DRM[AR] |
|
D0T5: |
ACAC^DR, SC0 |
ADD |
D1T4: |
DRM[AR] |
|
D1T5: |
ACAC+DR, ECout, SC0 |
LDA |
D2T4: |
DRM[AR] |
|
D1T5: |
ACAC+DR, ECout, SC0 |
STA |
D3T4: |
M[AR]AC, SC0 |
BUN |
D4T4: |
PCAR, SC0 |
BSA |
D5T4: |
M[AR]PC, ARAR+1 |
|
D5T5: |
PCAR, SC0 |
ISZ |
D6T4: |
DRM[AR] |
|
D6T5: |
DRDR+1 |
|
D6T6: |
M[AR]DR, if (DR=0) then (PCPC+1), SC0 |
Register-reference instructions |
D7I'T3=r(common to all register-reference instructions) |
|
IR(I)=Bi(I=0,1,2, …}, 11) |
|
|
r: |
SC0 |
CLA |
rB11: |
AC0 |
CLE |
rB10: |
E0 |
CMA |
rB9: |
AC!AC |
CME |
rB8: |
E!E |
CIR |
rB7: |
ACshr AC, AC(15)E, EAC(0) |
CIL |
rB6: |
ACshl AC, AC(0)E, EAC(15) |
INC |
rB5: |
ACAC+1 |
SPA |
rB4: |
If (AC(15)=0) then (PCPC+1) |
SNA |
rB3: |
If (AC(15)=1) then (PCPC+1) |
SZA |
rB2: |
If (AC=0) then (PCPC+1) |
SZE |
rB1: |
If (E=0) then (PCPC+1) |
HLT |
rB0: |
S0 |
Input-output instructions |
D7IT3=p(common to all input-output instructions) |
|
IR(i)=Bi(i=6,7,8,9,10,11) |
|
|
p: |
SC0 |
INP |
pB11: |
AC(0-7)INPR, FGI0 |
OUT |
pB10: |
OUTRAC(0-7), FGO0 |
SKI |
pB9: |
If(FGI=1) then (PCPC+1) |
SKO |
pB8: |
If(FGO=1) then (PCPC+1) |
Control Functions and Microoperations for our Micro-processor HCL-2516
|