[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/diy/ - Do It Yourself


View post   

File: 49 KB, 500x595, IBM1401 Front Panel.jpg [View same] [iqdb] [saucenao] [google]
1335417 No.1335417 [Reply] [Original]

CPU Architectures thread?
Post and discuss various architectures that have been designed over the decades and what their advantages or disadvantages were.

I'm thinking about making my own toy architecture and a 74xx series logic implementation of it.
So far I'm thinking of having 2 accumulators, 8 general purpose registers, and an instruction count register.
Other features would be a hardware timer/counter, an external interrupt pin, and some memory mapped I/O.
I haven't nailed out the instruction set quite yet, but I'd like to keep it fairly simple, probably no more than 40 instructions.
Additionally, I'm planning on giving it a front panel similar to pic related from the IBM 1401.

>> No.1335546

I'm considering making a redstone computer in minecraft, so I'm interested in this topic. Are instructions the same thing as op-codes?

>> No.1335547

>>1335546
yes

>> No.1335548

>>1335547
The most curious thing to me is how such simple things as gates and op-codes can be combined to create complex programs; I guess most complexity is math based. Is there any reason to build a computer rather than to just simulate one (besides aesthetics/cool factor)?

>> No.1335549

>>1335546
It'll be pretty fun. I never made a computer but I made some automation for my base before.

>> No.1335550

>>1335549
are you playing on 4craft? If I start working on a redstone computer I'll post a thread here. Maybe I'll get some cool insight.

>> No.1335583

>>1335417
>2 accumulators
>8 gpr
What the fuck!
Don't run before you can walk mate.
Download something like quartus and simulate it first, you can do modules of digital logic and write tests for them. Also wire verilog if you want a test bench for parts of it.
Logic computers are exciting but try pricing it up, plus boards, power, led, switches! Switches alone it's fucking expensive man.

>> No.1335591

this is now a 4craft thread
1.12.2
4craft.us
get in here

>> No.1335613

>>1335583
I know. It's a hobby project, so I'm willing to drop a reasonable amount of cash on it.

And yes, I'm going to write the initial implementations in verilog.
I've already made lots of basic components like a decimal to 7 segment hex display, and a 4 bit adder. My major is Computer Engineering.
I figured making a CPU would be a fun way to tie everything I've done together.

>> No.1335645

>>1335613
They don't make you do a basic cpu on your course anyway?
I just don't know what you are going to do with two accumulators? Something specific planned? 8 registers? You know you have to stack that shit if you want interrupts. But you know, power to you I guess keep us posted.
Have you looked at the homebrew cpu ring? Some of them are wild.

>> No.1335787

>>1335645
It's only Intro to Digital Logic (a 200 level class). The closest we get is making a 4 bit ALU with control registers. Not quite a full blown processor.
Making a CPU is for the next course.

I don't really have anything planned for the two accumulators. I know it's a bit unconventional, but I thought it could be handy.
Is there anything weird about having 8 general purpose registers? I mean, AVRs have 32 of them, and the z80 has like 6.
And yes, I know I'll have to allocate space in the stack to store the registers if I want interrupts.
I was thinking about just reserving 0x00 through 0x0A for that purpose.

>> No.1336009

Ok, I think I've nailed out a preliminary ISA and the registers.
Obviously I'll change them if any of you have any good insights or ideas.

>Registers:
ACC - Accumulator, duh
ART - "Accululator" 2. It's not quite a true accumulator, but it's reserved for multiplication and division operations.
AB
CD
EF
GH
O - Overflow flag
LOGIC - Boolean, holds the result of any logical operations
COUNT - this is the counter for the timer
S - Stack pointer

A through H are the 8 general purpose registers. They can be combined to hold a 16 bit value.

>Instructions:
Math operations:
Addition, subtraction, multiplication, division, modulo, root, power

Logic:
'Is equal to' (==), 'Less than' (<)

Bitwise Operators:
AND, OR, XOR, NOT, ROT
ROT will take a register and a signed integer, then bit shift (rotate) the chosen register according to the integer. e.g. ROT A -2 will rotate register A left two bits.

Interrupts:
HALT - halts the CPU until an interrupt happens.
EXT - whatever is after this will be executed when the external interrupt pin is triggered
COUNT - whatever is after this will be executed when the timer/counter reaches a certain threshold

Flow:
JMP - this will move the stack pointer to a specific location if the value in a specified register is true
JMPR - this will move the stack pointer to relative to the current location if the value in a specified register is true
NOP - does nothing

Things I'm considering:
Shenzen I/O instructions, like TEQ, TLT, TCP.

>> No.1336034
File: 552 KB, 669x1052, 1519181867854.png [View same] [iqdb] [saucenao] [google]
1336034

>>1335417
Time to play stationeers.

>> No.1336054

i never understood why you would design your own CPU and not just directly wire up the logic up for whatever specific thing you are doing

>> No.1336056

>>1336054
Because a CPU can do more than one thing.

I'm mainly doing it because I'm interested in how architectures work.

>> No.1336063

>>1336054
Generally speaking, using 0.1 million transistors for the processor and peripherals and 0.9 million transistors for memories results in a system which can do much more complex things (at lower speed) than using all the transistors for logic. Processor, peripherals and memories can all be standard chips, so you can do all kinds of stuff with relatively few building blocks. It is also much easier to change the software than to change the hardware. This is why processors are popular to begin with.

That said, the usual reasons for hobbyists are that they're doing it for fun and (officially) for learning purposes.Big part of the fun tends to be some self-imposed limitation like using only 7400 series stuff, or discrete transistors.

>> No.1336069
File: 163 KB, 1280x414, megaprocessor-panorama.jpg [View same] [iqdb] [saucenao] [google]
1336069

One funny computer of that kind I've witnessed working: http://www.megaprocessor.com/

>> No.1336148

>>1335548
assuming you're not just some hipster with a box of 74xx's and pretensions of talent, speed is a big one

>>1335787
>Is there anything weird about having 8 general purpose registers? I mean, AVRs have 32 of them, and the z80 has like 6.
no, the 8051 has 8 general registers (in 4 banks, no less)
in your new architecture you should be considering the instruction word formats in conjunction with the facilities you're offering
>I was thinking about just reserving 0x00 through 0x0A for that purpose.
see above about four banks. iirc one CPU's fast IRQ mode (was it a 680x or ARM? I forget) has a separate set of some registers for IRQ handling
if you didn't allow interrupt nesting, a single shadow set of general regs, accumulator, SP, PC, etc. would be useful for quickly servicing interrupts

>>1336009
>multiplication, division, modulo, root, power
these are pretty advanced ops. it is probably not worth implementing the last three in dedicated hardware, but there's nothing much wrong with a remainder as a secondary result of a 16/8=8 division
>AND, OR, XOR, NOT, ROT
assuming a barrel shifter, ROT is a good cheap way to get bit rotations. otoh you may want other kinds of shifts too (arithmetic/logical shifts)
>stack pointer
don't you mean program counter?
>EXT, COUNT
>not using vectors
I see a source of errors and complexity here. might want to consider a "control register space" a la MIPS/ARM/x86 to store the vectors, if you don't want to indirect through main memory for them
>no LD, ST
oh wait, no main memory?

>>1336054
processors are just sequencers for logic, more or less. you trade away time to gain space by reusing gates

>> No.1336156

>>1335417
Write it up in VHDL and put it in an FPGA

>> No.1336158

>>1336054
furthermore, consider that most MP3 and other audio decoder chips, for ex., are microcoded DSP machines for exactly that reason

>>1336063
for me, anyway, the big part of the fun is exploring the
>implied implications
of some architectural choice, like an otherwise dull load-store machine with a 4-stage pipeline, supplied with an immediate-extension prefix word that works on the (possibly 0-bit long) immediate or displacement operand of the next instruction. e.g. LDI r1, #x has a 4-bit argument which becomes 16-bit when extended, LDW r2, [r1] gets a ±2kiB displacement when extended, ADD r2, r1 becomes a three-operand ADD instruction where r2 ← r2 + r1 + a -2048 to +2047 immediate, etc. and what would gcc think of all that
or, whether I gain speed by adding a complex math accelerator that can run steps of a large FFT etc. on its own little memory space
or, what might happen if the crippled PIC architecture were extended to 32 bits and 32 working registers and the way to and from main memory were through a handful of register-mapped bus masters with auto-increment controls etc
>using only 7400 series stuff, or discrete transistors
i fail to get it

>> No.1336264

>>1336148
Hmm, I'll look into vectors. I'll be honest, I don't entirely understand how those work yet (inside of the processor, that is).

And yes, I agree that multiplication, division, modulo, root, power are pretty advanced ops.
I'll most likely just reserve opcodes for those and implement them later on.
I'm also considering not implementing those on the processor itself and instead just leaving a way to attach a math co-processor.

>no LD, ST
>oh wait, no main memory?
Oops, forgot to list those.

>>1336156
I'll probably do that too, especially for simulating the design before I commit to ordering +80 chips.

>> No.1336295

>>1336264
one common implementation is to jam the interrupt entry point into the PC during the next instruction start after the interrupt is recognized, also setting interrupt flags as desired. some processors (8051, AVR, others) will use a fixed entry point corresponding to the interrupt type, others (x86, 680x0) will read the entry point from a memory location corresponding to the interrupt type. the cleanup coming into and out of an interrupt gets more complicated along with the rest of the processor, but this seems like a simple enough machine that you can wing most of it
>multiplication, division
the trouble with these is that they take a lot of gates or a lot of time. u8*u8=u16 is easy enough to do with a couple of shift registers and some steering logic, if your accumulator, true to its name, has an adder built into it. if you felt like cheating a little you could use a couple of 64kx8 (E)EPROMs to store u8*u8=u16 multiplication tables, or larger ones to include signed variants. you might even consider using a few (E)EPROMs to store microcode for sequencing and control. instead of using the old windowed ceramic EPROMs, I suggest you use parallel flash
also, you might want to consider some form of carry handling, at least for addition and subtraction, and maybe a "rotate through carry" operation if you're feeling ambitious. it's essential for dealing with multi-word numbers. maybe the upper accumulator? (tip: a compare instruction is just a subtraction without writing the result back to a register)

>> No.1336675

>>1336295
I'm considering using microcode, although I'll admit it's a bit beyond me at the moment. I'm reading up on them though from other peoples implementations.

I'm also trying to figure out what I want to do for address space.
Obviously 8 bit addresses are pretty limiting, so the next easiest option is 16 bit addresses, which gives me 32kbits to work with, but then that started sounding limited once you start factoring in peripherals and all that.

>> No.1336820

This is a cool project Anon, keep it up.

>> No.1336856
File: 1.15 MB, 944x900, 1499562598854.png [View same] [iqdb] [saucenao] [google]
1336856

>>1336675
the long and short of microcode: instead of using logic to decode instructions and drive control lines of such as logic units and internal bus enables in the correct sequence, you use the data outputs of a wide ROM, whose address bus = a small counter of cycles since the beginning of the instruction, concatenated with some or all of the bit pattern of the instruction, possibly concatenated with other inputs (IRQ status, carry flags, wait state input, etc). you write the microassembly code, in which you specify the data flows per cycle, and write a microassembler that maps microinstructions to bit patterns. you would probably allow some of the bits of the instruction word to directly control some parts of the processor as enabled by the microcode, eg register selection
thus, your design can perform fairly complex operations without a too complex instruction decoder or tight coupling to instruction encoding, and it's a lot easier to maintain in the face of a changing instruction set and possibly a changing implementation. just update the microassembler, reassemble microcode, reflash
>peripherals
don't usually take up that much I/O space. the MOS 6522 VIA only has sixteen 8-bit registers yet provides two 16-bit counters, two 8-bit GPIO ports, and a serial port. a typical SPI slave's I/O space is only four 8-bit registers, and master functionality adds just a couple of counters, a FF, and clock in/out
>16 bit address
is a good size for a design like this. paired 8-bit registers lead to that naturally. you can very easily use half of the address space for a 62256 32kx8 RAM (available but expensive in the 1980s) and divide the other half as you please between boot/other ROM and peripherals. a lot of 16-bit designs use a 74138 3-to-8 decoder to subdecode the mixed block addresses into individual chip selects
address sizes larger than one large-size register tend to introduce complications (bank-switching, segmentation, etc) you probably don't need right now

>> No.1337207

>>1336820
Thanks! Hopefully I can do something cool with it once I'm done, maybe some demoscene.
After all, I'll literally know it inside and out.

I'll also post plans (and an emulator) online so you guys can play with it too!

>>1336856
Huh, interesting.

Also, one of my goals with this project is to make an architecture that's fun to use.
I've heard that MIPS and 68k are pretty fun to program for, however my only experience is with the z80, so I can't really compare them.
What is it that makes for a nice architecture?
The instructions I came up with are just what I though would be nice, but I don't really have a whole lot to compare it to.

>> No.1337259

>>1337207
having had a moment of pretensions of demoscene activity, imo 68k is fun to program because
1. its instruction set is highly orthogonal. any data operation instruction takes nearly any addressing mode on at least one of its operands, and it had a lot of complex addressing modes on the '000, even more on later members of the family
2. instruction cycle times were deteministic, and most earlier machines didn't have or need caches which would fog up that determinism
3. the machines they were found in often had some usefully buggy hardware that could be tortured into unspecified but cool behavior
anyway, I would add if I were you:
>a DJNZ instruction (pretty easy if you microcode)
>a stack pointer, and instructions to get and set it
>figure out how you'll get from a register content through the zero/negative tests to branch decisions
>flesh out the addressing modes for memory accesses
it would probably benefit you to read up on the 6502. it's similar to the Z80 but different, maybe a bit dumber. also on the 68000, which is probably way more complex than anyone should consider as their first processor project but might have some ideas worth stealing
if something for demoscene lovers is your aim, an HDL implementation would be really nice. almost everyone has an FPGA board knocking around in their desk drawers :^)
also important for scene fun is some sort of video output, which is a ball of wax possibly bigger than the processor! these days the baseline would probably be something like a dumb framebuffer in a private 32kB of RAM accessed from the CPU via bank switching, 640x400x16 fixed colors, and a VGA-analog output, which is fairly straightforward to implement as far as video goes. an easy way to make it more fun would be a horizontal blanking IRQ and user-accessible display counters, so people could go in and dick with the video control registers every line and also time their video memory accesses for minimal disruption and delay
also audio output

>> No.1337304

>>1337259
Good points.
And yes, there will be a HDL implementation. I'll be making one anyways so I don't have to test in hardware, which could get expensive.
The end goal is to have a hardware implementation though.

As for the video output, I probably won't include that in version 1. I'll probably save that for v2.0
Actually, I might just set up some double buffered ram and just let a PIC or Atmega read from that to do the video output, rather than making my own "GPU".
Another feature I'm considering for v2.0 is a blitter.
For vblank and hblank IRQ, I'll probably just put those on external interrupt pins and let the video peripheral handle issuing those.

For the stack pointer, I was just thinking about making that a register, so you can read and write to it like any other register.

>> No.1337322
File: 101 KB, 1648x481, 1518893373601.png [View same] [iqdb] [saucenao] [google]
1337322

>>1337304
>let a PIC or ATmega
skip the PIC, go with the mega, it's 4x faster at manipulating the address lines. you'll still need some logic to shift bits unless you want to spend a byte per pixel (256kB, my math was off by a factor of 4 in previous post) on it, which would be unusual, and probably slow for an 8-bit processor to do anything fun with
you might want to price out discrete dual-port RAM in large sizes before you commit to that course. FPGAs are actually a really good fit for video and often have a lot of dual-port RAM spread across the array. either way you'll need to work out some bus arbitration/interfacing logic so that the main system can read/write video RAM without glitching the display scan. one cheap, cheezy and not very deterministic way is to load the next line into a FIFO ahead of time. cheaper and cheesier is to put the CPU in wait state until the h/v blanking intervals
another option is to use an LCD controller instead of a video output, some of which are respectably fast and present 8-bit microprocessor-compatible interfaces
>blitter
another good idea. packed 8bpp video actually helps here, since your blitter will never have to mask bits etc. something not far from a straight DMA engine could work here
>for v2.0
another good idea. the blitter will also compete with the video shifter and the CPU for access to video RAM, making for another slightly more complicated bus arbiter
>just put those on external interrupt pins
good idea. just ensure a user can get them somehow
>stack pointer a register
in general this is a good idea, but it may cost a lot of space in the opcode map, and as a general register doesn't allow for easy/fast push/pop operations as dedicated increment/decrement would. this might be a good time for you to sketch out opcode maps and formats, and make sure there's room to fit everything together in a way that will be relatively easy for an assembler to encode and for the processor to decode

>> No.1337343

>>1337322
>those prices
Jesus Christ. Did it skyrocket recently?
I was looking at it for a different project about a year ago and I could have sworn it was more like $15-$20.
Fuck that.
I might just use an LCD controller then.

I'll play around with the stack pointer implementation. I haven't done to much work on opcodes yet since I'm still trying to figure out what all I want to do.

One other thing I need to think about is how storage will work.
I'm used to programming for the Gameboy, where RAM and storage are one in the same, since it's all on the same address bus.

>> No.1337347

>>1336069
That's a big CPU

>> No.1337353

>>1337322
>>1337343
I might take a page from this guy's book for the video output.
https://hackaday.io/project/20781-gigatron-ttl-microcomputer
Just bitbang it out in software, initially at least.
Once I get some more experience under my belt I'll try actually making a discrete "GPU" and all the challenges that come with managing the shared RAM.

>> No.1337623

>>1337343
maybe Mouser's just high, maybe RoHS happened to them
the reason I suggest keeping your eye on opcodes is because it determines just how much room you have for doing things. if your opcode map isn't full you can more easily add room. extension opcodes kinda suck
check out LCD panels on ali with a built-in SSD1963 controller, if you want a reasonably priced output device without a lot of logic
one saving grace for low-bandwidth video applications is that you can use DRAM a bit more easily than you might think at first, as your video address counter usually eliminates the need for a separate refresh counter. just keep that in the back of your mind for v2.0
>One other thing I need to think about is how storage will work.
consider defining some "BIOS" calls in ROM so that you can defer those considerations for a minute. with a device-independent block layer, you can get blocks to and from whatever device you decide on later, whether it be SD cards in SPI mode, or NAND flash as in https://eng.umd.edu/~blj/CS-590.26/micron-tn2919.pdf
later still you could add a FAT16 layer to run on top of all that
>bitbang video
one concern with software driven video output is jitter, which becomes interesting if each instruction's microcode is a different length. one hybrid solution: divide the CPU clock by 32 or so and add a wait instruction or wait state to sync the instruction cycle to that divided clock. then read and discard a line worth of data from RAM, while the video hardware snoops the read data bus and puts the data on the video output (ZX81, iirc)

>> No.1337806

Oh! I just found this resource!
This looks like some good info.

https://en.wikibooks.org/wiki/Microprocessor_Design

>> No.1337915
File: 88 KB, 560x679, muh henny.jpg [View same] [iqdb] [saucenao] [google]
1337915

>>1337806
it is good for an introduction to the concerns you will be dealing with as a computer architect. it is thin and not especially accurate on finer details (e.g. Cell processor description) but if you're especially intuitive it just might be enough inspiration
if you've got time (and the next C3 is months away, after all) there is literally no substitute for Pic related, even if you just skim over the quantitative parts and look at the examples of ways designers' problems are solved

>> No.1337919

>>1337915
Thanks! I'll give that a read too.

>> No.1338822

>>1336264

Multiplication and reciprocal are very adavnced, sure. But so much uses them (plus multiply accumulate) that you get a big boost from having it. As in, most matrix operations.

>> No.1338853

>>1335546
A basic calculator is pretty easy to make in minecraft, if you want to mess around with gates and make a proper computer with gui...etc, then I recommend gmod with wiremod installed

>> No.1338858

>>1337343

Get a Zynq 900X board. FPGA, ARM on board to control it (useful when verifying subsystems), great for a fuck ton of things. Dev boards aren't exactly cheap but $100-200 for the whole dev system setup isn't bad at all.

>> No.1338963

>>1338822
>so much uses them
>matrix operations are so common on small machines
Amdahl's Law, anon
besides a slow multiply won't help that much. see also: 68000, where the demo kiddies and no few compilers shifted and added instead
>floats
>on an 8/16 bit machine
top lel

>>1338858
>ARM on board
that's almost cheating

>> No.1339201

>>1338822
I agree that a matrix unit is quite handy, but I’m not going to implement one.
Doing so would probably triple the number of gates I’m using.
I feel that it’s just not worth the trouble.

>>1338858
I’ll give that a look.

>>1338963
Shifts can only do multiplication and division using powers of two, right?
I mean, I’m including shifts anyways.

>> No.1339219

>>1339201
depends on implementation.
if it's based on a full-size lookup table or fast logic, the result will return very quickly. if it's a microcoded multiplier that shifts and adds internally, it could take somewhat longer. consider the following two code snippets:
>x = x * 3;
vs.
>y = x;
>x = x + x;
>x = x + y;
it turns out the adds take about half the time on a 68000/68010 processor! early 68000 internally uses a microcoded shift and accumulate algorithm. (38 + 2 * (number of 1 bits in multiplier) = 42 vs. 4 + 8 + 8 = 20)
example 2:
>x = x * 127;
vs.
>y = x;
>x <<= 7;
>x -= y;
(38 + 2 * n = 52 vs. 4 + (8 + 2*n = 22) + 8 = 34)
I'm cheating a bit with these cycle counts, assuming unsigned 16x16=32 multiply which is the widest instruction available, unsigned 16x32=32 for adds/shifts. signed multiplication by 127 would be a 42 cycle op
aside: it was not uncommon back in the day to need to multiply by 40 or 160 to address rows in the frame buffer. 14 cycles to just look in a pre-calculated LUT maybe half a kilobyte long was not usually a big deal, but more time savings yet was achieved by so-called strength reduction; e.g. instead of
>for(i=0;i<200;i++) { int j=i*160; ... }
spend a register and do
>for(i=0,j=0;i<200;i++,j+=160) { ... }
if your processor multiplies using an 8x8=16 LUT and you have no barrel shifter, it might be cheaper in most cases to use the multiplier than to shift, with the possible exception of very small multipliers. if you have a barrel shifter and a shift+accumulate multiplier uint, shift+add is even more beneficial
division is a whole other circus, harder to do in hardware or software. there are algorithms that can divide numbers a couple or three bits at a time (look up SRT division for one idea) but they take up quite a few gates. you could do it yet faster with yet another LUT for division, eight bits at a time. it depends on how much hardware you want to throw at it, with the ghost of Amdahl watching over you always

>> No.1339230

>>1338963
I'm not talking a matrix unit for parallelization. I'm talking about increasing even single-thread performance. A dedicated multiplier or reciprocal function can unroll the loop and reduce the iterations required (eg if you use Newton's method for reciprocals). The nice thing is that sequencing the same step multiple times takes up silicon but not as much design effort.

>> No.1339269

Speaking of shifts and all that, which is generally more useful?
Circular (barrel) shifts, or arithmetic shifts?

>> No.1339292

>>1339230
doesn't a reciprocal function
>imply
a floating point unit, which doesn't seem all that useful in the case of an 8-bit CPU ffs? also, isn't all that application-dependent, and in any case could be implemented as a separate peripheral core with DMA and all that should it be found necessary to an application?

>>1339269
whynotboth.jpg
it's just another layer of muxing on the MSBs
for stuff like CRC calculation, bitstream/bitfield parsing, etc., a barrel shift would be essential to performance. for more mathematical applications, I would probably enjoy being able to divide by 2 in one machine cycle. so, as a general matter, probably the barrel shifter
it wouldn't be too hard to implement a sign-extension function in the register-to-register move anyway, even from a non-byte-aligned source size, which would also greatly help parsing bitstreams and bitfields

>> No.1339422

>>1339292
>doesn't a reciprocal function
>>imply
>a floating point unit, which doesn't seem all that useful in the case of an 8-bit CPU ffs?

It implies that youre implementing division as multiply by reciprocal.

>> No.1339613

>>1339422
which, in context, is topkek

>> No.1339965

>>1339613
>which, in context, is totally awesome

FTFY

What's the point of doing it if you can't do fun stuff? Why the obsession with reimplementing the 68000 if OP wants to try doing something of his own?

>> No.1340075

>>1339965
>let's skip the first system and go straight to second-system syndrome
>he thinks a fucking FPU is fun
>everything on 4chin should be trolling on some level
1/8 made me reply

>> No.1340100
File: 2.00 MB, 360x355, yes_this_looks_perfectly_normal_now_go_to_your_room.gif [View same] [iqdb] [saucenao] [google]
1340100

>>1339965
>What's the point of doing it if you can't do fun stuff?

exactly. not every project has to be meaningful or have intrinsic value. If I want to spend two years making a lathe or a CPU that is inferior in every way to mass produced items, who gives a shit?

>> No.1340144
File: 25 KB, 492x428, 1506959368363.jpg [View same] [iqdb] [saucenao] [google]
1340144

>>1340100
>not every project has to be meaningful or have intrinsic value

>> No.1340146

>>1338853
what about computercraft

>> No.1340215

OP here. Would you guys please stop arguing.

Yes FPUs and matrix operation units are useful.
No, I'm not going to include them in v1.0
I might not even include multiplication and division in v1.0

Also, speaking of multiplication and division: I gotta admit, as simple as using a LUT is, I don't think I want to use one.
I'm doing this as a hobby/learning project, and I feel that just using a LUT is counter to that goal.
I mean, you could use a LUT for everything if you wanted.
>Opcode decoder? LUT
>Multiplication and division? LUT
>Addition and Subtraction? LUT
>etc
I mean, creating a CPU purely from memory chips would be kind of a cool experiment, but not really what I want to do.
Also that's effectively just a really shitty FPGA.

>> No.1340289
File: 469 KB, 714x1000, 1470350480695.png [View same] [iqdb] [saucenao] [google]
1340289

>>1335417
Is there any kind of free, easy to use simulators to build CPUs/computers with?

I once tried doing shit in minecraft, but it's such a hassle to build literally every single thing from scratch, and routing redstone is a cunt.

>> No.1340303

>>1340289
Modelsim is free, if your design has under 10,000 “wires”.

>> No.1340377
File: 101 KB, 842x872, 1500255025057.png [View same] [iqdb] [saucenao] [google]
1340377

>>1340215
>Opcode decoder? LUT
uh, that's what microcode is, approximately
you could do what the 6502 designers did and use a PAL/GAL/CPLD-style AND-OR matrix to do your calculations, if not against the spirit of your effort. source your diodes or small-signal MOSFETs in family-size boxes
>Multiplication and division? LUT
either shift/accumulate (Booth's algorithm), probably easiest via microcode, or a Wallace or Dadda tree, both of which are kind of heavy on gates
note, you can emulate multiplication not terribly inefficiently in software with an extended 1-bit rotate on an 8-bit register, a conditional 16+16-bit accumulate (effectively a 1-bit multiply), and a 16+16-bit add.
>Addition and Subtraction? LUT
could, but obvs ridiculous, and really not a win on performance
polite suggestion, if you're not too keen on microcode, the data paths should probably be designed first. it's easier to add/subtract logic at this stage, and the data paths will constrain to some extent the design of the instruction set, instruction decoder, opcode formats, and also the gate count of the entire design

>>1340289
>Is there any kind of free, easy to use simulators to build CPUs/computers with?
Logisim is apparently pretty popular these days and has hierarchical design facilities to build your own blocks and reuse them
not necessarily easy, but there's always lerning2verilog, which has the advantage of being quite easy to realize into a form you can connect to real wires

>> No.1340699

>>1340215

I'm right there with you on LUTs but one way they might make sense is as placeholders. So you build out the design in broad strokes and then refactor the LUTs into logic.

The other way LUTs stand out is reprogrammability, which is a concern when you go to silicon and start testing.

And in fairness, real world hardware often uses LUTs as they save die space and cycles over calculating everything from scratch.

Sad you're not doing multiplication and reciprocals but good to know that it's off the menu.

>> No.1340703

>>1340377
>Logisim is apparently pretty popular these days

I thought Google killed Logisim after Carl left academia to work there a few years ago. Has there been any work since 2011? Did someone else take over the project?

>> No.1340733

>>1340377
>>1340703

I heard someone say to try xyce and it's free and open source. How is that one?

>> No.1340747

>>1335417
https://numato.com/product/mimas-spartan-6-fpga-development-board

This is a good fpga for the price and you don't need a jtag cable to program it. Bought one 3 week ago. Delivered in 4 days. The xilinx ide ISE is free and very intuitive.

>> No.1340870

>>1340703
what needs development, exactly?

>>1340747
that is a bretty gud board for the price
is the on-board programmer compatible with iMPACT?

>> No.1340906
File: 79 KB, 508x600, tay_so_nice_no_red_lipstick.jpg [View same] [iqdb] [saucenao] [google]
1340906

>>1340870
>bretty gud

nothing more cringe inducing than nerds trying to be cool

>> No.1340997

>>1340906
>https://numato.com/product/mimas-spartan-6-fpga-development-board

>...he says, sporting a tay-tay reaction image

>> No.1341043

>>1335548
>The most curious thing to me is how such simple things as gates and op-codes can be combined to create complex programs

LOL. Look up "Turing Machine" if you want to see something simple that can run any complex program.

>> No.1341045

>>1341043
but how will you know when it's finished?

>> No.1341161

>>1341045
By including a halt state.
When the program terminates, the machine will stop.

>> No.1341278

>>1335546
>redstone computer in minecraft
I see Eloraam is back on Twitter but not on eloraam.com yet. Had some interesting stuff such as 65el02.

>> No.1341733

>>1340870
You can use it with impact but there's a PIC connected to the usb to upload your firmware. On linux, I used the python script they give to upload and it works pretty well.

>> No.1341864
File: 312 KB, 1800x1200, 6502_Monster_A1_1.jpg [View same] [iqdb] [saucenao] [google]
1341864

>>1335417
>CPU Architectures thread?
Sure! It jhas been rather stagnant the last 10 years, looking like a smelly pond, until RISC-V get there. Even so it is a bit too dogmatic and orthodox to my tastes. So why not give it a major new spin?
>Post and discuss various architectures that have been designed over the decades and what their advantages or disadvantages were.
Accumulator designs are my favourite, probably since I started with 6502. Now it is more like register files but that costs a lot in terms of bits in decoding instructions.

>I'm thinking about making my own toy architecture and a 74xx series logic implementation of it.
There are several TTL-designs out there, at least one you can log into. Have you looked at those?

>So far I'm thinking of having 2 accumulators, 8 general purpose registers, and an instruction count register.
This does not really make much sense. Accumulators are general purpose, are those a sub set of the 8 general purpose registers? 6809 had dual accumulators, 68k had 8 data registers and 8 address registers, the latter not being general purpose. What do you wish to achieve with your design?

>Other features would be a hardware timer/counter, an external interrupt pin, and some memory mapped I/O.
Timer/counter is more of a peripheral than a core feature, or do you want to bring it into the core?
External interrupt is normal though 1802 goes all the way. What are your preferences?

>I haven't nailed out the instruction set quite yet, but I'd like to keep it fairly simple, probably no more than 40 instructions.
The instruction set and the instruction set architecture (ISA) should be well thought of before you start wiring up chips. Or wil you prototype in FPGA?

>Additionally, I'm planning on giving it a front panel similar to pic related from the IBM 1401.
How about pic related?

>> No.1342175

>>1341864
>This does not really make much sense. Accumulators are general purpose
Accumulator, meaning it can be used as a destination for accumulation operations.
You won't be able to do that for the general purpose registers.

>What do you wish to achieve with your design?
Learn how CPUs work from the inside out, and then do shit with it.
It's not like I have some grand plan to take over the desktop market.

>The instruction set and the instruction set architecture (ISA) should be well thought of before you start wiring up chips.
No shit. I thought I'd get input before I finalize the design.
I'd hate to get 80% of the way through design and realize that I'd made it hell to program for by having a crappy ISA.

>> No.1342266

>>1341864
>Even so it is a bit too dogmatic and orthodox to my tastes.
mainly because, with few exceptions, processors these days are designed to be programmed by compilers, not humans
>Accumulators are general purpose
the 8051 politely smirks at your statement

>>1342175
if you want to integrate the timer/counter, I can think of an instruction that would make programming this machine more interesting: a WAIT instruction that waits for the timer/counter register value to equal an immediate or register operand before proceeding with execution, for super precision timing without having to NOP slide or other synchronization techniques
that said,
>Learn how CPUs work from the inside out
might be better done by implementing an existing ISA. if nothing else you'll not run short of test cases any time soon

>> No.1342425

>>1342175
>You won't be able to do that for the general purpose registers.
It would appear then that it is not as general as the name suggests. My experiences are from 6502, DSP56300 and similar ISA philosophies and the accumulator(s) in those tends to be the most general of the registers.

>Learn how CPUs work
That is perfectly fine. I have a similar project as well, just a little different focus.

>No shit. I thought I'd get input before I finalize the design.
Good to hear, I was not sure how open you were for inputs.

>>1342266
>the 8051 politely smirks at your statement
I did a fair bit of assembly programming some time ago but never had to do the 8051. Looking at the design I find the term "general register" a bit strained, the zero page on 6502 seems far more flexible. And I wonder why 8051 was that successful,m was it the built in multiplier?

indubitably i am quite biased here and it seems those who started on 6502 or 6809 are in a very different camp from those starting with Intel and Zilog.

>> No.1342609

>>1342425
51 can't use its accumulator for addressing, except when it can. Also, the decrement and jump instruction does not work with the accumulator, unöess you refer to it via its address (it has a memory address). So, 51's accumulator isn't general purpose. Or so I think. It's been years since I last used 8051 or its variants.

>why 8051 was that successful,m was it the built in multiplier?
It's a microcontroller instead of a microprocessor and compared to the other microcontrollers on the market at that time, it was quite nice. It had also shitloads of second sources, which was important in the olden times. Dunno how much the multiplier/divider mattered. At least Intel didn't make that much noise about it; instead they advertised the improved bit-twiddling capacities, which were more relevant to its intended use.

>> No.1342721

>>1342266
>if you want to integrate the timer/counter, I can think of an instruction that would make programming this machine more interesting: a WAIT instruction

Yeah, I was thinking of something like that.
I'm basing it in part on how timers work on the AVR.

>> No.1342730

>>1342425
I started in the great MOS-Motorola tradition, myself. I took a side trip into 8051 when I first encountered them, when a fast-food cow orker was going for his EE degree. I don't particularly like the '51 series either, like most everything else out of Intel it was a clusterfuck
>y 8051 tho
everything >>1342609 said. it was sort of the ARM of its day, as sourcing and variation went, including improved versions (only 4 cycles per basic instruction instead of 12!) with wider memory addressing capabilities. some crazy crackwhores over at Dallas Semiconductor even ported something like a JVM to one

>> No.1342900

>>1335591
Can I join with a pirated/cracked client?

>> No.1343004

>>1342900
it's dead
coty killed it last week for reasons unknown
~30 people were playin then he shoah'd the discord and fucked off

>> No.1343034

>>1335417
>So far I'm thinking of having 2 accumulators, 8 general purpose registers, and an instruction count register.
Judging by http://www.homebrewcpu.com/overview.htm which also was made in 74 series logic I guess the registers alone will take up a lot of space. The wiring alone will be scary.

>> No.1343043
File: 247 KB, 1273x720, 1470946489393.jpg [View same] [iqdb] [saucenao] [google]
1343043

>>1335417
What is the absolute minimum instruction set needed for a functional CPU?

>> No.1343045

>>1343043
Depends on your idea of "functional" and "CPU".
https://en.wikipedia.org/wiki/One_instruction_set_computer
https://en.wikipedia.org/wiki/Zero_instruction_set_computer

>> No.1343047

>>1343045
Oh, and for more practical implementations:
https://en.wikipedia.org/wiki/Minimal_instruction_set_computer

>> No.1343316

>>1343043
If you use Brainfuck as a minimum, 8 instructions.

>> No.1343352

>>1342730
>improved versions (only 4 cycles per basic instruction instead of 12!
There's also the Silicon Labs' version of 8051 which uses one clock per instruction and runs at 100MHz. Even faster versions exist, but (afaik) they are available as IP only.

>>1343316
8X300 had 8 instructions and it was even quite popular. The instruction set was awful, but much less so than Brainfuck.
Maxim is still offering a series of microcontrollers which basically have just one instruction, move.

>> No.1343451

>>1343352
>8X300
I never knew that one. I thought 1802 was weird but this one is a strong competitor.

>> No.1343643

>>1343043
PDP-5 had six or seven instructions, plus a few weird ALU instructions that were combined as a bitfield. it's hardly minimum, but it's definitely lean

>>1343352
oddest duck I've seen yet
>1208 decoupling cap on board
kawaii

>>1343034
SOIC would be less than half the size. it's a shame those 16x4-bit TTL RAMs aren't around much anymore

>> No.1344194

>>1343043
4 bit processors have necessarily few instructions and those are selected to make it useful. Might be worth looking into. For instance this:
https://www.bigmessowires.com/nibbler/

>> No.1344428
File: 730 KB, 2868x1836, 1471855669705.jpg [View same] [iqdb] [saucenao] [google]
1344428

>>1344194
>>1343643
This stuff is pretty cool.

I've always been interested in computers, but never been able to really wrap my head around how CPUs, machine code, and programming all actually work on the lowest level.

Are there any good youtube series or shit that can educate me on how it all works and the logic behind how they're designed?

How did you guys learn all this?

>> No.1344481

>>1344428
yes. xillinx and altera have instructional videos on how to use their ASIC/FPGA logic block design suites. but first you will need to familiarize yourself with the basics in logic gates and logic blocks.

machine code consists of distinct blocks for each byte. depending on how it is configured, some of the bits are configuration bits, part is the instruction which consists of the operator and operand.(what you're doing and what you're doing it to)

however, firstly you need to familiarize yourself with the OSI model to determine which part you want to learn about.

>> No.1344489

>>1344428
I learned it over the course of about 35 years of self-study and a few years in industry, all starting in primary school and working my way out from LOGO. in particular, I got started programming in assembly language pretty early, first 6502, then 68000, then gotten involved in the demoscene via friends, which exposes you to a lot of neat tricks. also reading every little thing I could get my hands on, which is easier when whole system schematics and source code listings for ROMs are made available in reference manuals or on the interweb
the bible of CPU design is the Hennessy & Patterson's _Computer Architecture: A Quantitative Approach_. I blew $80 on that on a lark and read it from front to back, and it was highly educational. their complement to CA:AQA, _Computer Organization & Design_ I have not yet read, but I did get a decent background in the "stuff around the processor" from experimentation and reading the travails of others building CPUs and similar, seeing the bad ideas and the good and developing a sense of taste
tl;dr: read literally everything related you can get your hands on. find a logic simulator and build some things in it

>> No.1344612

>>1344194
4004 had 46 instructions and that was a rather typical number for older 4b designs. Much newer EM6580 (advertised as a RISC processor, btw) has 72 instructions.

>> No.1344841

>>1344489
>in particular, I got started programming in assembly language pretty early, first 6502, then 68000,
Most excellent taste, anon.
>then gotten involved in the demoscene via friends, which exposes you to a lot of neat tricks.
True. Also the Woz made a few assembly programs well worth studying: one calculates e, a really clever design; the other is a virtual machine (Sweet-16) that is used to overcome lack of 16 bit pointers in 6502.

>>1344612
>Much newer EM6580 (advertised as a RISC processor, btw) has 72 instructions.
6502 has, I believe, 57 instructions, yet is supposedly not RISC since it has a lot of modes.

>> No.1345462

>>1344841
The 6502 had 16-bit pointers. Maybe you meant it didn't have 16-bit arithmetic.

>> No.1345681

>>1345462
>The 6502 had 16-bit pointers.
You are right in that 6502 had (ZP,X) and (ZP),Y so I was not precise enough there.

Sweet-16 made a lot of improvements on that in
- auto increment
- auto decrement
- easy pointer artithmetic

If 6502 could do (ZP), Y++ or (ZP++) it would address the main pointer problems and allow for the normal C construct n = *p++

6502 is OK if your tables or arrays are max 256 entries long. After that it is still possible but a lot more involved.

I always wondered why Sweet-16 was not turned into silicon, after all Apple were in close dialogue with Western Design Center.

>> No.1345808

>>1336054
If you want to do application specific stuff, might as well use an FPGA and model the what you want it to do. Speed of hardware and easier (probably) than chaining up a bunch of logic chips. Though more expensive...

>> No.1345857

>>1345681
to be fair, many modern processors don't do auto-increment/decrement, with the exception of the stack pointer. much easier and faster, at least for a single-cycle machine, to have the user do a small-constant add or subtract than to add logic to the instruction decoder to steer pointers around to do that or to insert adders into each general register (Amdahl's Law at work, again). understand that, just a few years before Apple came to be,
>why not hardware
it would have needed bit operations, among other things, to be a usable main processor, and it would have been a lot more expensive. recall that the meager 6502 was still a $20 or so part, even with the manufacturing technology improvements they had to offer

>>1345808
not necessarily... iCE40s aren't all that expensive, on the order of $4/kLUT

>> No.1345920

>>1345857
>to be fair, many modern processors don't do auto-increment/decrement, with the exception of the stack pointer.
I knew RISC-V didn't have auto-ind/dec but felt this was excessive orthodoxy on their part.
>much easier and faster, at least for a single-cycle machine, to have the user do a small-constant add or subtract
The job has to be done anyway and this way you also add code bloat. RISC, sure, but still makes no sense. Also with auto-inc/dec you gain a little bit of parallelism, especially with post inc/dec.
>than to add logic to the instruction decoder to steer pointers around to do that
That is another part of orthodoxy: to keep the decoder very, very simple.
>or to insert adders into each general register (Amdahl's Law at work, again).
That would not be necessary. The index register would in any case have to be transferred into the address register, and that is easily connected to an incrementer or decrementer. A full signed adder is overkill.
>understand that, just a few years before Apple came to be,
>>why not hardware
>it would have needed bit operations, among other things, to be a usable main processor,
I was not thinking of a hardware version of Sweet-16 as a replacement, for the most part 6502 did its job well.
>and it would have been a lot more expensive. recall that the meager 6502 was still a $20 or so part, even with the manufacturing technology improvements they had to offer
We are talking about 28 rather simple instructions that can be seen as an extension of what already is available as 8 bit instructions in 6502. BTW the Wikipedia article on Sweet16 is recently improved. For instance most of the branch instructions are the same.

>> No.1346196
File: 63 KB, 600x450, 1502261324014.jpg [View same] [iqdb] [saucenao] [google]
1346196

Could you make a functional computer with just a PIC microcontroller?

>> No.1346197

>>1345857
>to be fair, many modern processors don't do auto-increment/decrement, with the exception of the stack poin
instruction pointer is incremented
ARM has auto-increment/decrement
i686/x86_64 have similar instructions
MSP430
AVR
etc

>> No.1346216

>>1346196
a PIC32, maybe
2/8 anime tiddies made me reply

>> No.1346231

>>1345681
>why Sweet-16 was not turned into silicon
but it was, minus the large register file which would have been quite an extravagance
>Development of the W65C816S commenced in 1982 after Bill Mensch, founder and CEO of WDC, as well as the designer of the 65C02 microprocessor, consulted with Apple Computer on a new version of the Apple II series of personal computers that would, among other things, have improved graphics and sound. Apple wanted an MPU that would be software compatible with the 6502 then in use in the Apple II but with the ability to address more memory, and to load and store 16 bit words.

>> No.1346339

>>1346231
>but it was, minus the large register file which would have been quite an extravagance
Sure?

RCA 1802 had a register file of 16 words plus an 8 bit accumulator back around the same time as 6502. It would not appear that extravagant.

Moreover:
R0 is the accumulator and could be implemented as an extension to the existing accumulator A.
R12 is the subroutine stack pointer and could be implemented as an extension to the existing stack pointer S.
R14 is status register, and is the same as the existing processor status with an extension of an (unused) byte.
R15 is the program counter, can be fully reused from the existing program counter PC

You might byte extend X and Y and reuse those as other registers. All in all you might get away with far fewer added registers than 16.

>> No.1346357

>>1346339
your theorizing is useless until you start talking actual, real transistors, or at least come back with some Verilog

>> No.1346401
File: 1.46 MB, 1600x1227, RCA_1802E_20x_top_c1_cwP036942_1600w.jpg [View same] [iqdb] [saucenao] [google]
1346401

>>1346357
Google says 1802 had around 5000 transistors and 6502 about 3500. 1802 came out first as two chips, the bigger chip containing the register file.

>> No.1346434

>>1346357
Punching Verilog before you have through through your design is the path to a lot of pain.

>> No.1346436

>>1346401
the 1802 was also a bit on the clock-inefficient side, especially compared to the 6502, so probably had rather less functional hardware that had lots of opportunities during a machine cycle to be reused. also, I'm neglecting reuse of registers because SW16 in common usage was expected to preserve the 6502 registers, except for the PC, and also because a simple process such as the 6502 was built with might not easily accommodate long lines between the two relatively independent instruction decoders
a register file consisting of 1024 transistors, not counting the selection logic or any other aspect of SW16, would constitute a sizable addition to the processor (over 30%). silicon wasn't as super-purified back in the day and processes weren't as tightly managed, so double-digit percentages of bad chips were normal. (defective device rate is proportional to area * crystal defect density)
even if the ~1000 transistors for the expanded register file and additional logic for decoding them (100 or so) could be neglected, the point stands about the endeavor of designing and laying out a custom chip (manually, in those days) not being a particularly economic enterprise for a consumer products company not concerned with pushing performance envelopes, if merely in order to accelerate a BASIC interpreter that almost nobody ever used
fortunately we are free to reimplement this sweet little thing in FPGAs all day long

>>1346434
yep

>> No.1346478

>>1346436
>the 1802 was also a bit on the clock-inefficient side, especially compared to the 6502
I think it used bit serial logic. It is simple and needs fewer transistors supposedly but at the expense of speed.

>if merely in order to accelerate a BASIC interpreter that almost nobody ever used
I worked on an embedded system mainly running on BBC Basic with some assembly sprinkled over it. Nobody must know of this.

>> No.1346792

>>1346478
the original 8051 had a 12 clock machine cycle, and it didn't use bit-serial logic. probably more like a very simple sequencer that just enabled/disabled stages whose timing and steering was fixed relative to the machine cycle, which makes the decoder and sequencer simpler and smaller. the only processor I can think of that did single-bit processing is some ridiculous industrial controller out of Motorola whose type number I can't even remember

>> No.1346972

>>1346401
According to Wikipedia Transistor count 6502 had 3500 transistors, and 1802 had 5000.
Apple II was released in 1977 and already the next year the transistor count passed 9000. Transistor count alone should not be a problem, especially as the 65C02 had even more.

>>1346436
>SW16 in common usage was expected to preserve the 6502 registers, except for the PC
Correct. Interfacing to SW16 was done through zero page addresses for the registers. That would be inefficient in a processor so reuse of A, X and Y would be simpler though different.

This was during the early days for 16 bit processors and 6502 sadly lost the race.

>fortunately we are free to reimplement this sweet little thing in FPGAs all day long
True. And I would like to.

>> No.1347255

>>1346972
>inefficient in a processor
>forgets that decoders are supposed to be complicated
it's time for you to shut the fuck up and design something

>> No.1347282

>>1347255
>>forgets that decoders are supposed to be complicated
What are you trying to say?
>it's time for you to shut the fuck up and design something
There is no reason to take advice from someone who cannot even work the shift key.