Systems Architecture
Machine Instructions
A computer’s fundamental building blocks give it the capabilities to do simple operations like:
- Add two numbers
- Check if a number is zero
- Copy data from one memory location to another
Layers of abstraction are built upon these basic machine instructions, leading to more human-friendly ways of describing logic, and ultimately build out complex systems, user interfaces, networking, etc… — everything that makes up a modern computer.
Six-level Architecture
While it’s not always a simple delineation, it’s handy to think of most computer architectures as being roughly split into the following levels:
- Digital logic/hardware: Gates, registers, digital logic (AND, OR, etc…)
- Microarchitecture: Specific memory registers, basic math operations
- Instruction set: A standardized set of operations supported by varied implementations of the microarchitecture level and the hardware
- Operating system: Orchestrate processes, manage memory and storage, I/O devices, and handle permissions
- Assembly language: Converts assembly language to architecture-specific instruction set byte strings
- Problem space/user code: Application logic written in high-level languages
“Any operation performed by software can also be built directly into the hardware, preferably after it is sufficiently well understood.” - Structured Computer Organization (pg. 8)
Operating systems date back to the 1960s, when it was designed to replace much of the tedium performed by human operators, like loading interpreters and programs from cards into memory. In addition to simply automating tasks, it became an abstraction layer on top of the ISA on which application logic could be built. This layer includes what we refer to as system calls. The operating system also introduced the capability for scheduling workloads, as well as remote access.
The microprogramming layer historically has been in flux due to growing and shrinking numbers of instructions, depending on whether or not it was more performant or cost-effective to implement those instructions in hardware. However it’s not entirely important how this microprogramming layer behaves for software engineers, we tend to care about the abstractions above this ISA.
Due to problems with heat dissipation, it’s been difficult for personal computer clock speeds to break the 4GHz threshold. Because of this, CPU manufacturers have invested in different ways to squeeze performance out of our chips, either by clever branch prediction or by packing multiple processing cores into a single CPU die. However, to take advantage of this, programs must be written to handle the problems of parallelism.
Digital Logic/Hardware Level
The Arithmetic Logic Unit is a multi-functional digital logic circuit comprised of rudimentary computational pieces like boolean logic gates and full adders. They are the primary workhorse of a microprocessor.
Instruction Set Architectures
x86
The x86 architecture refers to a class of compatible CPUs that stem from the 80386 and 80486 chips, successors to the Intel 8088. This is the predominant architecture found in personal computers and server systems.
ARM
ARM is chiefly used in mobile applications and newer video game consoles (low-power, mobile and embedded), however Apple’s M* chipset has notably pulled ARM into the personal computing space.
RISC-V
RISC-V is a relatively new open-source architecture with a rapidly growing ecosystem.
AVR
AVR is a common ISA used in microcontrollers, chiefly the ATMega (used in the Arduino).
MIPS
MIPS is mostly used for educational purposes.
Pipelines, Instruction- and Processor-level Parallelism
The five-stage pipeline has the following components:
- Instruction fetch unit
- Instruction decode unit
- Operand fetch unit
- Instruction execution unit
- Write back unit
Processors can improve performance by using multiple pipelines or “superscalar” parallelism (multiple operations on different parts of the CPU). Instruction-level parallelism commonly only helps by a factor of 5-10, but processor-level parallelism pushes beyond that.
Additional Terminology
- Superscalar: Single pipeline leads into many specialized instruction execution units that can operate in parallel. This way long-running memory fetches and stores don’t block operations.
- SIMD (Single Instruction Stream Multiple Data Stream) is used by GPUs to run data in parallel through the same instruction over multiple processing units.
- Vector processors are capable of operating on register arrays.
- Multiprocessors are individual CPUs with shared and/or local memory.
- Multicomputers are networked computer systems that compute large-scale problems together.
- Direct Memory Access allows I/O devices to supply data directly to memory and trigger an interrupt.
Operating System
The operating system provides a layer of abstraction over the ISA level to (among other things) simplify tedious tasks like:
File I/O
A pretty grueling task through CPU instructions alone, the OS provides an opinionated abstraction layer over operations. The OS establishes a standardized way to assemble bytes
Virtual Memory and Paging
Direct memory management is another difficult problem, especially because operating systems allow timesharing between multiple processes which may compete for resources. Application code uses virtual memory which is mapped (with a combination of hardware and OS software) to physical memory, either through paging or segmentation. Either way, as memory becomes scare or multiple virtual memory locations collide on the same physical address, pages are swapped on and off of disk.
Note: When resources are taxed, frequent swaps can lead to a performance degradation called thrashing.
Process Synchronization and IPC
- Threads
- Message Queues
- Shared Memory
- Pipes
System Calls
Hardware (ALU, registers, memory, bus devices) is accessed by microcode and the instruction set. The operating system uses the instruction set to create a running metaprogram that orchaestrates user codes and all of the other things the OS needs to do. Commonly, the OS provides an API out to user space code to allow it to do some of these things when it’s necessary and permitted. Additionally, these OS-level abstractions make it much simpler to make syscalls, as there are complex constraints to how parameters are supplied to the kernel. UNIX includes a great deal of replacement helper functions in its C library that perform syscalls.
Common roles of system calls are:
- Process
- File
- Device
- Information
- Communication
- Protection