Data Engineering for Beginners – Introduction to Computer Systems
When I started my career, not being from a Computer Science background felt like a huge disadvantage. While this varies from person to person, this was a difficult mental block for me to overcome.
This post is my humble attempt to structurally reconstruct my journey as a Data Engineer from the foundations. Unlike my previous posts, some of these may seem very basic but there are 2 reasons why I am including them here:
- Some of these were light bulb moments for me (when I started out) and helped me connect a lot of the dots.
- This post is mainly for folks who started out in non-CS backgrounds. They might find this more helpful. If you are from a CS background, you can start from a later post where I’ll discuss more advanced concepts.
So let’s get started.
Computer Hardware Organization
Hardware organization of a computer (also known as a node) refers to the way the different components of a computer are connected and interact with each other.
The main components of a computer include:
- Central Processing Unit (CPU)
- Main memory (RAM)
- Secondary storage (e.g. hard disk, SSD)
- Input / Output devices (e.g. keyboard, mouse, screen)
These components are connected to each other through a system bus, which transfers data and control signals between the components.
The motherboard serves as the central hub, connecting all of the components to each other and allowing them to communicate.
- The CPU is the brain of the computer and performs most of the computations.
- RAM provides temporary storage for the data and instructions being processed by the CPU.
- Secondary storage provides long-term storage for data and programs.
- Input / Output devices allow the user to interact with the computer and receive output.
- A bus in a computer refers to a communication pathway that allows components to exchange data and control signals.
Components of a CPU
The major structural components of a Central Processing Unit (CPU) are:
- Control Unit (CU): The Control Unit manages the flow of data and instructions within the CPU, fetches instructions from memory, and decodes them into operations that can be performed by the ALU.
- Arithmetic Logic Unit (ALU): The ALU performs mathematical and logical operations on data.
- Registers: Registers are high-speed storage locations within the CPU that temporarily hold data and results of operations.
- Cache memory: Cache memory is a small, fast memory located close to the CPU that stores frequently used data to reduce the number of accesses to slower main memory.
Note that the CPU can only read data directly from Cache or RAM. It cannot directly read from the disk.
Data and Instructions
It is worth noting that data and instructions are treated differently in the system. The instructions tell the computer what to do whereas the data is what the computer operates on.
So when we have both, we might think of how they travel within the system.
Most systems today use the Von-Neumann Architecture. Without going into details, the notable feature of this architecture is that both data and instructions share the same path/bus. This single path is also known as the Von-Neumann bottleneck due to obvious reasons.
Now that we are familiar with the hardware components, we will look at the software components as well.
There are broadly three types of software components in a computer:
- Operating System
- Application Software
The Operating system is what allows a user to seamlessly interact with the computer. It also manages background functions like resource management, interrupts, etc. You can install the OS of your choice on a system.
Firmware is pre-installed in a computer. It is the program that helps to boot/start a computer. It loads the Operating System to the memory and the OS will do the rest of the booting.
Application Software are programs that we install on a computer to perform tasks or operations like watching a video, creating art, converting files, etc.
1) Stallings, William (2016). Computer Organization and Architecture (Tenth Edition). Pearson.