New programmers often express a desire to have a deeper understanding of how their program works. In this chapter we'll take a brief look at just what Go programs are doing behind the scenes.
Variables are an abstract concept which the Go compiler understands and translates into something a computer can actually use.
A computer is very complex, but its basic architecture - the Von Neumann architecture - involves 3 components:
A central processing unit (CPU) which actually performs the instructions generated by the Go compiler
Various input / output devices (like your keyboard, monitor, hard drive, WiFi, etc...)
Read/write, random access memory (RAM) which stores both the data and instructions for your program
Main memory is a really, really big array of bytes. Imagine a giant post office with billions of mailboxes. Each one of those mailboxes has an address (Box #1, Box #2 ... Box #1,000,000,000) and each one of those mailboxes can contain a single byte - a byte being made up of 8 bits (or on/off flags) and so storing a number between 0 and 255. (255 is 28-1 since we start at 0)
It turns out we can represent pretty much anything this way. In particular
we represent "Hello World"
by converting each letter into a
number and then assigning those numbers into a sequential series of boxes:
So the way a program works is that variables are translated into memory addresses (the box numbers above). This:
fmt.Println(x)
Means something like:
fmt.Println( [BOX #1,001 -> BOX #1,011] )
Indeed before the advent of programming languages like Go, this is how programmers had to work with a computer. All that record keeping was tedious and error prone and the use of named variables with well defined types and scope rules makes it a whole lot easier to write software.
The representation for Hello World
we used above is an example
of a type
known as a string. In the next chapter we will discuss
many different types.
In the above example how does fmt.Println
actually result in
Hello World being displayed on the screen?
Well this is really complex, but here's the basic idea:
We know that Hello World
is actually just a sequence of
memory addresses (#1,000 - #1,011), so fmt.Println
is
given the starting address and a length
If we look at the code for fmt.Println
(available
here), we see
that it just calls fmt.Fprintln
with os.Stdout
.
os.Stdout
is a special file that represents the output of
the terminal and we can write to it as if it were any other file.
Writing to a file is done via a special mechanism known as a system call. System calls are implemented by our Operating System and when we make one we are transferring ownership to the Operating System for a specific purpose. Like the programs we write, the Operating System is itself a large program.
Through a convoluted set of processes the character data we passed in
(Hello World
) is translated into a bitmap of pixel data.
For example H
might become this:
That pixel data is then placed somewhere on a very large grid of pixel data. This large grid of data is sent to your display that then draws each pixel with the appropriate color.
Crucially that large grid is itself stored as a big sequence of bytes in memory somewhere. Each pixel is probably represented by 3 bytes: 1 byte for red, 1 byte for green and 1 byte for blue (RGB) and the combination of those primary colors allows us to make any color.
Now, perhaps when you were first introduced to our Hello World
program you had this question: "How exactly does a computer work anyway?",
and you saw this chapter and thought: "Aww, here we go he's going to answer
my question", and now having seen a brief description you're left more
confused and with more question than when you started. You just wanted to
know how fmt.Println
worked, and now you want to know how
stdout works, how system calls work, how text is converted into pixels,
how your display renders pixels, etc... Haven't I just made the situation
worse?
I think the key here is that although learning the innate details of how computers work is useful, it is not necessary to know these things to program effectively.
A large, complex machine like a computer is made up of thousands of components and fundamentally each component works the same way: it takes in data, processes it, and sends it out again. That data is in binary form - 1s and 0s or ons and offs - which are usually grouped together as byte-sized (8-bit) pieces.
In order to understand one component it is only really necessary to understand the format of the input data it expects, how it processes that data and the format of the output data it generates.
In our example, as long as we store Hello World
in the format
I described above, we can trust that fmt.Println
will do the
right thing. In turn fmt.Println
can trust that
fmt.Fprintln
will do what it's supposed to do, and the OS
write system call will do what it's supposed to do, and the OS can trust
that whatever mechanism it uses to convert text data into pixels will do
what it's supposed to do and so-on down the line.
It would be impossible to write software without relying on abstractions. Any single program - when considered in its entirety - is simply far too complex to grasp exhaustively. Programming requires a great deal of knowledge (what we're covering in this book), but it also requires some degree of skill: productive programmers have learned the fine art of taking a complex system, breaking it down into easier-to-graps sub-systems and then whittling that collection of sub-systems into only the ones that actually matter, and ignoring all the rest.
One of the most fascinating things about a computer is that it not only
stores data (like the string Hello World
) in main memory, it
also stores instructions there.
Let's look at a basic mechanical machine: a pendulum clock.
A pendulum clock has a pendulum (a weight on a string) which swings back and forth due to the force of gravity. A physics formula describes how often that will occur, with the key property being that the period of a swing (how long it takes) is mostly independent of the angle it starts at.
Through the use of an escapement the motion of the pendulum can be converted into fixed increments and then through the use of various gears attached to ticking hands those increments can be used to represent a human-readable clock face.
In a sense the program underlying a clock has the same 3 components all programs do:
The input is the motion of the pendulum.
The processesing is performed by all of the various gears and weights.
The output is a visible clock-face that displays the current time.
This program is defined in the language of mechnical machines: the layout and configuration of mechanical components according to the laws of physics. When properly configured the clock will show the correct time. So, in a sense, a clock-maker is a kind of programmer.
The problem with a clock, like most machines, is that in order to change the program one would need to take it apart and rearrange all of its components. You would need bigger or smaller gears, more or less weight, a longer rod or a different kind of face.
Early computers were programmed in a similar way: through the manipulation of tubes, switches and levers you could change what a computer would do when run.
But what if you could create a program that runs other programs?
← Previous | Index | Next → |