computology.org

Programming

If your quite clear what computer programming is and what compilers and interpreters are and which language you would like to learn, then why not take a look at the drop-down menu under "programming" in the top menu, to see if your prefered language is listed there.

Which Language should I Start with?

This may seem like a hard decision. First we assume that if you have arrived here, you are not looking to develop web-sites, because then you want to look at the heading HTML & CSS.

In making your choice of language it is worth taking account of the following list of considerations.

You can take a look at some languages by selecting them from the dropdown list under "Programming" in the top menu or you can read a bit more about the origins of programming on this page.

Backgrounder

Here I am going to go through a little history that people who have grown up using home computers etc. have missed. I will fill in a bit so you can understand what programming and software are and how they developed. Don't try to learn what is written here but just read it for interest. You don't have to know it in order to start your programming but once you do start programming you might refer back to it from time to time.

A Program

Everyone is familiar with reading and writing, where we use a bunch of symbols called letters to compose words, sentences, paragraphs and even books. It is possible to use symbols to write instructions to do things. i.e. Walk to shop, buy milk, walk home. This list of instructions is a simple program. The commands (or verbs) walk and buy are executed by a person. The person probably already knows (remembers the sub-programs) to walk and to buy. One could however write the sub-programs for walking and for buying in terms of much simpler commands like step, look, etc. etc.

The Jacquard Loom

Jacquard loom with punched cards Imagine building a machine that can execute instructions coded using some kind of symbols! Perhaps the first such machine was the Jacquard Loom which executes instructions written on cards in a language of holes, to weave patterns in the cloth it produces. Jacquard looms are programmable machines because the function they perform is not determined by the machines hardware but by two symbols "hole" and "no hole" coded on the cards. The writing on the cards we would now call "software" because it is flexible and can be changed easily. Just load a different stack of cards onto the loom and it produces a different pattern. It's just like running a different program on a computer.

The pattern of holes on the cards lifted or lowered the threads during the weaving process to create the pattern in the cloth. A very simple mechanism, in this case but still the point is that the binary (two symbol) language of hole/nohole could be turned into a pattern of cloth.

Use of Binary Codes

The Jacquard Loom was one of the first machines to use a binary code. Electronic computers use a binary (two symbol) on/off based language, but usually rather than say on/off we say 1 or 0 and refer to a 1 or a 0 as a "bit".

Although today you don't need to understand binary to program a computer and we probably won't even mention binary in any program you write, an appreciation of how and why computers use binary, a technology from weaving looms, can be enlightening and can help you write better programs.

Whether we represent 0 and 1 by a nohole/hole or off/on it isn't important. With just two symbols 1 and 0, what can you do with that?

In the decimal number system you will remember that there are just ten symbols 0,1,2,3,4,5,6,7,8,9 what can you do with that?

Well you can write bigger numbers than 9 by using the idea of "place value". So 4217 is understood to mean the number resulting from 4x10x10x10 + 2x10x10 + 1x10 + 7.

You can do the same sort of thing with binary (the 0/1 system) so 1101 is understood to mean the number resulting from 1x2x2x2 + 1x2x2 + 0x2 + 1 which in decimal is 13

Any number you can represent in decimal you can also represent in binary so 4217 is 1000001111001.

A computer can easily do any calculations you like using binary.

In modern computing, simply for convenience and as a matter of convention, we choose to group 1s and 0s together in groups of 8 bits and call that a byte. Using a byte it is possible to represent numbers from 0 to 255.

Current processors typically handle 64 bit long numbers that is numbers which are 8x8 bits or 8 bytes long, but 1 byte is a good size for representing the symbols people need like the alphabet for example.

The Terminal

Teleprinter Initially the telegraph used Morse code to transmit messages over long distances but as machinery became more advanced this was largely replaced by a machine called a Teleprinter where you could press keys on a keyboard of a teleprinter in one place and have it print out on a teleprinter in another place miles away. Of course the preference for communication was to use an on/off code, i.e. binary.

Eventually ASCII became the standard coding system with what was at first a 7 bit code and then the 8 bit or 1 byte coding system used today. Every character is assigned its own code.

'A' is 01000001 which is equivalent to the decimal number 65. 'B' is 01000010 which is equivalent to the decimal number 66. 'a' is 01100001 which is equivalent to the decimal number 97. 'b' is 01100010 which is equivalent to the decimal number 98. I am sure you get the idea. A few special codes mean start a 'new line' or 'return' to the left of the paper or screen. Code 00000111 means ding the bell on the teleprinter to attract attention, yes there actually was a little bell built into the old teleprinters like the one in the picture.

Not surprisingly computers initially combined the idea of punched card technology hole/nohole binary codes, with teleprinter on/off electrical technology binary codes, as a way to store text information.

Punched paper tape Punched cards Even as late as 1980 there were still computers where users, used punched cards and punched paper tape to store text and numbers as hole/nohole binary codes, and where users submitted programs and data to the computer room of the organisation as a role of punched paper tape or a bunch of punched cards held together with an elastic band! What came back was your original submitted tape or cards plus possibly some more tape or cards or perhaps a printout with the results from the execution of your program on the data you submitted. By the way the tape was about 1 inch wide and the cards were about 2.5 x 6 inches. (2.5cm = 1 inch)

If you were lucky you had a direct connection to the computer via a teleprinter like the one pictured above, where you could type commands write programs and where the results would print directly on your teleprinter. A teleprinter connected to a computer was often called a "computer terminal" or simply a "terminal".

The concept of the terminal is not out of date any more than the concept of the wheel is out of date and we still use terminals for programming though they are no longer electro-mechanical devices but rather simulations of those devices on our computer desktop. It's quite odd that the concept of terminals lives on in this way.

The Memory

In a typical computer there is a CPU (Central Processor Unit) which is an electronic device to execute instructions and process data. Both instructions and data are stored in the computers memory.

A typical computer has;

A computer memory can remember a lot of binary 1s and 0s which, in Dynamic RAM, are actually physically stored by either putting an electric charge or putting no charge on an electronic component called a capacitor. You can go out and buy a capacitor if you want to and charge and discharge it. Of course modern memories have billions of components fabricated on a single chip of silicon and so can remember billions of binary bits on billions of capacitors.

Usually the memories are built with the binary bits grouped together as bytes (8 bits) or larger groupings of 16, 32 or 64 bit chunks often called "words".

Each memory word in a computer's memory is referenced by a unique number called its "address" which refers to its position in the memory. So the CPU (Central Processor Unit), can read or write any word that is stored at any given address. The programmer generally does not need to reference words in memory using addresses, because they will typically be programming in a high level language that allows them to see data in memory as numbers or strings of text, each of which can be referenced by a name chosen by the programmer.

The hard disk also has an addressing scheme for accessing blocks of binary bits. The programmer generally does not need to reference blocks on disk using addresses, because they will typically be programming in a high level language that allows them to see data on disk as files of data and folders, each of which can be referenced by a name chosen by the programmer or the computer user.

Booting, Machine Language and CPUs

I have already mentioned the CPU (Central Processor Unit) which is an electronic device to execute instructions and process data stored in the memory.

Normally when the computer is switched on and the CPU first receives power it will read a specific memory location (fixed by the manufacturer) to get the address where the first instruction of the first program it must execute is stored. Both the memory location and the first program are stored in ROM (Read Only Memory). This first program is normally called the BIOS (Basic Input Output System) and it runs continuously when the computer is on. It deals with input key presses on the keyboard and basic output to the display. It then reads the hard disc boot sector to copy into RAM (or load) and execute another program responsible for loading and executing the Operating System which is also on the hard disc, be that Linux, MacOS or Windows. The Operating system consists of a number of programs which not only provide the user interface we see but much else besides.

The CPU only executes programs written in a specific "machine language" and stored as binary in its memory. The machine language a specific for the type of processor. An ARM processor or an x86 processor cannot execute each other's machine language.

Writing programs in machine language is hard. Everything has to be turned into binary codes and stored in the computers memory for execution. Machine language programmers do have to worry about addresses in memory.

Once you have done a bit more you might want to understand the boot process better. You can read the page on the BIOS the first program to run when the system starts.

Assembly Language

To make writing instructions for the CPU easier humans found that using letter codes for each command, for example ADC meaning ADd with Carry or MUL meaning MULtiply or MOV meaning MOVe or LDR meaning LoaD Register, and so on, worked. (A register is a special memory location inside the CPU.)

This is called "Assembly Language". The assembly language program text file could be processed by another (tool) program called an "Assembler" that would translate the assembly code into machine code but even then the programs have to be written for a specific processor. Still an ARM processor or an x86 processor cannot execute each other's assembly language programs and translation from one to the other would be very hard and inefficient. Assembly language directly corresponds to the machine language of the processor, one assembly language command corresponds to one machine language command.

Here is an x86 assembly language program to display "Hello, world" on the screen.

.global _start .text _start: # write(1, message, 13) mov $1, %rax mov $1, %rdi mov $message, %rsi mov $13, %rdx syscall mov $60, %rax xor %rdi, %rdi syscall message: .ascii "Hello, world\n"

Here is an ARM assembly language program to display "Hello, world" on the screen;

.global _start .text _start: mov r0, #1 ldr r1, =message ldr r2, =len mov r7, #4 swi 0 mov r7, #1 swi 0 .data message: .asciz "Hello, world\n" len = .-message

written as similarly as possible! spot the differences? By the way I have not executed either program.

High Level Languages

Writing programs in machine language - everything has to be turned into binary codes and stored in the computers memory for execution. Writing programs in assembly language - programs still have to be written for a specific processor type.

To overcome this problem "high level" languages were developed where you can write the program in a standard way for all processors, and where you can used names to refer to stored data in memory.

Here is a program to ask your name and say hello to you. It puts your name in the memory in a space that it calls A.

In the programming language C it looks like this;

#include <stdio.h> int main() { char A[64]; printf("What is your name?"); scanf("%s",A); printf("Hello %s.",A); return 0; }

in Ada it looks like this;

with Ada.Text_IO; use Ada.Text_IO; procedure hello is A:String:=" "; L:Integer; begin Put_Line ("What is your name?"); Get_Line(A,L); Put_Line ("Hello " & A & "."); end hello;

in Python it looks like this;

A = input("What is your name?") print("Hello", A, ".")

but before you decide Python is for you remember that python is about 1/20th the speed of C and 1/10th the speed of Ada. There are also a good many reasons why I favour Ada over C/C++ even though its 1/2 the speed of C/C++. I cover this on the page Why Ada? on my site www.adaworks.it . Java I cannot give an objective view on it from a speed point of view but it lacks Ada's multi-tasking and Ada's track record in life-critical systems, like air trafic control and railways. Rust is a new language compared to Ada combining a lot of good integrity features not in C/C++ but it is relatively new and not ideal for beginners.

Compilers and Interpreters

High level languages like C/C++ and Ada, use a (tool) program called a "compiler" which translates the program into the machine language of the computer so it can be executed. The program written in high level language is known as the "source code" while the translation is known as the "executable" or "machine code" it is just a machine language program to be executed by the CPU.

High level languages like Python or Javascript, do not use a compiler. They use another a program called an "interpreter" which rather than translate the program into the machine language of the computer. The interpreter is its self a machine code program that simulates a processor which executes the high level language as it is, no translation necessary. Some languages like Java compile the source code into something called "byte code" and then an interpreter (called the Java virtual machine) executes the byte code. So it compiles a bit and interprets a bit!

Compiled programs run faster than interpreted programs but can take more memory and have to be compiled for the particular type of computer they are to run on.

When you sell a compiled program you don't give away the source code but only give customers the executable which makes it almost impossible to reverse engineer the program. Interpreted programs are easier to reverse engineer.

Program Files

In reality large programs can be composed of many sub-programs appearing in many files so the reality of compilation and interpretation can be a bit more complicated than described above because all the sub-programs in the various files have to be "linked" together by a program called a linker.

Often there are standard sub-programs that get used a lot and the files containing them are called "libraries".

The C program above tells the compiler to "include stdio" the standard input output library and the Ada program above tells the compiler to compile "with Ada.Text_IO" which is Ada's text input output library. Linking to these extra library files is required to display text on the screen, which is what these simple programs do.

I am not going to explain these programs here and you don't need to worry about what exactly they mean. Once you start a tutorial on your chosen language you will be able to get a better understanding. Here I just wanted to give you a flavour of things to come.

About Tools

Besides compilers and interpreters, programmers use many other "tool" programs to assist in the programming process its self. Essential is a "text editor" to edit the source program files and if that has "syntax highlighting" to colour the commands of the particular language, all the better. Its just a sort of simple word processor.

Another tool program that is very useful is a "debugger" a program that lets you step through your program one instruction (line) at a time.

Programmers often use an IDE (Integrated Development Environment) which integrates an editor, debugger and various other tool programs as a single 'do it all' tool program.

Although IDEs can be great, they can be hard to setup if you haven't experienced using the tools independently before, so they are a mixed blessing on occasions.

Currently the internet is full of sites that allow you to compile/interpret programs online, but they are for short programs like those above and if you want to do any segnificant project you will need a decent IDE. Take a look at the drop-down menu under "programming" in the top menu, to see if your prefered language is listed there. Click on it to see a page giving more details about it possibly including where to download an IDE.

Lastly remember JSA (Jolly Silly Acronyms). The computer world is full of them. Never take the acronyms to seriously, its almost worth forgetting what they stand for and just learning them as words referring to something or other.

Fun and Frustration

Programming is what drives all the intelligent (and not so intelligent) machines you use each day from your washing machine to your phone and your computer. There is nothing like the satisfaction of seeing your own program actually doing things, whether your just starting or writing epic programs for the web or apps (just cool programs) for your smart phone or computer. But I can assure you there will be times when you might have some days or rarely I hope, a few weeks of frustration.

You are likely to have more of these at the beginning of your programming life than the end. Being able to get through such times is not just necessary to reach your goals in programming but also often to reach goals in life.

Programming isn't a linear learning process its one where your achievements accelerate as you progress, like a train it starts slow.

Make friends with people who know more than you as well as people who know less. Join a group, a "maker space" or "fab-lab" where programmers hang out.

Don't bother people who know more than you until you have struggled with the problem for an hour or two yourself. Help people who know less than you by sharing your knowledge with them when they ask for it. Basically "be helped and be helpful".