Sunday 20 May 2012

Compiling Andromeda


Andromeda is an interesting project, but in order to see what we're up to, you might just want to look beyond the code and into the binary.
As of yet, there's not that much going on, but the mere fact that we've got the possibility to boot, handle key-presses and actually have scrolling text on our screens is already quite a feat, for first time kernel programmers like us. Just imagine that the stuff going on in the background is a thousand times more epic than what you can see on screen.
This article will guide you through the building sequence for the Andromeda kernel project. We will start of by getting our tool chain in place. We won't go into detail on how the tools work by them selves. That's what Google is for.
Once we've got our tool chain in place, we'll go into basic compiling. With that behind us, we'll look at the option for providing flags, and last but certainly not least, we'll go into booting the kernel.

Tool chain

To compile Andromeda, we will need the following:
  • gcc
  • nasm
  • GNU-Make
  • ld
  • git
The compiler, assembler, make and linker respectively. For the ones that don't know what the different tasks of those tools are, here's a brief summary:

GCC

The compiler compiles C code, and translates it into binary object files. The C compiler doesn't know about the layout of the final layout of the binary image. All it knows how to do is translate C statements into machine instructions.

NASM

The assembler takes loose human-readable machine instructions and translates them into machine-readable machine instructions. The output is surprisingly similar to that of a compiler.

Make

Make is the tool that is the conductor of this small orchestra. It tells the compiler or assembler to translate their code into binary instructions. Once this is done it tells the linker to put it all together.

LD

The linker is the pasting tool. Until this point our sources have been translated into several binary images, but none of them are executable, because they have references to other files, which need to be linked together first.

Git

Git, the stupid content tracker, is a quick and easy to use version keeping system, that's used by projects such as the Linux kernel and Android. This will be used to get our code, and maybe even do some work on it. Who knows?!

Compiling

In order to build the kernel, we'll be issuing the make command in the src directory of the repository. This command will read the Makefile and based on what we've got there it will determine how to compile which source file.
Make takes some arguments. One of them is "-s". This silences make, and keeps the terminal relatively clean, and easy to read. Only warnings, errors and verbose messages will show up in the terminal, keeping it easy to follow what's happening.
Another argument is the "-j n" option, in which n is the number of jobs with which you want to compile the kernel. Generally to optimize the compilation for your system, it is best practice to take the number of cores (if you have hyperthreading, you can use that) and double it.
For example I have a core 2 dual, with 2 cores, so my most optimized command is:
make -sj 4

Compiler flags

To activate or disable some functionality in the kernel, there is a possibility to add some flags to the compilation. The easiest way to do this is to put the flags into the flags variable you hand to make. To enable a feature, the -D flag can be used. To disable, the -U flag is to be used. So for example to compile the slab allocator into the kernel the following command can be used:
make FLAGS=-D\ SLAB
or
make FLAGS="-D SLAB"
Work is under way to building a kernel configuration editor, so not all flags will have to either be looked up or be remembered. For now, to find out which flags are available, key in make usage.

Booting

There's a couple of different ways to boot Andromeda. The first and easiest is through the make test command.
Make test builds the source tree (FLAGS variable works with this target as well) and then tries to run it in an Qemu/KVM environment. Qemu has a pretty convenient option to emulate grub, which we take advantage of here. For those that don't know what Qemu is, it is a virtual machine manager, and KVM means kernel-based virtual machine (the commands can pretty much be used interchangably).
Another way is to create a director at /media/loop and download a floppy image from github. First you build the kernel, according to the instructions above, and then the image is put into the src/scripts directory. Next we issue the ./updatefloppy.sh command, which will ask you for your sudo password (please install sudo for this, if you haven't already).
To recap, the options are:
make test
or
make
cd scripts
wget https://github.com/downloads/Andromeda-Kernel/andromeda/floppy.img
./updatefloppy.sh; kvm -fda floppy.img -m 16M

Thursday 3 May 2012

What is virtual memory?


Lately I have been working on the virtual memory system. Now most of you out there won’t know what virtual memory is, so here is a brief explanation.
Virtual memory is the mechanism used to make all tasks think they’re the only task running on that system at any given time, unless communication through memory is desired, in which the operating system makes that possible through a mechanism called shared memory.
That’s virtual memory in one full sentence. If you still don’t get it, that’s perfectly understandable. So here is an explanation which is a tiny bit longer. It explains the goals of virtual memory and explains different approaches to solving the very same problem.
When writing the code for any particular task on any system it’s a nuisance, to say the least, to have to keep the possible existence of other tasks on that system into account. The sole role of the operating system is to help other tasks run, and so what it does, is try to give the entire memory space, to each and every task.
While the whole memory space won’t be possible due to technical restrictions (the kernel itself must be somewhere in memory as well), quite a lot can be made available to user space (that’s where user tasks run).
One of the ways to get this all to work is to have each task ask the kernel for more memory each time they need it, while keeping all the data available to all the other processes. Sometimes, on architectures which don’t support memory protection, this is the only way to go forward.
Another way is to have the processor translate the virtual addresses into physical ones while the kernel concerns itself with the allocation of physical pages (a page is a chunk of memory of a predefined size). When a process isn’t allowed to access a particular physical page, it is simply not mapped to any virtual address at the moment that task is running (again, besides the kernel, but this region of memory has other ways to protect itself).
The advantage of the latter approach is that in this model it isn’t possible for one task to modify critical data or code in the other task. When it is desired to have two tasks sharing a region of space, that can be accomplished by mapping virtual pages in both tasks to the very same physical pages.

Thursday 19 April 2012

Printf flavours


I’ve been working on several things in the kernel lately. Among them are the slab allocator and the different flavours of the printf function (some of them).
I’ve now finished work (mostly) on these printf functions and I’m happy to present:
  1. sprintf,
  2. vsprintf,
  3. fprintf and
  4. vfprintf.
These functions have better formatting support than the current printf function. So in order to use these new formatting features one will have to make a string, format it using sprintf and then print it using printf.
As soon as we have a better self cleaning buffer in the kernel, we’ll implement it in the vga-text driver and make a new printf function that uses fprintf to write to stdout (which is connected to the driver).
Also work on the slab allocator is progressing, albeit slowly. There’s a lot of thinking involved in building a slab allocator, and I’m hoping to do it well the first time. This means there’s even more thinking involved.
Also this slab allocator will have to be used for page allocation, which makes the slab allocator come with a couple of restrictions (we can’t just assume we’ve got space everywhere outside of the text segment and annoying details like that). So for now we’re still using the slob (single list of blocks) allocator to do all the memory allocation for us, but we’re looking forward to a functional slab allocator (probably done in a couple of weeks).
That’s it for now!

Saturday 11 February 2012

OSdev resources

I know a couple of people interested in OS development but no clue on where to start, and even though I occasionally show them one or two of the sites I get my information from this blog post is supposed to give them a better set of pointers to documentation.

I started programming in C using a good tutorial on cprogramming.com. Now I understand if this isn't enough for you and one book I highly recommend is the C programming language by Kernighan and Ritchie. Those are the two men behind the language and they've found a way to describe the language in a simple and brief way.

Also something worth learning, although not absolutely necessary when joining an existing project is a form of assembly language. I personally got started with the art of assembly, and honestly haven't found a better resource yet.

Now we have some grasp of what it means to be a programmer, we can start thinking about a simple kernel. For that we'll go to a man called James Molloy. He's written a tutorial which is very easy to understand. It might not be the most powerful and flexible kernel out there, but it'll get the job done for a first kernel. Also there's this beautiful wiki on how to do some things not covered in the tutorial.

Another tutorial that may interest you is one that follows the Windows path a little bit more as opposed to the Unix route used by James Molloy. I'm now talking about the brokenthorn web book. This is one of the first tutorials I've found on the subject and although I didn't take it's path it has taught me quite a lot.

Saturday 4 February 2012

The VFS


The virtual file system(VFS) is one of the most important parts of the operating system. It handles communication with permanent storage. In our case the VFS doesn't handle the file operation them selves but requires the file system to give us function pointers to work with.

A file in our VFS is nothing more than a data structure with a pointer to the file system driver file, which itself is attached to one of the permanent storage devices. This might seem complex but it does provide us with a whole lot of possibilities, which are provided not only by our system but also by others like Linux and BSD.

The file system mounts still have to be written though so I still can't go too deep into the implementation details of those.

One thing we already have working though is a special kind of file we call a buffer. When we open a file with the initialiser function of the buffer it loads all of its functions into the function pointers of the buffer and now allows us to read from, write to, seek in and close the buffer like it's a normal file somewhere on a disk. Internally it keeps all the data in the form of a tree and when it's time to close the buffer, while no other part of the application has opened it, it will remove all of it, which is quite unlike a normal file.

Because of the index variable, which is 64 bits, the buffer supports files of up to 16.000 PiB. If you don't know what that means. Well, lets just say that it's plenty for the coming decade if not more (I personally have trouble filling my 320 GiB, and that's roughly 0.002% of the buffer …).

The driver model


Since the beginning of this year we've been busy in the Andromeda team. One of the things we've been working at is the driver model. Our design is fairly simple, and I think I can explain it, so here we go.
The device model consists of a tree of devices. Starting with the root device. This is a virtual device to which everything is attached in some form. Attached to this device are for example the CPU's, memory, the PCI bus and some virtual buses.

The reason for choosing a tree instead of a plain list of devices is simple. When the system is to shut down or to suspend, we want to disable the devices first, then the buses and last the power supply (if necessary). If we're to walk a plain list things are to shut down in random order which might get interesting since for example the PCI bus has been shut down before the graphics card is so it never receives the shut down signal.

The reason why the CPU's, Memory and PCI are attached to the root device are simply the fact that they are at the head of the system. The reason for the virtual buses might not be so obvious yet.
There are 2 virtual buses. One for virtual devices which reside in memory and don't really have to be suspended, and can't have physical hardware attached to them, and then there's the legacy devices. These can be the VGA and PS/2 controller to name a few.

Devices in the model are nothing more than data structures with a file pointer, open function, name/unique identifier, driver pointer and a void pointer for special data structures. When interaction with the devices is needed, the device file can be opened and written to. The device might respond and by reading from the device file the answer can be retrieved.

In the case of a permanent storage device this file can hold pointers to partitions which can in turn hold files which can then be mounted to the VFS.

What we've been up to

I know, it's been a long while, and quite a bit has happened since.

A summary of what's been done:

  1. A driver model has been designed and implemented and is nearing completion (aside from the actual implementation of drivers).
  2. A virtual file system has been designed and implementations are on the way.
  3. Network stack development is currently being done by Michel.
  4. Ideas have formed on a new memory allocation algorithm.
Now this doesn't sound like much but for two people it's quite a bit of work, considering the fact that in the design of core features in the kernel we prefer to guarantee quality.

In following posts we'll go deeper into the separate items.