Friday, 24 June 2011

What's going on here?

If you have been following the code lately you will have seen several things changing, quite dramatically.
One of which is that there is a new contributor, named Steven van der Schoot. His name already was in the  CREDITS file, but his role actually has changed dramatically.

Previously, all he was known for was the printhex function, which couldn't do much more that print out numbers in hex. I thought that was a bit messy, as I also wanted something that could do binary, decimal and octal, and just hacked them together into a single function. (If you know the maths it's not that dificult, I admit).

Now he has more time because he has passed his final secondary school exams, for which I wish to congratulate him.
But he has decided to do several things with his spare time. One of which is Mincecraft. The other? The VGA driver for the platform.

Yes, we already had text output, and that probably won't change in the near future, but what he is enabling here is much more than basic text printing. Basically the bootloader configures you to use VGA text mode, which enables you to write characters to a location somewhere in memory. Now that a driver is coming up, we still be writing to this buffer, be it in another location, only god (and my paging code) knows where in physical memory. But once this is done, what he can do, is start focussing on drawing shapes to screen. Such as rectangles, triangles and circles.

Now that sounds fantastic and all, we've got graphics working, or ... do we?

Technically, yes, but don't get too impressed. I still don't know anything about resolution, but for all I know VGA is basically crappy, not being able to go over 640x480, at, what was it? 256 or 16 colours.

Now this is nice and all, but it doesn't really have my priority. I'm more worried about getting the source code in a new layout, which enables me to make a difference between the compressed image (or nano kernel. Yes I'm changing the terminology here) and the core image on basis of path, in contrast to a complex, hard to notice, easy to forget while writing ifdef structure.

This is actually the reason why you see these huge spikes in the impact graphs on github. It basically boils down to the entire project being assigned a new location, which generates a huge impact, while in reality, nothing changes in the content. It's a mere process of copying and removing.

I am going to have to do some work on the makefiles though, and I probably am not going to like doing so. The make files need to get more scalable, as I know that they aren't exactly doing so as of now.

But is your elf code done?

Well, no.
I needed to get my loaf away from it for a second. Making the project more scalable had a higher priority than elf code. Despite the elf code being half done. This is due to the fact that I'm no longer the sole developer of this project and I need to make it possible to give the other developer(s) all the space they need.

I still don't know where I should put Stevens drivers for example. The drivers could be necessary for the nano kernel, since the drivers need to be installed from unreal mode, and the core image can't go there any more (unless you have 3,5 gigs or more). I could also do v8086 mode, but I feel like that could pose security issues.

I still haven't figured it out yet, but I'm working on it. Fortunately with a new developer.

Thursday, 16 June 2011

Is it me, or has it been a while?

It's not you. It indeed has been a while and it is all due to the fact that this week is exam week at school. Those things require my attention a little more than the kernel.

So you've done nothing to the kernel since last update?

Well, that's not entirely true. I've been reading the Elf32 specifications and have managed to pull some more info from the Elf headers. Among which is a segment with a type number that doesn't appear to make sense according to the specifications.
However, after doing some googling I found out that Linux uses its own types. One of those is a stack segment, in which the kernel is specified to reserve room for the stack. In normal circumstances that's nice, but since my kernel sets up its own stack I'm just going to ignore this segment.
As for the rest?
Well I've been redoing my printhex function so that it now prints numbers in every base in the range from 2 to 36. All printf has to do now is call this printnum function with the number (named index), base, a boolean called signed and a boolean for capital characters.

So I guess that's all.

Well, not entirely. I have also changed the layout of the Orion project itself. Andromeda now no longer is merged into the Orion source tree. It rather is imported as sub-module. This means that when you import the Orion source tree, all you get is a .gitmodules file and a bunch of empty directories.

You then have to do git submodule init; git submodule update; to get the sources from the sub-projects.
Now that's all.

And what can we expect for the future?

Well, school is about to break up for the summer, which renders me a lot of spare time. I'll be spending some of that spare time in the woods with my dog, and some of that time in France. I will also be spending some of that time on the project, so don't worry about that.

I think the summer break up will give me more time to do work on the project so you might see the impact graphs show a lot more impact.

Wednesday, 1 June 2011

What? Nothing new?

Well, according to my dictionary elves are typically irritating little things, and that is exactly the way Elf loading turns out to be. A whole heap of irritating little things.

So I've been busy for sure, but I also have school and other things to keep working on, so unfortunately this one feature will return to this weeks sprint, and I have a very irritating little feeling that it will continue to come back to the next sprint for a while.

This basically means we have to put up with the slowing development for a while. If I have to choose between school and this project, I'll definitely opt for school, and as it turns out, that is exactly the choice I have to make this weekend.

I don't expect to do a lot this week, because all the work I do is really the time I have left when all of my spare time is divided.

I have hacked together a little document on scheduling though. It can be found at humane hours (from 7:00 through 23:00 GMT) behind this link.

Wednesday, 25 May 2011

And here come the elves!

Elves, what do you mean by that?

Well, there is this file format called ELF. It is an acronym for Executable Linkable Format.

Now what does this mean, or what the heck, why not what is a file?

Good question. What is a file. It seems so obvious, a file is a set of collected data, but how does the computer know that this piece of data is part of this file and what you've placed in another file is indeed content of this other file.

This has to do with the way the hard disk is made up but also how the main memory or RAM works.

The system implements a file-system. Now what is a file-system? It basically is an index of which regions of the disk are part of one file and which regions are part of another.

Okay, this is on disk, now what about in main memory?

The main memory contains tables which point to places on the disk. These tables are read of fixed places from the disk, or rather partition. For simplicities sake we'll assume the disk to hold only one partition meaning that those terms can nearly be used interchangeably.

So we now have a table of pointers to what? Files? Directories? Other tables?

Well, I'm still investigating this part, but for what I know the table points to files. There can be special files, which are called directories, and other types of files called symbolic links.

A special case is the hard-link which makes 2 pointers point to the same file.

That's fine and all, but how do we get this all from disk to memory? Is it just a matter of reading from a pointer and getting the answer or is there something more complicated going on.

Well, I know it's a little more complicated than that, but I still need to investigate this bit. All I know grub has done some things for me and I can just use a region in memory Grub has given me in which the core image is loaded. I need to parse it my self into the regions where it needs to be.

So the core image needs to be put in some other place than it is in right now. Why is that and how do you do such thing?

The reason why the image needs to be replaced is that it is an ELF image which is more or less compressed. That's fine and all, but this makes that it misses some very important things. One of which is the fact that there is no space reserved for variables, only indicators of where they need to be. Another issue is that I have written my code to go to a fixed place, somewhere very high (or very low, depends on how you look at it) in memory. Grub can't load to this place since it doesn't support virtual memory (at least, it doesn't initialise it for the client, according to the multiboot specifications). This means I need to initialise it my self and put the image there.

I've chosen ELF because it's flexible, but I also could have chosen the plain binary format. That's nice and all, but that also means linking is a bit tougher. Now from a security point of view this isn't necessarily a bad thing, but from a development perspective it could mean that the image is harder to inspect with an object dump.

As to how we do this, it turns out to be quite well documented in the ELF specifications. The ELF header holds pointers to the inside of the file with notes on where the section should go, and all I need to do is put the sections into place. Once that's done, I can (in case of the core image) transfer control to the core image. In the case of an user space application I should probably fork first.

Now to get the image as it is on disk it first needs to be transferred to main memory. Luckily Grub has done all the hard work for me, meaning I can transfer control without the need for a messy hard disk driver.

Monday, 16 May 2011

Hello paging world!

Yes, that's right. We've just entered paging mode.

I just made the commit available which solved the issue on the double fault caused by the page fault. The issue came from the halt&catch fire instruction, because it releases after an interrupt.

I had some issues with paging because I didn't map the VGA memory and the Stack.
These two are now solved.

So what is paging precisely?

Paging is using tables in the CPU to translate virtual addresses into actual physically accessible memory.

How do we do that?
Well to explain that we're going to split the memory up in parts. We're also going to split your memory addresses up in parts. Further more we'll expect the Intel CPU to be in 32-bits protected mode, so our linear addresses reach up to 4 GiB.

The memory is divided into pages, each being 4096 bytes (or 4 KiB) in size. So that means we have about 1 million pages. To reference these pages, we're going to need page tables.

Each page table is exactly 1 page in size, with each entry being 4 bytes. That means only 1024 pages can be accessed in a single page table. If you've been paying attention you can see that this only references 0.1% of all the memory space.

That's somewhat of an issue, but because the designers at Intel aren't stupid, they've created what they call a page directory. Now this page directory also holds 1024 entries of a 4 bytes each. These entries point to the page tables. That means we've now got 1024*1024 pages. That's more like it.

Now, if you don't understand the way this works, you should probably read the Intel manual Volume 3A, chapter 6. It also covers PAE and Long mode paging. Further more it holds the flags required for the implementation.

All the cache bits are set to 0, so the default mode gets used.

Tuesday, 10 May 2011

All on a big heap

Now what have I been working on?

Well, I've been working on dynamic heap allocation. It took me quite a while as it required me to use the memory map which I had to set up to figure out which pages I am able to use and which I should leave as they are.

What are pages?

Pages are basically chunks of memory. They are usually the same size and can be used for many different tasks. One of the most common tasks is to let every process think they own the entire address space while in reality there is only 10MiB of physical memory in place. Also,h this must be shared between processes which makes it even more complex.

And how did you get this memory map?

The memory map was provided to me by grub. Grub uses the BIOS to figure out what can be used and what can't be used. It provides areas, which might overlap. Because this map is so unreliable in some regions I decided to set up my own memory map, in a much more flexible style.

What kind of style is that?

I chose to give each physical page an owner. This can be used in the future to decide whether or not a process can actually access it. This is all dumped into an array, which is currently, 2MiB in size.

2MiB? That's huge!
I know, especially if you have only 10 MiB of memory, however those PC's are rarely ever found these days, and the smallest I have found is 64 MiB, so the table is only about 3-4% of the actual amount of physical memory.

And why was this so hard?

Well, I kind of made a small error in the memory map, and because of that I started writing data into my code. And if there's anything that's difficult to detect, it's writing data into code.
I solved it now, and since the code has passed every test suite I think it's time for a new tag, called Indev-0.0.2. It marks the current commit, as being the latest "stable" indev release.

So what can we see coming in the future?

Well, I think in the near future paging will start happening, and once that's done, I think I'll start worrying about elf loading the core image, and of course jump into that image.
When that's done the planning isn't fixed and but I think I'll (or hope it's we'll by that time) be working on getting into usermode. Once that's done I will probably start worrying about drivers and such but I don't know for sure.

Wednesday, 4 May 2011

Have you been gone?

Ok, there were little code updates lately. But that has a reason.
I was working on getting the decompressed image working, and at this moment, that's more a matter of thinking than writing code.

So what have you done?

Well, I've split up the linker script into one for the decompressed image and the compressed image. I removed the hardware interrupt code from the compressed image, and I've written a new entry point, but this time for the decompressed image (because the linker started complaining).
I also had to do some work on the makefiles, which fortunately for me, is done now.

Where are you going next?

Well, there are still some issues related to expanding the code. For example I don't want to write an #ifndef around every source file I don't want in the compressed image. Unfortunately I still don't quite know how to do this. It will probably boil down to rewriting the Makefiles though.
I also want to get some space from grub to put the heap for the compressed image. I don't know how I'm going to do this either, but I think it will be a case of declaring a humongous1 variable which will then be my heap, but if there is a possibility to do this from the linker script, I'll do that, since I think that's a more elegant2 way of coding.

1Something in the region of 32 MiB.
2I consider elegant code to be:

  • expandable
  • readable
  • understandable
  • reusable
  • stable