Calgary is working.
Sometime near 4AM on Sunday night I got half dynamic mappings working. Half dynamic mappings are still mapping from the physical address in the PCI address space to the same physical address in the system memory space (like static mappings), but only map those addresses the driver requests from the DMA API (unlike static mappings which map everything). The name is courtesy of Olof Johansson, who did a large chunk of the PPC IOMMU work (Hi Olof :-)).
Once I had half dynamic mode working, I started experimenting with simple transformations on the addresses. When even those worked, which showed that none of the drivers are bypassing the DMA API, I was in a real quandary – why do they work, and fully dynamic, which is just a different transformation on the address, doesn’t?
At this point in time my machine’s remote console dropped off the network and I went to sleep.
On Monday morning, inspiration hit: fully dynamic really is just another transformation on the address – but it’s the only one that I tried that was *0 based*! I immediately made a trivial change to my brain dead TCE allocator: instead of starting to allocate from address 0, start from 0x7000000 (arbitrary, I have 2GB of memory in the victim machine). When this worked, I knew I hit the jackpot.
Investigation later showed that 1MB was the cut-off point. As long as I allocated TCEs above 1MB, everything was fine. 1MB… there’s something about this address… BIOS? and indeed further investigation showed that the region between 640KB and 1MB is special and requires special handling. Now my allocator simply starts from 1MB and all is well with the world.
Next dual steps: moving the guts of Calgary into Xen so that we can use it for isolation and preparing it for submission to the mainline kernel.
Can you tell I’m happy? 🙂
I’m not a specialist in your field, but your description reminds me of some work I’ve done with northbridge programming.
In many northbridges, there are special registers that I have to program to prevent the northbridge from decoding certain legacy address ranges (.like 640k-1M) to the device bus instead of physical memory. Perhaps the problem is that the northbridge was decoding your original target address range to the device bus instead of to physical memory?
Comment by jspence — February 21, 2006 @ 9:18 AM |
We know that some of the addresses in that range are being “kept” on the PCI bus and not translated, and which register controls it; what we don’t understand yet is what the right thing to do here (should this check happen before or after address translation, for example). We sent an inquiry to the HW designers and are now waiting with baited breath 🙂
Comment by mulix — February 21, 2006 @ 9:24 AM |
You’re hitting UMA
Back in the day the upper memory limit of the 8086 architecture was 1MB but the IBM PC BIOS reserved everything above 640k for ROM memory (VGA BIOS, MMIO, etc.).
When a PC first boots, it’s actually in 8086 compatibility mode and it enables a bunch of extensions to get at the rest of physical memory. The 640k-1M reserved region never goes away of course.
— aliguori
Comment by Anonymous — February 25, 2006 @ 6:37 PM |
Re: You’re hitting UMA
What does ‘UMA’ mean in this context?
I know what we’re hitting now (and how to work around it, it’s all in the spec if you know where to look), it’s figuring out what was happening that wasn’t trivial. In hindsight, of course, it’s perfectly obvious 🙂
Comment by mulix — February 25, 2006 @ 10:29 PM |
Re: You’re hitting UMA
Upper Memory Access. It’s just a region of reserved physical memory that reserved for historical reasons.
It’s where stuff like the CGA/VGA planar memory live.
Comment by Anonymous — February 28, 2006 @ 6:05 AM