Q1: When I run your program the output is all messed up. Q2: The amount of RAM reported is wrong. Q3: Did you code all of these yourself? Q4: What resources you use to find out about all the different architectures? Q5: Why? Q6: What are the most unusual opcodes you've come across? Q7: How do you port to a new architecture? -------------------- Q1: When I run your program the output is all messed up. A1: That's more of a statement than a question, but OK. On some architectures, most notably x86, and most recently with core and core2 processors, you will get ugly results. Because ll is optimized for size, it reports blindly what's in the /proc/cpuinfo file. If intel or your BIOS manufacturer put really ugly and pointlessly long info there, there's not much I can do about it. If you want pretty output, use linux_logo instead. Q2: The amount of RAM reported is wrong. A2: See the previous question. I report what the sysinfo syscall reports. This doesn't report ACPI, shared video ram, and other things that might make you appear to have less RAM. For a more accurate tool use linux_logo. Q3: Did you code all of these yourself? A3: Originally I only had an x86 version that I posted to my website, and Stephan Walter found the page and sent me a more optimized version. We had a little contest going on for a while, seeing who could shave off the most bytes from the x86 code. He's the one who had the great idea of using LZSS compression. Since in the end all of the other architecture's code are loosely based on the x86 code, a lot is owed to him. Overall though, most of the code was done by me, by hand. Q4: What resources you use to find out about all the different architectures? A4: The architecture manuals provided by the manufacturer are the best resource there is, as the companies involved want you to use their processor. Knowing what instructions does what doesn't help though if you don't know how to interface with the Linux kernel. The best reference for that is the web. Bizzarrely the best code examples you'll find are by shell-coders. These are hackers who try to explot buffer overflow flaws in programs to get a root shell on your machine (hence the shell-coder moniker). While their motives might be questionable, they are the experts at using Linux at the syscall level without a C library and their source examples are an invaluable resource. There are Other useful tools. strace is great at helping you see if you are calling the syscalls with the proper arguments. objdump (specifically the --disassemble-all option) will show you exactly how much space each instruction is taking in the final exceutable. The linux kernel source is useful, primarily for the asm/unistd.h header which has all the syscall info for an arch Overall the linx kernel source doesn't help much. Neither does the glibc code, nor does using gcc -S. Q5: Why? A5: Assembly language is the ultimate puzzle. You have this black box (a computer) that only accepts a series of coded numbers. Different numbers make it do specified things. Using only this limited amount of numbered codes, make the machine do the task at hand. Extra Credit: make it as small as possible (or as fast as possible). Fun! More practically, my work as a computer architect has me often working at a low level with different architectures. It is nice having a "rosetta stone" set of programs that all do more or less the same thing, but on different platforms. Just knowing how to run an exit() syscall on a platform can make programming it in assembly that much easier. Q6: What are the weirdest opcodes you've come across? A6: abcd - m68k aaa - x86 eieio - ppc pea - m68k bra - m68k,sh3 bras - s390 sex - such an opcode exists, but not in any platforms linux supports. Most processors have a "Sign EXtend" instruction but most companies were too afraid to use that opcode (for example, it's cdq on x86) brb - vax spanc - vax shad - sh3 lfsux - ppc sob - pdp-11 doze/nap/sleep/rvwinkle - power6 Q7: How do you port to a new architecture? A7: First step is to find a machine from that architecture capable of running Linux. For the first few easy ports this was trivial. It got harder and harder to find machines. Eventually for some of the more obscure I had to use simulators (vax, m68k) and for sh3 I ended up using a user-space only qemu. Second step is getting development tools. This is pretty easy if you are running a full distribution on actual hardware. It's a bit harder in the obscure cases; you have to build cross-compile tools and that can be tricky. Third step is to find architectural documentation. The biggest help is the manual the company provides that describes the architecture and all available opcodes. That gets you most of the way. There can still be some tricky aspects. Finding out how to do syscalls properly can be a hurdle (best help is usually the arch-specific unistd.h file that comes with linux. Next is the uclibc code. Worst case you might have to statically link a binary and dig through a disassembly of the code with objdump). Another problem can be differences in the assembly gas (the gnu assembler) expects. The gas info page is a big help here. Fourth step is to start coding. Usually I start by getting the "exit()" syscall working, as that's one of the easier syscalls and you pass along an exit value that's easy to check with "echo $?". Once that works, it's time to do a "hello world" which exercises the write() syscall and memory accesses. After this gradually build up to the system info lines, then to the centering part (division can be tricky). Last is usually the actual logo decompression. The fifth step is debugging, which can be difficult. The best tool to help is "strace" which shows which syscalls are happening and which values. gdb (if it is working) also can be a big help. Some of the more obscure architectures you can't have either strace of gdb and have to resort to scattering write() and exit() calls around, plus a lot of commenting out. The sixth and final step is optimizing. There is no quick guide to this, you just have to find the parts of the code that are excessively big and try to make them smaller. Using objdump to diassemble the code before and after is a good tactic, especially on architectures with variable-sized instructions.