Single syscall "Hello, world" - in Rust - part 2
… or There and Back Again
Returning to Rust
As in every hero’s journey, after gaining the wisdom of the gods, the hero shall bring it back to his roots. If we squint our eyes a bit, that is (kind of) what we are going to do.
Our exploration in the previous post revealed some suspects that blow up our syscalls count in a simple “Hello, world” Rust application, those are:
- Rust runtime
libc
Let’s try getting rid of them.
No std
As a first step of our journey back to Rust, we need to cut some fat out of Rust runtime. We can do it by removing the standard library with no_std
attribute. Our minimal “Hello, world” program will look like this:
|
|
Since there is plenty of good resources about no_std
Rust, let’s just get a quick overview of what is happening here:
#[no_std]
attribute tells Rust to not link the standard library.- Since it is the standard library that defines
panic_handler
for Rust programs, without it we need to define our own using#[panic_handler]
attribute on a function with a proper signature. - On the same note we cannot use Rust’s default
main
function as an “entry point” to our program, so we need to use#![no_main]
attribute and provide our ownmain
function.#[no_mangle]
attribute tells the compiler to not change the name of our function, so it can be found (and called) bylibc
. - And finally, we do not have access to the
println
macro, so we uselibc::printf
instead.
For our Cargo.toml
we specify libc
as a dependency and tell the compiler that we want our program to abort on panic. This second piece is necessary again because we do not use std
and we are building a program for a target where eh_personality
(eh
stands for “exception handling”) is defined in the standard library. Since eh_personality
is necessary for stack unwinding when a panic occurs, aborting absolves us from the need to provide it:
|
|
You can try it for yourself by removing
panic = "abort"
lines. Rust compiler error messages would point you in the right direction.
and voila, it works:
|
|
|
|
However… In respect of your screen real estate, I am not even going to bother pasting strace
output here as it is bloated like a dead whale (ok, maybe not that bad just 35 syscalls…). This is because we still use the printf
function, libc
, and on top of that link it dynamically…
Statically linking libc
(especially musl) to no_std
program turns out to be not a trivial task, and since we need to get rid of it anyway, let’s not go down this rabbit hole. Let’s instead get rid of it altogether.
No libc
Okay, we can remove libc
from our dependencies, remove calls to printf
that depend on it, and we are good. Right?
|
|
Yes… but well, no…
|
|
(cut down to only relevant parts)
We have at least two problems here. The first one is what we see on the screen – a beautiful linker error – and the second one is what we do not see on the screen – our “Hello, world!” message – because we removed the printf
function call.
Since tackling the first one is a prerequisite for the second, let’s start with it.
As I mentioned earlier, with our #[no_std]
binary we had to provide a custom main
function that is called by libc
. But now that we do not have libc
there is nothing to call our main
function…
However, as we can see in the error message above something still refers to __libc_start_main
function, which reasonably so cannot be found. We can see that error originated in the Scrt1.o
file in _start
function. Scrt1.o
is a part of the C runtime startup code so we can reason that Rust still tries to link it to our binary.
Since this is not really the problem with our code, but more with the build (linking) process, we need to tell the compiler to not link those files, and we can do it by passing -nostartfiles
flag to the linker.
|
|
In
gcc
docs we can read:-nostartfiles Do not use the standard system startup files when linking. The standard system libraries are used normally, unless -nostdlib, -nolibc, or -nodefaultlibs is used.
To not specify RUSTFLAGS
every time, we can move them to .cargo/config.toml
:
|
|
And we are good to go!
|
|
|
|
I will take that as a no…
Remember the _start
function, right? Well, since we no longer link startup files it is no longer here (surprise!), and apparently, it is “kind of” needed.
When OS loads the program it will look for the entry point address in the ELF file header to start the execution. However, if the entry point function was not found during the linking process, the address will be set to 0x0
(NULL), which usually is a protected memory area.
|
|
|
|
The _start
function is a default expected by the linker. This means we could simply rename our main
function to _start
… Or we can convince the linker that our function is better! We can use the same trick as before – providing a linker configuration – this time by passing --entry
flag as a link-arg
:
|
|
|
|
😑 …
Is the entry point set?
|
|
|
|
Looks like it is, but… Remember our Assembly program? On top of write
we also used the exit
syscall, and without it, the program would segfault too. We might be facing a simillar issue here.
We can take a closer look at this by checking the Assembly code generated by the Rust compiler. To do that new need to expand our RUSTFLAGS
, this time with --emit=asm
flag:
|
|
We are building in release
mode this time to cut out debug symbols noise from the Assembly file. With that, we can find a concise .s
file in the target/release/deps
directory, and see our main
function:
|
|
Side note: be careful how you specify
RUSTFLAGS
Initially, I have done it like this:
1
RUSTFLAGS="-C link-arg=--entry=main, --emit=asm" cargo...
Notice the comma (
,
) aftermain
…And yeah, that was fun… The Assembly generation was working but the program was suddenly segfaulting. It took me a bit of head-scratching and a google search to find this:
export RUSTC_LOG=rustc_codegen_ssa::back::link=info
which shows the linker command that is being executed as well as its output. And it revealed this abomination in the linker command:
1
"cc" ... "-nodefaultlibs" "--entry=main," ...
The linker was looking for a
main,
function instead ofmain
… 🤦♂️ All this probably could have been avoided if I would spend a few minutes reading docs on how to pass multiple RUSTFLAGS, but hey, that would not be that fun… Taking a step back and looking at Entry Point address would also be helpful, but I did not think of it at that time. There is also--print link-args
flag that would be sufficient too.
The function contains only a single retq
instruction, no syscalls, and no exit codes. To get even closer to the binary we can disassemble main
directly, and confirm what we have already seen:
|
|
|
|
Since we are yet to research how to print to the screen without libc
, let’s confirm the hypothesis by doing something that we can empirically detect in our code. We can do it by calling panic!
, or simply adding an endless loop:
Since that is exactly the behavior of our
panic_handler
the result is effectively the same.
|
|
|
|
And now we are stuck, which is what endless loops usually do. We can update .cargo/config.toml
with our new link-arg
and move on:
|
|
We know the drill now from our previous adventures, we just write(2)
and exit(2)
, and we are done! It might be time to reach out to some old “friends”…
Assembly. Again…
Remember the “wisdom of the gods” part? Yeah, that was not (entirely) a joke.
Since we already have tremendous experience with assembly after writing our “Hello, world” program, it would be a shame not to use it again… You might be asking, “Am I cheating once more?”. Maybe. Or no, because it is me who made up those rules [evil laugh or something…].
Regardless, this time we are going to use Assembly from Rust (see, it is not cheating!). Fortunately, we can do that fairly easily, all we need is the asm!
macro:
|
|
And run it:
|
|
|
|
Amazing! Let’s just confirm with strace
, and we are done…
|
|
|
|
Ah, yes, of course… A quick look at the output of:
|
|
Can refresh some memory pages in my head…
|
|
ld.so.preload
again, isn’t it? We never actually got to build our no_std
binary statically, so it is still dynamically linked.
|
|
|
|
Been there done that, we know what to do, and after quick facepalm, we can again set target-feature=+crt-static
, this time in .cargo/config
:
|
|
And run it again:
|
|
|
|
There we have it, mission accomplished! “Hello, world!” program with just a single syscall, written in Rust (kind of).
It still hurts to look at tho. Perhaps we could use some library to have nice and smooth Rust functions and let someone more fluent speak Assembly for us… You know, sweep it under the rug type of thing…
Rust is not JavaScript, but there actually is a crate for that. Let’s add it to our Cargo.toml
:
|
|
Our main
function parameters are useless anyway, and we no longer need pub extern "C"
, so we clean it up as well with our last refactor. The final program looks much nicer:
|
|
Conclusion
Ah, what a journey it was! All that hassle for writing a “Hello, world!” program…
We achieved our goal of cutting it down to a single syscall, but there are still a lot of areas we have just scratched the surface. There is also this kernel thingy that actually performs the action requested by a system call and so on, but that is a story for another day (or maybe a few years worth of stories).
In any case, I hope you enjoyed this little exploration, and maybe even learned a thing or two. If you have any questions, comments, or suggestions, feel free to reach out.