openbsd-ext4/writeup/1.md

69 lines
3.4 KiB
Markdown

# A quick glance
We can start at ext2fs/ext2fs.h and see what's in there.
At first sight, we are presented with the following line:
```c
#include <sys/endian.h>
```
> Endianness is essentially bit order. On big endian systems, 0xABCDEF would be stored as the bytes [AB, CD, EF]. On little endian systems, the bytes would be [EF, CD, AB]. This is commonly implemented for computational efficiency, with little difference to the end user. Of course, as low level developers, this is something we will have to keep in mind.
Now if we quickly glance at kernel.org, we can see the following:
> All fields in ext4 are written to disk in little-endian order. HOWEVER, all fields in jbd2 (the journal) are written to disk in big-endian order.^[1](https://docs.kernel.org/filesystems/ext4/overview.html)^
Journaling was already implemented in ext3, and the fields in the super block are implemented as incomplete features in the ext2 source. This is irrelevant to us for now.
```c
#define BBSIZE 1024
#define SBSIZE 1024
#define BBOFF ((off_t)(0))
#define SBOFF ((off_t)(BBOFF + BBSIZE))
```
These are some standard macros to be used later in the code. All that matters is how the sizes of everything are all standardized.
The super-block is the first 1 KB of data on the disk. It contains information about what files are present, disk health information, the amount of cylinders on the disk, and other technical information that allows us to optimize reading the filesystem, instead of the OS having to analyze each disk.
The boot-block is 1 KB. ext4 allows no more than 1 KB of instructions to load up the filesystem. On MBR, the BIOS loads the first 512 bytes into memory, and on UEFI, there is a FAT32 filesystem with a bootable flag which instructs the BIOS on how to load the kernel. Those are "first-stage" bootloaders. The remainder of the 1 KB on disk is the "second-stage" bootloader. After this, you can load more bootloaders as necessary or get right into the kernel.
The following line:
```c
#define BBLOCK ((daddr_t)(0))
#define SBLOCK ((daddr_t)(BBLOCK + BBSIZE / DEV_BSIZE))
```
defines the block address of the boot and super block in memory. Obviously, the boot block is at address 0, but the super block's address depends on how many bytes fit in each block, or 1024 over that amount. `DEV_BSIZE` is a constant which has not been defined at this time, so hopefully we can figure out what it is going forward 🙏
> Inodes are, like in UFS, 32-bit unsigned integers and therefore ufsino_t.
> Disk blocks are 32-bit, if the filesystem isn't operating in 64-bit mode
> (the incompatible ext4 64BIT flag). More work is needed to properly use
> daddr_t as the disk block data type on both BE and LE architectures.
> XXX disk blocks are simply u_int32_t for now.
say the OpenBSD developers. The only point worth noting from this is that we have to implement 64-bit mode moving forward.
```c
#define LOG_MINBSIZE 10
#define MINBSIZE (1 << LOG_MINBSIZE)
#define LOG_MINFSIZE 10
#define MINFSIZE (1 << LOG_MINFSIZE)
```
Each block is a fragment of 1024 bytes at minimum. ext4 was likely designed with the hope that eventually block sizes would increase as disk drive storage increases, and that it would remain extensible for years to come.
```c
#define MAXMNTLEN 512
```
The maximum length of a mount point is 512 characters. You can test this with `mount -t ext2fs <filesystem>`. Even on other systems, it's not likely to work beyond 512 bytes.