I have been recently working on a side project on mine, Probe – A TCP/IP stack userspace implementation. which was started more or so with the interest to learn the protocols in multiple network layers such as UDP, TCP, IP, ARP etc. But. now, this side project has turned out into a much better and far expected learning experience for me.
Since, I am planning to write the entire stack, by this what I meant is that, as we use the networking stack that exists in our subsystem or the kernel to establish a communication between our device and the servers out there. So, with Probe, my idea was to replicate exactly something like this where a user could just use this userspace stack for network resolution.
Btw, let me briefly explain what Probe exactly is; Probe is a lightweight TCP/IP suite which tries to adhere two basic principles. Simplicity and speed. Further, I plan to release this as a library which tries to adhere to “learning by doing” technique in an educational sense where users of Probe could just call and implement probe APIs to learn and understand under the hood working of TCP/IP.
So, while laying out the entire architecture, I had to stumble upon a question, that, how will I actually pass on or carry the protocol packets or the datagram with maintaining its order, it’s overhead, even it’s bits and flags in order without messing things up. But the normal data structures aren’t quite enough for them. I guess, it is possible, but then, since I wanted the probe to maintain its agility and order, I did a bit of research and ended up in a something called sk_buff, a Linux kernel data structure.
What sk_buff is?
sk_buff mostly referred as an skb context with kinda full form of socket buffer is a data structure devised in the Linux kernel with the sole purpose to carry the packet buffers.
Above, you could see the entire sk_buff struct, while I am not quite interested in using it entirely like into my implementation because right now mostly, I am only interested in getting the low-level packets, that is in the Link Layer to the ethernet. While the above figure of the sk_buff struct illustrates the entire protocols in the stack to be optimized into one struct.
On an overview, SKB, in a nutshell, is a queue of a linked list. Inside this linked list we have this skb buff structure. Skb or the entire skbuff data structure can be picturized like the above. I like to consider it as a blend and a hybrid of both linked list, and queue. sk_buff data structure always starts off with a constant head section which is a list. Which is a fixed size for protocol header of type struct sk_buff. Then it is pointed to the variable length area large enough to hold all or part of the data of a single packet again of type sk_buff.
To start off with the contents in the sk_buff, a major aspect inside the sk_buff structure is the buffer content management pointers. This allows and gives the flexibility needed to move around in the buffer area.
head: Points to the first byte of the kmalloc’d component. It is set at buffer allocation time and never adjusted thereafter.
end: Points to the start of the skb_shared_info structure (i.e. the first byte beyond the area in which packet data can be stored.) It is also set at buffer allocation time and never adjusted thereafter.
data: Points to the start of the “data” in the buffer. This pointer may be adjusted forward or backwards as header data is removed or added to a packet.
tail: Points to the byte following the “data” in the buffer. This pointer may also be adjusted.
len: The value of tail data.
Further for our minimalistic Stack, we just make use of few functions and implement them into our userspace skb data structure.
skbuf_peek(): a function may be used to obtain a pointer to the first element in a nonempty queue.
skbuff_push(): decrements the data pointer by the len passed in and increments the value of skb>len by the same amount. It is used to extend the data area back toward the head end of the buffer.
skbuff_reserve(): reserve space headers and point data to where the data should be copied from user space.
This is just the theory/concept of it, the real implementation of this minimalistic unserspace implementation can be found in here: https://gitlab.com/Aniketh01/probe
Will come up with more of such posts soon, extending the probe implementation especially.