Network Programming Model
OSBlocking and Non-blocking I/O
The focus of “blocking” and “non-blocking” I/O is how user programs handle situations when data is not ready in the kernel.
Blocking I/O (BIO)
-
When try to read, if the data is not available (e.g., no data), OS blocks and waits until data is readable.
-
When try to write, if the data cannot be written (e.g., buffer is full), OS blocks and waits until it can write.
-
Pros ✔️: Simple to implement. Since the call blocks the thread, usually, one thread is allocated to handle with one client connection, and it simply sleeps while waiting for data.
-
Cons ❌: Limited number of concurrent connections because thread resources are limited, and context switching among threads has a cost.
-
Improvement 🤔: The crux of blocking I/O is the “blocking,” thus leading to Non-blocking I/O (NIO) 👇
Non-blocking I/O (NIO)
- Regardless of whether the connection is readable/writable, the user program attempts to read/write. If it cannot be done, the kernel returns
EAGAIN
orEWOULDBLOCK
. - In blocking
write
, the kernel waits until the buffer can hold all the data before returning, and it returns the number of bytes written, which equals the number of bytes the user passed in. However, in non-blockingwrite
, it copies → returns → copies again → returns again, returning a portion of the bytes each time.
read & write
The different behaviors of read
and write
in blocking and non-blocking modes:
Operation | Buffer State | Blocking Mode | Non-blocking Mode |
---|---|---|---|
read |
Data available | ↩️ Immediately returns | ↩️ Immediately returns |
~ | No data | ⌛ Waits for data | Returns, EWOULDBLOCK /EAGAIN |
write |
Free | Writes all before returning | Writes as much as possible, then returns |
~ | Not free | ⌛ Waits for free space | Returns, EWOULDBLOCK /EAGAIN |
Notes:
-
read
returns as soon as there is any data in the buffer, without necessarily waiting for all data to receive. -
For blocking
write
, if the opposite peer actively closes the socket, it will also return immediately. -
Pros ✔️: It allows polling multiple sockets in a single thread.
-
Cons ❌: It wastes a lot of CPU times for loops and system calls.
-
Improvement 🤔: The crux is “active polling” by the user program for “each socket,” leading to I/O Multiplexing which allows the kernel to notify events in batches 👇
I/O Multiplexing
- The user program does not need to poll each socket multiple times; instead, it can use a limited number of
select
/poll
/epoll
system calls to have the kernel actively notify data events.
select & poll
Similarities:
- Need to fully copy of File Descriptors:
- User space ➡️ Kernel space: Setting the FDs to be monitored;
- Kernel space ➡️ User space: The kernel sets the active FDs and copies them back to user space;
- Need to traverse all returned FDs to find which is ready
Differences:
select | poll | |
---|---|---|
FD storage | Uses a bitmap of size 1K | Uses an array of pollfd structures |
Event description method | Uses three FD bitmaps to represent different event types | Sets events in the pollfd structure |
How notify | The kernel modifies the passed bitmap | The kernel modifies specific fields in the pollfd structure |
Improvement🤔: 1. Do not need to copy all FDs; 2. The kernel can indicate available FDs. Here comes the epoll👇
epoll
Key advantages:
- Edge-triggered
- Red-black tree
Factors Affecting Concurrency
Number of Open Files
# Check
$ ulimit -n
1024
# Modify /etc/sysctl.conf
fs.file-max = 10000
net.ipv4.ip_conntrack_max = 10000
net.ipv4.netfilter.ip_conntrack_max = 10000
Memory
Memory is needed to allocate buffers for connections.
# Minimum allocation value, default allocation value, and maximum allocation value
$ cat /proc/sys/net/ipv4/tcp_wmem
4096 16384 4194304
$ cat /proc/sys/net/ipv4/tcp_rmem
4096 87380 6291456