Monday, January 15, 2024

[SOLVED] What is the true implementation logic of write()

January 15, 2024 c, linux, linux-kernel

Issue

I know on linux, the kernel often caches read write access to disk, like the codes below, it save the msg to a buffer after the write call and return, then does the actual writing when the buffer is full.

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

int
main()
{
    int fd = open("./test.txt", O_RDWR, 0766);
    if (fd < 0) {
        printf("open file failed %d\n", errno);
        return -1;
    }
    
    char msg[100] = "write this msg to disk";
    write(fd, msg, strlen(msg));

    return 0;
}

I do not know the implementation of write(), my question is: If the buffer is in memory, the kernel copy the msg to buffer like memcpy, will such a copy be time-consuming?

Solution

Yes; your buffer is copied into kernel memory before write returns; after that you can overwrite the buffer without affecting any background I/O involving the original data.

Copying the data consumes time; it's not free.

write is a generic function that works on any kind of file descriptor: you can write to a serial console, inter-process pipe, network socket, file, block device, ...

write has to look at the file descriptor and call to the right implementation for the device.

When you write some bytes to a file, those bytes overlap one or more blocks (equal sized units) of the file. If those blocks already exist, they have to be brought into memory and put into buffers. The buffers are then edited with the new bytes, and marked "dirty" (modified).

write requests do not align with blocks. They are often larger or smaller than blocks, and do not begin and end on block boundaries; writing is actually an editing activity which has to combine existing data with new data. When you write some bytes to a file, unless you write an exact block, some of the bytes written to storage will be bytes that didn't come from your program: such as zero padding bytes in the block if there is no other data, or existing bytes in the block that your write didn't touch.

Modified buffers have to be written to storage, though that action can be delayed. Delaying the action allows for optimization: like if that program (or any other) performs more write operations on the same blocks, then those edits can be incorporated before anything is actually written. Not to mention that the filesystem can order the writes in some way that makes it faster, like minimizing the head movements of spinning platter hard drive.

The key idea that flushing data from the operating system's buffers to the underlying file is generally not driven by buffers becoming full, but due to timer-based activity which monitors dirty buffers.

The fsync function can be used to delay the execution of a program until the previous writes on a file descriptor have been committed to storage (with certain documented caveats). The function possibly promotes the flushing as well (makes it happen sooner than it otherwise would).

Answered By - Kaz

Answer Checked By - Marilyn (WPSolving Volunteer)

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, January 15, 2024

[SOLVED] What is the true implementation logic of write()

Issue

Solution

Popular Posts

Labels