You've probably never accessed a file like this before. (Async IO example)

  Рет қаралды 37,575

Jacob Sorber

Jacob Sorber

Күн бұрын

Patreon ➤ / jacobsorber
Courses ➤ jacobsorber.thinkific.com
Website ➤ www.jacobsorber.com
---
You've probably never accessed a file like this before. (Async IO example) // Asynchronous input and output is cool, but rarely used. In this video, I'll show you how to split-phase your file reads so you can get other things done in the meantime. This can be a useful alternative to threads, for those of you trying to squeeze every last bit of performance out of your programs.
Related Videos:
***
Welcome! I post videos that help you learn to program and become a more confident software developer. I cover beginner-to-advanced systems topics ranging from network programming, threads, processes, operating systems, embedded systems and others. My goal is to help you get under-the-hood and better understand how computers work and how you can use them to become stronger students and more capable professional developers.
About me: I'm a computer scientist, electrical engineer, researcher, and teacher. I specialize in embedded systems, mobile computing, sensor networks, and the Internet of Things. I teach systems and networking courses at Clemson University, where I also lead the PERSIST research lab.
More about me and what I do:
www.jacobsorber.com
people.cs.clemson.edu/~jsorber/
persist.cs.clemson.edu/
To Support the Channel:
+ like, subscribe, spread the word
+ contribute via Patreon --- [ / jacobsorber ]
Source code is also available to Patreon supporters. --- [jsorber-youtube-source.heroku...]

Пікірлер: 120
@michalski9141
@michalski9141 2 жыл бұрын
I'd LOVE for you to talk about file systems!
@vor946
@vor946 2 жыл бұрын
id love any longer, theoretical video from Jacob
@empireempire3545
@empireempire3545 2 жыл бұрын
++
@jonweinraub
@jonweinraub 2 жыл бұрын
Thank you, thank you, THANK YOU! I been working on a side project that went into the unfinished pile as I started something else which many of us are guilty of but since I need to refactor a lot of it because I was coding whilst bed ridden post surgery this makes me keen on finishing it, and now it will be far better with it being far faster than I thought possible. Looking forward in ticking complete on a project soon enough. Much gratitude!
@cygmoid
@cygmoid 2 жыл бұрын
Nice video, really helpful. Would really love a video or videos on Filesystems. Nice work
@apostolisparga
@apostolisparga 2 жыл бұрын
Didn't know asynchronous operations were possible in C. Brilliant!
@ahmadhadwan
@ahmadhadwan 2 жыл бұрын
Anything that your hardware + OS can do is possible in C.
@amir3515
@amir3515 2 жыл бұрын
Excited for this one!
@willkessner2674
@willkessner2674 2 жыл бұрын
Definitely would love to hear more about file systems. I’m still confused about COW, and how a file system can be better for NVMe (something like BTRFS) than others (EXT4, ZFS, etc).
@ed_iz_ed
@ed_iz_ed 2 жыл бұрын
mostly a bunch of optimizations, for instance, by having implicit compression at fs level, you reduce the overall amount of single byte writes and reads to the actual drive (at the expense of more cpu usage) which usually means better overall performance, since io tends to be the biggest bottleneck
@reptilicusrex4748
@reptilicusrex4748 2 жыл бұрын
Very well explained. Thanks.
@oscareriksson9414
@oscareriksson9414 2 жыл бұрын
Cool! And yes filesystems would be nice to hear more about
@oj0024
@oj0024 2 жыл бұрын
Could you compare aoi to other forms of doing asynchronous/non blocking io, like poll, epoll, io_uring, ...
@coletoentertainment5985
@coletoentertainment5985 Жыл бұрын
Tbh, poll and epoll is old way to do it. AIO was latest before io_uring release. Much more interesting would be a video about io_uring
@31redorange08
@31redorange08 Жыл бұрын
What is aoi?
@AlFredo-sx2yy
@AlFredo-sx2yy Жыл бұрын
@@31redorange08 did you not watch the video?
@31redorange08
@31redorange08 Жыл бұрын
@@AlFredo-sx2yy I did. Did you?
@AlFredo-sx2yy
@AlFredo-sx2yy Жыл бұрын
@@31redorange08 i did, but unlike you, i paid attention. Rewatch it and you'll see what im talking about...
@90_98
@90_98 6 ай бұрын
I've always wondered how to use async io, thanks, this was incredibly useful
@cernejr
@cernejr 2 жыл бұрын
Nice. Good technique to know about.
@gatty.
@gatty. 2 жыл бұрын
Files systems?! Yes please! Keen as to hear you talk about file systems! :D Also, just did a test during the first few minutes of your video, it made me think, is reading from a file, treated the same way as iterating through memory via like memory_location++ sort of thing, and I found, it does! That's AMAZING! :D :D :D, that's super exciting. So you can just like, read however many bytes you want to, do things with it, then read some more, etc etc. Sooooo cool!!! :D
@edgeeffect
@edgeeffect 2 жыл бұрын
This is quite fascinating seeing this done in C for file access.... I've done it before in assembly, waiting for a microcontroller's ADC to work it's magic or in JacaScript, waiting for a network call to complete..... interesting to see how it's done in "the middle ground".
@robertturner7090
@robertturner7090 2 жыл бұрын
Comment for the algorithm, your audio is fine!
@shashankmshanbhag7970
@shashankmshanbhag7970 2 жыл бұрын
It would be great to learn in detail about filesystems and various mechanisms that they use.
@Sahilbc-wj8qk
@Sahilbc-wj8qk 2 жыл бұрын
Used it before when need to do other stuff while some data is being loading so i will no block the execution .it was amazing as Unix books teach lots stuff hard to discover by myself.
@agnag1
@agnag1 2 жыл бұрын
and the 6100 checks and increments were in a best case scenario for file access speed, where the file should already be cached in RAM.
@naconazov1
@naconazov1 2 жыл бұрын
If I may suggest something it would probably be cleaning up memory after ourselves (potential memleak). Here it is definitely not an issue, but having thousands or more little async reads may consume a significant amount of memory on some systems.
@AndreasWilfer
@AndreasWilfer 2 жыл бұрын
Hey Jacob, love the topics and videos as always! I do have a thought though.. At around the 10 minute mark, for the aio pointer member: I would consider it a lot more readable if it wasn't in these... unicode(?) characters.. but rather the actual two characters you typed. I seem to recall there being a point when you asked about it before and I can't remember what I thought at that time, but seeing this right now confuses me. Anyway, thank you for your great work explaining the topics you bring up!
@JacobSorber
@JacobSorber 2 жыл бұрын
Ah, yes, the ligatures. They're definitely a point of contention. Some people love them. Some hate them. Thanks for the feedback.
@pierreabbat6157
@pierreabbat6157 2 жыл бұрын
Did you free the aiocb after reading the file?
@31redorange08
@31redorange08 Жыл бұрын
Yes.
@coletoentertainment5985
@coletoentertainment5985 Жыл бұрын
It would be great if you could show how to use io_uring. Nowadays this approach is becoming more and more popular, even microsoft in Windows 11 implemented this way of communication with kernel.
@packmandudefake
@packmandudefake Жыл бұрын
2:38 char is guaranteed to take 1 byte at all times.
@katiagalkina4607
@katiagalkina4607 Жыл бұрын
perhaps it would be better to compare them on two separate files eliminating the cashing by the os
@user-jx8uz6tb6k
@user-jx8uz6tb6k 2 жыл бұрын
Cool! I have never done any http requests on c (if that makes sense) or any socket programming. So does async programming exists there? I suppose everything is a file in linux, so socket programming is writing or reading from socket "file". Correct me, please, if I am wrong
@user-sl6gn1ss8p
@user-sl6gn1ss8p 2 жыл бұрын
what exactly does async io do though? Like, if you have multiple cores, does it use that? Or is the "counting" happening while we wait for the disk's info to get into memory and available to the cpu? If neither, what runs parallel to what?
@michalski9141
@michalski9141 2 жыл бұрын
up
@dcorderoch
@dcorderoch 2 жыл бұрын
in this example, while the program is waiting for the file to be read, the counter is increased in the while loop, basically, it's about not needing to wait for the read to finish to do other stuff (though it has to be something that doesn't need the data from the file)
@user-sl6gn1ss8p
@user-sl6gn1ss8p 2 жыл бұрын
@@dcorderoch yes, that makes sense, but my question is more along the lines "what, if anything, in the actual hardware, is running in parallel?" Like, on one hand, the other computation might be running in parallel to the disk doing it's thing, but, on the other hand, it might be just time-sharing (as in threads running in a single core), and I guess there's a lot more possibilities (maybe even depending where the file is). Come to think of it, this might depend on the architecture as well and not be standardized?
@haathimmunas1563
@haathimmunas1563 2 жыл бұрын
@Jacob Sorber would love to see an explanation of how async IO works under the hood
@timtreichel3161
@timtreichel3161 2 жыл бұрын
As far as I can tell aio_read() calls __aio_enqueue_request (), which uses another thread to read the file. aio_read(): code.woboq.org/userspace/glibc/sysdeps/pthread/aio_read.c.html Where the magic happens: __aio_enqueue_request (): code.woboq.org/userspace/glibc/sysdeps/pthread/aio_misc.c.html#__aio_enqueue_request
@MrJasbur1
@MrJasbur1 2 жыл бұрын
3:36-3:41 “and of course let me know if you’d like me to talk more about file systems in future videos”. Sounds good yes.
@knecht3000
@knecht3000 2 жыл бұрын
If a more complicated task than incrementing a counter is performed while the async read is executed how would it be possible to partly execute it and countine executing it after the read operation has returned?
@Tremoneck
@Tremoneck 2 жыл бұрын
A server for example could be reading files to serve and send data asynchronously, and while those two are busy it compresses 1024 byte blocks of data and checks between each block.if the Io is finished.
@sriramvatala9159
@sriramvatala9159 2 жыл бұрын
Hi Jacob, can you speak about memory barriers in your upcoming videos
@JacobSorber
@JacobSorber 2 жыл бұрын
Yeah, probably. I'll add it to the list.
@MECHANISMUS
@MECHANISMUS 2 жыл бұрын
Every1 who suffers audio inconvenience try and use loudness limiter browser extension for dynamic equalization.
@laxmikantbotkewar4191
@laxmikantbotkewar4191 Жыл бұрын
can you please shed some light on OS topics like memory allocators, buddy, slab, kzalloc in future videos ??
@pooladkhay
@pooladkhay 2 жыл бұрын
File systems please 🙏🏼
@Uerdue
@Uerdue 2 жыл бұрын
Interesting, didn't know about `io_read`, thank you! How does this relate to opening the file with the `O_NONBLOCK` flag and then calling `read` and again checking errno for `EAGAIN` / `EWOULDBLOCK`? Are they doing the same thing under the hood?
@jonshouse1
@jonshouse1 2 жыл бұрын
Yep, I thought exactly the same. I don't normally bother for basic file IO but I do often use read() rather than recvfrom() with non blocking sockets, I have also used non blocking IO with Async file sources such as video capture devices, that way I can do some processing with partial data without much added complexity
@D0Samp
@D0Samp 2 жыл бұрын
For Linux, according to the aio(7) man page this is all still handled in userspace using non-blocking file descriptors (and with the occasional help of threads) because the io_* syscalls provided by Linux 2.6+ only implement a fraction of the POSIX standard. On the other hand, on FreeBSD or macOS aio_read is a native syscall. The biggest difference is that AIO allows you to queue multiple requests at once.
@hstrinzel
@hstrinzel 3 ай бұрын
I guess this would be particularly good for single-tasking single-threading servers, right?
@JacobSorber
@JacobSorber 2 ай бұрын
Single threading, yes. This could be a way to avoid the overhead of threads. You can do some types of multi-tasking with asynchronous I/O. It just sometimes takes a lot of care to keep your code from becoming super complicated.
@kishorekumar3393
@kishorekumar3393 2 жыл бұрын
How does this compare to reading files via mmap? Performance wise / use cases
@n00blamer
@n00blamer 2 жыл бұрын
mmap is usually slower than sync or async block read, but is way more convenient in some use cases. The fread-like stuff is optimized really well, and the page faults on mmap usually add up eventually. Even if you just memcpy to a buffer you allocated with mmap, it should recognize the linear access pattern and pre-populate the pages so that you don't hit page miss every time you advance 4k but it still ends up losing to block read. mmap is a pain in the ass when or if you still have to support 32 bit systems as the address space may not be available for mapping to success when trying to map a larger chunk at a time. mapping over ~200 MB may already fail in 32 bit system depending on the address space consumption of current process.. kind of sucks.. 64 bit
@wubbalubbadubdub6489
@wubbalubbadubdub6489 2 жыл бұрын
Not related to this video but I bought an Arduino (because of Jacob) just learn embedded SW.
@JacobSorber
@JacobSorber 2 жыл бұрын
Cool. Best of luck.
@bart2019
@bart2019 2 жыл бұрын
When looking into Linux backup systems, like borg and bup, I noticed that these use alternative ways to access files, like mmap. Could you make a video on how to use mmap?
@JacobSorber
@JacobSorber 2 жыл бұрын
kzbin.info/www/bejne/o2ioaqOffciZqJY
@milasudril
@milasudril 2 жыл бұрын
The poll loop is a bit annoying. It is much easier to kick off a worker thread, and sync at the end, though it would have more overhead.
@JacobSorber
@JacobSorber 2 жыл бұрын
Keep in mind that you don't really need a poll loop. I used it as a convenient way to measure the time between starting and finishing the request, but if you put something in here that is definitely going to take longer than the request, you could just grab the result when you finished. To be safe, most people would still use a loop of some sort - just in case, but in many cases you can write the code so you never really loop.
@fburton8
@fburton8 2 жыл бұрын
I remember doing asynchronous I/O on the PDP11 with READ$/WRITE$ macros and event flags - a long, long time ago. 👴
@andre0baskin
@andre0baskin 2 жыл бұрын
Yes, and then moving on to VMS with QIO and AST's. Although I believe QIO was supported on some of the PDP-11 OS's as well.
@fburton8
@fburton8 2 жыл бұрын
@@andre0baskin I think you're right. I do remember QIO too, with and without a W, in RSX-11M.
@lawrencedoliveiro9104
@lawrencedoliveiro9104 2 жыл бұрын
That is being done fully preemptively, with software interrupts. Threading can be considered a higher-level abstraction around that. But for applications which are not CPU intensive, this is all needlessly complicated. That’s why higher-languages are now adopting the old “coroutine” idea in the form of async/await constructs, e.g. in JavaScript and Python.
@sc5shout
@sc5shout 2 жыл бұрын
What if I already have a threading system. Can I use aio.h or it'd be too much overhead?
@maciejgo
@maciejgo 13 күн бұрын
Is line 17 ok? Shouldn't it looke like memset(aio, 0. sizeof(aio))?
@ujjawalKhadanga
@ujjawalKhadanga 6 ай бұрын
Would'nt the whole process context-switch while performing syncronous IO operations. If this happens, the cpu will not be idle always. Could anyone shed some light on this?
@pajeetsingh
@pajeetsingh 6 ай бұрын
C programming language has inbuilt structures and functions to do asynchronous work, using external library like aio is overkill also it make C similar to “package” based programming language.
@pajeetsingh
@pajeetsingh 6 ай бұрын
Also this is not asynchronous because you are still waiting on file reading completion albeit using a function from another library. You need a “wake up” feature to be called asynchronous. signals and condition variable with wait would do.
@grenadier4702
@grenadier4702 Жыл бұрын
Is this AIO approach any good for network sockets?
@soniablanche5672
@soniablanche5672 6 ай бұрын
most modern web servers use asynchronous code to listen to connections
@grenadier4702
@grenadier4702 6 ай бұрын
​@@soniablanche5672 are you sure? I think it's non-blocking multiplexed I/O that is popular now. For example, using `select` of something like that. Or, the classic, multithreading is being used instead of non-blocking i/o
@frotzecht3461
@frotzecht3461 2 жыл бұрын
In this brave new world of Node.js, asynchronous I/O is definitely no longer rare. OTOH POSIX aio is fairly unwieldy especially if you want to do more complex things.
@TheWeepingCorpse
@TheWeepingCorpse 2 жыл бұрын
it seems you only call free on an error. when is free called after the file is completed reading?
@soniablanche5672
@soniablanche5672 6 ай бұрын
everything is automatically freed when you exit the program
@empireempire3545
@empireempire3545 2 жыл бұрын
Wait, aio.h is not part of the standard if i see correctly? Is it open source?
@JacobSorber
@JacobSorber 2 жыл бұрын
According to the man pages, aio_read and aio_write are both part of the POSIX standard.
@gblargg
@gblargg 2 жыл бұрын
7:56 that printf specifier looks suspect. Probably best to cast the parameter to the expected type (unsigned long) rather than assume that uint64_t happens to be an unsigned long. 13:54 If you wanted to squeeze as much processing out of your program as possible, you'd need to make it multi-threaded. Doing things with async I/O will only allow one core to be utilized maximally. And if you're making it multi-threaded, the other threads can do the CPU-bound work while this one does I/O and blocks.
@ahmadhadwan
@ahmadhadwan 2 жыл бұрын
on x86_64 architecture I don't think it matters that much to cast from uint64_t to uint32_t since it's going to be passed through a 64 bit register
@gblargg
@gblargg 2 жыл бұрын
@@ahmadhadwan I thought it was portable code. No need to implicitly tie it to a specific compiler and architecture.
@ahmadhadwan
@ahmadhadwan 2 жыл бұрын
@@gblargg true, but I also think that on other architectures any good modern compiler will implicitly cast the parameters to their proper types.
@gblargg
@gblargg 2 жыл бұрын
@@ahmadhadwan Unless they've very recently added that, C requires that you pass the proper parameter types to varargs functions like printf.
@JojOatXGME
@JojOatXGME 2 жыл бұрын
As far as I know, the specification does not intend that. Note that the compiler would have to parse the string given as the fist argument in order to know the expected parameter types.
@dickheadrecs
@dickheadrecs 2 жыл бұрын
why come noone made a Remi Malloc() joke during Mr Robot 🤖
@vaclavzeleny5717
@vaclavzeleny5717 2 жыл бұрын
Why don't we free up allocated memory? Will aio_raturn do that?
@FanatiQS
@FanatiQS 2 жыл бұрын
What would be the benefit of using aio over MSG_DONTWAIT on read in the while loop?
@JojOatXGME
@JojOatXGME 2 жыл бұрын
MSG_DONTWAIT does only work on sockets, not on files. Beside that, I expect that it is basically the same. aio is just some abstraction. (For files, the abstraction will probably use O_NONBLOCK to archive the same effect as using MSG_DONTWAIT on sockets.)
@embeddedbastler6406
@embeddedbastler6406 2 жыл бұрын
I think it comes down to the question if my task is IO-bound or CPU-bound. If it is IO-bound, AsyncIO is a great option. If it is CPU-bound, AsyncIO does not really give any advantages over just using something like a thread pool. The real beauty of AsyncIO shines when a asynchronous runtime is used. Then it's the runtimes job to run the tasks than can currently make progress while other tasks are waiting on IO input. Of course I cannot end this comment without mentioning the awesome async integration into the Rust Programming Language. Together with a runtime like Tokio it is really pleasant to write async programs.
@soniablanche5672
@soniablanche5672 6 ай бұрын
javascript developers be like : oh yes we did
@cipherxen2
@cipherxen2 Жыл бұрын
You forgot to free the aio in main.
@Swedishnbkongu
@Swedishnbkongu 2 жыл бұрын
I don't get why everyone is crying about the audio, there's nothing too messy or quiet.
@pelajarankimia1503
@pelajarankimia1503 2 жыл бұрын
Memory leaks detected
@gloverelaxis
@gloverelaxis 2 жыл бұрын
so instead of AsyncIORequest or Request or AsyncRequestInfo or anything remotely useful, C calls it "aiocb"
@JacobSorber
@JacobSorber 2 жыл бұрын
I think this is less of a C thing, and more of a reflection of the *nix culture (as opposed to the more verbose/expressive microsoft culture, for example). If you look at at the Windows C API, you'll see what I'm talking about. Might be an interesting topic for a future video. There is reason behind the different cultures and their decisions-not that the other side will find those reasons compelling.
@gloverelaxis
@gloverelaxis 2 жыл бұрын
@@JacobSorber sorry for the aggression in my earlier comments; i appreciate your videos. i just feel really strongly about the Linux world's hostility to improvement and self-reflection. Windows is *also* excruciatingly bad and inconsistent at naming things, just in sometimes different ways. i think it's totally unacceptable that these names (and conventions) are still present as anything other than back-compat aliases. it simply makes the language harder to learn and understand for newcomers. if people are naming things according to the number of keypresses it takes to write, that's a big waving red flag that they're neglecting to use proper tools (same-project autocomplete; pre-baked language/library completions; snippets etc.), and that they're prioritising the most immediate and transient convenience over maintainability. the Linux world is absolutely rife with really, really poor coding practices (not to mention acutely bad for certain broader social problems like misogyny) and a kind of ossified, hierarchical, rote-learned set of conventions which can only be described as "tradition", which i use in the most pejorative sense.
@JacobSorber
@JacobSorber 2 жыл бұрын
@@gloverelaxis No worries. I think I agree with everything here. Cultural inertia is a super powerful force. Of course, it's a much bigger issue than Linux/Unix (cue discussions about qwerty keyboards and carriage returns), but the *nix world definitely has a lot of room for improvement.
@avirukbasak
@avirukbasak 2 жыл бұрын
man I've been watching your videos and your audio is quite soft. Can you have your audio slightly amplified during editing or something? love your videos a lot though
@filips7158
@filips7158 2 жыл бұрын
I bet that async read creates a thread under the hood. There is literally no other way to do this.
@JacobSorber
@JacobSorber 2 жыл бұрын
I'm not 100% sure what Linux is doing under the hood (and it could be different from one OS to another), but there are definitely other ways to do this (besides creating OS threads under the hood). It would be hard to get too deep into it in a comment, but modern operating systems try to avoid threads and synchronous operation whenever possible, favoring a more interrupt-driven/event-style approach.
@filips7158
@filips7158 2 жыл бұрын
@@JacobSorber Well I you're right with that one. I just checked. It defaults to user-space threaded approach if no interrupt routine for that IO is available. Otherwise it uses kernel-space interrupt-driven code.
@lawrencedoliveiro9104
@lawrencedoliveiro9104 2 жыл бұрын
There are some OSes where it’s the other way round: all I/O calls are inherently async, and threading is an abstraction built on top of that.
@NamasenITN
@NamasenITN Жыл бұрын
Commenting on dead times without establishing that both programs really executed over similar time intervals, seems not conclusive.
@oldold6636
@oldold6636 8 ай бұрын
C sucks, C++'s great
@joe_rogan
@joe_rogan 2 жыл бұрын
You're youtubing for a couple of years and still not fixed your volume level. Your audio is too quiet and intro way too loud.
@Swedishnbkongu
@Swedishnbkongu 2 жыл бұрын
Your headphones suck
Generating Unique Identifiers in Your Programs (GUIDs/UUIDs)
21:36
Jacob Sorber
Рет қаралды 26 М.
How to Check Your Pointers at Runtime
14:12
Jacob Sorber
Рет қаралды 31 М.
100❤️
00:19
MY💝No War🤝
Рет қаралды 21 МЛН
THEY WANTED TO TAKE ALL HIS GOODIES 🍫🥤🍟😂
00:17
OKUNJATA
Рет қаралды 22 МЛН
What's the Best Way to Copy a Struct in C and C++?
13:44
Jacob Sorber
Рет қаралды 33 М.
Make your Data Type more Abstract with Opaque Types in C
13:41
Jacob Sorber
Рет қаралды 48 М.
Performing Asynchronous I/O Bound Operations (Jeffrey Richter)
1:01:23
Pulling Back the Curtain on the Heap
21:38
Jacob Sorber
Рет қаралды 36 М.
Python Asyncio, Requests, Aiohttp | Make faster API Calls
17:56
Patrick Collins
Рет қаралды 126 М.
The What, How, and Why of Void Pointers in C and C++?
13:12
Jacob Sorber
Рет қаралды 51 М.
Simulating the Evolution of Rock, Paper, Scissors
15:00
Primer
Рет қаралды 362 М.
How to make memory read-only in your C programs.
12:57
Jacob Sorber
Рет қаралды 19 М.
100❤️
00:19
MY💝No War🤝
Рет қаралды 21 МЛН