i feel that the grammar of a programming language is among the least appropriate of all possible facets of its behavior to start off with.
-
Systems programs are strongly driven by bulk I/O
performance.he keeps talking about multiple aliasing being problematic and of course this is why i decided to never share anything and instead have layered i/o queues
In systems code, the effect of representation and data placement can be extreme. Bonwick et al. discuss some of these effects [5], noting that the performance of system-level benchmarks can change by 50% through careful management of cache residency and collisions.
but how do you manage something "carefully" if all the interfaces allow for is urgency???
This tends to penalize the performance of automatic storage reclamation strategies. To make matters more interesting, there are caches.
yeah it rly annoys me how the filesystem has its own caches and the kernel has its own caches but there's this assumption that persistence is always the final destiny of all writes
It follows that user-managed storage is a requirement, but perhaps not in fully general form.
that's exactly it!
The facts say otherwise. The annual cost to operate a large banking data center today is $150,000 per square foot. It is by far the most expensive real estate in the world, and more than one third of that cost is the cost of cooling the data center.
2006!!!!
-
The facts say otherwise. The annual cost to operate a large banking data center today is $150,000 per square foot. It is by far the most expensive real estate in the world, and more than one third of that cost is the cost of cooling the data center.
2006!!!!
wait shit he had a point:
The general rule of thumb is that power
is proportional to V2F: the square of the voltage times the frequency. Most of this power is wasted as heat. To a system’s programmer, the cost of doubling the clock rate is $50,000 per square foot per machine room.do i detect an IETF hater???
Raising the clock rate decidedly isn’t free, and walking into the network distribution closet at your business or school will quickly convince you that current power usage is excessive.
-
wait shit he had a point:
The general rule of thumb is that power
is proportional to V2F: the square of the voltage times the frequency. Most of this power is wasted as heat. To a system’s programmer, the cost of doubling the clock rate is $50,000 per square foot per machine room.do i detect an IETF hater???
Raising the clock rate decidedly isn’t free, and walking into the network distribution closet at your business or school will quickly convince you that current power usage is excessive.
oh oops he glazes up tcp/ip immediately after. this is "Programming Language Challenges in Systems Codes" by jonathan shapiro
-
oh oops he glazes up tcp/ip immediately after. this is "Programming Language Challenges in Systems Codes" by jonathan shapiro
on the internet:
Large block processing costs are dominated by memory bandwidth, not software overheads.
that makes sense. the difficulty with fitting network i/o into my beautiful symphony of data locality is that the network is "necessary global" in some sense, and can't do multi-level queueing or w/e because you can't dictate to network resources how fast or slow to send data to you!
As Blackwell discusses [4], processing overhead on smaller packets is necessarily much higher.
hmmmm
-
on the internet:
Large block processing costs are dominated by memory bandwidth, not software overheads.
that makes sense. the difficulty with fitting network i/o into my beautiful symphony of data locality is that the network is "necessary global" in some sense, and can't do multi-level queueing or w/e because you can't dictate to network resources how fast or slow to send data to you!
As Blackwell discusses [4], processing overhead on smaller packets is necessarily much higher.
hmmmm
vaguely interesting microsoft research paper https://research.cs.wisc.edu/areas/os/Seminar/schedules/papers/Deconstructing_Process_Isolation_final.pdf
A software isolated process is a collection of memory pages and a language safety mechanism that ensures that code in a process cannot access another process’s pages. A SIP replaces hardware memory protection with static verification of program safety.
DEEPLY suspicious to hear "replaces hardware memory protection" coming from microsoft lmao
They rely on verifying code’s safe behavior to prevent it from accessing another process’s (or the kernel’s) instructions or data.
LMAO
-
vaguely interesting microsoft research paper https://research.cs.wisc.edu/areas/os/Seminar/schedules/papers/Deconstructing_Process_Isolation_final.pdf
A software isolated process is a collection of memory pages and a language safety mechanism that ensures that code in a process cannot access another process’s pages. A SIP replaces hardware memory protection with static verification of program safety.
DEEPLY suspicious to hear "replaces hardware memory protection" coming from microsoft lmao
They rely on verifying code’s safe behavior to prevent it from accessing another process’s (or the kernel’s) instructions or data.
LMAO
However, language safety offers important benefits not provided by hardware process protection, for example, detecting in-process errors such buffer overruns.
literally nothing in this paper makes any sense
-
However, language safety offers important benefits not provided by hardware process protection, for example, detecting in-process errors such buffer overruns.
literally nothing in this paper makes any sense
just read a liedtke paper for the first time https://cgi.cse.unsw.edu.au/~cs9242/19/papers/Liedtke_93.pdf i think this guy is crazy for still trying to make ipc faster but this was actually cool to read. should have thought to learn that context first before hating on all the modern microkernel stuff =\
and he completely blew my fucking mind with this lmao:
5.3.5 Direct Process Switch
For a remote procedure call it is natural to switch the flow of control directly to the called thread, donating the current timeslice to it (as also LRPC does).
This is also the most efficient method, since it only involves changing stack pointer and address space.i don't think i would ever have thought of that myself and i can see why all-consuming focus on a hopeless task can actually get you places sometimes if you don't half-ass it
guy seems cool
-
just read a liedtke paper for the first time https://cgi.cse.unsw.edu.au/~cs9242/19/papers/Liedtke_93.pdf i think this guy is crazy for still trying to make ipc faster but this was actually cool to read. should have thought to learn that context first before hating on all the modern microkernel stuff =\
and he completely blew my fucking mind with this lmao:
5.3.5 Direct Process Switch
For a remote procedure call it is natural to switch the flow of control directly to the called thread, donating the current timeslice to it (as also LRPC does).
This is also the most efficient method, since it only involves changing stack pointer and address space.i don't think i would ever have thought of that myself and i can see why all-consuming focus on a hopeless task can actually get you places sometimes if you don't half-ass it
guy seems cool
ipc performance is not only determined by the kernel algorithms, but also by the user/kernel interface. It is important to support typical usage and permit compilers to optimize code.
clearly we agree on the important things??? lol
Since there are no compilers (as far as we
know) which permit interfaces to be specified at register level and basic block sequences to be optimized by programmer supplied usage information, we had to use hand coding for the critical ipc related parts.see i love this guy lmao
-
ipc performance is not only determined by the kernel algorithms, but also by the user/kernel interface. It is important to support typical usage and permit compilers to optimize code.
clearly we agree on the important things??? lol
Since there are no compilers (as far as we
know) which permit interfaces to be specified at register level and basic block sequences to be optimized by programmer supplied usage information, we had to use hand coding for the critical ipc related parts.see i love this guy lmao
oh amoeba is so cool lmao https://dl.acm.org/doi/abs/10.1145/54289.54291
6. THE FAST AMOEBA FILE SERVER
Like the Amoeba communication primitives, the Amoeba file server, called the bullet server was designed for extremely high performance.you're allowed to say stuff like this if you can back it up. let's see:
In particular, the decrease in the cost of disk and RAM memories over the past decade has allowed to use a radically different design than is used in UNIX and most other operating systems. In particular, we have abandoned the idea of storing files as a collection of fixed size disk blocks.
HELL yes i win again
All files are stored contiguously, both on the disk and in the server's (16 MB) main memory
16 mb lmao
-
ipc performance is not only determined by the kernel algorithms, but also by the user/kernel interface. It is important to support typical usage and permit compilers to optimize code.
clearly we agree on the important things??? lol
Since there are no compilers (as far as we
know) which permit interfaces to be specified at register level and basic block sequences to be optimized by programmer supplied usage information, we had to use hand coding for the critical ipc related parts.see i love this guy lmao
@hipsterelectron I have no fucking clue what any of this means but this guy seems chill and I love these types of threads where you liveblog the nerd shit you're reading anyways
-
oh amoeba is so cool lmao https://dl.acm.org/doi/abs/10.1145/54289.54291
6. THE FAST AMOEBA FILE SERVER
Like the Amoeba communication primitives, the Amoeba file server, called the bullet server was designed for extremely high performance.you're allowed to say stuff like this if you can back it up. let's see:
In particular, the decrease in the cost of disk and RAM memories over the past decade has allowed to use a radically different design than is used in UNIX and most other operating systems. In particular, we have abandoned the idea of storing files as a collection of fixed size disk blocks.
HELL yes i win again
All files are stored contiguously, both on the disk and in the server's (16 MB) main memory
16 mb lmao
The bullet server is an immutable file store, with as principal operations READ-FILE and CREATE-FILE.
this is how pants works and how my shared memory ipc worked, it's cool
(For garbage collection purposes there is also a DELETE-FILE operation.)
love this!
-
oh amoeba is so cool lmao https://dl.acm.org/doi/abs/10.1145/54289.54291
6. THE FAST AMOEBA FILE SERVER
Like the Amoeba communication primitives, the Amoeba file server, called the bullet server was designed for extremely high performance.you're allowed to say stuff like this if you can back it up. let's see:
In particular, the decrease in the cost of disk and RAM memories over the past decade has allowed to use a radically different design than is used in UNIX and most other operating systems. In particular, we have abandoned the idea of storing files as a collection of fixed size disk blocks.
HELL yes i win again
All files are stored contiguously, both on the disk and in the server's (16 MB) main memory
16 mb lmao
@hipsterelectron 16.... Huh????? Whuh????? That's a typo that's gotta be a typo
-
The bullet server is an immutable file store, with as principal operations READ-FILE and CREATE-FILE.
this is how pants works and how my shared memory ipc worked, it's cool
(For garbage collection purposes there is also a DELETE-FILE operation.)
love this!
the cache kernel is sick. closest thing to the macrokernel i've found so far https://dl.acm.org/doi/10.1145/504390.504414 research sponsored by ARPA wish ARPA did more locality-centric memory motion stuff
-
the cache kernel is sick. closest thing to the macrokernel i've found so far https://dl.acm.org/doi/10.1145/504390.504414 research sponsored by ARPA wish ARPA did more locality-centric memory motion stuff
SPIN kernel rox my sox!!! https://www.cs.cornell.edu/people/egs/papers/spin-tr94-03-03.pdf they're literally just saying "yeah so turns out applications have highly structured resource dependencies and you can just ask them for that shit"
In terms of memory resources, multimedia applications use large amounts of data (audio and video streams) with access patterns that interact poorly with locality-based page replacement algorithms [Anderson 93, Nakajima et al. 92]. Application-specific virtual memory management policies can solve this problem.
yes!!!!!!! but they go deeper:
High-level information about media
direction, edit cuts, and temporal constraints are directly relevant to page replacement decisions.yes!!!!!!!!!
When presenting a video stream, for example, an application can sequentially prefetch video frames directly from disk into memory-resident buffers. Information about synchronization between media streams can also be specified to prevent unnecessary replacement of pages that are interdependent.
literally the application knows what they want lmao
Filesystem performance can benefit from application-specific information in several ways.
TRUTHNUKE
The application can provide hints about future usage to the filesystem to help it schedule disk traffic [Gibson et al. 92]. This can result in
more effective prefetching policies and lower buffer cache miss rates.amazing
An effective prefetching policy can also remove virtual memory remapping operations from the critical path, since disk blocks are already mapped into the application address space when they are needed.
i think this is prob what i'm doing
In addition, the application can inform the kernel about how it will use the buffer cache, so that the kernel can make informed decisions about physical memory allocation [Stonebraker 81]
y e s
-
SPIN kernel rox my sox!!! https://www.cs.cornell.edu/people/egs/papers/spin-tr94-03-03.pdf they're literally just saying "yeah so turns out applications have highly structured resource dependencies and you can just ask them for that shit"
In terms of memory resources, multimedia applications use large amounts of data (audio and video streams) with access patterns that interact poorly with locality-based page replacement algorithms [Anderson 93, Nakajima et al. 92]. Application-specific virtual memory management policies can solve this problem.
yes!!!!!!! but they go deeper:
High-level information about media
direction, edit cuts, and temporal constraints are directly relevant to page replacement decisions.yes!!!!!!!!!
When presenting a video stream, for example, an application can sequentially prefetch video frames directly from disk into memory-resident buffers. Information about synchronization between media streams can also be specified to prevent unnecessary replacement of pages that are interdependent.
literally the application knows what they want lmao
Filesystem performance can benefit from application-specific information in several ways.
TRUTHNUKE
The application can provide hints about future usage to the filesystem to help it schedule disk traffic [Gibson et al. 92]. This can result in
more effective prefetching policies and lower buffer cache miss rates.amazing
An effective prefetching policy can also remove virtual memory remapping operations from the critical path, since disk blocks are already mapped into the application address space when they are needed.
i think this is prob what i'm doing
In addition, the application can inform the kernel about how it will use the buffer cache, so that the kernel can make informed decisions about physical memory allocation [Stonebraker 81]
y e s
Extensible interprocess communication
An extensible IPC interface enables applications and servers to define their own semantics for interprocess communication enabling the best tradeoff between performance and functionality.of course but also yes!!!!!!!!
-
Extensible interprocess communication
An extensible IPC interface enables applications and servers to define their own semantics for interprocess communication enabling the best tradeoff between performance and functionality.of course but also yes!!!!!!!!
Some systems rely on “little languages” to safely extend the operating system interface through the use of interpreted code that runs in the kernel [Lee et al. 94, Mogul et al. 87, Yuhara et al. 94].
i think it's a cute idea but it shouldn't be code it should be data describing a set of access patterns for an isolated application process
These systems suffer from three
problems. First, the languages, being little, make the expression of arbitrary control and data structures cumbersome, and therefore limit the range of possible extensions.this is why you never make your own language for a specific problem and then force people to use it!!!!
Second, the interface between the language’s programming environment and the rest of the system is generally narrow, making system integration difficult.
great to hear how bazel and nix were by no means the first to make this mistake
-
Some systems rely on “little languages” to safely extend the operating system interface through the use of interpreted code that runs in the kernel [Lee et al. 94, Mogul et al. 87, Yuhara et al. 94].
i think it's a cute idea but it shouldn't be code it should be data describing a set of access patterns for an isolated application process
These systems suffer from three
problems. First, the languages, being little, make the expression of arbitrary control and data structures cumbersome, and therefore limit the range of possible extensions.this is why you never make your own language for a specific problem and then force people to use it!!!!
Second, the interface between the language’s programming environment and the rest of the system is generally narrow, making system integration difficult.
great to hear how bazel and nix were by no means the first to make this mistake
a professor i follow on here who has been way more annoying on here recently and i didn't know why......anyway happened to find a paper of his from last year and he's just doing literal LLM slop now. RIP in peace
-
a professor i follow on here who has been way more annoying on here recently and i didn't know why......anyway happened to find a paper of his from last year and he's just doing literal LLM slop now. RIP in peace
sloperating system
-
on the internet:
Large block processing costs are dominated by memory bandwidth, not software overheads.
that makes sense. the difficulty with fitting network i/o into my beautiful symphony of data locality is that the network is "necessary global" in some sense, and can't do multi-level queueing or w/e because you can't dictate to network resources how fast or slow to send data to you!
As Blackwell discusses [4], processing overhead on smaller packets is necessarily much higher.
hmmmm
