Blog

notes and ramblings

Rule 2 – Random Access Should Be Avoided, Sequential Access Should Be Encouraged Applicability: Mostly low level, performance-oriented code.

Justification: Due to internal mechanisms on many levels, including RAM and processor cache designs, sequential access is definitely more optimal. DRAM requires far more CPU cycles to reach remote memory than its cache. The processor loads data in 64-byte blocks called cache lines. Each memory access less than 64 bytes is a waste of expensive resources. What’s more, random access patterns make it unlikely that the cache prefetching mechanism will work. The processor has no chance of discovering any predictable pattern with the random access to memory. What is important, by randomness we do not mean total randomness, but rather the fact that it is not an ordered access that is compatible with any detectable pattern.

How to apply: Obviously, the opposite of random access is sequential access, so try always to use it. If you are operating on a large amount of data, you might want to consider packing them into arrays that are taking care of memory continuity. Iterating over double-linked lists can be an example of a typical, unstructured access. We will ok closer at this aspect of memory access in Chapter 13 when describing so-called Data. oriented design.

Rule 3 – Improve Spatial and Temporal Data Locality

Applicability: Mostly low level, performance-oriented code.

Justification: Spatial and temporal locality are the pillars of the cache. If present, the cache is used effectively and helps to achieve better performance. On the contrary. Ifwe interfere with the temporal and spatial locality, we will lead to a significant decrease in productivity.

How to apply: Design your used data structures in such a way as to take care of your data’s locality and to maximize their reusability in time. As we have seen in the examples given, distributed, random access to data is very unfavorable in terms of performance and can be several times slower. Sometimes, in very advanced and high-performance parts of the program, this means applying such non-intuitive changes as will be presented in Design-oriented design in Chapter 13. Sometimes it only comes down to ensuring that our data structures are reasonably small, preallocated, and used repeatedly.

Rule 4 – Consume More Advanced Possibilities

Applicability: Extremely low-level, performance-oriented code.

Justification: The NET runtime environment is written in the most generic way. This is to ensure proper operation in a variety of possible scenarios. However, when writing our application, we know our needs perfectly. We may need to write extremely fast-performing fragments of memory-related code. If so, we may consider using some more advanced operating system-specific mechanisms. Such mechanisms will probably need about 0.0001% NET developers in the world. If you are writing memory-related library like serializers, messaging buffers, or any kind of extremely fast event processor – maybe you can benefit by using some of the mentioned here low-level APIs of the system (liKe non-temporal memory access).

howcoaply. This wal require witing a realy hard code. This code wil be a pain to manage and probably no one wil want to maintain it. Exceptyou. Because it Wiltse The low level APT of the operating system, it may also cause problems after updating or changing operating system versions. Itis also very unlikely that you need such low level memory management at all, because it will require extreme caution in coding. And it’s very easy to make a mistake, which, instead of increasing performance, will drastically

reduce it.

Read this book carefully. Then read carefully specific operating system books about sinternals. And then try to use advanced mechanisms like large pages, non-temporal operations, and others mentioned in this chapter.