- Monitoring memory counters and using tools like RAMMap, VMMap, WPR, and WPA allows for the detection of anomalous cache growth and memory leaks in Windows.
- Adjusting parameters such as RemoteFileDirtyPageThreshold and avoiding misuse of FILE_FLAG_RANDOM_ACCESS helps contain the system cache and prevent timeouts on remote writes.
- Identifying whether the leak affects virtual allocation or the heap is key to tracing the responsible process and correcting the software involved.
- Optimizing the cache at the web application level through appropriate TTLs, increased capacity, and suitable replacement policies reduces cache misses and improves performance.

When Windows starts to stutter, takes ages to open applications, or runs out of memory seemingly "out of nowhere"Often, the root of the problem lies in how the system manages its cache (both file and memory) and how certain applications use that memory. Understanding what happens under the hood is key to preventing crashes, performance drops, and even the risk of data loss in certain workloads.
The good news is that Windows includes performance counters, diagnostic tools, and specific settings. These tools allow you to detect anomalous cache behavior, identify memory leaks, and fine-tune critical parameters such as the page outage threshold for remote writes. With a little methodology and the right utilities, it's possible to pinpoint which component is consuming the most memory and take preventative measures.
What is the cache in Windows and why can it cause problems?
The cache, both in hardware As in software, it's basically a fast storage area where data is kept to speed up later access.In processors and memory we talk about L1, L2 or L3 caches; in OS Like Windows, the system's file cache stores disk data that is likely to be read again, reducing slow access times. storage physical.
In the specific case of Windows, the system file cache is managed by the cache manager.The cache decides which pages to keep in memory, which to discard, and when to free up space. Under certain workloads (millions of files, highly random access, servers with very fast remote clients, etc.), this cache can grow too large and leave the system with virtually no available memory.
When the cache takes up a large portion of the physical RAM and is not freed up in timeThe result is a slow team, with processes that are slow to respond and, in extreme cases, with errors stemming from a lack of virtual memoryThis not only affects performance: if storage is slow and there is a lot of pending writes to the cache, timeouts may occur on remote connections or there may be huge delays when writing data to disk.
In other scenarios, the problem is not the system file cache, but memory leaks in specific processes.These leaks manifest as an increasing and sustained consumption of virtual memory (commit size), without the process ever releasing it, which ends up exhausting system resources and triggering warning events such as the 2004 identifier of the resource exhaustion detector.
At the web level, there is also the concept of cache failureThis occurs when an application (for example, a WordPress site with caching plugins) requests data that isn't in the cache and has to retrieve it from the database or the source. These errors increase latency, slow down pages, and, if frequent, negate many of the benefits of caching.

Key counters for monitoring file cache in Windows
Before Windows Server 2012, there were two typical problems that caused the file cache to grow uncontrollably. until the available memory is practically exhausted. Although the architecture has been improved in recent versions, it is still useful to know which counters to monitor to detect similar situations.
The most important performance counters for monitoring these scenarios are:
- Memory\Long-Term Average Standby Cache Lifetime (s): Average long-term lifetime of standby memory. If this value remains below 1800 seconds (30 minutes), it indicates that standby pages are being recycled too quickly because the system is running low on memory.
- Memory\Available (Bytes / KBytes / MBytes)This reflects the available memory in the system. Persistently low values combined with high system cache usage are usually a warning sign.
- Memory\System Cache Resident BytesThis indicates how much physical memory is being used by the system file cache. When this value is a very large fraction of the total RAM and available memory is low, the cache may be oversized.
If you notice that Memory\Available is low and, at the same time, Memory\System Cache Resident Bytes occupies a considerable portion of RAMThe next step is to find out exactly what that cache is being used for. For this, RAMMap is an essential tool.
Use RAMMap to identify what fills up the file cache
RAMMap is a Sysinternals utility that graphically and in detail shows how physical memory is being used.Among other things, it allows you to see how many pages are assigned to metafiles NTFS, to mapped files, to system cache, etc.
One of the historical problems on heavily loaded servers was the massive accumulation of NTFS metafile pages in the cacheThis occurred especially on systems with millions of files and intensive access: the storage of metafile data (file system structure information, not the content itself) was not properly released from the cache, causing consumption to skyrocket.
In the RAMMap output, this scenario is detected by a very high number of active metafile pages.Visually, it's clear that this category consumes a disproportionate percentage of memory. Historically, this was mitigated with tools like DynCache, which adjusted the system cache limits to contain the problem.
Starting with Windows Server 2012, the internal architecture of cache management was redesigned. precisely to prevent this type of uncontrolled growth of metafiles from occurring, so it shouldn't appear in modern systems... although it is still useful to know the symptom to diagnose similar behaviors.
Another scenario that RAMMap helps to uncover is when the system file cache contains a large volume of memory-mapped filesThe tool shows a high number of active mapped file pages, often associated with applications that open numerous large files.
Impact of FILE_FLAG_RANDOM_ACCESS and application best practices

A typical pattern of file cache oversize occurs when an application opens many large files using CreateFile with the FILE_FLAG_RANDOM_ACCESS flag.This flag is basically a clue for the cache manager: it tells it to keep the mapped views in memory for as long as possible, because very random access to the data is expected.
By setting FILE_FLAG_RANDOM_ACCESS, the cache manager attempts to keep the data in memory until the memory manager declares a low memory condition.Furthermore, that same flag disables the prefetching of file data, since in theory the accesses will not follow a sequential pattern.
This behavior, combined with many large file openings and truly random accessesThis can cause the system cache to grow excessively. Although Windows Server 2012 and later versions introduce improvements to workspace trimming, the application vendor is ultimately responsible for correcting this issue.
The strong recommendation for developers is to avoid FILE_FLAG_RANDOM_ACCESS unless strictly necessaryAlternatively, they can access files with a low memory priority, so that used pages can be evicted from the workspace more aggressively when memory is needed for other processes.
This low memory priority can be configured using the SetThreadInformation APIWith it, threads that perform disk accesses can be marked as having low memory priority, reducing the impact of their activity on the overall system working set.
In Windows Server 2016, the cache manager goes a step further and, when trimming, ignores the suggestion of FILE_FLAG_RANDOM_ACCESS.Treating these files the same as any other in terms of cache shaving (although it still disables prefetch, as the flag indicates) partially mitigates the problem, but doesn't eliminate the risk of a disproportionate cache size if the application continues to open many files with that flag and performs very scattered accesses.
Outdated page threshold in remote files and risk of timeouts
Another specific problem that can affect both the performance and stability of remote connections This occurs when a system receives very fast writes from a remote client to a relatively slow destination storage.
In versions prior to Windows Server 2016, when the threshold of stale (dirty pages) cached for a remote file was reachedAdditional writes were handled as if they were simultaneous writes. This resulted in massive data being written to the disk, and if the storage capacity was insufficient, significant latency and even network connection timeouts occurred.
To address this problem, Windows Server 2016 introduces a separate stale page threshold for remote writes.When that threshold is exceeded, the system performs an online flush on the disk, that is, it writes the data gradually to avoid huge spikes that would saturate the storage subsystem.
This mechanism may cause some occasional slowdown during periods of very intensive writing.But in return, it greatly reduces the likelihood of a remote client experiencing a timeout due to too much data waiting to be written.
The default value for this remote threshold is 5 GB per fileDepending on the hardware configuration and workload, it may be advisable to adjust it: in some environments a higher limit yields better results, in others it is preferable to keep the default value.
How to adjust RemoteFileDirtyPageThreshold in the Registry
If the 5GB limit doesn't suit your needs, you can modify the remote stale page threshold using the Windows Registry.This setting is controlled by the RemoteFileDirtyPageThreshold value.
- Key path:
HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management - Type:
DWORD - Value name:
RemoteFileDirtyPageThreshold - Facts & figures: number of pages, where each page is equivalent to the page size managed by the cache manager (usually 4096 bytes).
To calculate the correct value, you must convert the desired number of bytes into a number of pages.For example, if you want to set the threshold to 10 GiB, you first calculate 10,737,418,240 bytes / 4096 = 2,621,440 pages; that is the decimal value you should enter in the DWORD.
General recommendations indicate working within a safe range between 128 MB and up to 50% of available memoryalways translated into pages. Ideally, the limit should be increased in 256 MB increments, measuring performance after each change, until the equilibrium point is found.
Note that any modification of RemoteFileDirtyPageThreshold requires restarting the computer. for the new value to take effect. Without the reset, the system will continue to use the previously active threshold.
It is also possible to completely disable this threshold by setting the value to -1However, it is not recommended: doing so again increases the risk of large bursts of remote writes filling the cache with outdated pages and triggering wait times and connection problems for clients.
Memory exhaustion detection and 2004 events in Windows
In addition to caching mechanisms, Windows has a resource exhaustion detector that monitors virtual memory usage.When a low virtual memory condition is reached, event 2004 is logged in the system log, with detailed information about the processes that are consuming the most memory.
A typical example of a 2004 event (Microsoft-Windows-Resource-Exhaustion-Detector) It includes a message indicating that Windows has successfully diagnosed a low virtual memory condition and lists the programs that have consumed the most memory at that time, with their executable name, PID, and number of bytes committed.
It's important to understand that the memory column that is seen by default in many views of the Task Manager It is not the most relevant for this type of diagnosis. It usually reflects the private memory backed by RAM (the working set), but what is of interest here is the total virtual memory usage that the system has committed to each process.
The metric to monitor in these cases is the commit size.This represents the virtual memory that Windows reserves for the process, regardless of whether it is backed by physical RAM or a page file. When the total system commit approaches the limit, 2004 events are triggered.
This behavior can occur with both Microsoft processes and third-party applications.For platform-specific processes, there is usually Symbols publics that allow for deeper analysis; in the case of third-party software, it is often necessary to contact the vendor for additional information if the call stacks do not show clearly named functions.
Using VMMap to classify the type of memory being leaked
When a memory leak is suspected, the first step is to determine what type of memory is growing uncontrollably.For this purpose, VMMap is a very useful tool, as it breaks down a process's memory usage into categories: private data, heaps, images, stacks, etc.
If the leak occurs in generic virtual allocation memory, in VMMap it will appear as a sustained increase in the private data categoryThat memory is not directly associated with a managed heap, but with address space reservations and commitments that the process does not release.
When a memory dump of the process is available, WinDbg can be used to run the command !address -summaryIn the summary, problematic virtual allocation memory usually appears under the category , indicating large unclassified memory regions that occupy a very significant part of the occupied space.
If the leak is related to the process's heap, it will be reflected in the Heap categories in VMMapHeap memory usage will steadily increase over time. There, without appreciable returns.
Again, with a dump and !address -summary in WinDbg, in these cases a very high percentage will be labeled as Heap32 or Heap64Depending on the process architecture. The remaining categories (images, stacks, others) will represent a much smaller fraction of total usage.
Correctly identifying whether we are dealing with virtual allocation or heap leaks is crucial.because it determines what type of monitoring we should activate next and what specific tools will help us find the responsible party.
Collect traces with WPR for virtual allocation memory
Once it's clear that there's a virtual allocation leak, the next step is to collect reproducible data that reveals who is making those reservations.For this purpose, you can use Windows Performance Recorder (WPR), which is included natively in Windows 10 and Windows Server 2016.
The basic procedure consists of starting a plotting session focused on virtual allocation, and letting the process behave naturally. while its memory usage grows, and stop the capture when enough information about the allocations has accumulated.
C:\>wpr -start VirtualAllocation
With the trace running, the growth of memory consumption is monitored by the process confirm size.This can be done using Task Manager, Resource Monitor, or similar tools. The advantage of this WPR profile is that, by focusing solely on virtual allocation, the resulting .etl file doesn't typically grow too large, allowing it to remain active for several minutes.
C:\>wpr -stop virtalloc.etl
That virtualloc.etl file will contain the virtual allocation events that will then be analyzed with Windows Performance Analyzer (WPA)., allowing you to see which functions and modules are behind the memory reservations that are not released.
Collect traces with WPR for heap memory
In the case of leaks associated with the process heap, the next step is to collect heap-specific tracesTo do this, WPR can be configured to log heap events from the target executable, usually by means of a Registry entry that enables tracking.
You can modify the Registry manually or use WPR to configure the target binaryFor example, to enable heap tracing for VirtMemTest32.exe, launch it from a command prompt with administrator privileges:
C:\>wpr -heaptracingconfig VirtMemTest32.exe enable
This command creates a configuration key under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\VirtMemTest32.exe, with a DWORD value called Tracing Flags normally set to 1, thus enabling heap tracing for that executable.
After the diagnostic phase, it is important to deactivate the tracking again.either by removing the key or setting the Tracing Flags value to 0, to avoid unnecessary runtime overhead.
With the applied configuration, heap tracing capture is performed with:
C:\>wpr -start Heap
As with virtual allocation, the process is allowed to run for a few minutes while the confirmation size is observed.Since only heap events are logged, the trace size is manageable in most scenarios.
C:\>wpr -stop Heap.etl
The result is a Heap.etl file containing the heap allocation and release eventswhich will then be studied with WPA to locate the call stacks responsible for the leaks.
Trace analysis with Windows Performance Analyzer (WPA)
The final step in the diagnostic process is to open the .etl traces with Windows Performance Analyzer, a tool included in the Windows Performance Toolkit, which in turn is part of the ADK (Assessment and Deployment Kit).
Before you begin examining the data, it is essential to correctly configure the symbol paths.so that call stacks are resolved with human-readable function names. In WPA, this is done from the Trace > Configure Symbol Paths menu, by adding a path similar to:
srv*C:\LocalPubSymbols*https://msdl.microsoft.com/download/symbols
With the symbols configured, they are loaded through the symbols menu and the corresponding .etl file is opened.In the case of virtual allocation leaks, a view centered on the problematic process should be reproduced, ensuring that the Commit Stack column is to the left of the gold or yellow grid line.
Expanding the commit stacks identifies the functions that originate the virtual assignments that are not released. The module that appears immediately above the allocation function on the stack is usually the logical one that is repeatedly requesting memory.
For heap leaks, the approach is similar, but the view must include columns such as Handle and Stack. to the left of the divide line. Exploring these stacks reveals the functions that make reservation calls on the heap without the corresponding release.
In both cases, once the module and the function involved have been identifiedThe next step is usually to check for software updates, known patches, or, if it's a custom application, debug and correct the allocation pattern to eliminate the leak.
Web application cache failures and how to reduce them
Beyond the operating system, the concept of cache misses is highly relevant in web applications like WordPress.A cache miss occurs when the data requested by the system or application is not available in the cache and has to be retrieved from the database or source.
In contrast, a cache hit occurs when content is retrieved directly from the cache.Whether in memory, on disk, or at the server level. The more cache hits there are, the faster the responses; the more misses, the higher the latency and the more load the server and database will bear.
Typical causes of a cache failure include that the data was never initially stored., that they have been evicted due to lack of space, that the cache has been purged manually or automatically, or that the time-to-live (TTL) policy associated with that data has expired.
When a cache miss occurs, the system usually makes a second attempt, this time against the data source.If the requested resource exists, it is read from the database or main storage, returned to the client, and in most cases, cached again to speed up future requests.
The problem is that every time you have to go down in the hierarchy (from L1 cache to L2, from L2 to main memory, from memory to disk, etc.) a miss penalty is introduced.In busy places, an excess of these failures significantly degrades response times.
Practical strategies to reduce cache failures in web environments
To reduce the frequency of cache misses on a website, the goal is to keep relevant data in the cache for as long as possible.always balancing performance and freshness of the content.
One basic tactic is to establish an appropriate time-to-live (TTL) for the cacheEach time a data is purged, it must be recalculated and rewritten after the next request, which can lead to initial failures. Extending the TTL when the content doesn't change frequently helps reduce invalidations and, therefore, fewer misses.
In managed providers, it is common practice to only purge specific sections of the cache when relevant changes are detected.For example, plugins like those used by some specialized hosting providers only clear the cache of the modified entry or certain areas of the site, instead of wiping out the entire server cache.
Another way to reduce cache misses is to increase the size of the cache itself or the available RAM.The greater the capacity, the more objects can be stored simultaneously without removing the least used ones, reducing the number of forced evictions.
This comes at a cost, as increasing RAM or subscribing to scalable hosting plans involves an additional expense.However, in high-traffic projects, it can be one of the most cost-effective optimizations in terms of performance and stability.
Finally, it is very useful to choose cache replacement policies that match the application's access pattern.Among the most common are FIFO (First In First Out), LIFO (Last In First Out), LRU (Least Recently Used) and MRU (Most Recently Used), each with advantages and disadvantages depending on the type of load.
Applying a smart combination of these policies allows you to control which cache objects are evicted first. When it is necessary to make room, keeping in memory those that are most valuable or most frequently used, even when it is not feasible to continue increasing the cache size.
In practice, coordinate the site's caching configuration with the hosting provider. This is usually the best way to adjust TTL, policies, and capacity, especially in managed environments where many cache decisions are made at the server level.
Understanding how Windows manages cache and memory, how to diagnose leaks and problematic thresholds, and how to optimize cache usage in applications like WordPress It allows you to avoid bottlenecks, reduce latency and minimize the risk of data loss or timeouts in both local servers and demanding web environments.
Passionate writer about the world of bytes and technology in general. I love sharing my knowledge through writing, and that's what I'll do on this blog, show you all the most interesting things about gadgets, software, hardware, tech trends, and more. My goal is to help you navigate the digital world in a simple and entertaining way.