AdobeDispatcherHacks ".statfile"

AEM DISPATCHER STATFILE UNDERSTANDING & CACHE INVALIDATION:-

AEM Developers, Infrastructure Engineers regularly come across a challenge on decoding the statfile and using it efficiently especially statfile becomes highly relevant in a multi-tenanted environment with different project teams controlling different sites. The article addresses in a simple way on how to understand the mechanisms of stat file and gives a detailed explanation of how it can be used in a multi-tenant environment model. 
The image for your reference as a quick overview of the data flow, before we take a deep dive. 



This article covers -
1 - When dispatcher serves the old version of the content. How to avoid it.
2- Cache invalidation mechanism.

Assumption - If you are reading this article, I believe you would have a basic understanding of Dispatcher and it's configuration.
Firstly let’s set the initial configuration for the cache invalidation section of the dispatcher. The configuration file then study how invalidation works at the low level and finally return to tools for invalidation. 

Invalidation section initial settings:-

Open the dispatcher.any files or any other *farm.any file's 

Inside /cache section there is /invalidate block which determines cached files that may be automatically invalidated when content is updated. For example, the following configuration invalidates all HTML pages and the .hpeg images. 

/cache
{
    /invalidate
    {
        /0000  { /glob "*" /type "deny" }
        /0001  { /glob "*.html" /type "allow" }
        /0002  { /glob "*.jpeg" /type "allow" }
    }
}

With automatic invalidation, the dispatcher doesn’t delete cached files after updating content but checks their validity when they are next requested. Documents in the cache that are not auto-invalidate will remain in the cache until a content update explicitly deletes them. 
Do not forget to reload or restart the apache/IIS server post configuration changes to take the effect.  

My Dispatcher is serving the old Content - 

The following points should be noted:

  • Content Updates are typically used in conjunction with an authoring system which "knows" what must be replaced.
  • Files that are affected by a content update are removed, but not replaced immediately. The next time such a file is requested, the Dispatcher fetches the new file from the AEM instance and places it in the cache.
  • You may have several statfiles, for example one per language folder. If a page is updated, AEM looks for the next parent folder containing a statfile, and touches that file. 

Now we will understand the stat file in detail, as it is a vital part of cache invalidation. 

Cache invalidation mechanism & Understanding of stat file – 

At the low-level dispatcher uses empty files which are named by default “.stat”. By default setting is used /statfileslevel "0" which means that there is only one stat-file is used and is placed at the root of htdocs directory or the cache root directory. 

The dispatcher module checks, If modification time of stat file is newer than the modification time of the resource then dispatcher consider such resources are obsolete or are invalidated.

For example, we have the cached resources after requesting the page "http://myhost:port/content/geometrixx/en/products.html". 
Let’s invalidate them by the low level mechanism of the stat-files. Create an empty file with name “.stat” at the root of your htdocs directory:

You may see that stat-file modification time is newer than cached resource modification time. That means for the dispatcher that all resources are obsolete. This is an invalidation mechanism at the low level in depth. After creating such stat-file if you will visit again the page http://host:port/content/geometrixx/en/products.html then requested cached resources will be updated:
In this example demonstrates the default invalidation scheme with /statfileslevel "0". 

Let’s study how we may configure invalidation more detailed with help of /statfileslevel setting.

/statfileslevel

You may use /statfileslevel property of the dispatcher configuration file to selectively invalidate cached files according to their path. There are some rules for /statfileslevel property mechanism: 
  1. Dispatcher creates .stat files in each folder from the docroot folder down to the level that you specify. The docroot folder is level 0.
  2. When a file is updated dispatcher locates the folder on the file path that is at the statfileslevel and invalidates all files below that folder.
  3. If level of the updated file is less than statfileslevel then all files in such folder are invalidated. Files below that folder are not invalidated.
  4. When a file is updated then all files from file folder up to the root level inclusive will be invalidated.
  For a better understanding of the /statfileslevel rules let’s consider a couple of examples. 
-   statefile lelvel 0 - A single .stat file will be created at the root of the cache directory.
        /.stat
- statefile lelvel 1 - .stat file at root of the cache and under each direct chile directory.
       /.stat
       /content/.stat
- statefile level 2 - .stat file at root of cache and under each direct child directory and each of their  direct  child directories:
      /.stat
      /content/.stat
      /content/foo/.stat
      /content/bar/.stat 

let's see the default  /statfileslevel “0” looks like -  

- There is only one stat-file at the root folder of our docroot/cache root.


- The scope of responsibility of that stat-file is all file-tree under htdocs. 

- If any file from this tree has older modification time then stat-file modification time then dispatcher consider such file is invalidated.

- statfileslevel = 0 is only for learning demo. It is not recommended to use 0 in the live production environment as it may end up flushing all cache every time any file gets modified. 


Now let's  set /statfileslevel “4”  more realistic scenario, and see how the  invalidation works - :

- There will be stat files at all levels from 0 (root) to 4 inclusive.

- There will be stat files at 5 levels (0 to 4) of the folder structure. 


Stat-files at levels less than 4 have the scope of responsibility with the only directory with this stat-file.
- That means if stat-file inside content/geometrixx/en folder is newer than any file from this folder then such file is invalidated but the validation of all files from all other folders is determined by other stat-files. 

- Stat-files at the level with a value of statfileslevel  property (level 4 in our case) only have the scope of responsibility with all underlying tree which begins from a folder with this stat-file and expands down to lower levels of the file tree. 
That means that if stat-file inside content/geometrixx/en/products folder has modification time newer than any file from underlying tree including products folder then dispatcher considers such file is invalidated. Validation of all files which is not located in this file tree is determined by other stat-files.
  
Stat-files at the level with a value of statfileslevel property (level 4 in our case) only have the scope of responsibility with all underlying tree which begins from folder with this stat-file and expands down to lower levels of the file tree. 
That means that if stat-file inside content/geometrixx/en/products folder has modification time newer than any file from the underlying tree including products folder then dispatcher considers such file is invalidated. Validation of all files which is not located in this file tree is determined by other stat-files.

Hope you understand the invalidation mechanism and the use of .stat file,




Comments

Post a Comment

Popular Posts

How to Increase Apache Request Per Second ?

how to clear dispatcher cache in aem ?

Configure/Decoding AEM AuditLogs

How to Configure CSP header in AEM , Dispatcher ?

How to prevent DDoS in Apache ?

How to protect AEM against CSRF Attack ?

Difference between Adobe AEM Enterprise vs Adobe AEM as a Cloud Service

Security best Practice in AEM

How Does S3 works with AEM ?