Interesting idea: If there is need to (due to performance) to cache large number of objects in memory, what if, instead of using java heap to keep objects, elements would be stored in NIO native byte buffer.
In theory that should (a) reduce java heap usage, (b) allow better usage of process memory space (i.e. using that native memory available to process, but not included in java heap), also (c) elements in cache would be ”unshared”, since they would be stored separately in native memory (basically this makes elements in cache truly inmutable).
Cons of such logic are that (a) access to cache is slower (however, if cache is for reducing DB access, then cache is still significantly faster), (b) elements are not shared, since cache wouldn’t give out shared instances (can reduce effectiveness of cache), (c) cache requires some complex extra logic (indexing logic to allow locating any single object in cache, updating/deleting/inserting elements, etc.).
Technically such cache would utilize number of DirectByteBuffer instances, where elements are stored as serialized objects.
For example, draft of implementation could be something like this,
(1) in java heap there is ”Map of ObjectId to CacheIndex”, where ObjectId is *light weight* object identifier, and CacheIndex indicates in which native buffer elements resides, and what is it’s location in there.
(2) Deleting element from cache would happen by putting some bit into ”deleted” mask, which allows system to know how many entries in each given buffer are used. If there is significant amount of entries deleted, then remaining entries could be reallocated into another buffer (to compact memory), and old buffer would be discarded.
(3) Updating would be based into same principle than delete, each update marks old entry as ”deleted”, and new entry is written either into end of buffer (if buffer can be grown), or into new buffer.
(4) Inserting element into buffer does same than update, except skipping that delete part
(5) Possible ”caveat emptor” is that every element must be handled as separate serialized object bytebuffer. This can cause leak of memory there is strong dependencies between separate ”top level” cacheable elements. In order to that such problem, there must not be any hard references from ”top level” elements to each other, but some indirection logic must be used.
Some notes:
(1) Now, in top of cache, some intelligence in client logic is also required to be cache aware. I.e. Logic must be aware that every access to cache will result into new unshared element instance. If logic wouldn’t take that in account then memory usage can increase very sharply.
(2) Reading objects cause some overhead due to serialization. If very large objects are cached, then java serialization requires temprarily potentially significant amount of extra memory, both when putting element in cache, and when getting it from there. It’s up to profiler to decide if such is significant problem.
(3) The fact that elements coming from cache would be never shared, can help ensuring robustness of application logic, since sharing cached elements is always risky. Problem is that it requires some caution to not accidentally modify shared cache element (since cache is transparent, this can be rather tricky in application logic). Insignificant benefit is neither more robust concurrency logic. Namely if any cached elements uses ”lazy initialization”, and application code is having multiple threads, then synchronization for such lazy initializations can become easily inproper; i.e. lazy initialization logic must be safe for concurrent thread access (which, btw, doesn’t necessarily mean usage of synchronized keyword). Now since elements are not shared in cache, occurrence of such problems is greatly reduced.
References:
NIO – jdk 6
NIO buffers tutorial
Raw NIO Performance
ByteBuffer : Java Glossary