Tuesday, October 29, 2013

Paper Review


One common problem that many programmers may be confused of is how variables we declare and use are managed inside the memory. It is also a hot topic that is discussed in CS371P class for countless times. For example, Downing likes to ask us, whether the new variable is stored in stack, or heap? Sometimes we have to guess because there are too many scenarios in both Java and C++.

Before discussing what is presented in the paper, we need to figure out, what is the literal definition of Memory Management? It is the process of recognizing when allocated objects are no longer needed, deallocating the memory used by such objects, and making it available for subsequent allocations.

There are mainly two approaches to memory management and we are lucky to see that C++ and Java happen to cover the two approaches. Java corresponds to automatic memory management, while C++ corresponds to explicit memory management.

The author of the paper began with the Java garbage collector. As we know, Java only stores primitives on the stack and they exist within the scope of the function they are created in. Objects on the other hand are created on heap using keyword new. It is also common sense that garbage collector is responsible for deallocating the memory that were no longer used. What we are interested is how does gc finish this job. According to the paper, in JVM there is a full state diagram of totally eight possible object states. Among them, the garbage collector is interested in the reachability state (are there still references in the program to this object) and the finalization state. The garbage maintains a circular state diagram for objects. Once the object becomes finalizer-reachable, it will be garbage collected eventually.

C++, on the other hand, is quite different. The author first introduced basic concepts of the locations where objects can be placed, which include static memory, stack and heap. Similarly, objects located on the stack are fully managed by the compiler and there is no need to manually allocated memory or delete them when they are no longer used. The limitation is that the size of objects must be known at compile time. Objects on the head are dynamically allocated at runtime and this is what we will focus on, since we have to manage the memory using new/delete manually. It is the cost of flexibility without garbage collector.

As mentioned in our lecture, the author also pointed out a potential problem during memory deallocation. That is exception. Different techniques of both Java and C++ were presented in the paper to address this problem.
For Java, the trick is finally blocks. Although C++ also support try/catch blocks to handle exceptions, only the Java programming language knows the so-called finally blocks. It is always executed as last part of the try block, so in the presence of catch blocks, the catch blocks are executed first in case of exceptions.
Code is better example than words.
Void JavaFinally(){
       resource = new resource();
       try{
              // something which could throw exception
       }finally{
              resource.cleanup();
       }
}
Regardless how the try block is left, the finally block will be used to call the cleanup method.
For C++, techniques include 1) avoid the requirement for manually managed dynamic memory 2) using smart pointer types and 3) possibility to use custom garbage collector.
Method 1 is trivial that we just use STL that provides many generic class implementation to meet our common requirements. Vector is the example we discussed in class.
The second method is interesting. The basic idea is to implement a class who is compatible to a normal pointer and has a destructor that performs the cleanup. As we talked in the class, auto_ptr<> from STL is such a class.
void HeapAlloc(){
       int* x = new int(0);
       int* z;
       {
              *x = 3;
              int& y = new int(*x);
              y++;
              z = &y;
       }
       (*x)++;
       delete x;
       ++(*z);
       delete z;
}

#include <memory>
void CPPauto_ptr(){
       std::auto_ptr<int> x(new int(0));
       std::auto_ptr<int> z;
       {
              *x = 3;
              std::auto_ptr<int> hlp)new int(*x));
              int& y = *hlp;
              y++;
              z = hlp;
       }
       (*x)++;
       ++(*z);
}
Comparing the two sections of code, we could get rid of the delete after using auto_ptr. Another point worth mentioning is that, if an exception happens in std::auto_ptr<int> hlp)new int(*x));, the resources occupied by the previously allocated object will be cleaned up properly. This is also the reason why each allocated object is directly passed to an auto_ptr<> (line 8). Using this technique guarantees to not leak if an exception happens.

The author didn’t provide many details about the third method and only said that there were systems that implement a garbage collection for the C++ programming language similar to Java. But the limitation is also obvious for most of them that they don’t destroy the objects properly and just re-use memory occupied by this object.

In conclusion, although many topics in the paper were discussed in class, it is still meaningful to review them together, which makes all knowledge connected to each. Comparison of memory management between Java and C++ showed us both similarities and differences and strengthen our programming abilities in both languages.


No comments:

Post a Comment