Java equivalents of malloc(), new, free() and delete

现在的位置: 首页 > 综合 > 正文

RSS

Java equivalents of malloc(), new, free() and delete

2013年02月13日 ⁄ 综合 ⁄ 共 13605字 ⁄ 字号小中大 ⁄ 评论关闭

文章目录

Performance
Direct byte buffers

http://www.javamex.com/java_equivalents/memory_management.shtml

C++ provides a few ways to allocate memory from theheap¹:

the new operator will allocate and initialise an instanceof a particularobject or array;
the malloc() function, inherited from C, will allocatean arbitrary block of memory which can then be usedfor any purpose, such as an array or struct.

In either case, C++ also requires the programmer to free up allocated memoryonce it is no longer needed (or else have a memory leak). Thefree()function recovers memory allocated viamalloc(). Thedelete keyworddeconstructs
an object, whichcould involve some cleanup code in addition to deallocating the memory.To understand the Java equivalents of these various operations, it'sworth first seeing a general overview of memory management in the twolanguages, which we'll do in the
next section.Then in the following sections, we'll look at details of each of the fourabovementioned operations in turn.

The basics: comparing memory management in C++ and Java

For a C++ program to function correctly, the programmer is effectively requiredto allocate memory in one of the above ways if it is toescape the function inwhich it is allocated. For example, the following isincorrect:

// Warning: do not try this at home!
char *makeString() {
  char[] str = "Wibble";
  return str;
}

The problem is that str will be allocated off the
stack,and this memory will become invalid (or at least, liable to be overwritten)when the function exits. To correct the above function, we need to make itallocate the space for the string viamalloc() (ornew):

char *makeString() {
  char *str = (char *) malloc(200);
  strcpy(str, "Wibbly");
  return str;
}

It would then be up to the caller to call free() on theresulting string as and when it had finished with it. Things pretty much haveto be this way in C/C++: in all but the most trivial cases, it wouldbe virtually impossible for the compiler
to determine that an object was"finished with". (Consider, for example, that in C/C++ any pointer couldbe made to point to that object at some arbitrary moment.)

In Java, things are a little different. The JVM sees Java objects as "objects",not simply as accesses to memory locations. And indeed, Java doesn't allowaccess to any memory that isn't part of an object. If there are no morereferences to a particular object,
the Java Virtual Machine (JVM) knows thatthere won't be some pointer lurking about waiting to access that object's memoryspace.This all means that in Java, memory allocation can be managed much more by the runtime system:

the JVM can work out whetheran object escapes the method or not, and so the JVM candecide where toallocate the memory (stack, heap, possibly even registers...);
the JVM can handle garbage collection: that is,it can work out when an object is no longerreferenced and can thus be "deleted";
Java basically has no such thing as a pointer² in the true sense:all objects are accessed by areference, but this is notan actual memory location.

We've said that the JVM can decide to allocate memory for an objecton the stack (if it determines this is safe and desirable). But this isreally an implementational detail.In terms of how it looks to the programmer, in Java,all objects are effectivelyallocated
from the heap.

The `new` keyword

The new operator is somewhat similar in the two languages.The main difference is thatevery object and arraymustbe allocated via
new in Java. (And indeed arrays are actually objectsin Java.) So whilst the following is legal in C/C++and would allocate the array from the stack...

// C/C++ : allocate array from the stack
void myFunction() {
  int x[2];
  x[0] = 1;
  ...
}

...in Java, we would have to write the following:

// Java : have to use 'new'; JVM allocates
//        memory where it chooses.
void myFunction() {
  int[] x = new int[2];
  ...
}

Internally,the JVM chooses where the memory is actually allocated (stack, "young generation"heap, "large object" heap etc). A super-optimised JVM could even allocate thetwo-element array to two registers if circumstances permitted.But to the
programmer, the array is allocated from "the heap".

The `delete` operator in Java...?

From the above, you'll have gathered that Java has no direct equivalentofdelete. That is, there is no call that you can make totell the JVM to "deallocate this object". It is up to the JVM to work outwhen an object is no longer
referenced, and then deallocate the memory at itsconvenience.

Now if you're a C++ programmer, at this point you might be thinking"that's cool, but what about deconstructors"? In C++, if a class has a deconstructor,it is called at the point of invokingdelete and allows any cleanupoperations to be carried out.
In Java, things are slightly different:

there's no way to guarantee that any cleanup functionwill be called at the point of deallocation...
...but given Java's automatic garbage collection, cleanup functionsare generallymuch less necessary;
Java provides a mechanism called finalization,which is essentially an"emergency" cleanup functionwith a
"weak" guarantee of being called;
Java's finally mechanism provides another means of performingcertain common cleanup functions (such as closing files).

An explicit deconstructor isn't needed in Java to deallocate "subobjects".In Java, if object A references object B, but nothing else does, then object Bwill also be eligible for automatic garbage collection at the same time as A.So there's no need for a
finalizer (nor would it be possible to write one) toexplicitly deallocate object B.

Garbage collection and finalization

The end of an object's life cycle in Java generally looks as follows:

at some point, the JVM determines that the object is no longerreachable: that is, there are no more references to it, or atleast, no more references that will ever be accessed; at this point,the object is deemedeligible for garbage collection;
if the object has a finalizer, then it isscheduled for finalization;
if it has one, then at some later point in time, the finalizer may be called by some arbitrary thread allocated by the JVM for this purpose;
on its "next pass", the garbage collector will check that thefinalized object isstill unreachable³;
then at some arbitrary point in time in the future—but alwaysafter any finalization— the memory occupied bythe object
may actually deallocated.

You'll notice that there are a few "mays" and "arbitraries" in this description.In particular, there are a few implications that we need to be aware of if we usea finalizer:

the finalizer may never actually get called, even if the objectbecomes unreachable;
having a finalizer adds some extra steps to an object's life cycle so that in general,finalizers delay object deallocation;
because the object needs to be accessed later, finalizersprevent optimisations that can be made to allocation/deallocation:for example, our notion that the JVM can put an object on the stack goes out thewindow if the object will later need
to be accessed by a separate finalizerthread at some arbitrary point in time⁴.

So finalizers should really only be considered an "emergency" cleanup operationused on exceptional objects. A typical example is something likefile stream ornetwork socket.Although they provide a
close() method,and we should generally try to use it, in the worst case, it's better thannothing to have the stream closed at some future point when the stream object goes out of scope. And we know that in the
very worst case, the stream or socket willbe closed when our application exits— at least on sensible operating systems⁵.This last point is important: the finalizer may not get called at all evenin "normal" operation⁶,so it's really
no good putting an operation in a finalizer whose executionis crucial to the system behaving once the application exits.

Another typical use of a finalizer is for objects linked to some native codethatallocates resources outside the JVM. The JVM can't automatically clean up or garbagecollect such resources.

How to create a finalizer

So despite all the warnings, if you still want to create a finalizer, then essantiallyyou need tooverride the
finalize() method of the given class.(Strictly speaking, every class otherwise inherits an emptyfinalize() methodfrom
Object, but good VMs can "optimise away" the default empty finalizer.)A typical finalizer looks as follows:

protected void finalize() throws Throwable {
  super.finalize();
  ... cleanup code, e.g. deleting a temporary file,
      calling a native deallocate() method etc...
}

General points about coding finalizers:

you should always call the superclass finalize() in caseit needs to do some aditional cleanup;
you should always synchronize on shared resources, as finalizerscan be run concurrently;
you should make no expectations about ordering offinalizers: if two objects are garbage collectable, there is no fixed order in whichtheir finalizers will be run (and they could be run concurrently);
finalizers should run quickly, as they could otherwise delay thefinalization (and hence garbage collection) of other objects;
you should not rely on seeing any exception thrown bya finalizer— the JVM will generally swallow them (but they will still interrupt thefinalization of the object in question, of course).

Alternatives to finalizers

In general, it's better to use an alternative to finalize() methodswhere possible— they really are a last resort. Safer alternatives include:

for objects with a well-defined lifetime, use an explicit close orcleanup method;
where useful, call the cleanup method in a
finally block;
using a shutdown hook (see Runtime.addShutdownHook()) to performactions such as deleting temporary files that just need to be done "when the application exits".

And of course, don't expect miracles. If the VM segfaults or there's a power cut⁷,there's no cleanup method or shutdown hook that's going to help you. If you are writing some critical application that requires "no data loss", you will have to
take othersteps (such as using a before-and-after transaction log) to minimise potential problemscaused by abnormal shutdown.

A Java equivalent of the `malloc()` function...?

If you've read the previous page on
memory management andnew in Java, you may be wondering why the current section onmalloc()even exists. We've just stated that in Java, all memory
has to be accessed via well-definedobjects. In C/C++, on the other hand,malloc() gives us a
pointer to an arbitrary block of memory.And in Java, there's no such thing, right?

Well, strictly speaking this is true: there isn't a way in Java to access "raw" memory viaits address (pointer)— at least, not in a way where the address is visible to your Java program.But Javadoes provide the following rough equivalents
to an area of memory allocated viamalloc():

if you just want a block of bytes, e.g. in order to buffer input form a fileor stream, then a common solution is to use abyte array;
the Java NIO (New I/O) package provides various
buffer classesthat allow you to manipulate an array or area of memory more flexibly: with methods to get/seta given primitive at a particular offset in the buffer, for example.

So, for example, the equivalent of the following C code:

unsigned char *memarea = (char *) malloc(200);
*(memarea + 10) = 200;
*((int *) (memarea + 16) = 123456;
ch = memarea[4];

would be something along the following lines in Java, using aNIOByteBuffer object:

ByteBuffer bb = ByteBuffer.allocate(200);
bb.put(0, (byte) 200);
bb.putInt(16, 123456);
int ch = bb.get(4) & 0xff;

otice a subtlety of buffer access in Java is that we must deal with sign conversions ourselves.To read an unsigned byte from the ByteBuffer, we must store it in anint (or some primitivebig enough to handle values between 128 and 255— basically,
anything but abyte!),and must explicitly "AND off" the bits holding the sign.

Performance

Note that in Sun's JVM— and probably other JVMs with a good JIT compiler—accesses to ByteBuffers usually compile directly to MOV instructions at the machine codelevel. In other words, accessing a ByteBuffer really
is usually as efficient asaccessing a malloced area of memory from C/C++⁸.

Direct byte buffers

In the above example, the byte buffer would actually be "backed" by an ordinary Java byte array.It is even possible to callbb.array() to retrieve the backing array. An alternativethat is perhaps closer tomalloc() isto create what is called
a direct ByteBuffer in Java: aByteBufferthat is backed by a "pure block of memory" rather than a Java byte array⁹.To do so requires us simply to change the initialcall:

ByteBuffer bb = ByteBuffer.allocateDirect(200);

Note that from Java, we still don't see or have control over the actual memory addressof the buffer. Direct byte buffers have the advantage ofaccessibility via native code: although the address is of the bufferisn't available or
controllable in Java, it is possible for your Java program to interfacewith
native code (via the Java Native Interface). From the native side,you can:

query the address of a direct buffer;
allocate a direct buffer around a determined address range.

This very last point is quite significant, as it effectively allows things like device drivers tobe written in Java (OK, in "Java plus a couple of lines of C" at any rate...).

Note, however, that you generally shouldn't use direct ByteBuffers "by default":

because they're essentially not Java objects (or at least, the actual data isn't),thedirect buffer data is not allocated from the Java heap,but rather from the process's remaining memory space;
in practice, direct buffers are difficult to garbage collect,if even possible: Sun's Java 1.4 VM, for example, appearsnever to reclaimthe memory of a direct byte buffer.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

1. A heap is a block of contiguous memory set aside that canhave smaller portions of it allocated out as and when necessary. In the eyesof the programmer, in both C/C++ and Java, we usually talk about allocatingfrom "the
heap" in an abstract way. But in neither language is there usuallyone single block of contiguous memory from which all allocations are made.For performance reasons,goodmalloc() implementations will internally manage several heaps(e.g. one for large
allocations, one for small allocations...); and modernJVMs often have several heaps internally. In most of this discussion, we'llcontinue to use the term "the heap", as that is how memory is presented tothe programmer.
2. The only slight exception in Java is that using the Java Native Interface(JNI), it possible to wrap a directByteBuffer around a block of memory allocated from native code (whose address is thus known). Butthe memory address still isn't exposed at
the Java level, and from Java, thememorymust still be accessed via thatByteBuffer object.
3. It's not a terribly useful thing to do, but in principle you could "resurrect" an object duringthe finalizer by creating a new reference to it. However, a finalizer is only ever called once;once resurrected, the object will never
be finalized again (though when all referencesdisappear again, it could be garbage collected).
4. As an example, Hans Boehm in his 2005 JavaOne talk on
Finalization, Threads and the Java Memory Model reported an 11-fold reduction inperformance of a binary tree test when the nodes had finalizers.
5. Anyone else remember RISC OS...?
6. On the other hand, if an object has a finalizer, the JVM won't garbage collect(deallocate) that object before its finalizer has been run— at least the first time (see point 1).
7. Buying a UPS tends to reduce the frequency of power cuts, just as taking an umbrella outreduces the likelihood of rain.

8. The same is often true of accessing object fields: the JIT compiler can compile accessesto object fields into MOV instructions that "know" the offset of the field in question.
9. You can actually create buffers backed by other types of primitive array,such as anIntBuffer backed by an
int array.

【上篇】“算法与计算数学”之四书五经
【下篇】ASP.NET 页面传值方式(转载+整理)

作者: mislead

该日志由 mislead 于11年前发表在综合分类下，最后更新于 2013年02月13日.
转载请注明: Java equivalents of malloc(), new, free() and delete | 学步园 +复制链接

抱歉!评论已关闭.

学步园