Time Measurements in Different CPU Views | ![]() ![]() |
Most likely you'll be interested in the wall clock time, that is the duration between the entry and the exit of a method as measured with a clock. For the profiling agent this is a straightforward measurement. While it might seem at first glance that measuring times should not have any significant overhead, this is not so if you need a high resolution measurement. Operating systems offer different timers with different performance overheads.
For example, on Microsoft Windows, the standard timer with a granularity of 10 milliseconds is very fast, because the operating system "caches" the current time. However, the duration of method calls can be as low as a few nanoseconds, so a high resolution timer is needed. A high resolution timer works directly with a special hardware device and carries a noticeable performance overhead. In JProfiler, CPU recording is disabled by default, however, method call recording is always enabled. If you compare the duration of the startup sequence of an application server with and without CPU recording, you will notice the difference.
Wall clock time is measured separately for each thread. In CPU views where the thread selection
includes multiple threads, the displayed times can be larger than the total execution time of
the application. If you have 10 parallel threads of the same class MyThreadClass
whose run()
method take 1 second and "All threads" is selected in the call tree,
the MyThreadClass.run()
node in the call tree will display 10 seconds, even though
only one second has passed.
Since the CPU might be handling many threads with different priorities, the wall clock time is not the time the CPU has actually spent in that method. The scheduler of the operating system can interrupt the execution of a method multiple times and perform other tasks. The real time that was spent in the method by the CPU is called the CPU time. In extreme cases, the CPU time and the wall clock time can differ by a large factor, especially if the executing thread has a low priority.
The standard time measurement in JProfiler is wall clock time. If you wish to see the CPU time in the CPU views, you can change the measurement type in the profiling settings. The problem with CPU time measurement is that most operating systems provide this information with the granularity of the standard timer - high resolution measurements would carry to much overhead. This means the CPU times are only statistically valid for method that have a CPU time bigger that the typical granularity of 10 milliseconds.
The notion of time measurement must be refined further, since not all times are equally interesting. Imagine a server application with a pool of threads that waiting to perform a task. Most of the time would then be in the method that keeps the threads waiting while the actual task will only get a small part of the overall time and will be hard to spot. The necessary refinement is done with the concept of thread status. There are 4 different thread statuses in JProfiler:
Object.wait()
and the current thread
will only become runnable again when some other thread calls Object.notify()
on the same object.
When looking for performance bottlenecks, you're mostly interested in the "Runnable" thread state although it's always a good idea to have a look at the "Net I/O" and "Blocking" thread states in order to check if the network or synchronization issues are reducing the performance of your application.
Nodes in the call tree (methods, classes, packages or Java EE components, depending on the selected aggregation level) are sorted by total time. This is the sum of all execution times of this node on the particular call path as given by the ancestor nodes. Only threads in the current thread selection are considered and only measurements with the currently selected thread status are shown.
Optionally, the call tree offers the possibility to show the inherent time of a node. The inherent time is defined as the total time of of a method minus the time of its child nodes. Since child nodes can only be in unfiltered classes, calls into filtered classes go into the inherent time. If you change your method call recording filters, the inherent times in the call tree can change.
While the call tree view shows all call stacks in your application, the hotspot view shows the methods that take most of the time. Each method can potentially be called through many different call stacks, so the invocation counts in the call tree and the hotspots view do not have to match. The hotspot view shows the inherent time rather than the total time. In addition, the hotspot view offers the option to include calls to filtered classes into the inherent time. Please see the article on hotspots and filters for a thorough discussion of this topic.
When you open a hotspot node, you see a reverse call tree. However, the times that are displayed in those backtraces do not have the same meaning as those in the call tree, since they do not express a time measurement for the corresponding node. Rather, the time displayed at each node indicates how much time that particular call tree contributes to the hot spot. If there is only one backtrace, you will see the hotspot time at each node.
The times that are shown for nodes (methods, classes, packages or Java EE components, depending on the selected aggregation level) in the call graph are the same as those in the hotspots view. The times that are associated with the incoming arrows are the same as those in the first level of the hot spot backtrace, since they show all calling nodes and the cumulated duration of their calls. The time on the outgoing arrows is a measurement that cannot be found in the call tree. It shows the cumulated duration of calls from this node, while the call tree shows the cumulated duration of calls from the current call stack.
![]() ![]() |