Today, GlassFish 3.1 was released.  This is an important milestone for our team and for me as I’ve been working on this release for the last 10 months. As the technical lead for the administration infrastructure area, I’ve been involved with the implementation of many parts of the clustering feature for GlassFish 3.1, which was one of the main release drivers.

The clustering functionality in 3.1 is very much like that from GlassFish 2.1, but the design and implementation is quite different in several areas.  Specifically, the data synchronization algorithm that used when an instance is started uses a more efficient algorithm for determined what files need to be transfered to the instance, and the method for performing dynamic reconfiguration is now based on command replication rather than state replication.  The remainder of this article focuses on how dynamic reconfiguration works in GlassFish 3.1.  The design and implementation for this was primarily developed by Vijay Ramachandran, but towards the later part of the project, I inherited responsibility for this part of the code as Vijay moved on to another project within Oracle.

The goal of dynamic reconfiguration is to ensure that as administrative actions are taken on clusters and instances, the results of those actions are propagated to all effected instances.  In the 3.1 design, this is accomplished using command replication.  As each asadmin command (or console action via the REST interface) is executed, the command is first executed on the domain administration server (DAS), and then, if necessary, it is replicated to one or more instances. To do this:

  1. commands are annotated to indicate where they need to be executed, for example, all instances, or all instances in a cluster, or just one instance,
  2. given the target for the command, the infrastructure determines on which specific instances to execute the command,
  3. depending on the state of the instances (up or down), either the command is executed for information about a command that has not been executed is preserved.

To accomplish the first step, a new annotation, @ExecuteOn, has been developed. The @ExecuteOn annotation tells the framework where the command should be executed. Many of the list-* commands only need to be executed on the DAS because the DAS has all of the information that is necessary.  When a new instance in a cluster is created, the registration command has to be executed on all of the instances in the cluster, so that each instance knows about the other instances in the cluster.

In GlassFish 3.1, many subcommands now take a “–target” option. For example, the –target option for the deploy subcommand specifies the cluster to which the application is to be deployed.  The deploy command uses the @TargetType annotation (also new for 3.1) to specify the valid target types, such as cluster, config, clustered instance or stand-alone instance.  For example, with deploy, cluster and stand-alone instance are valid, but it is not allowed to deploy an application to only a cluster instance.  The framework (GlassFishClusterExecutor class) uses this information and the name of the target to determine on which instances to execute the command.

Once the framework has the list of instances, the ClusterOperationUtil class takes care of executing the command on each instance. If the instance is down, a state file on the DAS (config/.instancestate) is updated with information about what command failed to execute.  The information in this file is used in the output of the list-instances subcommand to report whether an instance needs to be restarted because it missed an update. Once an instance has missed an update, no further updates are sent to that instance until it is restarted.

When you execute commands that are replicated to instances, the command output may including warnings about being unable to replicate commands if the target instance(s) is down. This output is suppressed if the instance has never been started.  The command replication framework implements state transitions for the instances that are recorded in the .instancestate file.

Because commands are only replicated to those instances that need the information, the domain.xml files for each instance will not be identical overtime.  However, the instances do have the information that they need. When instances are restarted, the synchronization process will copy the domain.xml from the DAS to the instance, bringing them back into exact synchronization.

When implementing an additional asadmin subcommand for GlassFish 3.1, it may be necessary to use the @ExecuteOn and @TargetType annotations to make the command execute correctly.

For years I’ve known that the Solaris DTrace tool is a very useful and powerful tool. However, I always had the (wrong) perception that it was difficult to use, so I never took the time to learn how to use it.  Today I finally took the time to learn enough about DTrace to be able to find the Java code that makes a system call. Here’s how to do it.

First, how did I get to this point.  Recently I’ve been analyzing the performance of the GlassFish application server, working on improving the performance for the upcoming 3.1 release. I was wondering about what JAR files are being accessed by the server, so I used truss on Solaris to trace the system calls, and looked for the open64 calls. This nicely displays the filename of every file that is opened. In the list, I noticed some files that I did not expect to be opened for this particular test, so this raised the question of what Java code is actually triggering the opening of that file.  So in steps DTrace to provide the answer.

DTrace uses scripts to define probes in the software. Once a probe is triggered, a sequence of actions can be executed.  One of the available actions is to dump the user level stack trace, and DTrace supports printing Java stack frames.  So here is the script I used to get the stack trace for the code that was calling open:

syscall::open64:entry
/pid == $target && 
copyinstr(arg0) ==
  "/home/trm/test/3.1-web/glassfish3/glassfish/modules/cluster-ssh.jar"/
{
  ustack(50, 5000);
}

The first line is a specification of probe, i.e., the entry point of the system call open64. Lines 2-4 specify a predicate to use for determining when to trigger the probe.  In this case, the process identifier (pid) needs to match that of the target process, and the first argument to the system call needs to match the given filename. The “copyinstr” call copies the argument string from user space into kernel space where the DTrace script is executing. The ustack function prints out the user stack trace.

To use this script, I ran GlassFish (essentially a Java program) using DTrace. First, I copied the Java command line from the GlassFish server.log file into a script.  Fortunately, GlassFish logs its startup command to the log file. Then I added the dtrace command that is needed to run the script, e.g.,

pfexec dtrace -s trace.d -c 'java -cp ...'

The pfexec causes dtrace to run with superuser privileges. The script above is in a file called trace.d, and is specified with the -s option. The command in the quotes after the -c option is the java command used to start GlassFish (this is the part taken from the server.log file).  The command run by DTrace is the target that is referenced in the script.  The next step was to run the script, which starts the GlassFish server with the probe enabled.  From another window, I trigged the behavior that causes the unexpected open to happen, and here is the beginning of the output:

CPU     ID                    FUNCTION:NAME
  0   2239                     open64:entry
              libc.so.1`__open64_syscall+0x7
              libc.so.1`open64+0xd1
              libhpi.so`sysOpen+0x3a
              libjvm.so`JVM_Open+0x3d
              libzip.so`Java_java_util_zip_ZipFile_open+0x95
              java/util/zip/ZipFile.open(Ljava/lang/String;IJ)J
              java/util/zip/ZipFile.<init>(Ljava/io/File;I)V
              java/util/jar/JarFile.<init>(Ljava/io/File;ZI)V
              java/util/jar/JarFile.<init>(Ljava/lang/String;Z)V
              org/apache/jasper/compiler/JspUtil.expandClassPath(Ljava/util/List;)Ljava/util/List;
              org/apache/jasper/compiler/Jsr199JavaCompiler.setClassPath(Ljava/util/List;)V
              org/apache/jasper/compiler/Compiler.setJavaCompilerOptions()V
              org/apache/jasper/compiler/Compiler.generateClass()V
              org/apache/jasper/compiler/Compiler.compile(Z)V
              org/apache/jasper/JspCompilationContext.compile()V
              org/apache/jasper/servlet/JspServletWrapper.service(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;Z)V
              org/apache/jasper/servlet/JspServlet.serviceJspFile(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;Ljava/lang/String;Ljava/lang/Throwable;Z)V
              org/apache/jasper/servlet/JspServlet.service(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V
              javax/servlet/http/HttpServlet.service(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V

This shows that the open64 was called as part of compiling the code that is produce from a JSP.

Looking at the system calls that are generated by a program and then understanding why the code is generating those system calls is a useful performance analysis technique. DTrace is useful in making the jump from seeing an unexpected system call to understanding what code triggers it.

Recently I worked on making sure that GlassFish 3.1 can use a multihomed server effectively.  This article talks about what it takes to configure a GlassFish domain to use multihoming. Unfortunately, this takes some detailed configuration at this point; there is an RFE filed to make this easier.

Briefly, a multihomed server has multiple IP addresses.  There are two primary use cases for multihomed servers:

1. Multiple distinct installations and/or domains of GlassFish are being operated on a server, with the intent to have one domain use one network and another use another network. For a particular DAS or instance, all of the ports for that instance are bound to the same host name.

2. A server with multiple networks, e.g., a front-end network for web requests, a back-end administrative network, and a back-end database network. The HTTP/S listeners are bound to the front-end network, while the admin-listener, GMS traffic, etc. is bound to the administrative network. Presumably in this case, the nodes would be defined to use the administrative network.

To configure either of these cases, the “address” attributes for all of the listeners must be configured to use a specific address rather than “0.0.0.0”. The address can either be an IP address or a DNS name. In each case the attribute is called “address”, but for some listeners, the default of “0.0.0.0” doesn’t show up in the domain.xml, so it has to be added.  The easiest way to find all of the addresses that need to be set is to search for “port” attributes.

For example, in the default domain.xml file, the “server-config” (which is used for the DAS) as the following entry:

<network-listener port="8080" protocol="http-listener-1" 
    transport="tcp" name="http-listener-1"
    thread-pool="http-thread-pool"/>

To configure this to bind to a specific address, such as 192.168.0.1, set this as follows:

<network-listener address="192.168.0.5" port="8080"
    protocol="http-listener-1" transport="tcp" 
    name="http-listener-1" thread-pool="http-thread-pool"/>

Now, when this domain starts, the http-listener-1 will listen only on the address 192.168.0.5 rather than all addresses. By doing this for all of the ports on which the server listens, either of the use cases above can be supported.

To make this configuration change from the command line, the asadmin command is:

asadmin set configs.config.server-config.network-config.
  network-listeners.network-listener.http-listener-1.address=192.168.0.5

After making this change, restart the server and you can see that it is only listening on the specified address using the netstat command.

Here is the list of addresses that need to be set in the default configuration:

  • JMX System Connector
  • JMS Provider
  • HTTP Listener
  • HTTP/SSL Listener
  • Administration Listener
  • IIOP Listener
  • IIOP/SSL Listener
  • IIOP/SSL Mutual Authorization

The Java debugger port and the OSGi Shell port are bound to localhost by default, so typically they do not need to be changed.

The multihomed server support is working in GlassFish 3.1 as of the MS4 build. If you have any interesting experiences to share with using this, please let us know.

My friend, Chris, recently recommended a QNAP network attached storage (NAS) device for our home.  He uses an earlier device, but the QNAP TS-119 looked good for our needs.  We also have an HP Dreamscreen wireless digital photo frame, and the goal was to be able to store our pictures on the NAS and display the pictures on the Dreamscreen.

However, doing this isn’t quite as simple as just plugging everything in.  The Twonkymedia software on the NAS doesn’t recognize the HP Dreamscreen, so even though Twonkymedia is enabled, the NAS doesn’t show up when one browses for it via the PC option in the Dreamscreen photo application. After searching the Internet and some trial and error, here is the trick for getting photos to display.

First, enable the Twonkymedia software on the NAS by selecting UPnP Media Server under the Applications menu in the QNAP administration console. This provides a link to get to the Twonkymedia admin console where one can configure the location of the pictures.

Second, in the Twonkymedia administration console, select Clients/Security and look for the MAC address for the Dreamscreen. It should be there if the Dreamscreen is on and the PC menu has been used to search for media streaming servers. By default, Twonkymedia will select Generic Media Server. Select “Generic DLNA 1.5” from the drop down menu and select Save Changes.  With this simple change, the Dreamscreen will now find the NAS device and will be able to display pictures from the QNAP NAS. Nice!

Welcome to my blog. This is my place for writing about random topics – some related to software development (my job), others related to my family or just other thoughts on my mind. I hope the information here is useful to someone.

About the blog title – it comes from the expression “Work is what I do when I would rather be doing something else.”  If we are lucky, we never have to work a day in our lives.