dmitrygusev online: Java

Showing posts with label Java. Show all posts

Tuesday, June 04, 2013

Deploy Application Binaries (*.war) to OpenShift

RedHat OpenShift is a PaaS that provides a cloud hosting for your applications.

I'd like to share a practice that I use to deploy my Java application to OpenShift.

I only have experience with Tomcat 7 (JBoss EWS 2.0) cartridge and non-scalable applications, so I will talk about them. However this may be applied to other environments.

I use GitHub to store my application codebase, and I also use Gradle as a build tool.

If you use Maven for your builds and you have all your dependencies in public Maven repositories or these repositories that are accessible from OpenShift, then this blog post is likely not for you.

As of today OpenShift does not support Gradle as a build tool, and I have some of my dependencies in my private/local repositories that are not available from OpenShift, this is why I build my application locally and only deploy binaries to OpenShift.

When you create OpenShift application there is a Git repository that you may use to deploy your code. You can also use this Git as your primary source storage (or you can synchronize with your GitHub repo), but I don't do this.

This Git repo has specific directory structure and OpenShift auto-deployment rely on this structure, this is one of the reasons I don't use this Git repo as my primary code base -- I use multiple deployment targets for my project and OpenShift is only one of them.

The directory structure contains /webapps folder where you can put your *.war file and OpenShift will deploy it when you Git push.

If you do this, however, you will find soon that your Git repository will eat all your server-side disk quota (which is only 1GB for free). This is because remote Git repository will hold all revisions of your binaries. My *.war file size is near 50MB -- this is typical for most small-to-medium Java applications. So after you do 20 deployments -- you will be out of free space.

Usually you don't need all these revisions of your binaries, so to fix this situation you first should delete your remote Git history and adopt some other practice for deployments.

Here is how I do this.

Delete old revisions of your binaries from your remote OpenShift Git repo

First you need to do a git clone or a git pull to fetch recent version of your remote repo. Lets name the folder you've cloned to as OLD_REPO. You will need this to restore your git hooks that are in the .openshift subfolder, and maybe some other configs except your binaries (see step 8 below).
SSH connect to your OpenShift instance.
cd ~/git/.git/objects
rm -rf *
cd ..
rm refs/heads/master
Do a fresh git clone from remote OpenShift Git. It will tell you that you've cloned empty repository -- this is correct, your remote repository now clean. Lets name your new clone folder as NEW_REPO.
Copy contents of OLD_REPO to the NEW_REPO. You should copy all except .git folder, because NEW_REPO will already contain itself .git folder.
Delete NEW_REPO/webapps/*.war -- these are your previous binaries:
git rm webapps/*.war

At this stage you will have empty remote Git repository and local clone with latest revision of what you've had in remote before deleting it except your binaries.

Way to deploy new binaries

To deploy new binaries you have to copy them manually to OpenShift. I do this using SCP command.

I created a shell-script upload-war.sh with the following content:

scp $PROJECT_X_WORKSPACE_DIR/project-x/project-x-web/build/libs/*.war $PROJECT_X_OPENSHIFT_ADDRESS:~/app-root/data/webapps/ROOT.war

As you see I use environment variables to tell the script where my local binary is located and where should I place them in remote OpenShift. PROJECT_X_OPENSHIFT_ADDRESS is the username@address you use when connect to OpenShift by SSH.

My project has only one target *.war artifact and I copy it to remote data folder under the ROOT.war name. In OpenShift the data folder is the place where you store your custom files.

After I copied the file I have to tell OpenShift to deploy it.

To do this I modify build action hook which is located here: NEW_REPO/.openshift/action_hooks/build to make it look like this:

#!/bin/bash

# This is a simple build script and will be executed on your CI system if

# available. Otherwise it will execute while your application is stopped

# before the deploy step. This script gets executed directly, so it

# could be python, php, ruby, etc.

cp $OPENSHIFT_DATA_DIR/webapps/* $OPENSHIFT_REPO_DIR/webapps/

Here $OPENSHIFT_DATA_DIR and $OPENSHIFT_REPO_DIR are OpenShift built-in environment variables.

Add this script to version control, commit and push to OpenShift remote.

When you commit this hook will copy the binary you copied earlier and deploy it. So next time when you will release new version, just run upload-war.sh and do some dummy commit/push to the OpenShift remote and thats it.

Sunday, April 07, 2013

Render Tapestry5 Block to a string from code

Recently I integrated Select2 component to my tapestry5 application.

Select2 can load data from server using ajax when user scrolls through drop down.

There could be any data formats in ajax response provided that developer implements javascript results() function that parses the response into the format expected by select2.

Select2 provides two more callback functions that developers usually implement to post-process returned data:

id() function that retrieves id from the choice object;
formatResult() function that builds HTML markup which is used to display choice object in drop down.

I used Tynamo's tapestry-resteasy to build REST service that returns data to select2, and my REST service returned everything needed to implement id() and formatResult() functions.

I could just implement formatResult() to build HTML markup for select2, but I already had similar tapestry5 component that builds the same markup and I wanted to increase code reuse.

To do that I built tapestry5 service that does just this – EventResponseRenderer (see code below).

To use it:

Declare a block you want to render, as you usually do. In my example I created separate page –internal/CompanyBlocks.tml – and there is my addressBlock;
Declare event handler method that handles "Render" event and returns the block you want to render (note that you may also use tapestry5 AjaxResponseRenderer.addRender());

In my example this is:

public Block onRenderFromCompanyAddress(Company company)

Note that you must specify id of tapestry5 component in event handler method name. This is the limitation of my EventResponseRenderer implementation and only required to conform tapestry5 API. This could be id of any component from the page;
You will probably want to declare some parameters that you will use to initialize page properties required for block renderer. You don't have to implement type coercers for them, because you will pass values for these parameters from code;

@Inject instance of EventResponseRenderer and call its render() method passing instance of RenderEvent to it. RenderEvent constructor accepts pageName where event handler method declared, componentId that was used in event handler method name (see previous step), and eventArgs – list of objects that will be passed as parameters to the event handler method. See method CompanyResourceImpl.createMatch()
Thats it.

Usage Example

EventResponseRenderer Implementation

Sunday, March 11, 2012

Serving Tapestry5 Assets As Static Resources

In Tapestry5 you use assets to reference *.js, *.css or image files from your templates/code. The reference may look like:

    <link rel="stylesheet" type="text/css" href="${context:/css/all.css}" />

During the render phase Tapestry5 converts the ${context:/css/all.css} part to asset URL, which may look like the following (see Asset URLs section here):

    <link rel="stylesheet" type="text/css" href="/assets/stage-20120310/ctx/css/all.css" />

Here "stage-20120310" -- is an application version string, which Tapestry5 adds to asset URLs to manage assets versioning. When running in production Tapestry5 adds a far future expires header for the asset, which will encourage the client browser to cache it.

When you change one of your assets you have to change application version number in your AppModule.java, so that Tapestry5 generate new asset URLs and browser fetched new assets instead of using the ones from cache.

One disadvantage of such approach is that client browser will have to get all the assets once again, not just the one that was changed.

For the majority of assets the asset URL is generated by Tapestry5. Exceptions are assets, that are referenced from *.css files by the relative URL, like this (file all.css):

a.external {

 background: transparent url(../images/external.png) no-repeat scroll right center;

 display: inline-block;

 margin-left: 2px;

 height: 11px;

 width: 11px;

 zoom: 1;

}

In this case browser will form the URL itself relatively to "/assets/stage-20120310/ctx/css/all.css", and the resulting URL will be "/assets/stage-20120310/ctx/images/external.png".

So you have to change application version in AppModule.java if you provide new version of "external.png".

But, for the majority of assets it would be enough to append MD5/SHA1/... checksum as a GET-parameter to asset URL and make them look like:

    <link rel="stylesheet" type="text/css" href="/assets/stage-20120310/ctx/css/all.css?5ef25ac1ec38f119e283f338e6c120a4e53127b1" />

In Tapestry5 you have the ability to provide your own implementation of AssetPathConverter service and append this checksum manually. But, in this interface you only have original asset URL, and don't have the resource itself to calculate the checksum.

There are several ways this may be implemented. Ideally, I'd like this to be implemented in Tapestry5 core.

There's one thing I don't like about Tapestry5 assets handling, though, even if the above solution will be implemented -- is that assets are not static.

This means every asset URL is handled by the Java code, and in most cases assets handling is just streaming of existing files from filesystem to browser (with optional minimization and gzip-compression).

Once the asset was handled, Tapestry5 caches the response and uses it in further responses, but still this is all done in Java.

In Ping Service we've implemented "assets precompilation", and placed all the rendered assets as static files in the web app root folder.

This is done using custom implementation of org.apache.tapestry5.internal.services.ResourceStreamer, which is responsible for streaming every asset to client. During resource streaming we calculate asset checksum and store in a static.properties file, where we put asset URL as a key, and checksum as a value:

#Static Assets For Tapestry5 Application

#Sat Mar 10 19:42:38 UTC 2012

/assets/stage-20120310/ctx/css/all.css=5ef25ac1ec38f119e283f338e6c120a4e53127b1

/assets/stage-20120310/ctx/css/analytics.css=ee470432c344820e43995fb4632ab4bee3b92e38

/assets/stage-20120310/tapestry/t5-prototype.js=95e30b840a5654b82e6a0334a14a2766c57c4d99

...

Our implementation of AssetPathConverter uses this property file to modify asset URLs.

We run our implementation of ResourceStreamer only in production mode, since Google App Engine doesn't allow writing to the filesystem.

Also we've implemented it to work only if special HTTP-header passed with the request. To pass this header and to trigger every asset we have in our application, we use Selenium-powered integration test that queries every single page. We run this test before deploying new version to production.

Now Tapestry5 asset URLs and URLs of static files are the same in our application. So Google App Engine runtime won't even pass the request to Java. Also it uses its own facilities to serve static files, i.e. gzip-compression, etc.

Saturday, January 14, 2012

Simple Sorting Facade for Java (SSF4J)

In Java to sort two or more lists together you have to write a custom solution.

Say, you have list of names and corresponding list of weights. There is no API that allows you to sort names by weights (at least not that I know). However this is very common use case, especially when you analyzing data in your programs.

To achieve this, you, most likely, implement one of the sorting algorithms with a custom swap-logic.

Simple sorting facade is a pattern that already contains implementation of sorting algorithm(s) and only requires developers to specify source list, its bounds, and compare- and the swap-logic.

You can explore SSF4J on GitHub and contribute your implementations of sorting algorithms.

Here's an example of using SSF4J:

Monday, September 05, 2011

Running BIRT Reports in Tomcat

Context: You have database and you need to do data analysis: draw charts, build some tables, calculate totals, etc. You want it all to be available over the web and secured with a password.

Your database is any JDBC-supported database (I use MySQL 5.1.49).

Your server is running any OS where Java can run (I use Ubuntu Linux 10.10 available through SSH).

I will show how to implement this using Eclipse BIRT 3.7.

Developer Environment

Download "Eclipse IDE for Java and Report Developers" package here.
Unzip to install.
Design new report (I created sales.rptdesign). This is really straightforward.

JDBC Driver. You will need MySQL JDBC driver to create Data Source for report. I got mysql-connector-java-5.1.17-bin.jar here.

Installing the driver in BIRT designer is easy using "Manage drivers..." dialog. But you will probably have problems deploying it to the runtime.

Fonts. You probably won't have any problems with fonts in BIRT designer. And again you will likely have problems with fonts in runtime.

Connection Profile Store. With BIRT 3.7 you can use Connection Profiles to hold database connections. After you've finished designing and testing your report, double click report Data Source to bring properties dialog and create new connection profile store in there. Save it to some file (I saved to planet33_v2.xml).
Here's what I have in there:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<DataTools.ServerProfiles version="1.0">
    <profile autoconnect="No" desc=""
        id="ecc3bc60-d4fd-11e0-957a-e0e31b9a34ee" name="planet33_v2"
        providerID="org.eclipse.datatools.enablement.mysql.connectionProfile">
        <baseproperties>
            <property
                name="org.eclipse.datatools.connectivity.db.connectionProperties"
                value="" />
            <property name="org.eclipse.datatools.connectivity.db.savePWD"
                value="true" />
            <property name="org.eclipse.datatools.connectivity.drivers.defnType"
                value="org.eclipse.datatools.enablement.mysql.5_1.driverTemplate" />
            <property name="jarList"
                value="/usr/local/share/mysql-connector-java-5.1.17-bin.jar" />
            <property name="org.eclipse.datatools.connectivity.db.username"
                value="your_username" />
            <property name="org.eclipse.datatools.connectivity.db.driverClass"
                value="com.mysql.jdbc.Driver" />
            <property name="org.eclipse.datatools.connectivity.db.databaseName"
                value="planet33_v2" />
            <property name="org.eclipse.datatools.connectivity.db.password"
                value="your_password" />
            <property name="org.eclipse.datatools.connectivity.db.version"
                value="5.1" />
            <property name="org.eclipse.datatools.connectivity.db.URL"
                value="jdbc:mysql://127.0.0.1:3306/planet33_v2" />
            <property name="org.eclipse.datatools.connectivity.db.vendor"
                value="MySql" />
        </baseproperties>
        <org.eclipse.datatools.connectivity.versionInfo>
            <property name="server.version" value="5.1.49" />
            <property name="technology.name.jdbc" value="JDBC" />
            <property name="server.name" value="MySQL" />
            <property name="technology.version.jdbc" value="4.0.0" />
        </org.eclipse.datatools.connectivity.versionInfo>
        <driverreference>
            <property name="driverName" value="MySQL JDBC Driver" />
            <property name="driverTypeID"
                value="org.eclipse.datatools.enablement.mysql.5_1.driverTemplate" />
        </driverreference>
    </profile>
</DataTools.ServerProfiles>

Note: I bet you can use JDNI data sources here (and I suppose this is even preferable because of connection pooling, etc.). Please, drop a few lines in comments below with instructions how you do this.

You can now edit XML source of you report and replace /report/data-sources with something like this:

<data-sources>
    <oda-data-source extensionID="org.eclipse.birt.report.data.oda.jdbc.dbprofile"
        name="Planet33 V2 Data Source" id="359">
        <property name="OdaConnProfileName">planet33_v2</property>
        <property name="OdaConnProfileStorePath">../conf/planet33_v2.xml</property>
    </oda-data-source>
</data-sources>

Several things to mention here:

planet33_v2.xml (Connection Profile Store)

Check all properties and change them according to your connection.
Note the jarList property, there you should specify path(s) to where your JDBC drivers located (I copied driver that I've downloaded to /usr/local/share/mysql-connector-java-5.1.17-bin.jar).
When you create connection profile store file from designer it places property with name="org.eclipse.datatools.connectivity.driverDefinitionID". You should remove this property because of this issue.

sales.rptdesign (The Report)

You should keep value of ida-data-source@id attribute the same that was in your design.
Value of OdaConnProfileName should match value of DataTools.ServerProfiles/profile@name attribute from planet33_v2.xml.
Note that OdaConnProfileStorePath is relative path (see below). But you can keep it absolute if you want.

Server Environment

(Note: I recommend to configure Tomcat instance on your developer machine first to make it easier to verify report settings, and then transfer the entire $CATALINA_HOME to production server. Of course, you can do all these steps on production server directly.)

Download Apache Tomcat (any Java application server should be fine).
Unzip to some folder (I used /usr/local/share/apache-tomcat-5.5.33/) -- this will be $CATALINA_HOME.
Download BIRT "Runtime" package here.
Copy birt.war (BIRT Web Viewer application) to $CATALINA_HOME/webapps.

Edit $CATALINA_HOME/catalina.sh and paste these lines somewhere after JAVA_OPTS variable initialized (this prepares workspace for DTP plugin):

java_io_tmpdir=$CATALINA_HOME/temp
org_eclipse_datatools_workspacepath=$java_io_tmpdir/workspace_dtp
mkdir -p $org_eclipse_datatools_workspacepath

JAVA_OPTS="$JAVA_OPTS -Dorg.eclipse.datatools_workspacepath=$org_eclipse_datatools_workspacepath"

Start Tomcat by running $CATALINA_HOME/startup.sh. After this BIRT Report Viewer application should be available by http://localhost:8080/birt. Also birt.war should be now extracted to $CATALINA_HOME/webapps/birt -- this will be $BIRT_HOME. You can now delete $CATALINA_HOME/webapps/birt.war.
Copy planet33_v2.xml to $CATALINA_HOME/conf as (remember OdaConnProfileStorePath property in sales.rptdesign file?).
Copy your sales.rtpdesign file to $BIRT_HOME.

At this point you should be able to execute the report by simply following the address http://localhost:8080/birt/frameset?__report=sales.rptdesign&__dpi=600.

Note the __dpi URL parameter -- it controls DPI of chart images rendered in HTML/PDF. You will probably want to modify like this it to increase image quality. Also note that if you set chart output format to SVG you will get vector graphics quality in PDF output.

Security

There are obvious reasons why you may want to keep your reports secure.

Besides that keep in mind that in BIRT Web Viewer application all reports sources (*.rptdesign files) available to user by request. Try navigating to http://localhost:8080/birt/sales.rptdesign and you'll see what I mean. I think this is a good reason, why you should use connection profile store (or at least JNDI data sources), because that files not available through HTTP.

Implementing simple (HTTP BASIC AUTH) security with Tomcat is pretty simple.

First, modify $BIRT_HOME/WEB-INF/web.xml, by adding this (you can change role names as you want):

<!-- Define a security constraint on this application -->
<security-constraint>
  <web-resource-collection>
    <web-resource-name>Entire Application</web-resource-name>
    <url-pattern>/*</url-pattern>
  </web-resource-collection>
  <auth-constraint>
    <!-- This role is not in the default user directory -->
    <role-name>manager</role-name>
  </auth-constraint>
</security-constraint>             
<!-- Define the login configuration for this application -->
<login-config>
  <auth-method>BASIC</auth-method>
  <realm-name>BIRT Report Viewer</realm-name>
</login-config>
<!-- Security roles referenced by this web application -->
<security-role>
  <description>
    The role that is required to log in to the BIRT Report Viewer
  </description>
  <role-name>manager</role-name>
</security-role>

You may also do the same for $CATALINA_HOME/conf/web.xml to secure all applications in this Tomcat instance.
Second, you should edit $CATALINA_HOME/conf/tomcat-users.xml to define user login and password.

Thats all, you're secured :) This is should be fine for most cases, but I would recommend you to read about HTTPS if your data is extremely secure.

Deploy to server

Copy Tomcat to the server:
Tip: Use scp command in terminal to transfer files from your machine to the server over SSH:
scp /usr/local/share/apache-tomcat-5.5.33 dmitrygusev@planet33.ru:/usr/local/share/
Copy JDBC Driver to the server:

Copy this driver to the same path as specified in the jarList property from planet33_v2.xml file.
DO NOT COPY this driver to $BIRT_HOME/WEB-INF/lib, because it may lead to ClassNotFoundException.

Fix file permissions. When you copy files over scp you may need to chmod them to grant read/execute access. This should fix it:

chmod a+r /usr/local/share/mysql-connector-java-5.1.17-bin.jar
chmod -R a+r /usr/local/share/apache-tomcat-5.5.33/
chmod a+x /usr/local/share/apache-tomcat-5.5.33/bin/*.sh

Now you should be able to start tomcat and run reports on the server.

Fonts. BIRT PDF output doesn't work good for Russian fonts out-of-the-box, because of licensing issues with fonts. One simple solution to fix this is:

Get the *.ttf font files you need (you can copy them from any Windows installation, look in c:\Windows\Fonts). These 8 files should be enough in most cases (these are "Arial" and "Times New Roman" fonts):

arialbd.ttf arialbi.ttf ariali.ttf arial.ttf timesbd.ttf timesbi.ttf timesi.ttf times.ttf
Copy these files to /usr/share/fonts/truetype (or any other place that is referenced from fontsConfig.xml).
Don't forget to fix file permissions:
chmod a+r /usr/share/fonts/truetype/*.ttf

Reference the fonts from *.rptdesign (or configure font-aliases):
/report/styles

<style name="report" id="4">
    <property name="fontFamily">"Arial"</property>
    <property name="fontSize">9pt</property>
</style>

Restart Tomcat:

$CATALINA_HOME/bin/shutdown.sh
$CATALINA_HOME/bin/startup.sh

Russian Localization. I used this method to do it. Although you may want to try BIRT Language Packs.

Troubleshooting.

Neither the JAVA_HOME nor the JRE_HOME environment variable is defined
At least one of these environment variable is needed to run this program

You should define JAVA_HOME variable. Execute this command before running Tomcat's *.sh files in terminal:

export JAVA_HOME=/usr/bin/java

If you get OutOfMemoryError you may want to give JVM more memory. Edit $CATALINA_HOME/bin/catalina.sh to include this (see this thread on stackoverflow, and read more about JVM memory settings):

JAVA_OPTS="$JAVA_OPTS -Xms512m -Xmx512m -XX:MaxPermSize=256m"

If you got OutOfMemoryError you most likely couldn't restart Tomcat using $CATALINA_HOME/bin/shutdown.sh script.

To kill Tomcat instance use htop command in terminal. In htop interface select Tomcat process (this is /usr/lib/java), press 'k', select 9 SIGKILL in "Send signal" area, and press Enter. To exit htop press 'q'.
Executing report never stops. Tomcat process consumes all CPU resources.

I've seen this situation when used charts in report and they were on page break. I fixed this by moving charts to other place (far from page break). Changing page size to avoid page breaks also fixes this issue.

Tuesday, September 14, 2010

Tapestry5: Caching Method Results

Assume you have methods that (almost) always return the same result for the same input arguments. If preparing method result is a heavy operation and/or it consumes time, it is reasonable to cache these results.

One way of building method cache in Tapestry5 is by implementing MethodAdvice interface like this:

public class CacheMethodResultAdvice implements MethodAdvice {

    private static final Logger logger = LoggerFactory.getLogger(CacheMethodResultAdvice.class);
    
    private final Cache cache;
    private final Class<?> advisedClass;
    private final Object nullObject = new Object();
    
    public CacheMethodResultAdvice(Class<?> advisedClass, Cache cache) {
        this.advisedClass = advisedClass;
        this.cache = cache;
    }
    
    @Override
    public void advise(Invocation invocation) {
        String invocationSignature = getInvocationSignature(invocation);
        
        String entityCacheKey = String.valueOf(invocationSignature.hashCode());

        Object result;
        
        if (cache.containsKey(entityCacheKey))
        {
            result = cache.get(entityCacheKey);

            logger.debug("Using invocation result ({}) from cache '{}'", invocationSignature, result);

            invocation.overrideResult(result);
        }
        else 
        {
            invocation.proceed();
            
            if (!invocation.isFail())
            {
                result = invocation.getResult();
                
                cache.put(entityCacheKey, result);
            }
        }
    }

    private String getInvocationSignature(Invocation invocation) {
        StringBuilder builder = new StringBuilder(150);
        builder.append(advisedClass.getName());
        builder.append('.');
        builder.append(invocation.getMethodName());
        builder.append('(');
        for (int i = 0; i < invocation.getParameterCount(); i++) {
            if (i > 0) {
                builder.append(',');
            }
            Class<?> type = invocation.getParameterType(i);
            builder.append(type.getName());
            builder.append(' ');

            Object param = invocation.getParameter(i);
            builder.append(param != null ? param : nullObject);
        }
        builder.append(')');
        
        return builder.toString();
    }
    
}

Implementation of getInvocationSignature(...) is not ideal, but you may improve it to match your requirements. One issue I see here is building invocation signature for null-value parameters in a clustered environment (which is GAE). In this implementation method nullObject.toString() will return something like java.lang.Object@33aa9b. And this value will vary in different instances of your application. You may replace nullObject with just "null" string. Just keep in mind that "null" != null.

To make this advice working you should declare it in your AppModule.java:

    @SuppressWarnings("unchecked")
    @Match("IPResolver")
    public static void adviseCacheIPResolverMethods(final MethodAdviceReceiver receiver, Logger logger, PerthreadManager perthreadManager) {
        try {
            Map props = new HashMap();

            //  IP address of URL may change, keep it in cache for one day
            props.put(GCacheFactory.EXPIRATION_DELTA, 60 * 60 * 24);
            
            CacheFactory cacheFactory = CacheManager.getInstance().getCacheFactory();
            Cache cache = cacheFactory.createCache(props);
            
            LocalMemorySoftCache cache2 = new LocalMemorySoftCache(cache);
            
            //  We don't want local memory cache live longer than memcache
            //  Since we don't have any mechanism to set local cache expiration
            //  we will just reset this cache after each request
            perthreadManager.addThreadCleanupListener(cache2);
            
            receiver.adviseAllMethods(new CacheMethodResultAdvice(IPResolver.class, cache2));
        } catch (CacheException e) {
            logger.error("Error instantiating cache", e);
        }
    }

    @Match("LocationResolver")
    public static void adviseCacheLocationResolverMethods(final MethodAdviceReceiver receiver, Cache cache) {
        //  Assume that location of IP address will never change, 
        //  so we don't have to set any custom cache expiration parameters
        receiver.adviseAllMethods(new CacheMethodResultAdvice(LocationResolver.class, cache));
    }

These declarations tell Tapestry5 to add our advice to all methods of services that implement IPResolver and LocationResolver interfaces.

Note that we able to use caches with different settings for different methods/services like in example above (see comments in code).

See also:

Tapestry5 Service Advisors

Caching in Tapestry5 Data Access Layer

How To Determine Client TimeZone In A Web Application

Monday, September 13, 2010

How To Determine Client TimeZone In A Web Application

In web applications when client and server located in different timezones we need a way to determine client timezone to display date/time sensitive information. This is almost always true for Google Appengine, where default server time zone is UTC.

There are several ways to determine client timezone.

One of them is resolving client IP address to location:

Get client IP

Get client location (latitude, longitude) by the IP-address

Get information about timezone by the location coordinates

Every web framework provides API to get client IP. For instance, in java there is a method ServletRequest.getRemoteAddr() for this purpose.

To resolve IP and location information you can use one of the numerous web services available online.

For instance, to resolve IP to location Ping Service uses IP-whois.net service.

Another service, Geonames.org provides web service API to get timezone information by latitude/longitude pair.

Here's an implementation of described approach in java:

    private TimeZone getTimeZoneByClientIP() {
        TimeZone timeZone = UTC_TIME_ZONE;
        
        try {
            String clientIP = globals.getHTTPServletRequest().getRemoteAddr();
            
            if (!Utils.isNullOrEmpty(clientIP)) {
                Location location = locationResolver.resolveLocation(clientIP);
                
                if (!location.isEmpty()) {
                    timeZone = timeZoneResolver.resolveTimeZone(location.getLatitude(), location.getLongitude());
                }
                
                if (timeZone == null) {
                    timeZone = UTC_TIME_ZONE;
                }
            }
            
            logger.debug("Resolved timeZoneId is {}", timeZone.getID());
        } catch (Exception e) {
            logger.error("Error resolving client timezone by ip " 
                    + globals.getHTTPServletRequest().getRemoteAddr(), e);
        }
        
        return timeZone;
    }

The disadvantages using this approach are:

Your code becomes dependent on third party online services that are not 100% reliable

Requesting third party services online will take time (up to several seconds) which may result in long response time

Note: according to Ping Service statistics IP-Whois.net availability is close to 100% with average response time ~270 ms, while Geonames.org availability is only around 80% with average response time ~1100 ms. Geonames.org low level availability is due to GAE hosting: Geonames.org restricts free access to its API to 3000 requests per IP per hour.

On the other hand you have really simple solution to implement that allows to determine client timezone at the very first client request so you can display all date/time sensitive data using client local time.

See also:

DISCUSSION: Time zones and date selection

Ping Service implementation of described approach

IPWhoisNetLocationResolver.java

GeonamesTimeZoneResolver.java

Update: GAE 1.6.5 introduces some request headers which already contains Lat/Lng pair for incoming request: https://developers.google.com/appengine/docs/java/runtime#Request_Headers

Thursday, September 02, 2010

Profiling GAE API calls

While optimizing performance of GAE application its convenient to measure GAE API calls.

I'm using the following implementation of com.google.apphosting.api.ApiProxy.Delegate to do this:

public class ProfilingDelegate implements Delegate<Environment> {

    private static final Logger logger = LoggerFactory.getLogger(ProfilingDelegate.class);
    
    private final Delegate<Environment> parent;
    private final String appPackage;
    
    public ProfilingDelegate(Delegate<Environment> parent, String appPackage) {
      this.parent = parent;
      this.appPackage = appPackage;
    }
    
    public void log(Environment env, LogRecord logRec) {
        parent.log(env, logRec);
    }
    
    @Override
    public byte[] makeSyncCall(Environment env, String pkg, String method, byte[] request) throws ApiProxyException {
        long start = System.currentTimeMillis();
        byte[] result = parent.makeSyncCall(env, pkg, method, request);
        StringBuilder builder = buildStackTrace(appPackage);
        logger.info("GAE/S {}.{}: ->{} ms<-\n{}", new Object[] { pkg, method, System.currentTimeMillis() - start, builder });
        return result;
    }

    /**
     * 
     * @param appPackage
     *        Only classes from this package would be included in trace.
     * @return
     */
    public static StringBuilder buildStackTrace(String appPackage) {
        StackTraceElement[] traces = Thread.currentThread().getStackTrace();
        StringBuilder builder = new StringBuilder();
        int length = traces.length;
        StackTraceElement traceElement;
        String className;
        for (int i = 3; i < length; i++) {
            traceElement = traces[i];
            className = traceElement.getClassName();
            if (className.startsWith(appPackage)) {
                if (builder.length() > 0) {
                    builder.append('\n');
                }
                builder.append("..");
                builder.append(className.substring(className.lastIndexOf('.')));
                builder.append('.');
                builder.append(traceElement.getMethodName());
                builder.append(':');
                builder.append(traceElement.getLineNumber());
            }
        }
        if (builder.length() == 0) {
            for (int i = 1; i < length; i++) {
                traceElement = traces[i];
                className = traceElement.getClassName();
                if (builder.length() > 0) {
                    builder.append('\n');
                }
                builder.append(className);
                builder.append('.');
                builder.append(traceElement.getMethodName());
                builder.append(':');
                builder.append(traceElement.getLineNumber());
            }
        }
        return builder;
    }
    
    @Override
    public Future<byte[]> makeAsyncCall(Environment env, String pkg, String method, byte[] request, ApiConfig config) {
        long start = System.currentTimeMillis();
        Future<byte[]> result = parent.makeAsyncCall(env, pkg, method, request, config);
        StringBuilder builder = buildStackTrace(appPackage);
        logger.info("GAE/A {}.{}: ->{} ms<-\n{}", new Object[] { pkg, method, System.currentTimeMillis() - start, builder });
        return result;
    }
}

To register this delegate add the following code to prior to any API calls, i.e. to filter init() method:

    public void init(FilterConfig config) throws ServletException
    {
        this.config = config;
        //  Note: Comment this off to profile Google API requests
        ApiProxy.setDelegate(new ProfilingDelegate(ApiProxy.getDelegate(), "dmitrygusev"));
    }

Here's an example of log output:


02.09.2010 0:22:19 dmitrygusev.tapestry5.gae.ProfilingDelegate makeSyncCall
INFO: GAE/S datastore_v3.BeginTransaction: ->1076 ms<-
...LazyJPATransactionManager$1.assureTxBegin:48
...LazyJPATransactionManager$1.createQuery:137
...AccountDAOImpl.findByEmail:36
...AccountDAOImpl.getAccount:26
...AccountDAOImplCache.getAccount:36
...Application.getUserAccount:395
...Application.trackUserActivity:400
...AppModule$1.service:229
...AppModule$2.service:291
...LazyTapestryFilter.doFilter:62
02.09.2010 0:22:19 dmitrygusev.tapestry5.gae.LazyJPATransactionManager$1 assureTxBegin
INFO: Transaction created (1200 ms) for context ...AccountDAOImpl.findByEmail:36
...AccountDAOImpl.getAccount:26
...AccountDAOImplCache.getAccount:36
...Application.getUserAccount:395
...Application.trackUserActivity:400
...AppModule$1.service:229
...AppModule$2.service:291

Wednesday, September 01, 2010

GAE and Tapestry5 Data Access Layer

GAE provides two ways communicating with its datastore from Java:

Using low-level API

Using JDO/JPA (with DataNucleus appengine edition)

In this post I will try to explain some performance improvements of JPA usage. Of course, there's always some overhead using high-level API. But I use JPA in Ping Service and think it worth it.

Update (17.09.2010): There is another way to communicate with GAE datastore from Java: Objectify

Spring vs. Tapestry-JPA

Its a good practice using JPA in conjunction with IoC-container to inject EntityManager into your services. At the very beginning of development I used Spring 3.0 as IoC and for transaction management. It worked, but it takes too much time to initialize during load requests, and every time user opens its first web page, he ended with DeadlineExceededException.

Then I tried tapestry-jpa from Tynamo and it fits perfectly. It runs pretty fast and allows to:

inject EntityManager to DAO classes (as regular T5 services)

manage transactions using @CommitAfter annotation

DAO and Caching

Since GAE datastore can't operate with multiple entities in a single transaction I've added @CommitAfter annotation to every method of each DAO class.

Datastore access is a an expensive operation in GAE, so I've implemented DAO-level caching:

DAO interface

public interface JobDAO {

    // ...

    @CommitAfter
    public abstract Job find(Key jobKey);
    @CommitAfter
    public abstract void update(Job job, boolean commitAfter);

DAO implementation

public class JobDAOImpl implements JobDAO {

    // ...

    @Override
    public Job find(Key jobKey) {
        return em.find(Job.class, jobKey);
    }

    public void update(Job job, boolean commitAfter) {
        if (!em.getTransaction().isActive()){
            // see Application#internalUpdateJob(Job)
            logger.debug("Transaction is not active. Begin new one...");
            
            // XXX Rewrite this to handle transactions more gracefully
            em.getTransaction().begin();
        }
        em.merge(job);
        
        if (commitAfter) {
            em.getTransaction().commit();
        }
    }

DAO cache

public class JobDAOImplCache extends JobDAOImpl {

    // ...

    @Override
    public Job find(Key jobKey) {
        Object entityCacheKey = getEntityCacheKey(Job.class, getJobWideUniqueData(jobKey));
        Job result = (Job) cache.get(entityCacheKey);
        if (result != null) {
            return result;
        }
        result = super.find(jobKey);
        if (result != null) {
            cache.put(entityCacheKey, result);
        }
        return result;
    }

    @Override
    public void update(Job job, boolean commitAfter) {
        super.update(job, commitAfter);
        Object entityCacheKey = getEntityCacheKey(Job.class, getJobWideUniqueData(job.getKey()));

        Job cachedJob = (Job)cache.get(entityCacheKey);

        if (cachedJob != null) {
            
            if (!cachedJob.getCronString().equals(job.getCronString())) {
                abandonJobsByCronStringCache(cachedJob.getCronString());
                abandonJobsByCronStringCache(job.getCronString());
            }
            
            cache.put(entityCacheKey, job);
        } else {
            abandonJobsByCronStringCache();
        }
        
        updateJobInScheduleCache(job);
    }

Notice how update method implemented in JobDAOImplCache. If DAO method changes object in database it is responsible for updating all cached object instances in the entire cache. It may be difficult to support such implementation, on the other hand it may be very effective because you have full control over cache.

Each *DAOImplCache class uses two-level JSR-107 based cache:

Level-1: Local memory (appserver instance, request scoped)
provides quick access to objects that were "touched" during current request

Level-2: Memcache (cluster wide)
allows application instances to share cached objects across entire appengine cluster

Note that local memory cache should be request scoped, or it may lead to stale data across appserver instances. To reset local cache after each request it should be registered as ThreadCleanupListener:

    public static Cache buildCache(Logger logger, PerthreadManager perthreadManager) {
        try {
            CacheFactory cacheFactory = CacheManager.getInstance().getCacheFactory();
            Cache cache = cacheFactory.createCache(Collections.emptyMap());
            
            LocalMemorySoftCache cache2 = new LocalMemorySoftCache(cache);

            //  perthreadManager may be null if we creating cache from AbstractFilter
            if (perthreadManager != null) {
                perthreadManager.addThreadCleanupListener(cache2);
            }
            
            return cache2;
        } catch (CacheException e) {
            logger.error("Error instantiating cache", e);
            return null;
        }
    }

Here's how LocalMemorySoftCache implementation looks like:

public class LocalMemorySoftCache implements Cache, ThreadCleanupListener {

    private final Cache cache;
    
    private final Map<Object, Object> map;
    
    @SuppressWarnings("unchecked")
    public LocalMemorySoftCache(Cache cache) {
        this.map = new SoftValueMap(100);
        this.cache = cache;
    }

    @Override
    public void clear() {
        map.clear();
        cache.clear();
    }

    @Override
    public boolean containsKey(Object key) {
        return map.containsKey(key)
            || cache.containsKey(key);
    }

    @Override
    public Object get(Object key) {
        Object value = map.get(key);
        if (value == null) {
            value = cache.get(key);
            map.put(key, value);
        }
        return value;
    }

    @Override
    public Object put(Object key, Object value) {
        map.put(key, value);
        return cache.put(key, value);
    }

    @Override
    public Object remove(Object key) {
        map.remove(key);
        return cache.remove(key);
    }

    // ...

    /**
     * Reset in-memory cache but leave original cache untouched.
     */
    public void reset() {
        map.clear();
    }
    
    @Override
    public void threadDidCleanup() {
        reset();
    }
}

Make Tapestry-JPA Lazy

On every request Tapestry-JPA creates new EntityManager and starts new transaction on it. And at the end of request if current transaction is still active it gets rolled back.

But if all data were taken from cache, there won't be any interaction to database. In this case EntityManager creation and transaction begin/rollback were not required. But they consumed time and another resources.

Moreover Tapestry-JPA creates EntityManagerFactory instance on application load which is very expensive, though you might not need it (because of DAO cache or simply because request isn't using datastore at all).

To avoid this I created lazy implementations of JPAEntityManagerSource, JPATransactionManager and EntityManager, you can find them here: LazyJPAEntityManagerSource and LazyJPATransactionManager.

dmitrygusev online