Enabling large file transfers through SOAP with Spring WS, MTOM/XOP and JAXB2

June 13th, 2012 7 comments

Some people might say that using a SOAP web service to transfer large binary files is the wrong way to do it. I agree. If you can handle large file transfers using some other method, do it. But, using SOAP to send the files has some advantages and so I will try and describe how I have gotten it to work using Spring web services and Apache Axis 2 or Apache CXF.

I am going to assume that you already know how to set up web services with spring.

The technology stack that I used is:

  • Spring WS (2.0.4)
  • SAAJ
  • JAXB2
  • Axis 2 or Apache CXF (for the client)

What is MTOM/XOP?

MTOM stands for Message Transmission Optimization Mechanism. It is a method of sending binary data along with an XML SOAP request.

When sending smaller files it is much simpler to base64 encode the file and include the base64 encoded string inside an XML node as text. The receiving server would then decode the string and obtain the binary data and use it as it sees fit.

The problem with this approach when sending a large amount of binary data is 3 fold:

  1. base64 encoding data takes time.
  2. base64 encoding data increases its size by 33%.
  3. Most SOAP implementations will try to load the entire SOAP message into memory causing out of memory errors.

Concerning point number 3. MTOM does not actually handle this itself (MTOM is a method and not an implementation). But it allows implementations to cache the binary data on a hard disk without having to load it into memory. I will talk about this later.

When a SOAP message has an MTOM attachment, the binary data of the attachment is stored “below” the actual XML under a MIME header. This allows the SOAP message to be free of the burden of having to carry a very large string of base64 encoded data. But how do we get the binary data if it isn’t located in the XML itself? This is where XOP comes in.

XOP is a method of referencing the binary data in your message from inside the SOAP XML envelope. When the server or client receives a SOAP message it needs to know where in the message the binary data is. By looking at the XOP node we can find the binary data from the Content ID. It looks something like this:

<xop:Include href="cid:0123456789" xmlns:xop="http://www.w3.org/2004/08/xop/include"/>

The CID lets the SOAP message unmarshaller/receiver know where the binary data is inside the entire payload so it can be referenced later. When the XML is being created it allows you to handle the XML without the binary data as well.

Streaming binary data

Enabling MTOM/XOP sometimes isn’t enough to send large attachments. Certain implementations of SOAP message factories and marshallers still try to load the entire message into memory when it is sent or received. When the data you need to send starts to exceed hundreds of megabytes this starts to become a problem. MTOM does reduce the amount of binary data by 33% but it doesn’t stop the client/server from trying to load it in memory.

So how do we send a 10 gig file with a SOAP request? The answer is by leveraging technologies that support steaming/caching the data without loading it into memory first. Of course there are a few frameworks that allow this functionality. A very popular option is Apache CXF. In my case I needed to use Spring web services inside tomcat.

Using Spring and JAXB2 to stream and cache large MTOM requests

I will begin with the server configuration.

So the ultimate goal here is to use SOAP to send an XML document (that contains XML nodes and information) and include with it VERY large binary attachments without having the client or server run out of memory during or after the transfer.

There is A LOT of misinformation about this on the internet. Some people say that Spring does not support streaming attachments, some people say you need to use the AxiomSoapMessageFactory instead of the default SaajSoapMessageFactory. Some people swear by Apache CXF. In this case we are going to use basic spring web services and JAXB2.

Proper XSD usage

I am going to assume you already have an XSD file(s) that are being auto generated into a WSDL. This is basic spring web services stuff. If you haven’t gotten this far yet you should turn around and do more reading about spring before continuing here.

When using JAXB to marshall/unmarshall XML documents we need to specify datatypes. For instance for strings you might use the datatype xs:string. In our case we need to use a datatype that maps to a Java object that can handle reading/writing files in a way that does not load them into memory completely. For this we need to use a DataHandler object. Here is an example XSD that will auto generate java source files using a DataHandler instead of a byte array to hold binary data:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
	xmlns:tns="http://example.com/ws/file" 
	xmlns:xmime="http://www.w3.org/2005/05/xmlmime"
	targetNamespace="http://example.com/ws/file">
 
	...
 
	<xs:complexType name="FileData">
		<xs:all>
			<xs:element name="fileData" type="xs:base64Binary"
				xmime:expectedContentTypes="*/*" />
		</xs:all>
	</xs:complexType>
 
	...
</xs:schema>

As you can see we import the xmlmime name space and use the xmime:expectedContentTypes=”*/*” attribute on the node that will contain the binary data (or rather the XOP reference to our MIME binary data). With the above XSD we can use JAXB to create a Java object representation of our document that uses a DataHandler instead of a byte array.

Spring configuration

Spring needs to know that we are using MTOM. Specifically the JAXB2 unmarshaller needs to know. If we do not tell the marshaller that we are using MTOM we will get the dreaded out of memory error. Here is an example spring configuration to enable the Jaxb2Marshaller with MTOM enabled:

<bean id="messageReceiver"
	class="org.springframework.ws.soap.server.SoapMessageDispatcher">
	<property name="endpointAdapters">
		<list>
			<ref bean="defaultMethodEndpointAdapter" />
		</list>
	</property>
</bean>
 
<bean id="marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
	<property name="classesToBeBound">
		<list>
			<!-- When you generate your JAXB classes from your XSD. Place them here. -->
			<value>com.example.ws.MyGeneratedJavaClass</value>
		</list>
	</property>
 
	<!-- This is the important part! -->
	<property name="mtomEnabled" value="true" />
</bean>
 
<bean id="marshallingPayloadMethodProcessor"
	class="org.springframework.ws.server.endpoint.adapter.method.MarshallingPayloadMethodProcessor">
	<constructor-arg ref="marshaller" />
	<constructor-arg ref="marshaller" />
</bean>
 
<bean id="defaultMethodEndpointAdapter"
	class="org.springframework.ws.server.endpoint.adapter.DefaultMethodEndpointAdapter">
	<property name="methodArgumentResolvers">
		<list>
			<!-- Be careful here! You might need to add more processors if you do more than webservices! -->
			<ref bean="marshallingPayloadMethodProcessor" />
		</list>
	</property>
	<property name="methodReturnValueHandlers">
		<list>
			<ref bean="marshallingPayloadMethodProcessor" />
		</list>
	</property>
</bean>

If you have a better way of doing this please let me know. This is not a perfect configuration but does work for a server that will only be doing web services.

SAAJ MimePull System Property

In order for SAAJ to be able to stream attachments and save them to the hard drive we need to enable MimePull. To do so set a JVM system property like so:

-Dsaaj.use.mimepull=true

There is a JIRA entry describing this option.

Temporary folder to store files being received

Since we are not loading the contents of the file into memory we need to store them somewhere while the transmission is taking place. The server or more specifically SAAJ, will use the currently set java temp file location to store the files temporarily. The default location is different based on OS. Here is how you configure the temporary file location for the JVM:

-Djava.io.tmpdir=/path/to/tmpdir

Just add it to your JVM system properties. Do not forget to manually clear this temporary folder periodically. The files are not automatically removed!

The server should now be complete.

Configuring the client (Axis 2)

For the client portion I am using Axis 2 and/or Apache CXF. I am not going to cover generating the client stub classes in this post.

Since the client also needs to be able to receive files and not run out of memory we need to enable file caching and specify a folder to temporarily store the files as they are downloaded (the folder also requires manual cleaning!). Here we enable MTOM, set the caching threshold (at what size should we cache a file or not), set the folder where we want to store the temporary files and set a timeout large enough to support the file size we are uploading:

// Axis 2 configuration
 
// First we set our MTOM settings and options.
Options mtomEnableServiceOptions = new Options();
mtomOptions.setProperty( Constants.Configuration.ENABLE_MTOM, Constants.VALUE_TRUE );
mtomOptions.setProperty( Constants.Configuration.ATTACHMENT_TEMP_DIR, "c:/temp/axisclient/" );
mtomOptions.setProperty( Constants.Configuration.CACHE_ATTACHMENTS, Constants.VALUE_TRUE );
mtomOptions.setProperty( Constants.Configuration.FILE_SIZE_THRESHOLD, "1024" );
mtomOptions.setTimeOutInMilliSeconds( TIMEOUT );
 
MyServiceStub service = new MyServiceStub();
 
// Set the options on the service stub.
service._getServiceClient().setOptions( mtomOptions );
 
// Set the endpoint URL.
EndpointReference wRef = new EndpointReference();
wRef.setAddress( "http://localhost:8080/ws/" );
service._getServiceClient().setTargetEPR( wRef );
 
//At this point you would set the data into your stub object.
MyRequest wRequest = new MyRequest();
 
DataHandler wHldr1 = new DataHandler( new URL(
				"file:///c:/verylargefile.zip" ) );
 
// Stick wHdlr1 into your generated request class.
wRequest.setFileData( wHldr1 );
 
// Send the request.

A word of warning. At the time of writing this post I fell victim to a bug in Axis 2. For some reason it is unable to download and cache MTOM files correctly. To fix this… and I hate to do this. I needed to comment out 1 line from the auto generated client stub.

You can find it inside your generated service stub class.

if (_messageContext.getTransportOut() != null) {
    // Comment out this line.
    //_messageContext.getTransportOut().getSender().cleanup(_messageContext);
}

Configuring the client (Apache CXF)

Apache CXF is slightly more complicated to get setup when generating the classes and stubs but it works out of the box without any bugs. Here is an Apache CXF example for uploading a large attachment to a service:

MyService wService = new MyService(new URL("http://localhost:8080/my.wsdl"));
 
// Get the port.
My wMyClient = wService.getMySoap11();
 
// Set client receive timeout to unlimited.
// If we are sending really large files a timeout would be bad.
Client cl = ClientProxy.getClient( wMyClient );
HTTPConduit http = (HTTPConduit)cl.getConduit();
HTTPClientPolicy httpClientPolicy = new HTTPClientPolicy();
httpClientPolicy.setConnectionTimeout( 0 );
httpClientPolicy.setReceiveTimeout( 0 );
http.setClient( httpClientPolicy );
 
// Set MTOM enabled.
javax.xml.ws.BindingProvider bp = (javax.xml.ws.BindingProvider)wMyClient;
SOAPBinding binding = (SOAPBinding)bp.getBinding();
binding.setMTOMEnabled( true );
 
//At this point you would set the data into your stub object.
MyRequest wRequest = new MyRequest();
 
DataHandler wHldr1 = new DataHandler( new URL(
				"file:///c:/verylargefile.zip" ) );
 
// Stick wHdlr1 into your generated request class.
wRequest.setFileData( wHldr1 );
 
// Send the request
MyResponse wResponse = wMyClient.my( wRequest );

At this point you should be able to send and receive very large files. I have tested (on a local server) a file that was 11 gigabytes without any memory issues.

Another small note. Sending a receiving files on the client side have different behaviors. Usually an Axis or CXF client can receive an MTOM file without many modifications. The above examples are more for sending large files rather than receiving. It might take some tuning for it to work for you.

Categories: Java, Programming, Spring, Tomcat Tags: , ,

Bandwidth monitoring with vnStat

February 24th, 2011 No comments

Monitoring bandwidth usage with vnStat is extremely easy. There are 2 ways you can monitor bandwidth with vnStat. In this post I will show you how using the cron job method as opposed to the daemon method.

Installation

vnStat can be installed on debian based distros by typing:

sudo apt-get install vnstat vnstati

Creating the database

Once installed we need to create a database where we can store the bandwidth usage information. In this step you need to know the network interface name of the interface you want to monitor. If you want to monitor more than one interface you need to run this command once per interface you want to monitor.

How do you find the right interface? It depends on your system and what exactly you want to monitor. To get a list of interfaces on your machine, type:

ifconfig

You might see something that looks similar to this:

eth0      Link encap:Ethernet  HWaddr 00:14:5E:5A:BE:75
          inet addr:XXX.XXX.XXX.XXX  Bcast:XXX.XXX.XXX.XXX  Mask:255.255.255.0
          inet6 addr: **************** Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:12533815559 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13465552400 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6190479498840 (5.6 TiB)  TX bytes:2713637024693 (2.4 TiB)

eth0 is the main interface in my example and is the standard interface on most Linux machines. For the rest of this post we will be using eth0 as the interface we want to monitor.

Now that we know the name of the interface we want to monitor we need to create a database for the vnStat data. We can create one for eth0 by typing:

vnstat -u -i eth0

This will create a new database file in /var/lib/vnstat/ called eth0.

Periodically updating the database

Now that we have a database to store the bandwidth transfer information, we need to periodically update this database with new data. The method this post will describe is the cron job method of updating the database. Calling the vnstat update command line with a cron job is a very simple and effective way of updating the database.

One thing to understand is that if your server has a very extreme usage the way the vnstat program detects usage might not work properly due to the kernal rolling over the usage meter too quickly. If you think your usage is very high you might want to consider using vnstatd, the daemon, instead of the cron method.

Creating a cron entry varies depending on which operating system you are using. I will not go into detail on how to do this for each distribution so here is the cron entry to run vnstat to update the database every 1 hour:

0 * * * * vnstat -u

Most distributions have an /etc/crontab file which can be edited. Addin the line above to that file will run vnstat every 1 hour and update all the databases located in /var/lib/vnstat. If you only want to update a database for a specific interface you must add the -i switch and the interface name like so.

0 * * * * vnstat -u -i <interface>

Viewing your bandwidth usage

To view your usage at the console just type:

vnstat

Here is a list of arguments to be able to see different statistics on your bandwidth usage:

$ vnstat --help
 vnStat 1.10 by Teemu Toivola 
 
         -q,  --query          query database
         -h,  --hours          show hours
         -d,  --days           show days
         -m,  --months         show months
         -w,  --weeks          show weeks
         -t,  --top10          show top10
         -s,  --short          use short output
         -u,  --update         update database
         -i,  --iface          select interface (default: eth0)
         -?,  --help           short help
         -v,  --version        show version
         -tr, --traffic        calculate traffic
         -ru, --rateunit       swap configured rate unit
         -l,  --live           show transfer rate in real time
 
See also "--longhelp" for complete options list and "man vnstat"

Visualizing your bandwidth usage

When we installed vnStat we also installed vnstati. vnStati is a program that reads the database created by vnstat and generates really cool graphs and tables to easily view your bandwidth usage.

There are many different visualization presets in vnstati, check out the man page for more. A simple summary can be created by using the command:

vnstati -s -i eth0 -o /my/output/folder/eth0-summary.png

This will create a 500×200 png image that contains a simple summary of your bandwidth usage. By saving this image in a location that is accessible by a web server you can access this image from anywhere to check your bandwidth usage remotely.

You can also create a cron job to generate the image every hour after you have updated the database. Here is an example of the summary output:

vnStat summary example

vnStat is a great tool for someone looking for a simple bandwidth monitor with no hassle installation.


Categories: Linux, Networking Tags: ,

SSH password field takes long to load.

February 23rd, 2011 No comments

Sometimes after a fresh reinstall of  SSH Server client in Linux you will notice that when you try to connect to your server using a SSH Client the password screen takes a LONG time to show up after entering your username. You need to configure your SSH configurations.

Once logged in your server, edit the following file sshd_config which can be found in /etc/ssh/ (Linux/BSD) folder.

Add this to the bottom of the file:

UseDNS NO

save the file then reboot and try to login and see if the problem is still there.

Categories: BSD, Linux Tags:

How to find out what Unix version / flavor is running?

February 23rd, 2011 1 comment

Type the below command to see the EXACT version of unix is running:

To see which version/Unix is running:  cat /etc/issue

To see which kernal version is running: uname -a

Categories: Linux Tags:

Switch uplink with Cisco not working

February 23rd, 2011 No comments

Sometimes after configuring your switch port uplink,  you might not get a link.

Check the cable to see if it is straight or cross-over. Make sure the cable pin out is correct.

Check to see that both ports are enabled/on/working.

What happens alot is the device might not support MDIX. If you have MDIX enabled you will need to use a cross-over cable. If both devices have MDIX then it should work.

Check to see if your switch needs the speed to be set manually or set on auto.

Another issue to look at is if one end is not a Cisco device, say Foundry for example. The Foundry might be using a 10/100/1000 port and your cisco device is using a 10/100 port. There will be no link as well even if you don’t need 1000 speed and have it set to 100. Both ports need to be 10/100/1000.   Example: Cisco 10/100 port connected to a Foundry 10/100/1000 port, you will get NO LINK.

If your switch has a Gibic slot you need to get yourself a 10/100/1000 Gibic (Fiber or Copper). If you are getting a fiber Gibic make sure you get the right one. There is Multimode/Singlemode depending on the fiber you are using. There is also different types of fiber connectors such as LS, LX, LC,SC etc… Find out which connector is on BOTH ends of the fiber. There is also speed differences for the Gibic connector that needs to be paid attention to. You will need to research the model of your switch/router and the gibic connectors/speeds.

Categories: Cisco, Networking Tags:

Configure a range of ports on a Cisco switch

February 23rd, 2011 No comments

The below commands will allow you to configure a range of ports instead of configuring one port a time.

Enter enable mode by typing “en” and enter your password. Type:

config t

to enter configuration mode.

Once in:

Switch(config)#

Type:

int range fastEthernet 0/1 - 24

Now the ports range 1-24 are in configuration mode. You will see this:

Switch(config-if)#

Now enter the command you wish to apply the will be applied on the ports specified:

Switch(config-if)# switchport port-security (to enable port security)
Switch(config-if)# switchport access vlan 100 (to set port(s) on vlan 100)

Categories: Cisco, Networking Tags:

Windows 7 corrupted user profile

February 23rd, 2011 No comments

If you ever get a message that your profile is corrupted in Windows 7.

First Create a new user account with a NEW name:

Click on the Windows button on the bottom left corner.

Select “Control Panel”.

Select “Add or remove user accounts under ” “User Accounts and Family Safety”. (Default windows layout/theme)

Select “Create a new account”.

Type in the new account name and choose “Administrator” then click on Create account on the bottom.

A new account has been created. Proceed to Step 2.

2) Copy your old user files to the new account folder.

For this step you need to copy the files from one user to another. You cannot copy files from a “source” or “destination” account, so please create a third user account by following the steps above again if you don’t have three accounts. Once completed follow the steps below.

Login to the user that the files are not the destination or source.

Click on the Windows button Picture of the Start button on the bottom left corner and then select Computer.

Double-click on your Local C: Drive. Open the  Users folder.

Open the  folder “USERS” and then open the folder with your OLD username and then open the “My Documents” folder.

Type “ALT+T” to open the Tools menu. Select  Folder Options.

Select the “VIEW” tab on top and select “Show hidden files, folders, and drives”.

Make sure “Hide protected operating system files check box” is NOT selected. Select Yes to confirm the changes and then click OK.

Now open your “Local C: Drive” again and open your OLD Username folder in ” C:\Users\OLD_USERNAME\”.

Copy all the files and folders inside except the three files: Ntuser.dat, Ntuser.dat.log, Ntuser.ini

Copy by highlighting all the files and press CTRL and deselect the three unwanted files. While the files are highlighted, type “CTRL-C

By typing CTRL-C the files are copied in the systems clipboard.

Now we need to paste these files to your NEW user account.(Not the account you are logged in with)

Open the folder that your new account you will be using in “C:\Users\New_Useraccount\”

Type CTRL-V to paste the files from the old account to your new account.

Log off this temp account and try logging in the new created account where you just copied the files into.

NOTE: Please keep in mind email profiles(Outlook) and some other softwares that are dependent to the user profile might not work and will need to be recreated.

 

Categories: Windows, Windows 7 Tags:

Icons not working in Windows 7

February 23rd, 2011 No comments

Click on the Windows menu button on the bottom left corner and select “Computer”.

Go to your “Local Disk C:\ ”  and go to the following location:  C:\Users\(User Name)\AppData\Local

Right click on IconCache.db file and choose Delete. (This file keeps your icon details)

Empty your recycle bin and restart your computer.

If your Icons still do not show up properly, please view our “Fix user profile in Windows 7″

Categories: Windows, Windows 7 Tags:

Google Web Toolkit Firefox Plugin Download

August 30th, 2010 No comments

I spent about 15 minutes trying to find a place to download the plugin since going to my GWT app page wasn’t allowing me to download the plugin automatically. All I saw was a missing-plugin/ folder. I asked Firefox to search for possible plugins to download but it was unable to find any.

To get the plugin installed via other means, just go here:

http://gwt.google.com/samples/MissingPlugin

Follow the instructions and you’re all set.


Categories: Java, Programming Tags: , , ,

Zip an entire folder using Java

July 28th, 2010 5 comments

There are some good code samples out there for doing this but I put my own simple twist on it. All the code online for zipping a directory copied the ENTIRE directory structure into the zip file.

private static void zipDir( String origDir, File dirObj, ZipOutputStream out )
				throws IOException {
	File[] files = dirObj.listFiles();
	byte[] tmpBuf = new byte[1024];
 
	for ( int i = 0; i < files.length; i++ ) {
		if ( files[i].isDirectory() ) {
			zipDir( origDir, files[i], out );
			continue;
		}
		String wAbsolutePath =
			files[i].getAbsolutePath().substring( origDir.length(),
				files[i].getAbsolutePath().length() );
		FileInputStream in = 
			new FileInputStream( files[i].getAbsolutePath() );
		out.putNextEntry( new ZipEntry( wAbsolutePath ) );
		int len;
		while ( (len = in.read( tmpBuf )) > 0 ) {
			out.write( tmpBuf, 0, len );
		}
		out.closeEntry();
		in.close();
	}
}

origDir = String version of the directory you want to zip.
dirObj = File object of the dir you want to zip.
out = ZipOutputStream you are zipping to.

Example usage:

try {
	ZipOutputStream out =
		new ZipOutputStream( 
			new FileOutputStream( "c:\outputfolder" ) );
	zipDir( "c:\mypath\long\long", new File( "c:\mypath\long\long" ), out );
	out.close();
} catch ( FileNotFoundException e ) {
	e.printStackTrace();
} catch ( IOException e ) {
	e.printStackTrace();
}

While we are finding each file recursively we remove the super long path from the file so we can get a zip file which the root is the top folder of the path you specified. So instead of having a zip file which contains folders like “mypath\long\long\myfile” we just get “myfile” and all subdirectories above the path specified.

It’s not perfect but it works.


Categories: Java, Programming Tags: