<< 从LiveJournal学习大规模网站架构 | 首页 | 用sun.net.ftp.FtpClient实现简单的Java FTP 上传文件代码例子 >>

Java FTP 客户端开发库上传文件比较(Java FTP client libraries reviewed)

Let's imagine a situation where we want to write a pure Java application that must download files from a remote computer running an FTP server. We also want to filter downloads on the basis of remote file information like name, date, or size.

Although it is possible, and maybe fun, to write a protocol handler for FTP from scratch, doing so is also hard, long, and potentially risky. Since we'd rather not spend the time, effort, or money writing a handler on our own, we prefer instead reusing an existing software component. And plenty of libraries are available on the World Wide Web. With an FTP client library, downloading a file can be written in Java as simply as:

FTPClient ftpClient = new FTPClient();
ftpClient.connect("ftp.foo.com", "user01", "pass1234");
ftpClient.download("C:\\Temp\\", "README.txt");
// Eventually other operations here ...
ftpClient.disconnect();

 

Looking for a quality Java FTP client library that matches our needs is not as simple as it seems; it can be quite painful. It takes some time to find a Java FTP client library. Then, after we find all the existing libraries, which one do we select? Each library addresses different needs. The libraries are unequal in quality, and their designs differ fundamentally. Each offers a different set of features and uses different types of jargon to describe them.

Thus, evaluating and comparing FTP client libraries can prove difficult and confusing. Reusing existing components is a commendable process, but in this case, starting out can be discouraging. And this is a shame: after choosing a good FTP library, the rest is routine.

This article aims to make that selection process short, easy, and worthwhile. I first list all available FTP client libraries. Then I define and describe a list of relevant criteria that the libraries should address in some way. Finally, I present an overview matrix that gives a quick view of how the libraries stack up against each other. All this information provides everything we need to make a fast, reliable, and long-lasting decision.

FTP support in JDK

The reference specification for FTP is Request for Comments: 959 (RFC959). Sun Microsystems provides an RFC959 implementation in the JDK, but it is internal, undocumented, and no source is provided. While RFC959 lies in the shadows, it is actually the back end of a public interface implementing RFC1738, the URL specification, as illustrated in Figure 1.

Figure 1. FTP support in JDK. Click on thumbnail to view full-size image.

An implementation of RFC1738 is offered as standard in the JDK. It does a reasonable job for basic FTP transfer operations. It is public and documented, and source code is provided. To use it, we write the following:

URL url = new URL("ftp://user01:[email protected]/README.txt;type=i");
URLConnection urlc = url.openConnection();
InputStream is = urlc.getInputStream(); // To download
OutputStream os = urlc.getOutputStream(); // To upload

 

FTP client support in JDK strictly follows the standard recommendation, but it has several downsides:

  • It fundamentally differs from the third-party FTP client libraries; these implement RFC959 rather than RFC1738.
  • RFC959 is implemented in most desktop FTP-client tools. Many Java programmers use these tools to connect to FTP servers. As a matter of taste, these tools most likely prefer RFC959-like libraries.
  • The URL and URLConnection classes only open streams for communication. The Sun library offers no straight support for structuring the raw FTP server responses into more usable Java objects like String, File, RemoteFile, or Calendar. So we have to write more code just to write data into a file or to exploit a directory listing.
  • As explained in section 3.2.5 of RFC1738, "Optimization," FTP URLs require that the (control) connection close after every operation. This is wasteful and not efficient for transferring many small files. Furthermore, extremely restrictive FTP servers may consider such a communication overhead as an evil network attack or abuse and deny further service.
  • Finally, it lacks several useful features.


For all or any of these reasons, using a third-party library is preferable. The following section lists the available third-party alternatives.

Library comparison

The list below outlines the libraries I compare throughout this article. They all follow the reference FTP specification. Below, I mention the provider name and the library name (in italics). Resources includes links to each product Website. To jumpstart library use, I also mention the main FTP client class.

  1. JScape, iNet Factory: com.jscape.inet.ftp.Ftp
  2. /n software, IP*Works: ipworks.Ftp
  3. Enterprise Distributed Technologies, Java FTP Client Library: com.enterprisedt.net.ftp.FTPClient
  4. IBM alphaWorks, FTP Bean Suite: com.ibm.network.ftp.protocol.FTPProtocol
  5. SourceForge, JFtp: net.sf.jftp.net.FtpConnection
  6. The Jakarta Project, Jakarta Commons/Net: org.apache.commons.net.ftp.FTPClient
  7. JavaShop JNetBeans: jshop.jnet.FTPClient
  8. Sun, JDK: sun.net.ftp.FtpClient
  9. Florent Cueto, JavaFTP API: com.cqs.ftp.FTP
  10. Bea Petrovicova, jFTP: cz.dhl.ftp.Ftp
  11. The Globus Project, Java CoG Kit: org.globus.io.ftp.FTPClient


Notes:

  • At the time of this writing, IBM is evaluating the suitability of offering its alphaWorks FTP Bean Suite on its Website. For now, download is closed for all users.
  • Jakarta Commons/Net is a drop-in replacement for Savarese NetComponents, which is no longer developed.
  • JavaShop JNetBeans seems to have been abandoned. At the time of this writing, the site has been off-line for more than a month, and I never received any answers to my support requests.


Criteria

So far, I have introduced the context and listed the available libraries. Now, I list the relevant criteria against which each library will be evaluated. I enumerate possible values for each criterion, along with the abbreviation (in bold) used in the final comparison matrix.

Product support

The libraries provide support to users through product documentation, compiled Javadocs, sample code, and an example application that can include comments and explanations. Additional support can be offered to users through forums, mailing lists, a contact email address, or an online bug tracking system. /n software offers extensive support for an additional fee.

A support administrator's motivation is an important factor for fast support. Support administrators can be:

  • A voluntary individual (I)
  • A voluntary group (G)
  • A professional entity paid to provide support (P)


License

For commercial projects, a product license is an important matter to consider from the beginning. Some libraries can be freely redistributed in commercial products and others cannot. For instance, GPL (GNU General Public License) is a strong, limiting license, while the Apache Software license only requires a mention in redistributed products.

Commercial licenses limit the number of development workstations programming with the library, but distribution of the library itself is not restricted.

For noncommercial projects, license is more a matter of philosophy; a free product is appreciable.

Licenses can be:

  • Commercial (C)
  • GPL (G)
  • Free (F); however, check a free license for limitations


Some library providers provide alternate, less-restrictive licenses on demand.

Source code provided

A closed-sourced, black-box software library can be irritating. Having source code can be more comfortable for the following reasons:

  • When debugging application code execution, stepping into the library code source can help you understand library behavior
  • The source code has useful comments
  • Source code can be quickly tweaked to match special needs
  • Exemplary source code can be inspiring


Age

Libraries have been tested, debugged, and supported since their first public release. As version numbering varies among libraries, I base this criterion on the year of the earliest public release.

Directory listing support

Retrieving remote file information (name, size, date) from the server is important in most applications. The FTP protocol offers the NLST command to retrieve the file names only; the NLST command is explicitly designed to be exploited by programs. The LIST command offers more file information; as RFC959 notes, "Since the information on a file may vary widely from system to system, this information may be hard to use automatically in a program, but may be quite useful to a human user." No other standard method retrieves file information; therefore, client libraries try to exploit the LIST response. But this is not an easy task: since no authoritative recommendation is available for the LIST response format, FTP servers have adopted various formats:

  • Unix style: drwxr-xr-x 1 user01 ftp 512 Jan 29 23:32 prog
  • Alternate Unix style: drwxr-xr-x 1 user01 ftp 512 Jan 29 1997 prog
  • Alternate Unix style: drwxr-xr-x 1 1 1 512 Jan 29 23:32 prog
  • A symbolic link in Unix style: lrwxr-xr-x 1 user01 ftp 512 Jan 29 23:32 prog -> prog2000
  • Weird Unix style (no space between user and group): drwxr-xr-x 1 usernameftp 512 Jan 29 23:32 prog
  • MS-DOS style: 01-29-97 11:32PM <DIR> prog
  • Macintosh style: drwxr-xr-x folder 0 Jan 29 23:32 prog
  • OS/2 style: 0 DIR 01-29-97 23:32 PROG


Unix style, then MS-DOS style, are the most widespread formats.

Java FTP client libraries try to understand and auto-detect as many formats as possible. In addition, they offer various alternatives for handling unexpected format answers:

  • An additional method returning a raw FTP response as one string (S)
  • An additional method returning a collection of raw strings, one string per line/file (C)
  • A framework supporting pluggable parsers (P)


Most libraries parse LIST responses and structure raw file information into Java objects. For example, with JScape iNet Factory, the following code retrieves and exploits file information received in a directory listing:

java.util.Enumeration files = ftpClient.getDirListing();
while (files.hasMoreElements()) {
   FtpFile ftpFile = (FtpFile) files.nextElement();
   System.out.println(ftpFile.getFilename());
   System.out.println(ftpFile.getFilesize());
   // etc. other helpful methods are detailed in Javadoc
}


Section "Solutions for Remaining Problems" further considers directory listings.

Timestamp retrieval

In many cases, we are interested in a remote file's latest modification timestamp. Unfortunately, no RFC introduces a standard FTP command to retrieve this information. Two de facto methods exist:

  1. Retrieve this information from the LIST response by parsing the server answer. Unfortunately, as you learned in the previous section, the LIST response varies among FTP servers, and the timestamp information is sometimes incomplete. In the Unix format, imprecision occurs when the remote file is more than one year old: only the date and year, but not hours or minutes are given.
  2. Use the nonstandard MDTM command, which specifically retrieves a remote file's last modification timestamp. Unfortunately, not all FTP servers implement this command.


An intricate alternative to MDTM command support is to send a raw MDTM command and parse the response. Most libraries provide a method for sending a raw FTP command, something like:

String timeStampString = ftpClient.command("MDTM README.txt");


Another possible concern is that FTP servers return time information in GMT (Greenwich Mean Time). If the server time zone is known apart from FTP communication, the java.util.TimeZone.getOffset() method can help adjust a date between time zones. See JDK documentation for further information about this method.

Section "Solutions for Remaining Problems" further considers file timestamp retrieval.

Firewalls

Typically, a firewall is placed between a private enterprise network and a public network such as the Internet. Access is managed from the private network to the public network, but access is denied from the public network to the private network.

Socks is a publicly available protocol developed for use as a firewall gateway for the Internet. The JDK supports Socks 4 and Socks 5 proxies, which can be controlled by some of the libraries. As an alternative, the JVM command line can set the Socks proxy parameters: java -DsocksProxyPort=1080 -DsocksProxyHost=socks.foo.com -Djava.net.socks.username=user01 -Djava.net.socks.password=pass1234 ...

Another common alternative to Socks proxy support is to "socksify" the underlying TCP/IP layer on the client machine. A product like Hummingbird can do that job.

The JDK also supports HTTP tunnels. These widespread proxies do not allow FTP uploads. /n software's IP*Works allows you to set HTTP tunnel parameters.

Most libraries support both active and passive connections: passive connection is useful when the client is behind a firewall that inhibits incoming connections to higher ports. RFC1579 discusses this firewall-friendly functionality in more detail. Some products' documentations refer to active and passive connections as PORT and PASV commands, respectively.

Parallel transfer

In a desktop application, when a transfer starts in the main single thread, everything freezes. Some libraries automatically service the event loop for parallel transfers in separate threads so we do not have to create and manage our own threads.

JavaBean specification support

Some libraries implement the JavaBean specification. JavaBean compliance allows visual programming, which is featured in major Java IDEs.

The n/ software IP*Works JavaBean design is event-based (for example, see the ipworks.Ftp.listDirectory() method). Although it remains synchronous and is perfectly safe, some programmers may find it odd or awkward in server-side applications.

Progress monitoring

Some libraries implement progress monitoring. Progress monitoring support makes it easy to implement event listeners that track any FTP transfer's progress. This feature is useful when developing a friendly user interface.

Transmission types

RFC959 section 3.1.1 specifies several transmission types, among which two are common: ASCII nonprint (default) and image (also called binary). Some libraries can be set in auto mode, according to the file extension. Such a method is rarely useful in modern information systems. Other transmission types have become obsolete and are not supported by any of the Java libraries.

Other criteria notes

All libraries run on at least JDK 1.2.x and later; most should run on JDK 1.1.x, and maybe JDK 1.0.x.

All libraries are pure Java.

The comparison matrix lists other obvious criteria.

Java FTP client libraries: Comparison matrix

Now comes the final comparison matrix. It displays libraries on top against criteria on the left. In cells, Y means Yes; other abbreviations are explained in the criterion lists above (see letters in bold) and in the table's key.

FTP Comparison Matrix

When choosing a library, I have a few recommendations:

  • For server-side applications, I recommend Jakarta Commons/Net library
  • I found JScape's iNetFactory the most easy-to-use library
  • /n software's IP*Works is part of a wide family of products, which includes encrypted FTP support
  • Java CoG Kit also implements GridFTP, an interesting new-generation file transfer system
  • At the time of this writing, I recommend not using IBM alphaWorks FTP Bean Suite and JavaShop JNetBeans in their current condition
  • Other libraries are absolutely decent and may suit your needs; please refer to the matrix


Change management

Most likely, at some point in our project, especially at the end when thorough testing occurs, we might want to change our library. Such a change affects all our calling code: our classes do not compile anymore, and some application parts must be recoded to match different method names and the new library's different design.

Since managing such a change can prove annoying, especially at a project's end when time is a critical resource, we should limit changes to one single class. Typically, we can apply the Façade pattern, with the FTP library as the back end, as illustrated in Figure 2.

Figure 2. The Façade pattern applied to an FTP library

A beneficial side effect of applying the Façade pattern to the FTP library is that we can add value to the library itself. For instance, we can write a Façade method that downloads an entire remote directory tree into a local zip file or a method that implements any basic feature lacking in the library.

Finally, two libraries that have the same signature do not necessarily have the same runtime behavior. Thus, switching from one library to another can also affect our application runtime. Such an impact is unpleasant and uncomfortable because discovering runtime differences is much more difficult, although detailed test cases can help.

Solutions for remaining problems

In the explanations of the above criteria, I briefly described several unsolved problems. In this section, I further discuss and address them; I suggest both long-term solutions and short-term workarounds.

Directory listing

The lack of any authoritative specification for the LIST response has led to many different FTP server implementations. This diversity is the biggest problem for FTP client programmers and is still an open issue.

As the problem's root lies in the protocol definition, I recommend that the concerned authoritative entity, the Internet Engineering Task Force (IETF), define the LIST response structure specification in a new reference document (an RFC).

This process can be long. In the meantime, the most flexible solution is to use a library offering a framework for pluggable format parsers.

File timestamp retrieval

As I discussed earlier, no method retrieves a remote file's last modification timestamp through FTP. I suggest two long-term solutions for transferring that timestamp from the server to the client:

  1. Include the precise and complete timestamp representation in the LIST response
  2. Standardize the MDTM command and response


For both solutions, server time zone should be considered in the communication.

Again, as the root of the problem lies in the protocol definition, I recommend that the IETF define one or both of the above solutions as an authoritative specification.

In the meantime, the most generic workaround is to use a library supporting both LIST and MDTM response parsing and exploit a combination of these two features.

Change management

In the related section above, I recommended the Façade pattern to reduce change effort in case of library replacement. As I mentioned, the pattern does not serve as a panacea, because the diversity in behaviors among libraries can still affect our entire application at runtime, which is difficult to control.

As this concern is a pure programming matter, I recommend that Sun publish a standard well-designed API, defining precise method signatures and behaviors. Anyone, including Sun, could implement it. Programmers could use the interface methods and back them up with their preferred implementation. And any switch from one library to another would have minimal impact on the rest of the application. JavaMail and JDBC (Java Database Connectivity) APIs are exemplary precedents.

The Java FTP API Standardization project aims to organize a consortium of users, developers, and providers to introduce a Request For Enhancement as a Java Specification Request in the Java Community Process. Your support would certainly be useful to this project, the homepage of which can be found in Resources.

Check out the best library for your needs

In this article, I explained how to write FTP client code in Java and presented FTP client support in the JDK and third-party libraries. I presented important criteria to consider when evaluating various libraries and compared the criteria across libraries. I hope decision-makers facing the choice of a Java FTP client library find useful indications in this objective study to make the best decision.

Finally, I presented different problems common to all FTP libraries and suggested short-term workarounds as well as long-term solutions that could be adopted by authoritative entities like IETF and Sun. I hope these leads and actions will help forge the future of Java FTP client libraries.

Author Bio

Jean-Pierre Norguet holds an engineering degree in computer science from the Universite Libre de Bruxelles and a Socrates European master's degree from the Ecole Centrale Paris. After three years of full-time Java development with IBM on mission-critical e-business applications, as team leader and coach, his areas of expertise grew to include the entire application development life cycle. He now works as a research fellow in Brussels, Belgium, writing a PhD thesis about Internet audience analysis. His outside interests include artistic drawing, French theater acting, and well-being massage.

Resources

标签 : ,


Avatar: Tim Archer

Re: Java FTP 客户端开发库上传文件比较(Java FTP client libraries reviewed)

Jakarta Commons Net is also a pretty good library for performing FTP transfers. I did a writeup on how to use Jakarta Commons Net to perform FTP operations at the following URL: http://timarcher.com/?q=node/56 It has a pretty good example on sending and retrieving files, deleting files, changing directories, etc.

发表评论 发送引用通报