Fishing for POI

Creating Excel or Word files from Java...

Boost IT visibility and business value

Have you ever needed to create a Microsoft Excel or Word file from Java? If you have, did you try to do it from scratch yourself? If you were working with Excel, did you end up creating comma-separated data in a file (CSV files)?

CSV files work very well as long as all you are interested in is the raw data. But what if you are interested in including formulas in your data or need to format your spreadsheet appropriately (with centring, colours, bold, italics etc)? Did you give up in the end, or work with a compromise solution?

As it happens I was recently asked this very question by some Java developers. They were working with a web-based application and wanted to create both Excel and Word files for senior management to access. These files would hold dynamically generated data relevant to their organisation. They therefore needed to be generated programmatically as and when required.

The POI Project

Creating Excel and Word files is hard, not least due to the complex nature of the file formats used by Microsoft for Excel and Word. That is, formats based upon Microsoft's OLE 2 Compound Document format. However, one of the Apache projects does all the hard work for you and makes it very easy to create, read and update Excel, and soon, Word files. This project is called POI. It has already been in development for several years, starting in April 2001. It is currently in version 2.5. You can download it here.

POI is actually more like a number of combined projects, which allow you to create both Word and Excel format files. It can be divided into several sub-projects, in particular:

* POIFS, the oldest and most stable part of the project, which provides facilities for reading and writing OLE 2 Compound Document files.

* HWPF, a port of the Microsoft Word 97 file format to pure Java.

* HSSF, a port of the Microsoft Excel 97(-2002) file format (BIFF8) to pure Java.

In this column we will focus on the use of POI to create Excel files using HSSF.

HSSF for Excel files.

You may wonder what HSSF stands for. Rather provocatively it stands for Horrible SpreadSheet Format (indeed many of the elements of POI have quite provocative names, e.g. DDF - Dreadful Drawing Format, which is the Microsoft Office Drawing format, otherwise known as Escher format).

HSSF provides a way to create Excel spreadsheets as well as to read, modify and write existing spreadsheets. All together it provides:

* low level structures for those with special needs

* an event model API for efficient read-only access

* a full user model API for creating, reading and modifying XLS files

Creating an Excel file with POI

Let's look at the basics of what is needed to create an Excel file. First, we need to create a workbook and add a sheet to it. We will then need to add values for cells, formulas and the like. The program presented in figure 1 illustrates how we can do this using POI.

The code required to create an Excel file using POI

The first thing that this program does is create a new HSSFWorkBook (in line 14). We then create a sheet from this book (in line 15). Finally, we obtain the first row in the sheet in line 16. Note that in POI rows and columns are numbered from zero. Thus, a cell A1 in Excel is obtained from row 0, and cell 0 (in that row).

We now have a workbook object, with a single sheet in it (called Sheet1). In turn the sheet contains a single row.

Lines 18 to 25 now create a set of cells to form that row in order to hold headings for each column in the sheet. Lines 27 to 35 provide data for our very simple spreadsheet. In all cases we obtain the cells in the sheet by accessing the appropriate row element and retrieving a cell form within that row (note that the createCell(short) method takes a short value rather than an int - we thus need to cast to a short when calling this method using an integer literal).

To set the value within the cell we use the setCellValue method. This is an overloaded method, which can take a Boolean, a string, a double, Date or Calendar object. It can thus represent most types of data held in a spreadsheet. It also helps define the type of the data (e.g. cells set with a string will be textual, whereas cells set with a numeric are numerical).

One cell deviates from this; cell 3 in row 1. In this cell we use the method setCellFormula(String). In fact, we pass the string "B2*C2" to this method. This sets the cell to hold a formula where its value will be calculated by executing this formula.

The setCellFormula method takes a string and uses it as a formula for the cell. In our case, the formula is very simple - it multiplies the value held in cell B2 (the second cell in row 1) with the value held in cell C2 (the third cell in row 1). Notice the cells we obtain are from row 1, and cells 1 and 2, but the formula references them as cells B2 and C2. Also notice that the formula does not include the "=" at the start - this will be automatically added by POI.

Once the Spreadsheet has been defined it can be written out to file. This is done in lines 37-39. This creates a FileOutputStream to a file called "text1.xls" and uses the write method on the Workbook object to write its contents out to file. The end result is that the file saved to the file system is now an Excel file that is indistinguishable from any other Excel file. In Figure 2 I have opened this file using Excel:

What the Excel file generated in fig 1 looks like in Excel

The essential guide to IT transformation

More from The Register

next story
Munich considers dumping Linux for ... GULP ... Windows!
Give a penguinista a hug, the Outlook's not good for open source's poster child
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Intel's Raspberry Pi rival Galileo can now run Windows
Behold the Internet of Things. Wintel Things
Microsoft cries UNINSTALL in the wake of Blue Screens of Death™
Cache crash causes contained choloric calamity
Eat up Martha! Microsoft slings handwriting recog into OneNote on Android
Freehand input on non-Windows kit for the first time
Time to move away from Windows 7 ... whoa, whoa, who said anything about Windows 8?
Start migrating now to avoid another XPocalypse – Gartner
You'll find Yoda at the back of every IT conference
The piss always taking is he. Bastard the.
prev story


5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.