This post is part of a series. If you can’t understand what is detailed below, or can’t find the information you’re looking for, take a look at the index page.
As explained previously, each Worksheet object Win32OLE uses contains many rows, columns, and cells. You can access these by calling them directly (assuming you already have a ‘ws’ variable referring to a worksheet as explained here):
second_row = ws.Rows(2) # the range containing the worksheet's second row
second_column = ws.Columns(2) # the range containing the worksheet's second columns
a_cell = ws.Cells(1,2) # the cell on the first row, second column
cell_b2 = ws.Range("B2") # cell B2 (more on ranges below)
Take the time to note a few things:
- the ‘Rows’, ‘Columns’, and ‘Cells’ attributes are all plural
- indexes start at 1: ws.Rows(1) will correspond to the topmost row in the worksheet
- the code above will return a reference to a row/column/cell, and not the actual data contained
Acessing the data
So how do we actually access the data? Use the ‘Value’ attribute:
cell_content = ws.Cells(1,2).Value # put the cell's data into a variable
new_content = "Data test"
ws.Cells(1,2).Value = new_content # set the cell's data to "Data test"
The same can be done for ranges. For example, let’s copy the header information contained in the first row of one worksheet to another:
header_data = ws.Rows(1).Value
# you could do some other computation here
a_different_ws.Rows(1).Value = header_data
The UsedRange attribute
Before we move onto other things we can do to our cells, let me quickly tell you about a useful attribute (especially for loops): UsedRange. This contains the area of the work sheet that is actually used (i.e. the biggest row and column number with data in it). So to iterate along each row, we would do the following:
for row in 1..ws.UsedRange.Rows.Count do # note that we are including the last value, due to the way Excel indexes rows
ws.Cells(row,1).Value = "This is row number #{row}."
end
Remember the ‘Count’ attribute we used earlier? You can use it here to count the rows that are used in the worksheet.
Manipulating cells and ranges
Let’s get back on track: what else can you do with cells and ranges? Plenty of useful things:
ws.Rows(1).Font.Bold = true # set all font in the topmost row to bold
ws.Columns(1).Interior.Color = 6 # make the first column yellow (see comment after code)
ws.Columns(3).NumberFormat = "#,##0" # set the 3rd column's format to display numbers with commas
ws.Columns(4).NumberFormat = "mm/dd/yyyy" # set the 4th column's format to display dates in a specific format
ws.Columns(5).NumberFormat = "0.0%" # set the 5th column's format to display percentages with 1 decimal
ws.Rows(5).Delete # will delete the 5th row (see comment after code)
Note that the background color number are documented here. Also, bear in mind that after deleting the 5th row, row number 5 is what was previously the 6th row. This detail is particularly important if you are deleting rows or columns in a loop: you might have to adjust the row pointer that is used to fetch the next range. For example, to delete all rows where the first cell contains ‘zz’:
correction = 0
for i in 1..ws.UsedRange.Rows.Count do
row = i - correction
if ws.Cells(row,1).Value =~ /zz/
ws.Rows(row).Delete
correction += 1 # correct the row pointer to account for the deleted row
end
end
Working with formulas
Win32OLE will also give you access to the formula a cell contains. Therefore, you could add a sum to each column in a worksheet this way:
last_row = ws.UsedRange.Rows.Count
for col in ws.UsedRange.Columns.Count do
ws.Cells(last_row+1, col)['Formula'] = "=SUM(#{ws.Cells(1,col).Address}:#{ws.Cells(last_row,col).Address})"
end
As you’ve probably guessed, the ‘Address’ attribute will return a given cell’s address: ws.Cells(2,2).Address will return “$B$2”.
Working with ranges
We touched upon ranges earlier, and they are handy when dealing with blocks of data. If you wish data from ‘first_row’ to ‘last_row’ from the ‘ws_data’ worksheet to another worksheet called ‘ws_special’, for example, you can simply use a range:
ws_special.Range(ws_special.Rows(1), ws_special.Rows(last_row - first_row)).Value = ws_data.Range(ws_data.Rows(first_row), ws_data.Rows(last_row)).Value # copy the values from one range to another
ws_data.Range(ws_data.Rows(first_row), ws_data.Rows(last_row)).Delete # delete the original range
Notice that to use the range object, you give to arguments referencing objects of the same type: two rows (as in our example), two columns, or two cells. If two cells are supplied as arguments, the range will be the rectangular area having the first cell as its upper left corner and the second cell as its lower right corner.
Ranges can also be sorted:
ws.UsedRange.Sort('Header'=>1, 'Key1'=>ws.Range("B1") , 'Order1'=>1)
In this example, we’re sorting the entire worksheet (as indicated by the range being UsedRange), according to the data in the second column (key1 argument). This column has a header and we are sorting it in ascending order. (Note: in Excel 2003, you can sort using up to 3 criteria. Simply add the appropriate ‘Key’ and ‘Order’ arguments to the method call: ‘Key2’, ‘Order2’, etc.)
Microsoft’s documentation indicates the argument for the order is ‘xlSortOrder’: not very helpful… To be able to use functions in your Win32OLE code, you’ll need to use the numerical equivalents to these constants. You have several options to get them:
- Semi-automatically loading Excel’s constants into a class
- Creating a macro and setting a cell’s value to the constant for which you want to know the numerical equivalent
- Searching the web
By now, you should be well on your way to doing cool stuff with Win32OLE and Ruby. We can now move on to formatting the worksheets for the screen and printing.