Win32OLE and Ruby: working with cells and data
This article is part of a series on processing data in Excel with Ruby and Win32ole.
As explained previously, each Worksheet object Win32OLE uses contains many rows, columns, and cells. You can access these by calling them directly (assuming you already have a ‘ws’ variable referring to a worksheet as explained here):
# the range containing the worksheet's second row
second_row = ws.Rows(2)
# the range containing the worksheet's second columns
second_column = ws.Columns(2)
# the cell on the first row, second column
a_cell = ws.Cells(1,2)
# cell B2 (more on ranges below)
cell_b2 = ws.Range("B2")
Take the time to note a few things:
-
the
Rows
,Columns
, andCells
attributes are all plural -
indexes start at 1:
ws.Rows(1)
will correspond to the topmost row in the worksheet - the code above will return a reference to a row/column/cell, and not the actual data contained
Acessing the data
So how do we actually access the data? Use the Value
attribute:
# put the cell's data into a variable
cell_content = ws.Cells(1,2).Value
new_content = "Data test"
# set the cell's data to "Data test"
ws.Cells(1,2).Value = new_content
The same can be done for ranges. For example, let’s copy the header information contained in the first row of one worksheet to another:
header_data = ws.Rows(1).Value
# you could do some other computation here
a_different_ws.Rows(1).Value = header_data
The UsedRange attribute
Before we move onto other things we can do to our cells, let me quickly tell you about a useful attribute (especially for loops): UsedRange
. This contains the area of the work sheet that is actually used (i.e. the biggest row and column number with data in it). So to iterate along each row, we would do the following:
# note that we are including the last value,
# due to the way Excel indexes rows
for row in 1..ws.UsedRange.Rows.Count do
ws.Cells(row,1).Value = "This is row number #{row}."
end
Remember the Count
attribute we used earlier? You can use it here to count the rows that are used in the worksheet.
Manipulating cells and ranges
Let’s get back on track: what else can you do with cells and ranges? Plenty of useful things:
# set all font in the topmost row to bold
ws.Rows(1).Font.Bold = true
# make the first column yellow (see comment after code)
ws.Columns(1).Interior.Color = 6
# set the 3rd column's format to display numbers with commas
ws.Columns(3).NumberFormat = "#,##0"
# set the 4th column's format to display dates in a specific format
ws.Columns(4).NumberFormat = "mm/dd/yyyy"
# set the 5th column's format to display percentages with 1 decimal
ws.Columns(5).NumberFormat = "0.0%"
# will delete the 5th row (see comment after code)
ws.Rows(5).Delete
Note that the background color number are documented here. Also, bear in mind that after deleting the 5th row, row number 5 is what was previously the 6th row. This detail is particularly important if you are deleting rows or columns in a loop: you might have to adjust the row pointer that is used to fetch the next range. For example, to delete all rows where the first cell contains ‘zz’:
correction = 0
for i in 1..ws.UsedRange.Rows.Count do
row = i - correction
if ws.Cells(row,1).Value =~ /zz/
ws.Rows(row).Delete
# correct the row pointer to account for the deleted row
correction += 1
end
end
Working with formulas
Win32OLE will also give you access to the formula a cell contains. Therefore, you could add a sum to each column in a worksheet this way:
last_row = ws.UsedRange.Rows.Count
for col in ws.UsedRange.Columns.Count do
ws.Cells(last_row+1, col)['Formula'] =
"=SUM(#{ws.Cells(1,col).Address}:#{ws.Cells(last_row,col).Address})"
end
As you’ve probably guessed, the Address
attribute will return a given cell’s address: ws.Cells(2,2).Address
will return "$B$2"
.
Working with ranges
We touched upon ranges earlier, and they are handy when dealing with blocks of data. If you wish data from
first_row
to last_row
from the ws_data
worksheet to another worksheet called ws_special
, for example, you can simply use a range:
# copy the values from one range to another
ws_special.Range(ws_special.Rows(1), ws_special.Rows(last_row - first_row)).Value =
ws_data.Range(ws_data.Rows(first_row), ws_data.Rows(last_row)).Value
# delete the original range
ws_data.Range(ws_data.Rows(first_row), ws_data.Rows(last_row)).Delete
Notice that to use the range object, you give to arguments referencing objects of the same type: two rows (as in our example), two columns, or two cells. If two cells are supplied as arguments, the range will be the rectangular area having the first cell as its upper left corner and the second cell as its lower right corner.
Ranges can also be sorted:
ws.UsedRange.Sort('Header'=>1, 'Key1'=>ws.Range("B1") , 'Order1'=>1)
In this example, we’re sorting the entire worksheet (as indicated by the range being UsedRange
), according to the data in the second column (key1
argument). This column has a header and we are sorting it in ascending order. (Note: in Excel 2003, you can sort using up to 3 criteria. Simply add the appropriate ‘Key’ and ‘Order’ arguments to the method call: ‘Key2’, ‘Order2’, etc.)
Microsoft’s documentation indicates the argument for the order is ‘xlSortOrder’: not very helpful… To be able to use functions in your Win32OLE code, you’ll need to use the numerical equivalents to these constants. You have several options to get them:
- Semi-automatically loading Excel’s constants into a class
- Creating a macro and setting a cell’s value to the constant for which you want to know the numerical equivalent
- Searching the web
By now, you should be well on your way to doing cool stuff with Win32OLE and Ruby. We can now move on to formatting the worksheets for the screen and printing.
This article is part of a series on processing data in Excel with Ruby and Win32ole.
Would you like to see more Elixir content like this? Sign up to my mailing list so I can gauge how much interest there is in this type of content.