I already made article about how fast arrays can be comparing to Excel tables. You know that. You know also that arrays need some RAM memory to exist. But can we increase array performance with its split and descrease RAM usage? In this article I will show You if it is possible.
In this post I will go through data preparation, first approach (1 big array), second approach (split array), results of tests and results summary.
Random data for tests
First of all, I generated for myself random data.
Option Explicit
Sub random_data()
Dim i As Long, j As Long
Dim ws As Worksheet
Application.ScreenUpdating = False
Set ws = ThisWorkbook.Sheets(1)
With ws
For i = 1 To 100000
For j = 1 To 20
.Cells(i, j).Value = i + j * 10000 - 10000
Next
Next
End With
Application.ScreenUpdating = True
End Sub
For the tests I started with 20 columns and ended with the area of 120 columns.
1 big array approach
In this approach the code is measuring the size of the table, putting it into array and cell by cell divide their values by 2. After all calculations new values in array are coming back into the table area.
Sub big_array_test()
Dim i As Long, j As Long, ws As Worksheet, arr As Variant, lastRow As Long, lastCol As Long, tajm As Double
tajm = Timer
Set ws = ThisWorkbook.Sheets(1)
Application.ScreenUpdating = False
With ws
lastRow = .Cells(.Rows.Count, 1).End(xlUp).Row
lastCol = .Cells(1, .Columns.Count).End(xlToLeft).Column
arr = .Range(.Cells(1, 1), .Cells(lastRow, lastCol))
For i = LBound(arr, 1) To UBound(arr, 1)
For j = LBound(arr, 2) To UBound(arr, 2)
arr(i, j) = arr(i, j) / 2
Next
Next
.Range(.Cells(1, 1), .Cells(lastRow, lastCol)) = arr
End With
Application.ScreenUpdating = True
Debug.Print Format(Timer - tajm, "0.00")
End Sub
split array approach
In the second method I implemented 2 additional variables. The first one – x – stands for the number of loops to iterate the whole range and the second – lx – is constant value made of lastRow and x division. There is also additional for loop for the x small arrays out of main data area.
Sub split_array_test()
Dim i As Long, j As Long, k As Long, x As Long, lx As Long, ws As Worksheet, arr As Variant, lastRow As Long, lastCol As Long, tajm As Double
tajm = Timer
Set ws = ThisWorkbook.Sheets(1)
Application.ScreenUpdating = False
With ws
lastRow = .Cells(.Rows.Count, 1).End(xlUp).Row
lastCol = .Cells(1, .Columns.Count).End(xlToLeft).Column
x = 5
lx = lastRow / x
For k = 1 To x
arr = .Range(.Cells(1 + (k - 1) * lx, 1), .Cells(lx + (k - 1) * lx, lastCol))
For i = LBound(arr, 1) To UBound(arr, 1)
For j = LBound(arr, 2) To UBound(arr, 2)
arr(i, j) = arr(i, j) * 2
Next
Next
.Range(.Cells(1 + (k - 1) * lx, 1), .Cells(lx + (k - 1) * lx, lastCol)) = arr
Next
End With
Application.ScreenUpdating = True
Debug.Print Format(Timer - tajm, "0.00")
End Sub
Time results
I tested those 2 methods with the data range of 100 000 rows and 20, 60 & 120 columns for x =5. Also for 120 columns with different x values – 10,20,25 & 100.
For less than 100 000 rows and 120 columns the time differences were really small, hundredth places differences, so publishing the results for them would be waste of place and time.
For the 100 000 rows, 120 columns and x = 5 the results were “more visible”. And I’m putting here only the averaged results.
method | I | II |
time [s] | 9.08 | 8.63 |
Time result for the second approach, to split big array into smaller ones, is faster. The code itself seemed to be more complex and longer, but in reality it works a little bit faster. I could notice that tendency in smaller data.
In that point I considered about how other x variable values will impact on the time result of 2nd method. So the best way was to test that, for x equals 10,20,25 and 100.
x | 10 | 20 | 25 | 100 |
time [s] | 8.53 | 8.22 | 8.34 | 8.79 |
You can clearly see that the smallest avaraged time was for x = 20. Also the difference between I method and best result for II method is almost 1 second (0.86s) which is almost 10%!
And it may grow in case of more complex data.
RAM memory results
Whats more, I checked as well RAM memory usage. The basic value for Microsoft Excel for this workbook was 268 MB (after opening, no action). I checked usage for single array (breakpoint on array creation) and during work of code for both methods.
Methods | Single array | RUN |
I | 534 MB | 799 MB |
II | 276 MB | 295MB |
Not only the 2nd method was faster, it was less RAM memory consuming too! While the 2nd approach took only 8 MB of additional memory, the first took 266 MB in Single array. In RUN mode the first method increased to almost 800 MB all together, while 2nd got not even to 300 MB.
Let’s summarize that
Splitting array can increase performance! Yes, it can do it.
I think that in case of more complex data that time result can improve, a lot. Also to split array approach can save your RAM memory, so not only it will increase the performance of your code, but also it will work also on the slower/weaker or more loaded units.
Hello Tomasz, great article thank you! I was wondering if you had any advice for me I have a file with around 470,000 lines and 30 or so columns. I have a VBA script that sorts and deletes data in two different columns. Currently it takes 30-45 minutes to complete… I have tried optimizing as much as possible but I cannot reduce it more. Do you have any suggestions? Thank you!
Hello Paul
I’d love to, but if You provide here kinda bigger picture, for example what kind of action do You perform there, what kind of loop or anything else it would be helpful to know what is the situation.
Sure! One of the columns is made up of codes in form 60000. Some start with 1 some with 9 etc. For this column I have to delete all rows that do not start with number 6 with the exception of 620400. I do not use a loop for this. The second column is made up of different names and I created a list of the ones I need to keep telling the code to delete anything that is not on the list. I don’t know if this is enough information. If you need more please let me know! Thank you for any help!
You do not use the loop for this, but what do You use?