Split array to increase performance

I already made article about how fast arrays can be comparing to Excel tables. You know that. You know also that arrays need some RAM memory to exist. But can we increase array performance with its split and descrease RAM usage? In this article I will show You if it is possible.

In this post I will go through data preparation, first approach (1 big array), second approach (split array), results of tests and results summary.

Random data for tests

First of all, I generated for myself random data.

Option Explicit

Sub random_data()

Dim i As Long, j As Long
Dim ws As Worksheet

Application.ScreenUpdating = False

Set ws = ThisWorkbook.Sheets(1)

With ws
    For i = 1 To 100000
        For j = 1 To 20
            .Cells(i, j).Value = i + j * 10000 - 10000
        Next
    Next
End With

Application.ScreenUpdating = True

End Sub

For the tests I started with 20 columns and ended with the area of 120 columns.

1 big array approach

In this approach the code is measuring the size of the table, putting it into array and cell by cell divide their values by 2. After all calculations new values in array are coming back into the table area.

Sub big_array_test()

Dim i As Long, j As Long, ws As Worksheet, arr As Variant, lastRow As Long, lastCol As Long, tajm As Double

tajm = Timer

Set ws = ThisWorkbook.Sheets(1)

Application.ScreenUpdating = False

With ws
    lastRow = .Cells(.Rows.Count, 1).End(xlUp).Row
    lastCol = .Cells(1, .Columns.Count).End(xlToLeft).Column
    
    arr = .Range(.Cells(1, 1), .Cells(lastRow, lastCol))
    
    For i = LBound(arr, 1) To UBound(arr, 1)
        For j = LBound(arr, 2) To UBound(arr, 2)
            arr(i, j) = arr(i, j) / 2
        Next
    Next
    
    .Range(.Cells(1, 1), .Cells(lastRow, lastCol)) = arr
    
End With

Application.ScreenUpdating = True

Debug.Print Format(Timer - tajm, "0.00")

End Sub

split array approach

In the second method I implemented 2 additional variables. The first one – x – stands for the number of loops to iterate the whole range and the second – lx – is constant value made of lastRow and x division. There is also additional for loop for the x small arrays out of main data area.

Sub split_array_test()

Dim i As Long, j As Long, k As Long, x As Long, lx As Long, ws As Worksheet, arr As Variant, lastRow As Long, lastCol As Long, tajm As Double

tajm = Timer

Set ws = ThisWorkbook.Sheets(1)

Application.ScreenUpdating = False

With ws
    lastRow = .Cells(.Rows.Count, 1).End(xlUp).Row
    lastCol = .Cells(1, .Columns.Count).End(xlToLeft).Column
    
    x = 5
    lx = lastRow / x
    
    For k = 1 To x
    
        arr = .Range(.Cells(1 + (k - 1) * lx, 1), .Cells(lx + (k - 1) * lx, lastCol))
        
        For i = LBound(arr, 1) To UBound(arr, 1)
            For j = LBound(arr, 2) To UBound(arr, 2)
                arr(i, j) = arr(i, j) * 2
            Next
        Next
        
        .Range(.Cells(1 + (k - 1) * lx, 1), .Cells(lx + (k - 1) * lx, lastCol)) = arr
        
    Next
    
End With

Application.ScreenUpdating = True

Debug.Print Format(Timer - tajm, "0.00")

End Sub

Time results

I tested those 2 methods with the data range of 100 000 rows and 20, 60 & 120 columns for x =5. Also for 120 columns with different x values – 10,20,25 & 100.

For less than 100 000 rows and 120 columns the time differences were really small, hundredth places differences, so publishing the results for them would be waste of place and time.

For the 100 000 rows, 120 columns and x = 5 the results were “more visible”. And I’m putting here only the averaged results.

methodIII
time [s]9.088.63

Time result for the second approach, to split big array into smaller ones, is faster. The code itself seemed to be more complex and longer, but in reality it works a little bit faster. I could notice that tendency in smaller data.

In that point I considered about how other x variable values will impact on the time result of 2nd method. So the best way was to test that, for x equals 10,20,25 and 100.

x102025100
time [s]8.538.228.348.79

You can clearly see that the smallest avaraged time was for x = 20. Also the difference between I method and best result for II method is almost 1 second (0.86s) which is almost 10%!

And it may grow in case of more complex data.

RAM memory results

Whats more, I checked as well RAM memory usage. The basic value for Microsoft Excel for this workbook was 268 MB (after opening, no action). I checked usage for single array (breakpoint on array creation) and during work of code for both methods.

 MethodsSingle arrayRUN
I534 MB799 MB
II276 MB295MB

Not only the 2nd method was faster, it was less RAM memory consuming too! While the 2nd approach took only 8 MB of additional memory, the first took 266 MB in Single array. In RUN mode the first method increased to almost 800 MB all together, while 2nd got not even to 300 MB.

Let’s summarize that

Splitting array can increase performance! Yes, it can do it.

I think that in case of more complex data that time result can improve, a lot. Also to split array approach can save your RAM memory, so not only it will increase the performance of your code, but also it will work also on the slower/weaker or more loaded units.

Author: Tomasz Płociński

I'm very advanced in VBA, Excel, also easily linking VBA with other Office applications (e.g. PowerPoint) and external applications (e.g. SAP). I take part also in RPA processes (WebQuery, DataCache, IBM Access Client Solutions) where I can also use my SQL basic skillset. I'm trying now to widen my knowledge into TypeScript/JavaScript direction.

4 thoughts on “Split array to increase performance”

  1. Hello Tomasz, great article thank you! I was wondering if you had any advice for me I have a file with around 470,000 lines and 30 or so columns. I have a VBA script that sorts and deletes data in two different columns. Currently it takes 30-45 minutes to complete… I have tried optimizing as much as possible but I cannot reduce it more. Do you have any suggestions? Thank you!

    1. Hello Paul
      I’d love to, but if You provide here kinda bigger picture, for example what kind of action do You perform there, what kind of loop or anything else it would be helpful to know what is the situation.

      1. Sure! One of the columns is made up of codes in form 60000. Some start with 1 some with 9 etc. For this column I have to delete all rows that do not start with number 6 with the exception of 620400. I do not use a loop for this. The second column is made up of different names and I created a list of the ones I need to keep telling the code to delete anything that is not on the list. I don’t know if this is enough information. If you need more please let me know! Thank you for any help!

Leave a Reply

Your email address will not be published. Required fields are marked *