Home » excel » excel – How to get MATLAB xlsread to read until a last row of a contiguous <<data-range>>?

excel – How to get MATLAB xlsread to read until a last row of a contiguous <<data-range>>?

Posted by: admin May 14, 2020 Leave a comment

Questions:

I want to use xlsread in MATLAB to read an Excel file.

While I know which columns I want to read from, and which row I want to start reading from, the file could contain any number of rows.

Is there a way to do something like:

array = xlsread( 'filename', 'D4:F*end*' );              %% OR ANY SIMILAR SYNTAX

Where F*end* is the last row in column F?

How to&Answers:

Yes. Try this:

FileFormat = '.xls' or '.xlsx';                         % choose one
                                                        % ( by default MATLAB
                                                        % imports only '.xls' )
filename   = strcat( 'Filename you desire', FileFormat );

array      = xlsread( filename )                        % This will read all
                                                        % the Matrix ( by default
                                                        % MATLAB will import all
                                                        % numerical data from
                                                        % file with this syntax )

Then you can look to the size of the matrix to refine the search/import.

[nRows,nCols] = size( array );

Then if the matrix you want to import just parts of the matrix, you can do this:

NewArray = xlsread( filename, strcat( 'initial cell',
                                      ':',
                                      'ColumnLetter',
                                      num2str( nRows )
                                      )
                   );
% for your case:

NewArray = xlsread( filename, strcat( 'D3', ':', 'F', num2str( nRows ) ) );

Hope this helps.

Answer:

In xls format excel files, 65536 seems to be limit of number of rows that you can use. You can use this number with F and that will basically tell MATLAB to search till the end of file. That’s all I could gather from little digging up work on these and this trick/hack seems to work alright.

To sum up, this seems to do the trick for xls files –

array = xlsread('filename', 'D4:F65536')  

For xlsx files, the limit seems to be 1048576, so the code would change to –

array = xlsread('filename', 'D4:F1048576')  

External source to confirm the limit on number of rows –

Excel versions 97-2003 (Windows) have a file extension of XLS and the
worksheet size is 65,536 rows and 256 columns. In Excel 2007 and 2010
the default file extension is XLSX and the worksheet size is 1,048,576
rows and 16,384 columns.

Answer:

You could read column by column:

col1= xlsread( 'filename', 'D:D' );
col2= xlsread( 'filename', 'E:E' );
col3= xlsread( 'filename', 'F:F' );
...

Don’t provide row numbers (such as D12:D465), Matlab will deal with D:D like you would expect. col1, col2 and col3 will have different sizes depending on how much data was extracted from each column.

I haven’t tried something like this thought, I don’t know if it would work:

    colAll= xlsread( 'filename', 'D:F' );

Answer:

No, But…

MATLAB does not have either documented or undocumented feature for doing this directly.

The maximum one can use under direct MATLAB support is to:

___ = xlsread(filename,-1) opens an Excel window to interactively select data.

      Select the worksheet, drag and drop the mouse over the range you want,
      and click OK.
      This syntax is supported only on Windows systems with Excel software.

Still, how to approach the task efficiently and future-proof?

The “blind” black-box approach would be to first test the boundary of the contiguous area, where your data is present — use any feasible iterator, first forward-stepping by doubling a blind-test step-distance of a tested cell alike aRowToTEST = ( aRowToStartFROM + aRowNumberDistanceToTEST ) and in case the tested cell contains a number, set aLastNonEmptyROW = aRowToTEST; double the aRowNumberDistanceToTEST and repeat.

In case aRowToTEST points “behind” the format-specific maximum row number, set aRowToStartFROM = aLastNonEmptyROW; and reset the forward-stepping distance aRowNumberDistanceToTEST = 1; to continue forward-stepping iterations with a doubling-step stepping. If this again hits the limit, having the step == 1 and yet pointing right “behind” the format-specific limit, your sheet-under-review contains data until its last row ( finishing on the format-specific “edge” ).

But once the target cell is empty/NaN, stop the forward-stepping phase and start a standard back-stepping phase by halving the interval between a found/failed ( empty ) cell aFirstEmptyROW = aRowToTEST; and the last known cell at aLastNonEmptyROW, that contained number.

Again, if a cell under test contained a fair value, move the aLastNonEmptyROW-boundary to aRowToTEST value, if not, move the same way aFirstEmptyROW-boundary.

Finally set aBackSteppingSTEP = ( aFirstEmptyROW - aLastNonEmptyROW )/2; aRowToTEST = aFirstEmptyROW - aBackSteppingSTEP;.

Iterate the above until your step is < 1 and thus you have iteratively found the contiguous data-area boundary.

This is way faster and incomparably more efficient than a raw-dumb-import-whole-sheet and works until both a 64k or 1M or any further upper-limit of an XLS rowNumber.

Having the boundary, simply array = xlsread( 'filename', 'D4:F<<aLastNonEmptyROW>>' );