Home » excel » I have an excel sheet of 85,038 rows, how do I randomly select 10% of these?

I have an excel sheet of 85,038 rows, how do I randomly select 10% of these?

Posted by: admin March 9, 2020 Leave a comment

Questions:

There are 5 columns (first name, email, userid, app name) and I want to randomly select 10% of these rows and export it eventually to a CSV while maintaining the column headers i listed above. thanks a million

How to&Answers:

I don’t know how random you want this to be but adding a column containing =RANDBETWEEN(1,85038) copied down to suit, then sorting that column and selecting the first 8,504 rows should give quite an ‘arbitrary’ result.

Answer:

Are you familiar with SQL and the Microsoft Query functionality in Excel (Data ->…-> From Microsoft Query)?

If yes then use this

( SELECT "first name, email, userid, app name" )
UNION   
( SELECT TOP 8503 t.[first name] & "," & t.[email] & "," &  t.[userid] & "," & t.[app name] 
FROM [Sheet1$] AS t ORDER BY RND() )

Then copy an paste to an empty text file and save as CSV

You can also use my SQL addin for this http://blog.tkacprow.pl/?page_id=130

EDIT 1: I assumed that “Sheet1” is the name of your worksheet

Answer:

Here’s a possible solution for you using Array Formula.
Suppose you have data in Column A (in this example I used 100 data only).

enter image description here

Now in C2, type the following formula: (Credits to Oscar.)

=IF(ROW(A1)<=0.1*COUNTA($A$2:$A$101),INDEX($A$2:$A$101, LARGE(MATCH(ROW($A$2:$A$101), ROW($A$2:$A$101))*NOT(COUNTIF($C$1:C1, $A$2:$A$101)), RANDBETWEEN(1,ROWS($A$2:$A$101)-ROW(A1)+1))),"")

Use Ctrl+Shift+Enter to get the formula to work.
Using just Enter will return #N/A.
Then to get the rest of the values, just drag the formula down.
In this example, I just auto-fill up to C20.

Note: Randbetween is volatile. So recalculation happens everytime you change something. If you are to return 8k data, that would be a lot of recalculation. It may take a while.

Answer:

I personally used a handy and useful plugin or lets say add-on specially for the Microsoft Excel 2016 / 64 bit. It is called Kutools.
You can freely download and use it via this link:

Download Link (for both 32 Bit & 64 Bit)
Kutools Website

After downloading and installing you can select random number of rows from the kutools tab-> Range -> Sort Range Randomly ->Select
then you can enter the amount of your need to select the rows from and that’s it.

Fig of Kutool tab
Fig of Select Tab

Answer:

enter image description here

I think this will help to generate any percentage from a list.

If Col A has your list
In Col B: RAND() and fill down.
In Col C:
IF(ROW()>10%*COUNTA($A$2:$A$37),””,INDEX($A$2:$A$37,RANK.AVG(B2,$B$2:$B$37,0),1)) and fill down. Only the proportion you want will appear in the list.