I have a bunch of numbers or text in different cells e.g.:
2 50 900 1000 6 10 10 30
a b c d e
I need to sort them according to a starter number and a divider. For example, if the starter is 3, I will start with the third value, which in the numbers above will be be 900 and in alphabet will be “c”
Then from starter I need to skip a fixed number each time, which is the divider. For example, if the divider is 3 then I will need every third value. In the numbers the next number to pick is 10, and in the alphabet the next value to pick is “a”.
When the search reaches the end of the range, it needs to start from start and again from the beginning.
If the value has been picked before, then I have to select the next one that has not been used.
Here are more examples, using the number or letter sequences above:
starter:3 – divider:3
- 900 10 2 1000 10 50 6 30
- c a d b e
starter:2 – divider:2
- 50 1000 10 30 900 6 10 2
- b d a c e
Note that here, after 30 I would select 50, but because I’ve already selected it, I select the next unused number, in this case 900. Sometimes it may happen that two or three numbers are used before, so the selecting algorithm should jump to first unused one!
starter : 4 – divider :2
- 1000 10 30 50 6 10 2 900
- d a c e b
Anyway I’m not totally sure how to do it in Excel. I tried to use offset, index, lookup
The data type doesn’t matter, I just wanted to give two examples, so I chose to give one in numbers and one in text, since the rules for any type of data should be same.
Is there any simple way to solve this or do I have to get my hands dirty and write a macro?
Fun little mathematical exercise 🙂
For an example, I put the range of values in the top row
A B C D E 1 aa bb cc dd ee 2 bb dd aa cc ee
A2 I put:
and then just dragged the formula to the right. The example shows a “starter” of 2 and a “divider” of 2.
A few key points
We’re using index to choose one member of the range:
=INDEX($A$1:$E$1, ... )
Which column we are in (starting with zero):
Number of columns in the range (the array-length):
The index is modulo the number of columns, but then we switch to 1-based indexes for the INDEX function, hence the subtracting and adding:
MOD( ... -1,COLUMNS($A$1:$E$1))+1
This is the portion of the index that tells us where we’d be if we weren’t worried about repeating numbers (or, more precisely, it’s a number congruent to the desired index modulo the array-length):
And this portion adds 1 every time we repeat:
This last part works because the GCD of the array-length and the “divider”, as you call it, is equal to the number of non-overlapping, repeating sequences that exist as you add multiples of the divider mod the array-length. (You can only be on one of these repeating sequences at a time.) So, the array-length / the GCD is the length of such a sequence, and once you’ve used that number of values you’ll need to skip 1 to get to the next repeating sequence. We just divide our position in the output by the number of values in a repeating sequence
current position / (array-length / GCD) =
current position * GCD / array-length, rounding down (using INT), to see how much of an offset we need.