Home » excel » Make an equivalent to Excel's =PERCENTRANK.EXC function in R?

Make an equivalent to Excel's =PERCENTRANK.EXC function in R?

Posted by: admin April 23, 2020 Leave a comment

Questions:

I was wondering how would I convert the Excel’s Percentile rank exclusive function in R. I found a technique here which is like this:

true_df <- data.frame(some_column= c(24516,7174,13594,33838,40000))

percentilerank<-function(x){
  rx<-rle(sort(x))
  smaller<-cumsum(c(0, rx$lengths))[seq(length(rx$lengths))]
  larger<-rev(cumsum(c(0, rev(rx$lengths))))[-1]
  rxpr<-smaller/(smaller+larger)
  rxpr[match(x, rx$values)]
}
dfr<-percentilerank(true_df$some_column)

#output which is similar to =PERCENTRANK.INC and NOT =PERCENTRANK.EXC
#[1] 0.50 0.00 0.25 0.75 1.00

But it is for =PERCENTRANK.INC equivalent in R. According to info popup in Excel, a =PERCENTRANK.INC takes (array, x-value of rank, [significance-optional]) and returns percentage rank inclusive of the first (0%) and last (100%) values in the array.

=PERCENTRANK.EXC is similar to its counterpart but it returns percentage rank exclusive of the first and last values in the array. Meaning not 0% or 100%.

Here is a small example using Excel to show difference:

enter image description here

When I apply the above R function it gives me the output similar to PERCENTRANK.INC($A$32:$A$36,A32) column. How can I achieve this? I’m new to R.

How to&Answers:

Using dplyr:

library(dplyr)

# inclusive
percent_rank(x)

# exclusive
percent_rank(c(-Inf, Inf, x))[-(1:2)]

Answer:

I messed around with the code and got this:

true_df <- data.frame(some_column= c(24516,7174,13594,33838,40000))

percentilerank<-function(x){
  rx<-rle(sort(x))
  smaller<-cumsum(c(!0, rx$lengths))[seq(length(rx$lengths))]
  larger<-rev(cumsum(c(0, rev(rx$lengths))))
  rxpr<-smaller/(smaller+larger)
  rxpr[match(x, rx$values)]
}

dfr<-percentilerank(true_df$some_column)

#output is now matches =PERCENTRANK.EXC 
#[1] 0.5000000 0.1666667 0.3333333 0.6666667 0.8333333

Since the 0 and 100% are not included in the percentile. I changed the line smaller<-cumsum(c(0.... to smaller<-cumsum(c(!0.... and similarly to get rid of 100% where I took out [-1] from line larger<-...[-1]