Home » excel » r – Extracting columns from table Excel and merging them into another table

r – Extracting columns from table Excel and merging them into another table

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have many (hundreds) excel documents with around 10 columns and 10 rows each.

My objective is to create separate txt files containing the first column and the second column, then another file containing the first and the third and so on… and the same for the rest of excel files.

Is there any way of doing this in Excel? Rather, would it be possible to apply a batch command in R to get into the Excel files (previously exported to CSV or of the kind) to produce separate txt files containing the pairing of columns?

How to&Answers:

Here is the one possible way to do it in R. This is only for one csv file, but it can easily be adapted for many files.

##Simulate data
write.csv(matrix(rnorm(100),ncol=10),file="test.csv",row.names=FALSE)
data1<-read.csv("test.csv")

##Create the matrix containing the columns numbers for exporting. 
##Note the code is not nice. There is a function which gives this 
##matrix immediately, but I forgot it.
rr<-numeric()
for(i in 1:9) for(j in (i+1):10) rr<-rbind(rr,c(i,j))

##Write the columns in separate files
for(i in 1:nrow(rr)) write.csv(data1[,rr[i,]],file=paste("output1_",paste(rr[i,],collapse="_"),".csv",sep=""),row.names=FALSE)

This code takes one file named test.csv and produces files of type output1_coln1_coln2.csv where coln1 and coln2 are the column numbers.

For many files wrap this into a function and loop over all csv files.

Answer:

And with the looping over files:

fnames<-list.files(pattern = "myFile*.csv")
fnums<-as.integer(sub(".csv", "", sub("myFile", "", fnames, fixed=TRUE), fixed=TRUE))

for(i in seq_along(fnums))
{
    dta<-read.csv(fnames[i])
    #halfnumcols<-dim(dta) %/% 2
    #for(j in (seq(halfnumcols)-1))
    #{
    #   write.csv(dta[,j*2+c(1,2)], paste("resultFile", i, ".", (j+1), ".csv", sep=""))
    #}
    #EDIT: instead of neighbor pairs, run over all pairs
    numcols<-dim(dta)[2]
    apply(combn(seq(numcols), 2), 2, function(curcomb){
        write.csv(dta[,curcomb)], paste("resultFile", i, ".", curcomb[1], ".", curcomb[2], ".csv"))
    })
}