Hi everyone, I received data in a excel (xls) spreadsheet that is formatted in the first table, illustrated above.
I am attempting to rearrange this data into the format, in the table, just below. Any help would be greatly appreciated.
First, save it to a
import csv curr =  with open('file.csv') as infile, open('path/to/output', 'w') as fout: outfile = csv.writer(fout) for area, pop10, pop20, pop50 in csv.reader(infile): if curr and curr != area: outfile.writerow(curr) curr = [area, pop10, pop20, pop50] continue if pop10: curr = pop10 if pop20: curr = pop20 if pop50: curr = pop50
You can do this pretty succinctly using Pandas:
import pandas as pd dataframe = pd.read_excel("in.xlsx") merged = dataframe.groupby("AREA").sum() merged.to_excel("out.xlsx")
so, if the csv has 11 columns where ‘AREA’ is the second column, would the code be:
def CompressRow(in_csv,out_file): curr =  with open(in_csv) as infile, open(out_file, 'w') as fout: outfile = csv.writer(fout) for a,b,c,d,e,f,g,h,i,j,k in csv.reader(infile): if curr and curr != b: outfile.writerow(curr) curr = [a,b,c,d,e,f,g,h,i,j,k] continue if a: curr = a if c: curr = c if d: curr = d if e: curr = e if f: curr =f if g: curr=g if h: curr=h if i: curr=i if j: curr=j if k: curr=k #execute CompressRow(in_csv,out_file) I tried executing it and it gives me if a: curr=a IndexError: list assignment index out of range