Home » excel » python – Calculate new columns from existing

# python – Calculate new columns from existing

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have excel files with anywhere from 30-35 columns and 50-1500 rows of data depending on the company. The columns in question are as follows; USED REMAIN REFUND. These three columns are made up of calculations from other columns.

USED is each row of GAL added up as it goes along so the excel calculation looks like this: starts with one =W2, then the next row is W2+W3 then W3+W4 and so on

REMAIN is ASSIGNED-USED

REFUND is GAL*CREDIT

Would something like this even be possible, currently I’m doing all the calculations in excel which is time consuming and after some research I figure it would be easier to code something to automate this.
Grateful for any help even if its just the calculations for one column

I’ve looked for some ideas online and figure pandas is the best way of going about it, but if anything else is suggested I’m open to anything

``````import pandas as pd
filename = home/itdept/Documents/BestWines.xlsx
df = pd.read_excel(filename)
df['Refund'] = df['QUANTITY IN GAL']*df['CBMA Credit']
df.head(5)
df.to_excel("path to save")
``````

This is what I came up with for the one column: Refund, I wasn’t sure how/if I could incorporate all of the other columns in the code as well

How to&Answers:

Consider `Series.cumsum` for cumulative sum:

``````df['USED'] = df['GAL'].cumsum()
``````

From there, any basic arithmetic like subtraction and multiplications can be run directly on columns:

``````# SUBTRACTION
df['REMAIN'] = df['ASSIGNED'] - df['USED']

# MULTIPLICATION
df['REFUND'] = df['QUANTITY IN GAL'] * df['CBMA Credit']
``````

Or their functional forms, `sub`, and `mul` (among other similar operators):

``````# SUBTRACTION
df['REMAIN'] = df['ASSIGNED'].sub(df['USED'])

# MULTIPLICATION
df['REFUND'] = df['QUANTITY IN GAL'].mul(df['CBMA Credit'])
``````

Altogether, consider `assign` for one compact statement:

``````import pandas as pd

filename = "home/itdept/Documents/BestWines.xlsx"
df = (pd.read_excel(filename)
.assign(USED = lambda x: x['GAL'].cumsum(),
REMAIN = lambda x: x['ASSIGNED'].sub(x['USED']),
REFUND = lambda x: x['QUANTITY IN GAL'].mul(x['CBMA Credit'])
)
)

df.head(5)
df.to_excel("path to save")
``````

### Answer：

``````"""importing packages to be used in our code"""
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile

"""importing excel content to df DataFrame"""
df = pd.read_excel('sflowone.xlsx', sheetname='Sheet1')

""" we will use LIST for updating (USED)coloumn"""
newlist = []        # created empty list

x=int(0)            # created a variable which will take all values of "GAL"
for value in df["GAL"]: # FOR LOOP will run for every value in "GAL" and takes data in "value"
x = value + x       # add all earliear entries of "GAL"
newlist.append(x)   # here we append the new values of x inside empty list
df.drop("USED",axis=1,inplace= True)  # deleted the "USED" column if exist before updating.
df.insert(3,"USED",newlist)     # inserted updated "USED" column with newlist in index number "3"

"""Updating (REMAIN) and  (REFUND)"""
df ["REMAIN"]= df["ASSIGNED"]- df["USED"]
df ["REFUND"]= df["GAL"]* df["CREDIT"]

""" Visualising first 5 entries"""
df.head(5)
""" saving to Excel sheet """
df.to_excel("sflowfinal.xlsx")

"""CODE IS TESTED AND RUNNING, for query please reply"""
``````