Home » Python » How to match data in dataframe Python Pandas-Exceptionshub

How to match data in dataframe Python Pandas-Exceptionshub

Posted by: admin February 24, 2020 Leave a comment

Questions:

I’m trying to match data to data in a dataframe. The way I’m currently attempting to is not working. After some research, I believe I’m only choosing either or, not and. I have transactions I want to match the opening and closing, and disregard the rest. The results still show unclosed transactions.

Code:

# import important stuffs
import pandas as pd

# open file and sort through options only and pair opens to closes
with open('TastyTrades.csv'):
    trade_reader = pd.read_csv('TastyTrades.csv')  # create reader
    options_frame = trade_reader.loc[(trade_reader['Instrument Type'] == 'Equity Option')]  # sort for options only
    BTO = options_frame[options_frame['Action'].isin(['BUY_TO_OPEN', 'SELL_TO_CLOSE'])]  # look for BTO/STC
    STO = options_frame[options_frame['Action'].isin(['SELL_TO_OPEN', 'BUY_TO_CLOSE'])]  # look for STO/BTC
    paired_frame = [BTO, STO]  # combine
    results = pd.concat(paired_frame)  # concat
    results_sorted = results.sort_values(by=['Symbol', 'Call or Put', 'Date'], ascending=True)  # sort by symbol
    results_sorted.to_csv('new_taste.csv')  # write new list

Results:

310,2019-12-19T15:47:24-0500,Trade,SELL_TO_OPEN,APA   200117P00020000,Equity Option,Sold 1 APA 01/17/20 Put 20.00 @ 0.33,33,1,33.0,-1.0,-0.15,100.0,APA,1/17/2020,20.0,PUT
296,2019-12-31T09:30:07-0500,Trade,BUY_TO_CLOSE,APA   200117P00020000,Equity Option,Bought 1 APA 01/17/20 Put 20.00 @ 0.08,-8,1,-8.0,0.0,-0.14,100.0,APA,1/17/2020,20.0,PUT
8,2020-02-14T12:19:30-0500,Trade,BUY_TO_OPEN,AXAS  200918C00002500,Equity Option,Bought 2 AXAS 09/18/20 Call 2.50 @ 0.05,-10,2,-5.0,-2.0,-0.28,100.0,AXAS,9/18/2020,2.5,CALL
172,2020-01-28T10:05:14-0500,Trade,SELL_TO_OPEN,BAC   200320C00033000

As you can see here, I have one full transaction: APA, one half of a transaction: AXAS, and the first half of a full transaction: BAC. I don’t want to see AXAS in there. AXAS and the others keep popping up no matter how many times I try to get rid of them.

How to&Answers:

Right now you’re just selecting for all opens and all closes, and then stacking them; there’s no actual pairing going on. If I’m understanding you correctly, you only want to include transactions that have both an Open and a Close in the dataset? If that’s the case, I’d suggest finding the set intersection of the transaction IDs, and using that to select the paired transactions. It’d look something like the code below, assuming that the fifth column in your data (e.g. “APA 200117P00020000”) is the TransactionID.

import pandas as pd

trade_reader =  pd.read_csv('TastyTrades.csv')
options_frame = trade_reader.loc[
    (trade_reader['Instrument Type'] == 'Equity Option')
] # sort for options only

opens = options_frame[
    options_frame['Action'].isin(['BUY_TO_OPEN', 'SELL_TO_OPEN'])
] # look for opens
closes = options_frame[
    options_frame['Action'].isin(['BUY_TO_CLOSE', 'SELL_TO_CLOSE'])
] # look for closes

# Then create the set intersection of the open and close transaction IDs
paired_ids = set(opens['TransactionID']) & set(closes['TransactionID'])
paired_transactions = options_frame[
    options_frame['TransactionID'].isin(paired_ids)
] # And use those to select the paired items

results = paired_transactions.sort_values(
    by=['Symbol', 'Call or Put', 'Date'],
    ascending=True
) # sort by symbol
results.to_csv('NewTastyTransactions.csv')