as I was learning Python,was working on one of the dataset like this:
**Col1** **Col2** **Col3**
dog Z st02
dog,cat Z st02
dog,bat,cat Z st02
bat,cat,elephant Y st02
dog,bat,cat,elephant Y st02
tiger Z st01
pigeon Z st01
pigeon,parrot Z st01
dove,parrot Z st01
pigeon,parrot Z st01
pigeon,parrot,dove Z st01
lion,leopard,cheetah Z st01
tiger,lion,leopard,cheetah Z st01
dog,tiger,cheetah Y st01
dog,tiger,leopard,cheetah Y st01
eagle,jaguar,Kangaroo,zebra Z st02
cheetah,eagle,jaguar,Kangaroo,zebra Z st02
The expected output is:
**Col1** **Col2** **Col3**
dog,bat,cat Z st02
dog,bat,cat,elephant Y st02
tiger,lion,leopard,cheetah Z st01
dog,tiger,leopard,cheetah Y st01
cheetah,eagle,jaguar,Kangaroo,zebra Z st02
pigeon,parrot,dove Z st01
In order to extract the above rows as output, I tried tracing the patterns and using the below logic:
data = pd.read_excel("data.xlsx")
data['Col4'] = data['Col1'].str.count(',')
v1 = []
v2 = []
v1.append(0)
v2.append(0)
for i in range(0,data.shape[0]-1):
x = data['Col_2'][i]
y = data['Col_2'][i+1]
t1 = data['Col_3'][i]
t2 = data['Col_3'][i+1]
g1 = (x == y) & (t1==t2)
d1 = data['Col_1'][i]
d2 = data['Col_1'][i+1]
c1 = data['Col_4'][i]
c2 = data['Col_4'][i+1]
flag = 0
if(all(x in d2 for x in d1)):
flag = 1
g2 = (flag == 1)&(c2>c1)
v1.append(g1)
v2.append(g2)
data['new_cond1'] = v1
data['new_cond2'] = v2
data['Final_flag'] = (data['new_cond1']==True)&(data['new_cond2']==True)
data_output = data[data['Final_flag']==True]
But I didn't end up getting the expected output, rather few additional rows are also present in output. Could someone please help me extracting the rows mentioned in expected output.
Thanks in advance!
Aucun commentaire:
Enregistrer un commentaire