Home » Python » How can I solve backtrack (or some book said it's backtrace) function using python in NLP project?-Exceptionshub

How can I solve backtrack (or some book said it's backtrace) function using python in NLP project?-Exceptionshub

Posted by: admin March 1, 2020 Leave a comment

Questions:

Here’s the code I got from github class and I wrote some function on it and stuck with it few days ago.

In this code I have to use maximum matching and then backtrace it.

thai_vocab = ["ไ","ป","ห","า","ม","เ","ห","ส","ี","ไป","หา","หาม","เห","สี","มเหสี","!"]


from math import inf #infinity
def maximal_matching(c):
    #Initialize an empty 2D list
    d  =[[None]*len(c) for _ in range(len(c))]

    

Answer:

#FILL CODE HERE

Answer:

# for i in range(len(d)): for j in range(len(d[0])): if(i == 0) and (c[i:j+1] in thai_vocab): d[0][j] = 1 elif((j> 0) and (c[i:j+1] in thai_vocab)): res = [k for k in zip(*d)][i-1] temp = [] for val in res: if val != None : temp.append(val) d[i][j] = 1 + min(temp) elif((c[i:j+1]) != "") : d[i][j] = inf

Answer:

Answer:

Answer:

Answer:

Answer:

Answer:

Answer:

# return d def backtrack(d): eow = len(d)-1 # End of Word position word_pos = [] # Word position

Answer:

#FILL CODE HERE

Answer:

# row_pos = len(d)-1 while eow >=0: res = [k for k in zip(*d)][eow] temp = [] for val in res: if val != None : temp.append(val) min_col = min(temp) while row_pos >= 0: if (d[row_pos][eow] == min_col) and (d[row_pos][eow-1] is None): word_pos.append((row_pos,eow)) elif (d[row_pos][eow] == min_col) and (d[row_pos][eow-1] == inf) : eow-=1 elif (d[row_pos][eow] == inf) and (d[row_pos-1][eow] == inf): eow-=1 elif ((d[row_pos][eow] == min_col) and (d[row_pos][eow-1] == min_col) or (d[row_pos][eow-1] == inf)): eow -=1 elif (d[row_pos][eow] == inf) and (d[row_pos][eow-1] is None) and (isinstance(d[row_pos-1][eow], int) == False): word_pos.append((row_pos,eow)) else: row_pos-=1 eow -=1

Answer:

Answer:

Answer:

Answer:

Answer:

Answer:

Answer:

# word_pos.reverse() return word_pos

Now I run the code below to get the result from maxmatch():

input_text = "ไปหามเหสี!"
out = maximal_matching(input_text)
for i in range(len(out)):
    print(out[i],input_text[i])

The result is

[1, 1, inf, inf, inf, inf, inf, inf, inf, inf] ไ
[None, 2, inf, inf, inf, inf, inf, inf, inf, inf] ป
[None, None, 2, 2, 2, inf, inf, inf, inf, inf] ห
[None, None, None, 3, inf, inf, inf, inf, inf, inf] า
[None, None, None, None, 3, inf, inf, inf, 3, inf] ม
[None, None, None, None, None, 3, 3, inf, inf, inf] เ
[None, None, None, None, None, None, 4, inf, inf, inf] ห
[None, None, None, None, None, None, None, 4, 4, inf] ส
[None, None, None, None, None, None, None, None, 5, inf] ี
[None, None, None, None, None, None, None, None, None, 4] !

I the final step is trying to find the word which I tokenized by the algorithm (in this case it separate with 4 words via my dictionary).

def print_tokenized_text(d, input_text):
    tokenized_text=[]
    for pos in backtrack(d):
        #print(pos)
        tokenized_text.append(input_text[pos[0]:pos[1]+1])

    print("|".join(tokenized_text))

print_tokenized_text(out,input_text)

The result should be

ไป|หา|มเหสี|!

but in this case i got the error ,it can’t solve my function with my code I don’t know how to optimize it.Could you suggest me which algorithm would be the best to search the value and print out the result ?

Thank you in advance

How to&Answers: