mercredi 8 mai 2019

is there a Python library or method to determine the best match

I am trying to see how to pick the "most likely" value out of 5 records. I think the fuzzywuzzy package might work but wondering if this package or another can work even when no search string is supplied.

i tried regular expressions but don't think it works at all, then ran into this fuzzywuzzy code: https://www.datacamp.com/community/tutorials/fuzzy-string-python

from fuzzywuzzy import process

str2Match = "apple inc"
strOptions = ["Apple Inc.","apple park","apple incorporated","iphone"]
Ratios = process.extract(str2Match,strOptions)
print(Ratios)

# You can also select the string with the highest matching percentage
highest = process.extractOne(str2Match,strOptions)

print(highest)


#The results are:
#print (Ratios)

[('Apple Inc.', 100), ('apple incorporated', 90), ('apple park', 67), ('iphone', 30)]

print(highest)

('Apple Inc.', 100)

the above results make sense as there is a supplied search string of str2Match, but I am wondering, is it possible for the code to automatically generate what the best (highest) value should be (e.g. auto generate some value when it realizes 'Apple Inc' and 'apple incorporated' are similar enough? thanks

Aucun commentaire:

Enregistrer un commentaire