A duniyar nazarin bayanai, ya zama ruwan dare a gamu da manyan bayanan da ke buฦatar sarrafa bayanai da sarrafa su. ฦaya daga cikin irin wannan matsala da ke tasowa shine tace layuka bisa ga ฦididdiga masu ban sha'awa, musamman lokacin da ake hulษa da bayanan rubutu. Pandas, sanannen ษakin karatu na Python don sarrafa bayanai, yana ba da kyakkyawar mafita don taimakawa magance wannan batu. A cikin wannan labarin, za mu nutse cikin yadda ake amfani da Pandas don tace layuka ta amfani da ฦima mai ban sha'awa, bincika lambar mataki-mataki, da tattauna ษakunan karatu da ayyuka masu dacewa waษanda zasu iya taimakawa wajen magance irin waษannan matsalolin.
Don fara magance wannan matsalar, za mu yi amfani da Panda library tare da fuzzywuzzy ษakin karatu wanda ke taimakawa lissafin kamance tsakanin igiyoyi daban-daban. The fuzzywuzzy ษakin karatu yana amfani da nisa na Levenshtein, ma'aunin kamanni dangane da adadin gyare-gyare (sakewa, sharewa, ko maye gurbin) da ake buฦata don canza wannan kirtani zuwa wani.
Shigarwa da shigo da dakunan karatu da ake buฦata
Don farawa, muna buฦatar shigar da shigo da dakunan karatu masu mahimmanci. Kuna iya amfani da pip don shigar da Pandas da fuzzywuzzy:
pip install pandas pip install fuzzywuzzy
Da zarar an shigar, shigo da dakunan karatu a cikin lambar Python ku:
import pandas as pd from fuzzywuzzy import fuzz, process
Tace Layukan Da Aka Gina Akan ฦimar Maษaukaki
Yanzu da mun shigo da dakunan karatu da ake buฦata, bari mu ฦirฦiri saitin bayanan ฦagaggen mu baje kolin yadda ake tace layuka bisa ga ฦima. A cikin wannan misalin, saitin bayanan mu zai ฦunshi sunayen tufafi da salon su.
data = {'Garment': ['T-shirt', 'Polo shirt', 'Jeans', 'Leather jacket', 'Winter coat'], 'Style': ['Casual', 'Casual', 'Casual', 'Biker', 'Winter']} df = pd.DataFrame(data)
A ษauka muna son tace layuka masu ษauke da riguna masu kama da โTee shirtโ, za mu buฦaci yin amfani da ษakin karatu na fuzzywuzzy don cim ma wannan.
search_string = "Tee shirt" threshold = 70 def filter_rows(df, column, search_string, threshold): return df[df[column].apply(lambda x: fuzz.token_sort_ratio(x, search_string)) >= threshold] filtered_df = filter_rows(df, 'Garment', search_string, threshold)
A cikin lambar da ke sama, muna ayyana aiki tace_launi wanda ke ษaukar sigogi huษu: DataFrame, sunan ginshiฦi, igiyoyin bincike, da madaidaicin kofa. Yana dawo da tacewa DataFrame dangane da ฦayyadadden ฦofa, wanda aka ฦididdige shi ta amfani da fuzz.token_sort_ratio aiki daga fuzzywuzzy library.
Fahimtar Code Mataki-da-mataki
- Da farko, mun ฦirฦiri DataFrame da ake kira df dauke da saitin bayanan mu.
- Na gaba, muna ayyana kirtan binciken mu azaman "Tee shirt" kuma saita madaidaicin kofa na 70. Kuna iya daidaita ฦimar kofa gwargwadon matakin kamancen da kuke so.
- Sai mu ฦirฦiri wani aiki mai suna tace_launi, wanda ke tace DataFrame dangane da nisa na Levenshtein tsakanin layin bincike da ฦimar kowane jere a cikin ฦayyadadden ginshiฦi.
- A ฦarshe, muna kira da tace_launi aiki don samun tacewa DataFrame, tace_df.
A ฦarshe, Pandas, a haษe tare da ษakin karatu na fuzzywuzzy, kayan aiki ne mai kyau don tace layuka dangane da ฦima mara kyau. Fahimtar waษannan ษakunan karatu da ayyukansu yana ba mu damar sarrafa bayanai da kyau da kuma magance hadaddun ayyukan sarrafa bayanai.