An warware: pandas tana tace layuka ta dabi'u masu duhu

A duniyar nazarin bayanai, ya zama ruwan dare a gamu da manyan bayanan da ke buฦ™atar sarrafa bayanai da sarrafa su. ฦŠaya daga cikin irin wannan matsala da ke tasowa shine tace layuka bisa ga ฦ™ididdiga masu ban sha'awa, musamman lokacin da ake hulษ—a da bayanan rubutu. Pandas, sanannen ษ—akin karatu na Python don sarrafa bayanai, yana ba da kyakkyawar mafita don taimakawa magance wannan batu. A cikin wannan labarin, za mu nutse cikin yadda ake amfani da Pandas don tace layuka ta amfani da ฦ™ima mai ban sha'awa, bincika lambar mataki-mataki, da tattauna ษ—akunan karatu da ayyuka masu dacewa waษ—anda zasu iya taimakawa wajen magance irin waษ—annan matsalolin.

Don fara magance wannan matsalar, za mu yi amfani da Panda library tare da fuzzywuzzy ษ—akin karatu wanda ke taimakawa lissafin kamance tsakanin igiyoyi daban-daban. The fuzzywuzzy ษ—akin karatu yana amfani da nisa na Levenshtein, ma'aunin kamanni dangane da adadin gyare-gyare (sakewa, sharewa, ko maye gurbin) da ake buฦ™ata don canza wannan kirtani zuwa wani.

Shigarwa da shigo da dakunan karatu da ake buฦ™ata

Don farawa, muna buฦ™atar shigar da shigo da dakunan karatu masu mahimmanci. Kuna iya amfani da pip don shigar da Pandas da fuzzywuzzy:

pip install pandas
pip install fuzzywuzzy

Da zarar an shigar, shigo da dakunan karatu a cikin lambar Python ku:

import pandas as pd
from fuzzywuzzy import fuzz, process

Tace Layukan Da Aka Gina Akan ฦ˜imar Maษ—aukaki

Yanzu da mun shigo da dakunan karatu da ake buฦ™ata, bari mu ฦ™irฦ™iri saitin bayanan ฦ™agaggen mu baje kolin yadda ake tace layuka bisa ga ฦ™ima. A cikin wannan misalin, saitin bayanan mu zai ฦ™unshi sunayen tufafi da salon su.

data = {'Garment': ['T-shirt', 'Polo shirt', 'Jeans', 'Leather jacket', 'Winter coat'],
        'Style': ['Casual', 'Casual', 'Casual', 'Biker', 'Winter']}
df = pd.DataFrame(data)

A ษ—auka muna son tace layuka masu ษ—auke da riguna masu kama da โ€œTee shirtโ€, za mu buฦ™aci yin amfani da ษ—akin karatu na fuzzywuzzy don cim ma wannan.

search_string = "Tee shirt"
threshold = 70

def filter_rows(df, column, search_string, threshold):
    return df[df[column].apply(lambda x: fuzz.token_sort_ratio(x, search_string)) >= threshold]

filtered_df = filter_rows(df, 'Garment', search_string, threshold)

A cikin lambar da ke sama, muna ayyana aiki tace_launi wanda ke ษ—aukar sigogi huษ—u: DataFrame, sunan ginshiฦ™i, igiyoyin bincike, da madaidaicin kofa. Yana dawo da tacewa DataFrame dangane da ฦ™ayyadadden ฦ™ofa, wanda aka ฦ™ididdige shi ta amfani da fuzz.token_sort_ratio aiki daga fuzzywuzzy library.

Fahimtar Code Mataki-da-mataki

  • Da farko, mun ฦ™irฦ™iri DataFrame da ake kira df dauke da saitin bayanan mu.
  • Na gaba, muna ayyana kirtan binciken mu azaman "Tee shirt" kuma saita madaidaicin kofa na 70. Kuna iya daidaita ฦ™imar kofa gwargwadon matakin kamancen da kuke so.
  • Sai mu ฦ™irฦ™iri wani aiki mai suna tace_launi, wanda ke tace DataFrame dangane da nisa na Levenshtein tsakanin layin bincike da ฦ™imar kowane jere a cikin ฦ™ayyadadden ginshiฦ™i.
  • A ฦ™arshe, muna kira da tace_launi aiki don samun tacewa DataFrame, tace_df.

A ฦ™arshe, Pandas, a haษ—e tare da ษ—akin karatu na fuzzywuzzy, kayan aiki ne mai kyau don tace layuka dangane da ฦ™ima mara kyau. Fahimtar waษ—annan ษ—akunan karatu da ayyukansu yana ba mu damar sarrafa bayanai da kyau da kuma magance hadaddun ayyukan sarrafa bayanai.

Shafi posts:

Leave a Comment