An warware: maida kowane fayil .pdf zuwa audio dev.to

Duniyar fasaha tana ci gaba cikin sauri, kuma ษ—ayan sabbin abubuwan da suka daษ—e suna jan hankali shine canza fayilolin .pdf zuwa sauti. Wannan na iya zama da amfani sosai don dalilai iri-iri, kamar kayan koyo, samun dama, ko kawai jin daษ—in littafi ko takarda ba tare da buฦ™atar allo ba. A cikin wannan labarin, za mu shiga cikin hanyar Python don wannan matsala kuma mu bayyana matakan da suka dace don ฦ™irฦ™irar rubutun aiki don canza fayilolin .pdf ษ—inku zuwa sauti. Bugu da ฦ™ari, za mu tattauna wasu mahimman ษ—akunan karatu da ayyuka da ke cikin wannan tsari. Don haka, bari mu fara!

Maganin Python don Maida Fayilolin PDF zuwa Audio

Harshen shirye-shiryen Python yana ba da tarin ษ—akunan karatu da kayan aikin da ke ba masu haษ“aka damar yin ayyuka da yawa, gami da sauya fayil. ฦŠayan irin wannan ษ—akin karatu shine PDF2, wanda ke ba mu damar cire rubutu daga fayilolin .pdf. Don canza rubutun da aka ciro zuwa sauti, za mu iya amfani da wani ษ—akin karatu da ake kira gTTS (Google Rubutu-zuwa-Magana). Yana amfani da API ษ—in Rubutu-zuwa-Magana na Google don samar da fayil mai jiwuwa daga rubutu.

Anan ga bayanin mataki-mataki na lambar don canza fayil ษ—in .pdf zuwa fayil mai jiwuwa ta amfani da Python:

  1. Da farko, shigar da ษ—akunan karatu da ake buฦ™ata ta aiwatar da umarni mai zuwa a cikin tashar ku ko umarni da sauri:
      pip install PyPDF2 gtts
      
  2. Na gaba, shigo da dakunan karatu masu mahimmanci a farkon rubutun Python ษ—inku ta ฦ™ara waษ—annan layukan:
      import PyPDF2
      from gtts import gTTS
      
  3. ฦ˜irฦ™iri aiki don cire rubutu daga fayil ษ—in .pdf:
      def extract_text_from_pdf(pdf_path):
          # Initialize the PdfFileReader object
          pdf_file = PyPDF2.PdfFileReader(pdf_path)
          
          # Extract text from each page
          full_text = ""
          for page_num in range(pdf_file.getNumPages()):
              text = pdf_file.getPage(page_num).extractText()
              full_text += text
    
          return full_text
      
  4. ฦ˜irฦ™iri wani aiki don canza rubutun da aka ciro zuwa fayil mai jiwuwa:
      def text_to_audio(text, output_audio_file):
          # Initialize the gTTS object
          tts = gTTS(text=text, lang='en', slow=False)
          
          # Save the audio file
          tts.save(output_audio_file)
      
  5. A ฦ™arshe, yi amfani da ayyukan don canza fayil ษ—in .pdf da kuke so zuwa mai jiwuwa:
      pdf_file_path = "example.pdf"
      audio_output_file = "output_audio.mp3"
    
      extracted_text = extract_text_from_pdf(pdf_file_path)
      text_to_audio(extracted_text, audio_output_file)
      

Yanzu da muka rufe mahimman matakai don rubutun Python ษ—inmu, bari mu bincika wasu ษ—akunan karatu da ayyuka masu alaฦ™a.

Madadin PDF da Kayan aikin sarrafa Rubutu a cikin Python

Yayin da muke amfani da PyPDF2 da gTTS a cikin misalinmu, akwai wasu ษ—akunan karatu da ake samu a cikin yanayin yanayin Python don ayyuka iri ษ—aya:

  • PDFMiner: Laburaren da aka tsara don fitar da bayanai daga fayilolin PDF, kamar rubutu, hotuna, metadata, har ma da samar da bayanai. Yana ba da ฦ™arin kayan aiki masu faษ—i don hakar rubutu da magudi fiye da PyPDF2.
  • Rubutun rubutu: Laburaren da ke sauฦ™aฦ™a cire rubutu daga nau'ikan fayil daban-daban, gami da fayilolin PDF da Microsoft Office. Textract na iya zama babban madadin idan kuna buฦ™atar cire rubutu daga nau'ikan fayil da yawa.
  • pyttsx3: Laburaren rubutu-zuwa-magana na layi-laburaren layi da giciye-dandamali don Python. Yayin da gTTS ya dogara da API na Google, pyttsx3 yana amfani da injin rubutu-zuwa-magana na tsarin ku, yana samar da ayyukan layi da fa'idodin sirri.

Waษ—annan hanyoyin za su iya samar da ฦ™arin fasali da zaษ“uษ“ษ“uka, dangane da takamaiman buฦ™atun ku. Jin kyauta don ฦ™ara bincika su kuma zaษ“i wanda ya fi dacewa da aikin ku.

A cikin wannan labarin, mun gabatar da wani bayani na Python don canza fayilolin .pdf zuwa sauti, mun bayyana matakan da ake buฦ™ata don ฦ™irฦ™irar rubutun aiki, kuma mun tattauna ษ—akunan karatu da ayyuka daban-daban da suka shafi maganinmu. Ta bin waษ—annan jagororin da fahimtar ma'anar bayan lambar, zaka iya sauฦ™aฦ™e ilimin ku kuma daidaita wannan bayani don wasu tsarin fayil ko lokuta daban-daban na amfani. Murnar coding!

Shafi posts:

Leave a Comment