استخراج قيمة سمة attribute ضمن ال tags باستخدام beautifulsoup في Python

إياد أحمد · 21 نوفمبر 2021

لدي الموقع التالي:
https://www.imdb.com/title/tt5648202/
وأحاول استخراج القيم التي تأخذها ال attribute (السمة) class ضمن الوسم main في الصفحة؟ كيف يمكنني الحصول عليها؟

Ali Haidar Ahmad · 21 نوفمبر 2021

يمكنك القيام بذلك بالشكل التالي:

# Import Beautiful Soup
from bs4 import BeautifulSoup
htmlDoc='''
	<html>
		<h2 class="first second third"> Heading 1 </h2>
		<h1> Heading 2 </h1>
	</html>
	'''
# تحليل المكونات
soup = BeautifulSoup(htmlDoc, "lxml")
# الحصول على الوسم 
tag = soup.h2
#الحصول على قيمة السمة المطلوبة 
attribute = tag['class']
# طباعتها
print(attribute)
# ['first', 'second', 'third']

وبفرض كان لديك أكثر من وسم من نفس النوع استخدم findall:

# Import Beautiful Soup
from bs4 import BeautifulSoup
htmlDoc='''
	<html>
		<h2 class="v0"> Heading 1 </h2>
    <h2 class="v1"> Heading 2 </h2>
    <h2 class="v2"> Heading 3 </h2>
		<h1> Heading 2 </h1>
	</html>
	'''
# تحليل المكونات
soup = BeautifulSoup(htmlDoc, "lxml")
tags = soup.find_all('h2') 
for tag in tags:
  attribute = tag['class']
  print(attribute)
"""
['v0']
['v1']
['v2']
"""

وبالتالي في مثال يمكنك القيام بذلك بالشكل التالي:

# استيراد الوحدات
from bs4 import BeautifulSoup
import requests
# تحديد العنوان
url="https://www.imdb.com/title/tt5648202/"
# GET إرسال طلب 
page = requests.get(url)
# BeautifulSoup تحليل مكونات الصفحة باستخدام 
soup = BeautifulSoup(page.content, "lxml") # lxml استخدمنا المحلل 
# main الحصول على كل الوسوم التي تحمل اسم 
tags = soup.find_all('main')
# نقوم بالمرور عليها واحدة تلو الأخرى
for tag in tags:
  # الحصول على قيمة السمة المطلوبة
  attribute = tag['class']
  # طباعتها
  print(attribute)

Ahmed Sharshar · 26 نوفمبر 2021

بالاضافة للطرق السابقة يمكنك تحويل الملف الى xml ثم ايجاد القيم بسهولة باستخدام find_all كالتالي:

xmlData = None

with open('conf//test1.xml', 'r') as xmlFile:
    xmlData = xmlFile.read()

xmlDecoded = xmlData

xmlSoup = BeautifulSoup(xmlData, 'html.parser')

repElemList = xmlSoup.find_all('repeatingelement')

for repElem in repElemList:
    print("Processing repElem...")
    repElemID = repElem.get('id')
    repElemName = repElem.get('name')

    print("Attribute id = %s" % repElemID)
    print("Attribute name = %s" % repElemName)

اما اذا أردت ايجاد قيمة عنصر معين:

يمكنك كذلك استخدام find_all لجلب العناصر كالتالي:

input_tag = soup.find_all(attrs={"name" : "stainfo"})

بعد ذلك تحديد العنصر الذي تريده بين كل العناصر التي تم ارجاعها:

output = input_tag[0]['value']

أو استخدام find لجلب عنصر واحد فقط ثم ايجاد قيمته:

input_tag = soup.find(attrs={"name": "stainfo"})
output = input_tag['value']

استخراج قيمة سمة attribute ضمن ال tags باستخدام beautifulsoup في Python

السؤال

إياد أحمد

رابط هذا التعليق

شارك على الشبكات الإجتماعية

2 أجوبة على هذا السؤال

Recommended Posts

Ali Haidar Ahmad

رابط هذا التعليق

شارك على الشبكات الإجتماعية

Ahmed Sharshar

رابط هذا التعليق

شارك على الشبكات الإجتماعية

انضم إلى النقاش

إعلانات

تابعنا على

الرئيسية

تابعنا

دروس ومقالات

أسئلة وأجوبة

كتب

دورات

بطاقات هدية