Selenium with Python: XPath not retrieving expected values for sibling elements

I’m encountering difficulties using XPath expressions in Selenium with Python to retrieve numerical values from sibling elements. The HTML structure consists of “h3” and “p” elements within “figure tags. Despite trying different XPath expressions, the values returned do not match expectations.

HTML Structure:

<div>
  <h3>Title</h3>
  <figure>
    <p>1</p>
  </figure>
  
  <h3>Title</h3>
  
  <h3>Title</h3>
  <figure>
    <p>3</p>
  </figure>
  
  <h3>Title</h3>
  <figure>
    <p>4</p>
  </figure>
</div>

import pytest
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

@pytest.fixture
def driver():
    driver = webdriver.Chrome()
    yield driver
    driver.quit()

def test_extract_numbers_from_elements(driver):
    html_content = """
    <div>
      <h3>Title</h3>
      <figure>
        <p>1</p>
      </figure>
      
      <h3>Title</h3>
      
      <h3>Title</h3>
      <figure>
        <p>3</p>
      </figure>
      
      <h3>Title</h3>
      <figure>
        <p>4</p>
      </figure>
    </div>
    """

    driver.get('data:text/html;charset=utf-8,' + html_content)

    h3_elems = WebDriverWait(driver, 10).until(
        EC.presence_of_all_elements_located((By.XPATH, "//h3"))
    )

    data = []

    for i, ele in enumerate(h3_elems):
        try:
            # Use a more specific XPath expression to select the following <p> element
            num = ele.find_element(By.XPATH, "following-sibling::figure[1]/p").text
            
            print(f"Index: {i}, Text: {num}")

            # Check if the text can be converted to a float
            try:
                num = float(num)
                data.append(num)
            except ValueError:
                data.append(None)
        except Exception as e:
            print(f"Index: {i}, Error: {e}")
            data.append(None)

    print("Actual data:", data)

    assert data == [1.0, None, 3.0, 4.0]

Despite trying different XPath expressions, I am still facing issues. The current Python script using JavaScript execution works, but I would like to understand why the XPath expressions are not behaving as expected.

Any insights or alternative approaches would be greatly appreciated. Thank you!

Source link