Say you have HTML similar to the following:
<div style="background-image: url('https://some.domain/image')"></div>
and you want to extract https://some.domain/image
using XPath. With XPath 2.0, you can select the URL with something like
select-before(select-after(//div/@style, "backgound-image: url("), ")")
but, when using XPath 1.0, this fails — I think it’s due to nested functions not being supported in XPath 1.0, but I have been unable to find documentation to confirm that. Is there a way to accomplish this using XPath 1.0?
Asking just because I’m curious… why are you using xpath?
Also, is this for a website you control or for some else’s website?
If you’re rendering the page (in a browser, e2e test-runner, spider bot, etc…), have you considered running some js on the page to get the image? Something like:
const imagePath = document.getElementById('exampleIdOnElement').style.backgroundImage
I’m using a service called FreshRSS that automatically fetches RSS feeds. It has a feature that allows you to create custom feeds for sites by scraping the HTML with user specified XPath expressions.
I know that this isn’t exactly “web development”, but it uses webdev tools, and I wasn’t entirely sure where else to post this.
JS is, unfortunately, not possible here. I can only use XPath expressions.