How to get HTML on Backend?

Hi! I’m working on a site that imports contents from another, but the API on the site I have to import generates HTML, which I should get values from it and set on enonic objects. As enonic don’t have DOM, I cant use DOMParser, same as document.getElementById. How can I take html elements on the backend? I’ve tried to use a lib called htmlparser2 but it doesn’t work. Can you help me please?

I was about to suggest Cheerio which is designed to be “jQuery designed specifically for the server”, but it looks like it wraps around htmlparser2 which would leave you with the same problem as before :-/

2 Likes

It would be a really ugly solution, but maybe using a regex could work for you? The imported HTML can be read as plain text, and a regex should work to extract specific parts - assuming the markup is okay enough.

I could also imagine the XSL library being useful, but that requires XHTML. (HTML5 wouldn’t work)

RegEx worked well. Thanks!

1 Like