There can be 2 things you are mentioning here:
- Either you are parsing a html file within main document
- Or want access between 2 windows
--1--
For 1. Nicolas gave an answer to you. Additionally in legacy browsers where DOM parser is not there you can create a 'dummy' html element with unknown tag:
var mock = document.createElement("mock");
if the html is trusted:
mock.innerHTML = someRawHTML;
if not:
mock.innerHTML = someRawHTML.replace(/src\s*=\s*"[^"]*\s*"|url\s*\([^)]*\s*\)/gi,"");
You can include the href in the regex if you want as well.
--2--
For 2. You need either of 2 things. If you opened it via window.open:
targetWindow = window.open("somepage.com");
The targetWindow points out the window object. From there window.self points to yourself (opener) and window.window points out the the other page's window object provided that the origins are SAME.
Another way is you open the the other page within your current page as an iframe.
To be able to this your htaccess/main apache configuration or equivalent has to add the header:
Header set X-Frame-Options SAMEORIGIN
Or if the header is not present, all domains are allowed by default. Then you do:
iframe = document.createElement("iframe");
iframe.src = "somepage.com"
document.insertBefore(iframe);
iframe.contentWindow
If the same origin policy is a problem for you, I suggest you use fileGetContents (I omited the underscores) or curl of PHP or equivalent to first scape the site and then parse it.