Static analysis
For starter, you may use CTRL+U or Right-click and "View page source".
HTML tags
A webpage often have hidden or disabled HTML tags.
This sample script below is fetching every HTML tag that is hidden aside from some uninteresting ones.
β‘οΈ Find hidden input fields, hidden content...
Array.from(document.querySelectorAll('*')).filter(x => {
// There were not displayed to the user in the first place
if (x.nodeName === "HEAD" || x.nodeName === "META"
|| x.nodeName === "LINK" || x.nodeName === "STYLE"
|| x.nodeName === "SCRIPT" || x.nodeName === "TITLE") return false
// hidden hidden="" hidden="hidden"
if (x.hidden === true) return true
// style (visibility/display/font-size)
const style = window.getComputedStyle(x)
if (style.visibility === 'hidden') return true
if (style.display === 'none') return true
if (style.fontSize === "0px") return true
return false
}).map(x => x.outerHTML)
Links
To ensure you visited every page, you may want to check every link on every page. You can use this command:
// list every element with a href
document.querySelectorAll('*[href]')
As this process is done by search engines, websites may have a file sitemap.xml
with every page of their website, but there is no guaranty that all pages were added inside.
π Common URL if present: https://example.com/sitemap.xml
There is also a file robots.txt
with the pages that robots should not crawl/index, with may indicate vulnerable/sensitive pages.
π Common URL if present: https://example.com/robots.txt
Comments
I'm using this snippet to grab every HTML comments.
document.querySelector('html').innerHTML.replaceAll('\n', ' ').match(/<!--.*?-->/g)
Since newlines are removed, you can use the modified version below if you still need them:
document.querySelector('html').innerHTML.replaceAll('\n', '/n').match(/<!--.*?-->/g).map(x => x.replaceAll('/n', '\n'))
You may append the snippet this to remove empty comments:
[...].filter(r => r !== "<!---->")
There are more complex regexes if you want, such as rt/96517.
Analyze the javascript
You may use the console debugger, after adding a breakpoint in the JavaScript, to analyze the javascript code, if needed.
π Some developers might use well-known script name to hide their scripts such as jquery.js
.
β‘οΈ It's hard, so feel free to explore other techniques first.