I have the following questions:
1. What uniquely identifies your test agent? Either user agent string or IP address.
2. Can you briefly describe how the agent works, how long it takes etc. Any limitations of the agent or the test methodology as well please.
3. If a URL is not listed in the sitemap.xml file then does your agent parse anchor tags on pages in order to create a larger index of links to traverse?
1. This question I will have follow up for you but I believe we use PhantomJs to make make the page calls. This is essentially a headless version of Chrome. As I say I will check this answer for you.
2. The length of time is determined by the amount of url's in sitemap.XML. But overall the process does not take very long. Site Scan is unable to access pages that are behind a login page. This were you would need to use our Chrome Extension and continue the process manually.
3. No, is the simple answer. Site Scan has no knowledge of additional pages linked in anchor tags. As I mentioned above these could be added through the Chrome extension.
If you have any further questions please contact your account manager who will be happy to help.
The user agent that the bot / crawler uses is ' Tealium-Sitemap-Audit/2.0.0'. Also the IP address is 18.104.22.168. For some parts of the Site Scan it reverts back to a generic user agent, so using the IP address would be the best approach to correctly identify the crawler in the mean time.