It’s quite easy to use PhantomJS to muck around with pages, especially if you use CasperJS which provides you a bit nicer API.
It’s also pretty easy to take screenshots, but there’s a few things you need to take into account if you want the results to be accurate…
Image loading
If the page contains images (a very common thing you might find!), you need to take some steps to ensure the images are fully loaded before taking a screenshot.
Depending on the circumstances, you might be able to get away by simply using an onload listener. However, when that doesn’t suffice, you have to resort to something slightly more complex.
There are two ways you can use to detect whether all images have been loaded:
- Check for
img.complete
- Use the
img.onload
event
You may be able to get away with a simple waitFor
condition:
casper.waitFor(function() { return this.evaluate(function() { var images = document.getElementsByTagName('img'); return Array.prototype.every.call(images, function(i) { return i.complete; }); }); }, doSomething); |
The above will wait until all images appear to have been succesfully loaded, but sometimes depending on the page in question, a different albeit slightly more complex approach may work better…
casper.evaluate(function() { var images = document.getElementsByTagName('img'); images = Array.prototype.filter.call(images, function(i) { return !i.complete; }); window.imagesNotLoaded = images.length; Array.prototype.forEach.call(images, function(i) { i.onload = function() { window.imagesNotLoaded--; }; }); }); casper.waitFor(function() { return this.evaluate(function() { return window.imagesNotLoaded == 0; }); }, doSomething); |
This approach reduces the amount of times your code will call getElementsByTagName
, which can be useful in very image-heavy sites.
CSS animations
If the page in question has CSS animations and you wish to skip them, for example to make sure the page is in the end of the animations when you take a screenshot, you’ll just need to inject a little bit of CSS into the document:
casper.evaluate(function() { var style = document.createElement('style'); style.innerHTML = '* { -webkit-animation-delay: 0.01s !important; -webkit-animation-duration: 0.01s !important; }'; document.body.appendChild(style); }); |
Unless the CSS on the page has been written poorly, the above CSS snippet will make sure any animations are finished almost instantly.
You may be tempted to simply put 0s
instead of 0.01s
, but I found a plain zero is unreliable and would sometimes screw up the animations entirely.
Dealing with dynamic elements
If you want to take screenshots of specific elements instead of the entire page, you need to provide Casper with a selector.
But what if the element has been generated dynamically, or you just don’t know its selector ahead of time?
The solution is simple: You can probably find a way to query for the element using querySelectorAll
, so you can write something like follows:
casper.evaluate(function() { var els = document.querySelectorAll('.some-selector > div'); var ids = 0; Array.prototype.forEach.call(els, function(el) { el.id = 'foo' + ids++; }); }); |
And there you go, now you have an easy way to tell what kind of selector you can use – Simply an ID in the format you assigned to th element. If you don’t know the number of elements, you could return the number from the evaluate call and use that.
In closing
These three cases were something that I ran into while implementing a tool to automatically do screengrabs of some pages. I hope this helps you avoid some of the pitfalls!