Alright, so you’ve written an insightful article and are ready to publish it on your blog. Now, you’re wondering whether you should publish the article study directly in PDF format or if you should take the time to convert it to an HTML-formatted blog post.
I’ve been there. I had an article and wasn’t sure which format to publish it in. So I decided to conduct some research.
What follows below is 1300+ words and a couple of screenshots on why you should publish as HTML. If you’re the trusting sort, skip the reading and just go ahead and publish your case study on an HTML page. Be assured that the click-through rate (CTR) on the search engine result page (SERP) is higher for HTML-formatted case studies than for PDF-formatted ones.
If you would like to know a bit more, stay with me.
Table of contents
- The backstory, aka the two voices in my head
- Study: Comparing the click-through rate of PDF and HTML format on SERP
- The result: CTR for HTML version is higher
- “Hate PDF” and other real life quotes on why users prefer HTML content PDF
- Bonus chapter: Is publishing case studies as both PDF and HTML format considered as duplicate content?
- What to consider when publishing the same content both as PDF and HTML
The backstory, aka the two voices in my head
After completing a UX Writing assignment last year, I had a ready-to-publish case study sitting in my Dropbox. I had put considerable thought into this piece, and I was ready to upload the case study as a PDF file on my website.
But then, a sneaky voice in my head suggested, “Well, but if you publish the content as a PDF, you can’t track whether you helped even one user”. True. I would have no insight on the resultant user behaviour except the number of PDF downloads. I wouldn’t know if the user spent one second or five minutes on my file.
A second voice in my head countered, “But it is common practice to publish case studies in PDF. The user expects case studies in PDF. For sure, the click-through rate on PDFs is higher.”
I had no answer to my doubt, nor did a quick Google search. So, I decided to run a small survey.
Study: Comparing the click-through rate of PDF and HTML format on SERP
I recruited 88 users across Europe and divided them into two cohorts.
Why did I choose 88 users, you may wonder? The Norman Nielsen Group recommends 40 participants as a minimum sample size for UX research. As I wanted to examine two cohorts I doubled the number to survey at least 80 participants.
The first cohort saw the below search snippet.
I asked the participants, “If you were looking for a UX Writing case study, which result would you rather click?”
- 45% would click on the PDF case study
- 55% would click on the HTML case study
So far, so good. But, as the position of the search result has a strong impact on the click-through rate, I also needed to test the same snippet with the inverse sequence.
I showed the below search snippet to the second cohort.
It is, again, a search snippet for the query “ux writing case study”, but the sequence of the search entries is inverted. The search result snippet shows the HTML case study ranking at position 1, and the PDF case study ranking at position 2.
I also asked the second cohort: “If you were looking for a UX Writing case study, which result would you rather click?”
- 48% would click on the PDF case study
- 52% would click on the HTML case study
The result: CTR for HTML version is higher
Taking the average of the above mentioned results leads to the following overall preference:
- 46.5% would click on the PDF case study
- 53.5% would click on the HTML case study
If you’re more the visual type:
To summarise, the HTML version of the case study has a higher probability of being clicked than the PDF version. Most users prefer to consume content on an HTML site.
Off I went and published my case study about UX writing on an HTML site.
“Hate PDF” and other real-life quotes on why users prefer HTML content PDF
I asked the survey participants why they preferred HTML over PDF content. Here are some literal quotes from the study:
- “It seems to be a page in the site and not a PDF document. I prefer to read case studies on a responsive page rather than on a PDF”.
- “I don’t want to download a PDF when doing Google research unless it is necessary”.
- “I prefer a web link so that I can translate directly into my mother tongue with the Google extension. With a PDF, I cannot”.
- “Hate PDF”.
- “The second result is a PDF, and I only click on them if absolutely necessary and if I feel I can trust the source”.
- “I do not like reading PDFs on Google”.
Having screened through all the 50+ comments, I see the following pattern:
- User experience: an HTML site has a responsive design.
- Convenient: no need to download a file and browser add-ons work perfectly fine.
- Trust: an HTML site is perceived as more secure than PDF.
Bonus chapter: Is publishing case studies as both PDF and HTML format considered as duplicate content?
Now, you’ve learned that publishing your case study in HTML leads to more traffic.
What if you decide to additionally provide the case study as a PDF as well? Kind of killing two birds with one stone.
Would this be considered duplicate content and lead to a penalty? The short answer is, “No”.
I’ve asked none other than John Müller himself in his SEO office hour. Here’s what he said:
“We wouldn’t see it as duplicate content. It’s different content. One is an HTML page, and one is a PDF, even if the primary piece of content is the same. The whole thing around it is different. From that level, we wouldn’t see it as duplicate content.”
As it is not considered duplicate content, you won’t be penalised.
What to consider when publishing the same content both as PDF and HTML
If you publish both HTML and PDF formats with the same content, they will both be indexed. John Müller confirms my hypothesis that they could compete against each other. He adds that HTML sites tend to outperform PDFs:
“For the most part, PDFs will probably be less visible because they’re less tied in with the rest of your website. In the internal linking, you’ll link to the web pages and then from one of those web pages you’ll link to the PDF so they’ll be a little bit kind of ‘deemphasised’“
Hear the full dialogue between John Müller and me:
That’s about it for today.
What’s your experience on comparing click-through rates between PDF and HTML formats? I’m excited to hear about your findings.