How Are PDF Files Performing on Google? An SEO Case Study Comparing Click-Through Rate for PDF and HTML Files on SERP.

Last updated on August 25th, 2023.

Alright, so you’ve written an insightful article and are ready to publish it on your blog. Now, you’re wondering whether you should publish the article study directly in PDF format or if you should take the time to convert it to an HTML-formatted blog post.

I’ve been there. I had an article and wasn’t sure which format to publish it in. So I decided to conduct some research.

What follows below is 1300+ words and a couple of screenshots on why you should publish as HTML. If you’re the trusting sort, skip the reading and just go ahead and publish your case study on an HTML page. Be assured that the click-through rate (CTR) on the search engine result page (SERP) is higher for HTML-formatted case studies than for PDF-formatted ones.

If you would like to know a bit more, stay with me.

The backstory, aka the two voices in my head

After completing a UX Writing assignment last year, I had a ready-to-publish case study sitting in my Dropbox. I had put considerable thought into this piece, and I was ready to upload the case study as a PDF file on my website.

But then, a sneaky voice in my head suggested, “Well, but if you publish the content as a PDF, you can’t track whether you helped even one user”. True. I would have no insight on the resultant user behaviour except the number of PDF downloads. I wouldn’t know if the user spent one second or five minutes on my file.

A second voice in my head countered, “But it is common practice to publish case studies in PDF. The user expects case studies in PDF. For sure, the click-through rate on PDFs is higher.”

I had no answer to my doubt, nor did a quick Google search. So, I decided to run a small survey.

Study: Comparing the click-through rate of PDF and HTML format on SERP

I recruited 88 users across Europe and divided them into two cohorts.

Why did I choose 88 users, you may wonder? The Norman Nielsen Group recommends 40 participants as a minimum sample size for UX research. As I wanted to examine two cohorts I doubled the number to survey at least 80 participants.

The first cohort saw the below search snippet.

printscreen search result page ux writing case study
Search snippet for the first cohort. The first search result entry is the PDF version of the case study.

I asked the participants, “If you were looking for a UX Writing case study, which result would you rather click?”

  • 45% would click on the PDF case study
  • 55% would click on the HTML case study

So far, so good. But, as the position of the search result has a strong impact on the click-through rate, I also needed to test the same snippet with the inverse sequence.

I showed the below search snippet to the second cohort.

printscreen search result page ux writing case study
Search snippet for the first cohort. The first search result entry is the HTML version of the case study.

It is, again, a search snippet for the query “ux writing case study”, but the sequence of the search entries is inverted. The search result snippet shows the HTML case study ranking at position 1, and the PDF case study ranking at position 2.

I also asked the second cohort: “If you were looking for a UX Writing case study, which result would you rather click?”

  • 48% would click on the PDF case study
  • 52% would click on the HTML case study

The result: CTR for HTML version is higher

Taking the average of the above mentioned results leads to the following overall preference:

  • 46.5% would click on the PDF case study
  • 53.5% would click on the HTML case study

If you’re more the visual type:

To summarise, the HTML version of the case study has a higher probability of being clicked than the PDF version. Most users prefer to consume content on an HTML site.

Off I went and published my case study about UX writing on an HTML site.

“Hate PDF” and other real-life quotes on why users prefer HTML content PDF

I asked the survey participants why they preferred HTML over PDF content. Here are some literal quotes from the study:

  • “It seems to be a page in the site and not a PDF document. I prefer to read case studies on a responsive page rather than on a PDF”.
  • “I don’t want to download a PDF when doing Google research unless it is necessary”.
  • “I prefer a web link so that I can translate directly into my mother tongue with the Google extension. With a PDF, I cannot”.
  • “Hate PDF”.
  • “The second result is a PDF, and I only click on them if absolutely necessary and if I feel I can trust the source”.
  • “I do not like reading PDFs on Google”.

Having screened through all the 50+ comments, I see the following pattern:

  • User experience: an HTML site has a responsive design.
  • Convenient: no need to download a file and browser add-ons work perfectly fine.
  • Trust: an HTML site is perceived as more secure than PDF.

Bonus chapter: Is publishing case studies as both PDF and HTML format considered as duplicate content?

Now, you’ve learned that publishing your case study in HTML leads to more traffic.

What if you decide to additionally provide the case study as a PDF as well? Kind of killing two birds with one stone.

Would this be considered duplicate content and lead to a penalty? The short answer is, “No”.

I’ve asked none other than John Müller himself in his SEO office hour. Here’s what he said:

We wouldn’t see it as duplicate content. It’s different content. One is an HTML page, and one is a PDF, even if the primary piece of content is the same. The whole thing around it is different. From that level, we wouldn’t see it as duplicate content.”

As it is not considered duplicate content, you won’t be penalised.

What to consider when publishing the same content both as PDF and HTML

If you publish both HTML and PDF formats with the same content, they will both be indexed. John Müller confirms my hypothesis that they could compete against each other. He adds that HTML sites tend to outperform PDFs:

“For the most part, PDFs will probably be less visible because they’re less tied in with the rest of your website. In the internal linking, you’ll link to the web pages and then from one of those web pages you’ll link to the PDF so they’ll be a little bit kind of ‘deemphasised’“

Hear the full dialogue between John Müller and me:

That’s about it for today.

What’s your experience on comparing click-through rates between PDF and HTML formats? I’m excited to hear about your findings.

Corina Burri

Corina is a SEO professional from Zürich. Since 2016 she's in SEO and has contributed to publications such as SEOFOMO, Tech SEO Tips, or iPullRank. When not grinding, she enjoys exploring Switzerland with her family.

Leave a Reply