GIF vs. WebM on 4chan – Part 1: The Data

One year ago I conducted a study about GIF and WebM usage on 4chan and presented the first results at 32C3 (see here). Since then, I continued to collect data for 8 more months until August 2015*.

And now I finally publish the findings. As this is a leisure time project, I hope you understand that it took some time to finish it. Also, it is divided into three separate articles to make it more accessible:

PART 1: The Data (this article)

PART 2: Quantitative Analysis & Method

PART 3: Qualitative Analysis (will follow)

Okay, before you scroll down for all the shiny graphs and tables, take a second for some basic information about the study:

What’s the idea?
The hypothesis was, that the 4chan /gif board, as it contains material that is considered to be “not suitable for work”, is more active in posting images than the “worksafe” /wsg board. Also, I assumed that /gif would use more WebM files instead of GIFs than /wsg, because WebMs provide a better image quality at a smaller file size, which – as I thought – might be highly appreciated for sharing (mostly) pornographic content.

*Why this time period?
I wanted to inspect the changes since April 2014, when WebM could be used on 4chan for the first time. A second paradigm shift was at the end of January 2015, when WebMs on 4chan were allowed to have sound. And for looking at a change it is good to look at older data, too. That’s why I included it back to 2012 (at least for the /wsg board). The end of the study in August 2015 is due to the circumstances explained in the next paragraph…

Why are there holes in the dataset?
I extracted the data from 4chan archives, because 4chan itself deletes them after a while. But until now, every single one of the archives I worked with eventually went offline. So I wasn’t able to find a data source for the time before fall 2014 for the /gif board. And there is also a lack of data in spring 2015. However, meanwhile I found a new archive for the ongoing time after August 2015, but I will care about that in a while.

How have the data been extracted?
Using a Python-Script that read it from the HTML of the threads in the /gif and /wsg boards (in the archives, not on 4chan itself). More details will follow in part 2.

Ok, ready?

Click to enlarge the images.

First, there are two tables for the monthly amount of image files on both /wsg and /gif board. Notice the gaps in the data, as I explained earlier. It becomes very obvious which board is more populated by Animations.

absolute amount of images on /wsg

absolute amount of images on /gif

The next two graphs represent the same dataset. This time not the absolute numbers, but the average amount of files per day in each month.

EDIT: The next two images have been updated. Before, they have been older/incorrect versions. Sorry!



These next graphs show the data from December to August on a daily base. This way, the details of the development become visible. For example the huge peak at the end of January or some low points during 4chan’s server downtime. More details on that will follow in part 3.
You can see two versions of each graph. The first ones show an overview and the other (flat and greyscale) ones are thumbnails of very large graphs that show the exact data for each day in that time period. Click them to see all the details.

/wsg from January to August

/gif from January to August



A first summary:

  • Users on /gif share more animations than those on /wsg.
  • On /gif WebM files quickly began to outnumber GIF files, while on /wsg this trend started later and the difference increases more slowly.
  • The total amount of animations slowly but steadily grows on both boards respectively.

To be continued :)


One thought on “GIF vs. WebM on 4chan – Part 1: The Data

