ICYMI (ADN) bot analysis - January 2024

Richard Martin-Nielsen · April 3, 2024

In January the bot was moved from one instance (server) to another, principally because it was no longer seeing toots from unfollowed accounts which had relevant hashtags on the older instance. There is more detail about this in my initial discussion of what I thought was wrong and my plan of how to try to fix it.

This went relatively smoothly but it is now April and I’m finally polishing off analysis for January. Why? I’ve been busy with other things and my priority had been to have the bot change instance. There were some nagging problems with my analysis code and I didn’t find the time to fix them to my satisfaction. It’s now April and I’m going to run the code I have, and probably try applying it to February and March. But I also still want to have a look at the content of what is being amplified and that will take some new code. And I may prefer to do that when I can.

Scores and time {scores_and_time}

The bot assigns a score to each toot it sees, then boosts the toots which get the highest score. This can mean that when there aren’t many toots, toots with lower scores might get boosted which otherwise wouldn’t be amplified.

Accounts boosted

Anyone who follows the bot and will see that it tends to boost some accounts a lot. This isn’t very surprising since (a) it doesn’t follow many accounts and (b) even those accounts who do post about ADN topics don’t all consistently use the hashtags which the bot listens for.

Still, a histogram shows that bojacobs, nknews and CNDuk are most likely to score well and be boosted.

Scores of toots from regularly boosted accounts

Looking at the scores of toots from the most frequently boosted accounts, there is some variation. The algorithm is very simple and rewards use of hashtags and pays attention to likes and boosts from other users.

Scoring by day of week

It’s not very interesting to look at the distribution of boosted toots by day of the week alone, but if we break all the boosted toots into four quartile boxes from lowest to highest quartile by score, and then look at which quartiles show up across which weekdays, there’s a bit more information. These charts are made using Bob Rudis’ (@hrbrmstr@mastodon.social) waffle library.

This chart isn’t just colourful, it shows that Wednesday, Thursday and Friday have roughly comparable numbers of boosts. This time quality is better relatively on Wednesday and on Thursday.

Wednesday, Thursday and Friday have roughly comparable numbers of boosts. Quality is better relatively on Thursday and on Friday.

Sites referred to

Looking at which sites the toots link to there is greater variation, though www.nknews.org, gets 32% of the links. Beyond that, there is a mix of press, NGOs, and other specialised media outlets.

I like to make waffles charts.

Keyword frequency and topic analysis

A very very basic keyword search was used to mark all boosted toots based on the content of the toot (not any linked site). This was then used to look at which topics are most frequently referred to, and where they may overlap.

More than half (52%) all the toots referred to nuclear, followed by Korea (20%), Russia (18%), and “missile” (16%). This month, “Korea” and “missile” together (13%) came slightly less frequently than “Korea” and “nuclear” (11%). 54 of the boosted toots (out of 256 total – 21%) didn’t fall into any category.

“nuclear” and “power” appear together in 28 toots. The bot may be amplifying toots about nuclear power plants, but nuclear weapons are regularly discussed as tools of state power.

Methodology {methodology}

This is a rough post-facto analysis of the behaviour of the ICYMI (ADN) bot. The bot spits out some rudimentary logs as it works (in fact as part of how it processes creating a list of toots to boost, and tracking which of those it has boosted and which it still has to boost) and this stores high-level data about the promoted toots (but not the discarded ones). I use the logs along with data drawn from the server using the rtoot library to slice and dice the data and try to present the data graphically. This is still largely an exploratory rather than explanatory exercise.

Twitter, Facebook