Session: Testing the Fediverse / Fediverse test suite

/2023-09/session/5-e/

Topics:

Convener: Johannes Ernst (@j12t@social.coop)

Participants who chose to record their names here (18 people total):

Jeremiah Lee (@Jeremiah@alpaca.gold)
Taylor Beseda (@tbeseda@indieweb.social)
Stéphan Kochen (@kosinus@hachyderm.io)
Ryan Barrett (@snarfed.org@snarfed.org)
Ben Pate (@benpate@mastodon.social)

(Note by Johannes: This is a merge of two sets of notes that were apparently taken independently of each other. I took the liberty to merge them.)

Notes

Johannes brought two slides: https://reb00ted.org/tech/20230922-fediverse-testsuite/

Why a test suite? Three distinct set of requirements:

A standards group like SWICG wants to know how/how well its standards are implemented. e.g. what apps support something correctly?
An app dev wants to know whether they broke interop with some other fediverse app in their latest commit.
Dev starting out implementing fediverse support wants help with next step

Criteria for a complete test suite:

Be able to run tests against a staging/dev impl and “get a report card” without having to test said impl again live servers of other codebases
Be able to run those tests in CI rather than manually
Test cases should be describable as plain language and/or a slightly-formalized plain language system like Gherkin
- Additional test cases could be harvested via git issues, esp. by non-technical community members and power-users of today’s fediverse

Existing test suite efforts:

Original 2008 AP “test suite”: https://gitlab.com/dustyweb/pubstrate : more like an interactive questionnaire than a test suite. Originally served at test.activitypub.dev. Unusual language, stack, discouraged community support.
- discussion: https://github.com/w3c/activitypub/issues/337
- https://github.com/go-fed/testsuite : re-implementation of original
- Steve’s Python rewrite of the original 2008 AP “test suite”: https://github.com/steve-bate/rocks-testsuite . That test suite was more of a questionaire than a real test suite.
Steve’s new Python test suite: https://github.com/steve-bate/activitypub-testsuite . An automated regression test suite with ~150 automated tests for both C2S and S2S. It abstracts the server so the same tests can be run against different server implementations and reports can be generated.
https://pubkit.net/ from Pixelfed, coming soon. More toolkit than test suite (mock servers)
Bengo, Juan, and Dimitri Z are working on a test suite, starting with these MUST behaviors extracted from the AP spec: https://socialweb.coop/activitypub/behaviors/
Gherkin tests being worked on by helge (?) for ActivityPub spec, so not behavior

Examples of test suites that do some of the things we want:

OpenID Connect test suite
https://webmention.rocks/ for WebMention. Online protocol test suite: not all-in-one, but helpful interactive online tool
https://micropub.rocks/
Verifiable credentials test suite
https://identity.foundation/JWS-Test-Suite/

Test case ideation:

Behavioral: ActivityPub for WordPress, Write Freely noted yesterday the differences by how Note and Article get displayed to users differently in Mastodon, inline images don’t get displayed by Mastodon and have to be attachments but that causes duplication of images in another implementation (I forget which one)

Use cases

I want to verify a new change to my implementation does not break interop
I want to see what an activity looks like in various implementations
I want to verify an implementation for spec conformance
I want to identify issues / omissions in the spec

Do we need a neutral coalition for hosting the test suite and better documentation (developer network)? https://fedidevs.org/

Democratically governed
Need for governance in conflict resolution, neutrality
Needs money to operate.
- From who? The biggest users like Meta?
Pluggable in by individual projects to sponsor their own portion of the test suite?
General concerns about any form of centralization
- Why not a new W3C task force?
Maybe better to just do the thing as a personal project and later bring into a community-governed org if it’s popular?

Current situation is software with the largest user base defines the spec and that’s Very Bad. Mastodon does not conform in many ways, but soon a bigger vendor will eclipse it: Meta. The sooner we assemble something the better.