Accountable data governance


Convener: Aaron Shaw, Northwestern U (@aaronshaw@social.coop)


Participants who chose to record their names here:

Opening discussion:

  • broad picture?
    • data governance / challenge / failure mode in server level autonomy
    • each server sets their own policies, around access/archiving/public interest/commercial uses,
  • what are the big unsolved questions in this space that you see?
  • political figures saying they opt-in to public interest data archiving (similar to bot opt-in)
    • possible opt-in features at the user level vs. server level
    • assumption is often server-level decision
  • protocol layer as well! bluesky/AT assumption of total public record
    • fediverse/activitypub seems to assume something less public
  • where in the protocol can this be set/toggled.
    • timelines, posts, profiles
  • tensions for public figures
    • Are they using verification features?
    • Should there be some scrutiny of public figure accounts that don’t use verification features?

Protocol layer encourages deletion propagation. but not enforced.

  • is this a problem that could even be solved in the protocol, vs just respectfully followed
  • how many people turn on indexable vs discoverable flags, how many posts are public vs limited
  • IEEE P7012 committee (https://digitalprivacy.ieee.org/standards)? - tag data with the expectations of data use, and a service provider can tag things to say this is what the service requires (https://standards.ieee.org/ieee/7012/7192/)
  • communicating that so end users can understand it would be hard
  • where should we go with this problem / find ways forward
  • lesson to be learned by SBOM / integrity work / verify that server is running something - explained that DCAP (datacenter attestation via CPU/intel) is possible, but it’s complex and most legacy hosting providers couldn’t prove it. SBOM is cool but you might not be able to prove someone remotely computed as you expect.

Trying to name the several threads of conversation here:

  1. “what are my responsibilities as a user” – for instance, a public figure with specific data obligations
  2. “what are my responsibilities as a system admin” – for instance, verifying software integrity of my+federated servers, and
  3. “what are my responsibilities as someone looking at published data on a given mastodon instance” – can i archive it? index it? process it? etc.

Broadly, how do we express these things both in a human- and machine- understandable fashion? How do we support them in practice? Which parts of the solution belong at the protocol layer vs app layer etc?

Cross-cutting political, social, and technical questions to tackle!

  • What are the next steps? Aaron to take the lead on this. Maybe create an email list, or get mutual follows on Mastodon going?