Globally-inclusive Fediverse Handles (i.e. non-ASCII)
/2025-10/session/2-e/
Convener: Jim DeLaHunt (@jdlh@mstdn.ca)
Participants who chose to record their names here:
- Emelia Smith (@thisismissem@hachyderm.io, @thisismissem.social)
- James Marshall (@jamesmarshall@sfba.social, james@jmarshall.com)
Website: https://github.com/swicg/activitypub-webfinger/issues/29
Notes
Intros/table-setting
- unattributed: I’m mainly curious about: Does webfinger need to be tightly coupled to Fediverse? Why not domain names as handles.
- but ASCII imperialism is also important!
- Emelia: I’ve been adapting some bluesky/AT Protocol technologies for activitypub, trying to separate identity and data from applications. Using bluesky-like handles and portability detached from fediverse platform.
- fun fact, masto uses extremely similar logic to Twitter in the code for how handl
- Hong Minhee from fedify is active within the Korean and Japanese communities, and has advocated for some work to better support those communities
- jm: work on federated software, would love to know why unicode isn’t already standard everywhere?
Discussion
- jdlh: domain names already have non-ascii domain names (per URL spec) but mastodon software is restricting that to ASCII only– why?
- em: could use punycode for domains; but trickling that through to all the different parts of the software (has to work in UX, in moderation/blocklist backend, in UX for moderation (userside and moderator side), etc)
- backstory: lots of moderation attacks, i.e. homoglyph attacks (mastodon.soc1al with a turkish dotless i or whatever)
- em: same problem for the username - some dB wants the dB to be UTF-8 only, ascii-only, etc
- partic since lots of usernames used as primary keys
- ruby also janky with UTF-8 specifically, i’d imagine in some languages
- libicu (via charlock holmes) which parses the UTF-8 strs in messages; common cause of install/setup headaches because libicu is native but needs to be updated in lockstep with gems; fedify has an easier time here because javascript natively supports UTF8 in strings well.
- em: could use punycode for domains; but trickling that through to all the different parts of the software (has to work in UX, in moderation/blocklist backend, in UX for moderation (userside and moderator side), etc)
- JDLH: consider the frame around pros and cons.
- For us in Latin-script culture, ASCII works pretty well, i18n gives us few pros and many cons
- For someone in another culture and another script, i18n gives them huge pros and the cons are not so great.
- jdlh: how make homoglyph detection visible? who’s thinking about this in UX?
- Metamask, cryptocurrency service, they have a huge homoglyph problem which leads to fraud. Their head of UX cares about fraud prevention.
- The DNS people (ICANN) have come up with rules for what characters can be used in domain. Called Label Generation Rules.
- RFC 9839 Unicode Character Repertoire Subsets https://www.rfc-editor.org/rfc/rfc9839.html might be advice on limits to identifiers.
- BF: It would be good to have a really good implementation as a role model.
- JDLH: Agree. The people using the non-Latin usernames will be using good software.
- Then pay somene to do the PR on the Mastodon code.
- em: it may be hard to find, because everyone wants to federate with Mastodon software.
- em: Mastodon includes user name in the identifier for federation. This is an anti-pattern, because you can’t change your username. Mastodon now allows, for new accounts, identifiers apart from user name.
- bf: somewhat related: Portability User Stories: https://codeberg.org/fediverse/fep/src/branch/main/fep/7952/fep-7952.md
- bf: change the relationship between user names and federation identifiers.
- bf: Bluesky has something like a UUID for users, so that when users migrate, the URLs can change
- bf: you need to change all of it at once.
- em: demonstration of the indirection between handles and data storage and did (decentralized identifiers) which are globally unique values, e.g.
did:plc:
anddid:web:
- em: The lemmy software has communities, and they share a handle namespace. This conflicts with Mastodon.
- jdlh: inherit the work from ICANN and domain localization?
- bf: one random precedent: https://discuss.ens.domains/t/ens-name-normalization-2nd/14564/3
- em: maybe start with a primer as groundtruth/reference?
- JDLH is willing to try. em can provide resources for writing those document. Use respec for W3C documents. Venue: FEP or W3C?
- bf: https://www.punycoder.com/idn/ ?
- bf: FEP might be an easier forum than W3C?
- bf: Scope of document might be spelling out user stories and desired outcomes. Audience is the implementor who can satisfy all those user stories in their own implementation.
- JDLH: since I am aware of some of the prior art from ICANN etc., I will attempt a paper listing this prior art for the benefit of Fediverse people.
- bf: suggest write a use-case FEP and send to the SWICG mailing list next week. This starts a conversation.
- em: suggest that if we can decouple user handle from identity from applications, we get to a better world. May be best to advocate for that big breaking change, rather than a smaller breaking change.
Other weird handle FEPs: https://codeberg.org/fediverse/fep/src/branch/main/fep/e3e9/fep-e3e9.md
Weird things with webfinger: https://correct.webfinger-canary.fietkau.software/
EVAN FROM THE FUTURE: https://github.com/swicg/activitypub-webfinger/issues/29
This is on the agenda for the Social CG.