first time law enforcement are sharing actual csam with a technology company
It’s very much not: PhotoDNA, which is/was the gold standard for content identification, is a collaboration between a whole bunch of LEOs and Microsoft. The end user is only going to get a ‘yes/no idea’ result on a matched hash, but that database was built on real content working with Microsoft.
Disclaimer: below is my experience dealing with this shit from ~2015-2020, so ymmv, take it with some salt, etc.
Law enforcement is also rarely the first-responder to these issues, either: in the US, at least, reports will come to the hosting/service provider first for validation and THEN to NCMEC and LEOs, if the hosting provider confirms what the content is. Even reports that are sent from NCMEC to the provider aren’t being handled by law enforcement as the first step, usually.
And as for validating reports, that’s done by looking at it without all the ‘access controls and safeguards’ you think there are, other than a very thin layer of CYA on the part of the company involved. You get a report, and once PhotoDNA says ‘no fucking clue, you figure it out’ (which, IME, was basically 90% of the time) a human is going to look at it and make a determination, and then file a report with NCMEC or whatever, if it turns out to be CSAM.
Frankly, after having done that for far too fucking long, if this AI tool can reduce the amount of horrible shit someone doing the reviews has to look at, I’m 100% for it.
CSAM is (grossly) a big business, and the ‘new content’ funnel is fucking enormous and is why an extremely delayed and reactive thing like PhotoDNA isn’t all that effective is that, well, there’s a fuckload of children being abused and a fuckload of abusers escaping being caught simply because there’s too much shit to look at and handle effectively and thus any response to anything is super super slow.
This looks like a solution to make it so less people have to be involved in validation, and could be damn near instant in responding to suspected material that does need validation, which will do a good job of at least pushing the shit out of easy (ier?) availability and out of more public spaces, which honestly, is probably the best thing that is going to be managed unless the countries producing this shit start caring and going after the producers which I’m not holding my breath on.
The problem I ran into is that every single platform that primarily interacted with Mastodon (The keys, etc.) had the same exact same set of problems.
While yes, my Firefish instance had search, what was it searching? Local data only, and once I figured out that Mastodon-style replies didn’t federate to all of someone’s followers, it became pretty clear that it was uh, not very useful.
You can search, but any given server may or may not have access to data you actually want and thus, well, you just plain cannot meaningfully search for shit unless you go to one of the mega instances, or join giant piles of relays and store gigabyte upon gigabyte upon gigabyte of garbage data you do not care about.
The whole implementation is kinda garbage for search-based discovery from it’s very basic design all the way through to everyone’s implementations.