Page MenuHomePhabricator

Delay/wait for confirmation of likely porn images using algorithm detection
Closed, DeclinedPublic

Description

Author: dwheeler

Description:
There's a risk of inappropriate images being uploaded and appearing;
yes, people notice and fix them, but images can be much more
shocking to many people than words.

One partial solution might be use an algorithm that tries to
detect "likely pornographic" images, and handle them differently.
Like spam filters, my understanding of such algorithms is that they're
imperfect but often right. I believe they generally work by noticing
a lot of flesh tones in a picture that doesn't seem to be a face.
You could then delay for a short time actual viewing of such 'suspect images',
placing them on a "please check this" list (where an admin might okay,
or after some period of time it just becomes visible).
It won't be perfect, but this technical and procedural way could
lower the risk a little bit. It's worth noting that in almost all cases,
porn images are also copyright violations, so even if you don't care about
porn per se, it's still a reasonable idea to have extra controls relating to images.

I did a little searching on filtering out porn images. I found a OSS/FS
implementation of an algortihm to detect porn images,
based on a larger project to detect 'bad' things called POESIA.
You can see an academic paper on POESIA as a whole
(http://www.poesia-filter.org/pdf/Deliverable_1_4_public.pdf).
SourceForge has POESIA software
(http://cvs.sourceforge.net/viewcvs.py/poesia/PoesiaSoft/);
see the "ImageFilter" and "Java" subdirectories for code,
and "Documentation" for - well, you can guess.
Presumably, you could pass an image to this code, which would
tell you if it's likely to be porn or not, and then you could make
other decisions based on that. One interesting thing: POESIA can also
detect certain symbols, like swaztikas, if you want it to.
There may be other such tools; this is just the one I found.
If you went with a neural net instead, you'd need to train;
and make sure that faces are in the "okay" list.

It would also be wise to limit the number of images
uploaded per minute, at least for anonymous users.
That would prevent uploading tons
of garbage and abusing the filter, and really, it'd be a
good idea anyway in most cases.

It might also be wise to maintain a database of
deleted image checksums; a match should be quarantined.

Anyway, thought this might be useful to think about.


Version: unspecified
Severity: enhancement
URL: http://cvs.sourceforge.net/viewcvs.py/poesia/PoesiaSoft/

Details

Reference
bz682

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 7:00 PM
bzimport set Reference to bz682.
bzimport added a subscriber: Unknown Object (MLST).

thalltda wrote:

Sorry, but I completely object to such a filter. Some pornographic images have a place on wikipedia, e.g.
for articles about pornography.

Not sure why this wasn't closed earlier. This is a non-problem; resolving as WONTFIX.

thalltda wrote:

I should also note that implementing such a feature would lengthen the time of the upload process, even
for legitimate images, as they would all need to be scanned.

thalltda wrote:

Not to mention that MediaWiki is used for some sites that may LIKE to have porn images, i.e. the Wiki Sex
Site. Don't be a fool. Vote NO on bug 682.