Author: dwheeler
Description:
There's a risk of inappropriate images being uploaded and appearing;
yes, people notice and fix them, but images can be much more
shocking to many people than words.
One partial solution might be use an algorithm that tries to
detect "likely pornographic" images, and handle them differently.
Like spam filters, my understanding of such algorithms is that they're
imperfect but often right. I believe they generally work by noticing
a lot of flesh tones in a picture that doesn't seem to be a face.
You could then delay for a short time actual viewing of such 'suspect images',
placing them on a "please check this" list (where an admin might okay,
or after some period of time it just becomes visible).
It won't be perfect, but this technical and procedural way could
lower the risk a little bit. It's worth noting that in almost all cases,
porn images are also copyright violations, so even if you don't care about
porn per se, it's still a reasonable idea to have extra controls relating to images.
I did a little searching on filtering out porn images. I found a OSS/FS
implementation of an algortihm to detect porn images,
based on a larger project to detect 'bad' things called POESIA.
You can see an academic paper on POESIA as a whole
(http://www.poesia-filter.org/pdf/Deliverable_1_4_public.pdf).
SourceForge has POESIA software
(http://cvs.sourceforge.net/viewcvs.py/poesia/PoesiaSoft/);
see the "ImageFilter" and "Java" subdirectories for code,
and "Documentation" for - well, you can guess.
Presumably, you could pass an image to this code, which would
tell you if it's likely to be porn or not, and then you could make
other decisions based on that. One interesting thing: POESIA can also
detect certain symbols, like swaztikas, if you want it to.
There may be other such tools; this is just the one I found.
If you went with a neural net instead, you'd need to train;
and make sure that faces are in the "okay" list.
It would also be wise to limit the number of images
uploaded per minute, at least for anonymous users.
That would prevent uploading tons
of garbage and abusing the filter, and really, it'd be a
good idea anyway in most cases.
It might also be wise to maintain a database of
deleted image checksums; a match should be quarantined.
Anyway, thought this might be useful to think about.
Version: unspecified
Severity: enhancement
URL: http://cvs.sourceforge.net/viewcvs.py/poesia/PoesiaSoft/