>But have you ever wondered if they made photos that look like you? Let's check!
>We collected the huge dataset of 428526 fake generated photos and extracted their facial parameters with https://github.com/ageitgey/face_recognition. Now you can match your image against fake faces and compare with the closest matches. Enjoy!
Maybe it is just me, but how would I go if I wanted to harvest large numbers of "real" photos of "real" persons?
Having curious people uploading them to my website could be an easy way.
https://thispersondoesnotexist.com/ only exists because someone already collected enough real photos of real persons to have a neural network learn their distribution well enough to generate new samples.
Harvesting even more photos is not really necessary at this point, and in any case, scraping them off the web would be faster than creating a novelty website.
>Harvesting even more photos is not really necessary at this point, and in any case, scraping them off the web would be faster than creating a novelty website.
But scraping them from the web says nothing about the source, even if you manage to remove all stock photos.
This way it is IMHO more likely that it is a "real" photo, most probably uploaded by a "real" user and the site has also the IP of the sender.
Morover, most photos you can find on the web have had their EXIF information removed by the host, maybe it is not the case for a casual user.
As I see it scraping them off the web is good for quantity but not so much for quality, this (completely hypothetical) approach would give less quantity but IMHO better quality data.
I definitely tried it with more junk images that I had in stock than real photos, so, there's that. There's moment where you need to know what face a computer would say resemble the most a sushi... So I'm not 100% sure about quality
>Maybe it is just me, but how would I go if I wanted to harvest large numbers of "real" photos of "real" persons?
Use APIs or scraping to collect profile photos from facebook, twitter, google, gravatar, etc. You'll get a lot of non-person photos, but havetheyfaked.me probably does too.
Yep, but as said above that would be "big data" with the need of de-duplicating them and with no additional (reliable) parameters.
Then you will need some AI (or whatever) to remove non-human photos or non-suitable photos (position, lighting, etc.) whilst this method would almost guarantee only "portraits" or "upper torso" pictures of humans.
What I tried to say wasn't that this is the "only" method, but that it is one of the "easy" ones with a high probability of getting "reliable" data.
There seems to be plenty of less-than-pleasant outputs from this net. I had a couple of strange ones when I landed: [1][2]. If those were generated with GAN, the adversarial network could really use another training pass with examples like this (and if GANs are not used, there could be a simple network filtering the first net's output).
I hate to be the one to break the news to you but ... you are a zombie (it's the only way they'd match your photo to pictures of computer-generated undead).
I love this comment mood. Everyone losing it when some random small website asks people to upload their photo if they want to find out some random information. But everyone's cool with sharing constantly on FB, IG, uploading their photos and locations to massive companies, etc etc. The internet has reality distorted everyone's perspective. Orwell would be have something to say about it. Everyone loves Big Brother (FAANG etc) but distrusts each other. Divide and conquer much?
It's almost has if big companies were regulated but personal websites were not.
> when some random small website
It could be some random website. It could also be a 3 letters agency website training airport security face detection, a social engineering website, a random website storing images in a public s3 bucket.
> The internet has reality distorted everyone's perspective.
You give all your personal / biometric info when you get a new passport. If some rando guy were asking for your biometric data in a street would you give him ?
>> It could also be a 3 letters agency website training airport security face detection
3 letter agencies will kindly ask/force Facebook to share their face-detection data and algorithms, no need to do this themselves.
In fact I think the biggest co-conspirator of NSA and other agencies is Apple. First they put fingerprint sensor inside Home button, so you can't avoid using it, and have to share your fingerprints whether you like it or not. Now that they captured enough fingerprints they move to FaceID, which sensors are so conveniently placed that you cant avoid having your face scanned, even if you don't ever set up FaceID for your own use.
This is a good question. the thing is that except you photo I have to extract its vector of facial parameters to match fakes. So with this 128-dimentional vector I can find you on other photos (if I had some)
All the pics and their vectors stay on this small server.
if you want to be deleted please drop me a letter wastemaster@gmail.com
> All the pics and their vectors stay on this small server.
This should be mentioned on the website.
> if you want to be deleted please drop me a letter wastemaster@gmail.com
There should be an easier mechanism to request a deletion of our photo. Better still, request permission from the user to store the photo in your servers before actually storing them.
I think this is the bare minimum of transparency that should required before letting people upload personal data, especially in this day and age.
Not to mention that the website is accessible from the EU, and you're required -by law- to obtain consent to store this personal data, and to tell people what exactly you're going to do with it, and with whom you're sharing it (if anyone).
I know everyone's used to the wild west but I'm glad that's changing, because of comments like yours - this transparency should NOT be something done out of the website owner's good heart (because as we've seen, most will just give us the finger), but enforced by law.
Edit: For the record, wastemaster's actually quite nice, and this is not directed at them, just websites in general.
How exactly would anyone in the EU prosecute someone outside of it for running their own website if that individual does so outside of the EU and does not have any organization or company they are affiliated with. Just because something is accessible from the EU does not make it under their jurisdiction to police.
The EU claims jurisdiction based on the fact that part of the interaction occurred in the EU, so they can fine you (it should be noted that the GDPR applies to data related to people in the EU, not related to EU citizens living elsewhere). Whether they can collect on those fines is a different matter.
How do they intend to fine non EU residents hosting a website outside of the EU? I could see if it was a company but if someone is running a server with a not for profit site on it with no way to identify the site owner and an EU resident visits it, good luck trying to fine anyone. The EU does not own or even control the internet outside of their borders.
Why do you care what happens with a photo of your face? Many thousands of them exist; you probably have a profile photo on gravatar, or linkedin, or twitter somewhere anyway, to say nothing of the many thousands upon thousands of pictures of your face captured in frames on surveillance camera footage.
You provide this information (a picture of your face) to every convenience store, casino, bank, airport, and office building you walk into, many hundreds of times per day, for permanent storage. What is the threat model here from someone with a webserver having a single picture of your face with no other associated identifying information about it?
The fact that you don't want to immediately delete the data after processing is a cause for concern.
I can't think of any reason not to immediately delete the data, other than that you intend to use it for something else in the future.
That said, I appreciate your honesty. If you had actually nefarious intentions, you would presumably just claim that you deleted the data when you don't
> All the pics and their vectors stay on this small server.
Not to take a jab at you specifically, but the fact that someone that can make websites like these is ignorant enough about privacy (law) to casually drop this line marks a worrying development in the accessibility of AI tech.
For impactful technologies, we probably want the required domain knowledge to come with some structural social disciplining so that we can collectively steer that impact in the right direction (whatever that is). Clearly AI libraries have become so easy to use professional ethics are not part of the curriculum.
>This is a good question. the thing is that except you photo I have to extract its vector of facial parameters to match fakes. So with this 128-dimentional vector I can find you on other photos (if I had some)
And - with all due respect - an alternative would be providing an open-source program to create offline this "128-dimentional vector" and upload only this latter and NOT the photo.
Sending this vector is somewhat even worse than just photo. This vector info would allow someone to match your face without your initial photo! (array of 128 floats requires less storage and less transparence)
For example if I extract color map from your photo is this your personal data still?
Everyone's getting absolutely bloody terrified about uploading their photos... What do you think the creator's gonna do, drive to your house and kill you, knowing nothing but what you look like?
None of the matches looked a ton like me, imo, but I tried five photos of myself from 2011-2018 and a couple matches kept reoccurring -- not as top match, but in the top few, so that was interesting.
I didn't trim my beard or mustache from 2009 - 2012, when I shaved it off entirely, and I didn't cut my hair from 1996 - 2017, so there's some good variation in these pictures.
haha well, I was looking at them too. I'm not a native speaker, so considered to use at least their name scheme haveibeenfaked.co or something. Fortunately chosen current option
Got two gdpr complaints, so I added automated removal of uploaded photos and metadata extracted from those photos.
Now photos and data are only stored for 3 minuites, then deleted (app artitecture requires server side processing of uploaded file and to compare results with your photo I have to store it - to be able to show)
Also reflected information above on the website itself.
Thank you for your interest to this project!
Well done. Does this apply to all images that have ever been uploaded to the website, or only to the ones uploaded after your update? If an user uploaded an image 2 hours ago, will their image be stored or has it been deleted already?
>But have you ever wondered if they made photos that look like you? Let's check!
>We collected the huge dataset of 428526 fake generated photos and extracted their facial parameters with https://github.com/ageitgey/face_recognition. Now you can match your image against fake faces and compare with the closest matches. Enjoy!
Maybe it is just me, but how would I go if I wanted to harvest large numbers of "real" photos of "real" persons?
Having curious people uploading them to my website could be an easy way.