digiKam GSoC progress: libkface is almost ready

Ah well, today marks a month since the official GSoC coding period started.

Brief intro: I’m working on implementing automatic tagging of faces for digiKam, with face detection and recognition. For that, I’ve been working on a library named libface that does the detection and training/recognition.

At some point, it was deemed necessary to make a Qt wrapper for libface, so that Qt/KDE programs could easily use it. This has been now (almost) done, and the resulting library is named libkface. libkface will be put somewhere in kdegraphics/libs as soon as the API freeze is over (I think), and will be usable not only by digiKam, but also by other programs like KPhotoAlbum.

We’re also going to keep the lib(k)face database in a shared directory, so that all programs that may use it can share the same training data. That way, a user migrating from digiKam to KPhotoAlbum (hypothetical, with face recog) would not have to retrain libkface with his entire collection.

In branches/extragear/graphics/digikam/libkface/test/gui/, I have made (with help from Alex) a simple GUI for playing with libkface:

Two-Face Guy
Two-Face Guy

It has a few buttons for the usual operations. Mind you, this is NOT going to be the way digiKam will present it’s interface to you, so don’t be afraid :D. With my last commit, I have added support in libface for “tunable accuracy” for face detection. Look at the slider above the photo – you can set the accuracy of detection on a Five-point scale. 1 gives you very low accuracy, but insanely high speed. Setting the accuracy to 5 gives you very high accuracy, but very low speed.

From what I’ve observed :

Accuracy of 1 : ~0.8 seconds.

Accuracy of 2 : ~2.1 seconds

Accuracy of 3 : ~4 seconds

Accuracy of 4 : ~7 seconds

Accuracy of 5 : ~13-15 seconds

The default accuracy is kept as 3.

This is a slight improvement from when I blogged about it last time. A nice use-case for this is – if you have a weak/slow computer, you can set the accuracy to 2. 1 is waaay too less, only for waaay slow computers, but in that case you probably shouldn’t be using face detection anyway. My laptop isn’t exactly powerful, and 3 works best. For those of you with newer laptops, you can go for 4. 5 is for those who have obscenely large amounts of processing power.

In the screenshot above, the red frame and the nice little black rectangle you can see is the embryo of the tagging widget I’m making. You can edit the name and all. In the next few days, I’ll be adding Tick/Cross buttons near the text, and making the face rectangle resizable.

In all, this week was quite productive, and I must also thank Marcel (patches, code review), Gilles (Porting libkface from qmake to CMake), and Alex (CMake porting and some code) for their help.

digiKam GSoC progress : libface with a ‘k’

This is a progress report after a long time.

Since my last post, my work has involved hours of staring at debug outputs and fixing bugs in libface. I guess I can now say that libface is very usable. Apart from fixing certain segfaults, I also expanded the API a bit.

Till now, you could only give images for detection by passing a filename string to LibFace::detectFaces as:

LibFace libFace = LibFace(DETECT, "."); // Only for Detection
vector<Face> result;
result = libFace.detectFaces(string(argv[i]));

which is from one of the programs in the examples directory. Not a very good idea to load an image from the filesystem if digiKam already has it in memory.

Fortunately now this has been overloaded with :

result = libFace.detectFaces(img->imageData,img->width, img->height, img->widthStep, img->depth, img->nChannels);

Where img is of type IplImage, the OpenCV native image type. This is useful because now digiKam et al won’t have to use IplImage, as the new way is API-agnostic.

Another thing which I modified, does not change the interface, but which is very useful nevertheless, is the management of face ID’s.Every ID has a one-to-one correspondence with a person.

This is how a libface database can be updated:

1. Updating with new faces, whose ID’s are specified by digiKam. These are people whose faces have never been added to the DB before, but digiKam wants to decide the ID’s for them.

2. Updating with faces that have  been given no ID’s. These are totally new people. The Face objects whose ID’s have not been specified will automatically be given the next available ID.

3. Updating with faces that already have matching ID’s in the DB. These are people whose faces have already been added to the DB before. In this case, the new and old face images are averaged and the result is treated as that ID’s face. This way, every person has only one single face image stored for him/her in the libface database.

This was quite complicated to code, because we store the Face images sequentially in a chronological order in the DB.

What I had to do was link the storage index of a face image with it’s ID using one std::map, and then use another map to link each ID to the number of times that ID’s face has been averaged. It’s complicated to explain, you’ll need to look at the code.

This is stored at the end of the DB in this way :


Looking near the end, you can see that the person with ID 8 has 10 occurences.

Also,  the recognise function now also returns the “accuracy” of each recognition. This would very helpful if digiKam wants to present “suggestions” for tagging.

What I’m working on right now is a new KDE-friendly wrapper for libface, named “libkface”, which presents a Qt-only interface to client apps, so that digiKam, Krita, Gwenview, KPhotoAlbum etc can use it very easily if they wish.

Oh, and as much as I want to, I’ll not be coming to Akademy this year because my passport issual has been delayed.

Before ending this post, here’s the obligatory screenshot:

Detected Doggy
I iz detected doggy?? OMG!