If it were up to me, all the services of the world would be available to us via REST API’s. In keeping with this grand vision, I stumbled across the Microsoft Cognitive Services – a series of REST API’s that let’s you identify the joy (or grief) in a persons face via an easy to use API. Wonderful!
For me, this is definitely an API in need of a problem to solve. I don’t have a specific use case but I can think of a few creepy things this would be great for:
- A social network where you can only post and comment and respond with selfies.
- Snapchat-like filter stuff on images and videos.
- Profiling (yes, the public creepy kind.)
- Customer Research (also the creepy kind.)
- Any place where you need to count people (like a movie theater? Or a conference?)
I’ve only dug into the three Vision API’s (Computer Vision, Emotion, and Face) but they are super easy to use. Pass them an image and get some JSON back. I didn’t even realize until late into my experiments that if you scroll down to the bottom of the docs they even give you code examples in almost any modern language you can think of.
All that to say, I created a really quick demo to see what all the vision API’s can do! You upload an image and it submits it to the Computer Vision, Face and Emotion API’s. In return, I draw a white rectangle on all the faces returned and dump the JSON on the page to check out. At some point I’ll put this up on github but this is less “code that I’m proud of writing” and more “how quickly can I move code from stackoverflow to my sample page to get this thing working.”
A couple things I discovered:
- Apparently I love unordered lists. I just discovered that right now when I started my second unordered list in a single post.
- The results on the emotion API are really odd. I think surprise and contempt seem to rank really high in posts where I thought surprise or happiness would rank higher. Maybe the results are more poignant than I thought?
- It does well with pictures but not so much with drawings. Sometimes the description on an illustrated portrait would read something crazy like “two giraffes in a fenced in area.” Also, any time it sees a computer it also adds “coffee” and “pizza” as tags.
Anyway, well done Microsoft. There is some really interesting stuff here and it’s always nice to work with a solid, well documents and consistent API.
(Note: If you visit the page on a phone you can take a picture to upload…)
(Another note: I don’t store any of the pictures that you upload anywhere but I do send that image over to Microsoft and I’m not sure what they do with it.)