Friendly Image Captions using Azure Computer Vision Image Analysis service

Azure Computer Vision service is an API that falls in the Vision API category of Azure Cognitive Services. It gives access to advanced algorithms to process images and get information out of them. Image Analysis service is one of the services in Computer Vision that can help in analysing an image and extract visual features from an image like faces, objects, colours. It can also give you helpful tags about the image, a friendly description for the image, and even detect adult, racy or gory content. I am going to keep my use case quite simple and use the service to generate a friendly caption for an image in the media library. The caption gets generated when an image without a caption is saved.

To begin with, I add a property with the alias caption to the Image media type in my Umbraco install.

To use the Computer Vision service I am using the client library. You can also use the REST API if you wish. The client library Microsoft.Azure.CognitiveServices.Vision.ComputerVision is available as a Nuget package and I have installed it in my solution. I also need a Computer Vision Azure resource that I have created in my Azure subscription. Each resource has a API Key and a endpoint url which are required by the client library to create a ComputerVisionClient object.

I want to call the Computer Vision service whenever an image without a caption is saved. Umbraco 9 uses Notifications to allow you to hook into the backoffice workflow. For e.g., if you want to do some processing when content is saved, content is published, media is saved etc. Notifications exist in pairs, with a "before" and "after" notification. You would use the "before" notification if you want the ability to cancel an operation, and the "after" notification if you want to do some processing after the operation has succeeded. I am using the MediaService notification called MediaSavedNotification that gets published after the media has been saved and data has been persisted. And to do some processing when the notification occurs, we use handlers that subscribe to the notification.

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Extensions.Configuration;
using Umbraco.Cms.Core.Events;
using Umbraco.Cms.Core.IO;
using Umbraco.Cms.Core.Notifications;
using Umbraco.Cms.Core.Services;
using Umbraco.Extensions;
using Umbraco.TechCommunityDayDemo.Models;

namespace Umbraco.Demo.Notifications
{
    public class MediaSavedNotificationHandler : INotificationHandler<MediaSavedNotification>
    {
        private readonly MediaFileManager _mediaFileManager;
        private readonly IMediaService _mediaService;
        private readonly IConfiguration _configuration;

        public MediaSavedNotificationHandler(MediaFileManager mediaFileManager, IMediaService mediaService, IConfiguration configuration)
        {
            _mediaFileManager = mediaFileManager;
            _mediaService = mediaService;
            _configuration = configuration;
        }

        public async void Handle(MediaSavedNotification notification)
        {
            var apiKey = _configuration.GetValue<string>("CognitiveServices:SubscriptionKey");
            var endpoint = _configuration.GetValue<string>("CognitiveServices:EndPoint");                     

            //create a computer vision client
            using (var client = new ComputerVisionClient(new ApiKeyServiceClientCredentials(apiKey)) { Endpoint = endpoint })
            {
                foreach (var media in notification.SavedEntities)
                {
                    //we want to call the Computer Vision service only for images where the caption field is empty
                    if (media.ContentType.Alias == Image.ModelTypeAlias && media.GetValue<string>("caption").IsNullOrWhiteSpace())
                    {
                        //get the image file from local disk as a stream using Umbraco MediaFileManager
                        var imageStream = _mediaFileManager.GetFile(media, out string mediaPath);                                               

                        using (imageStream)
                        {
                            //get the image description
                            var results = await client.DescribeImageInStreamAsync(imageStream);

                            if (results != null)
                            {
                                foreach (var caption in results.Captions)
                                {
                                    var mediaItem = _mediaService.GetById(media.Id);

                                    mediaItem.SetValue("caption", caption.Text);

                                    _mediaService.Save(mediaItem);
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

What my code does is, when the notification is published, it creates a ComputerVisionClient and uses it to call the DescribeImageInStreamAsync() if the saved entity is an image without a caption. The response I get back is an ImageDescription object which has a Captions property which I am saving to the caption property of my media item using the MediaService.

I also need to register my handler. There are multiple ways to do this but I am keeping it simple, using the Startup.cs.

 services.AddUmbraco(_env, _config)
                .AddBackOffice()
                .AddWebsite()
                .AddComposers()
                .AddNotificationHandler<MediaSavedNotification, MediaSavedNotificationHandler>()
                .Build();

And did it work? Yes!!! Look at the friendly caption the service gave me :-)

Want to know more about Notifications in Umbraco? Head over to the official documentation about Notifications and how to Subscribe to Notifications.

Understanding the concepts behind asynchronous messaging

REST : Myths & Facts

REST, gRPC and GraphQL - Part 3 - Comparison

Friendly Image Captions using Azure Computer Vision Image Analysis service

Share

Understanding the concepts behind asynchronous messaging

REST : Myths & Facts

REST, gRPC and GraphQL - Part 3 - Comparison