Note utilisateur: 5 / 5

Etoiles activesEtoiles activesEtoiles activesEtoiles activesEtoiles actives
 

Azure AI Vision SDK for image analysis

When i'm visiting a place people send me messages to ask me if I am the one describing what I see. I answer no, it's computer generated. People don't believe me. I'm so tired of explaining every time that there is no magic, and there's no Harry Potter with me, so that I decided to create this article to explain. Anyone can do the same thing as long as they know a little bit about software development. You can learn by yourself here on Microsoft Learn, and GitHub Copilot even explains if we don't understand something. Lets'go!

 

Azure AI Vision SDK for image analysis person taxi dog

 

This works with Microsoft Cloud Azure. In your subscription, you’ll need to provision an Azure AI Vision resource.

 

Azure AI Vision SDK for image analysis abonnement plan

 

You can also use Azure AI Vision services in an Azure AI Services multi-service resource, either directly or in an Azure AI Foundry project.

 

Azure AI Vision SDK for image analysis region france central

 

Check this guide to know what is Github : Git & Github for beginners and Utiliser Git et GitHub

 

Azure AI Vision SDK for image analysis deploiment

 

Azure AI Vision SDK for image analysis explained with Github Copilot : git clone

git clone https://github.com/MicrosoftLearning/mslearn-ai-vision

Or use Visual Studio Code:

Azure AI Vision SDK Clone Git Repository visual studio code

"Clone Git Repository"

Select a repository, "ai-vision"...

Cloning into 'mslearn-ai-vision'...

remote: Enumerating objects: 1787, done.

remote: Counting objects: 100% (653/653), done.

remote: Compressing objects: 100% (294/294), done.

remote: Total 1787 (delta 497), reused 424 (delta 354), pack-reused 1134 (from 1)

Receiving objects: 100% (1787/1787), 37.22 MiB | 39.25 MiB/s, done.

Resolving deltas: 100% (856/856), done.

 

Azure AI Vision SDK for image analysis explained with Github Copilot : create a venv

python -m venv labenv 

 

 

Azure AI Vision SDK for image analysis explained with Github Copilot : requirements

 

pip install -r requirements.txt azure-ai-vision-imageanalysis==1.0.0

Collecting azure-ai-vision-imageanalysis==1.0.0

  Downloading azure_ai_vision_imageanalysis-1.0.0-py3-none-any.whl.metadata (22 kB)

Collecting dotenv (from -r requirements.txt (line 1))

  Downloading dotenv-0.9.9-py2.py3-none-any.whl.metadata (279 bytes)

Collecting matplotlib (from -r requirements.txt (line 2))

  Downloading matplotlib-3.10.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (11 kB)

Collecting pillow (from -r requirements.txt (line 3))

  Downloading pillow-11.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (9.0 kB)

Collecting isodate>=0.6.1 (from azure-ai-vision-imageanalysis==1.0.0)

  Downloading isodate-0.7.2-py3-none-any.whl.metadata (11 kB)

Collecting azure-core>=1.30.0 (from azure-ai-vision-imageanalysis==1.0.0)

  Downloading azure_core-1.35.0-py3-none-any.whl.metadata (44 kB)

Collecting typing-extensions>=4.6.0 (from azure-ai-vision-imageanalysis==1.0.0)

  Downloading typing_extensions-4.15.0-py3-none-any.whl.metadata (3.3 kB)

Collecting python-dotenv (from dotenv->-r requirements.txt (line 1))

  Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 kB)

Collecting contourpy>=1.0.1 (from matplotlib->-r requirements.txt (line 2))

  Downloading contourpy-1.3.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.5 kB)

Collecting cycler>=0.10 (from matplotlib->-r requirements.txt (line 2))

  Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)

Collecting fonttools>=4.22.0 (from matplotlib->-r requirements.txt (line 2))

  Downloading fonttools-4.59.2-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl.metadata (109 kB)

Collecting kiwisolver>=1.3.1 (from matplotlib->-r requirements.txt (line 2))

  Downloading kiwisolver-1.4.9-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.3 kB)

Collecting numpy>=1.23 (from matplotlib->-r requirements.txt (line 2))

  Downloading numpy-2.3.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (62 kB)

Collecting packaging>=20.0 (from matplotlib->-r requirements.txt (line 2))

  Downloading packaging-25.0-py3-none-any.whl.metadata (3.3 kB)

Collecting pyparsing>=2.3.1 (from matplotlib->-r requirements.txt (line 2))

  Downloading pyparsing-3.2.3-py3-none-any.whl.metadata (5.0 kB)

Collecting python-dateutil>=2.7 (from matplotlib->-r requirements.txt (line 2))

  Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)

Collecting requests>=2.21.0 (from azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)

  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)

Collecting six>=1.11.0 (from azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)

  Downloading six-1.17.0-py2.py3-none-any.whl.metadata (1.7 kB)

Collecting charset_normalizer<4,>=2 (from requests>=2.21.0->azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)

  Downloading charset_normalizer-3.4.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (36 kB)

Collecting idna<4,>=2.5 (from requests>=2.21.0->azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)

  Downloading idna-3.10-py3-none-any.whl.metadata (10 kB)

Collecting urllib3<3,>=1.21.1 (from requests>=2.21.0->azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)

  Downloading urllib3-2.5.0-py3-none-any.whl.metadata (6.5 kB)

Collecting certifi>=2017.4.17 (from requests>=2.21.0->azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)

  Downloading certifi-2025.8.3-py3-none-any.whl.metadata (2.4 kB)

Downloading azure_ai_vision_imageanalysis-1.0.0-py3-none-any.whl (64 kB)

Downloading dotenv-0.9.9-py2.py3-none-any.whl (1.9 kB)

Downloading matplotlib-3.10.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (8.7 MB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.7/8.7 MB 67.4 MB/s eta 0:00:00

Downloading pillow-11.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.6 MB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 82.6 MB/s eta 0:00:00

Downloading azure_core-1.35.0-py3-none-any.whl (210 kB)

Downloading contourpy-1.3.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (362 kB)

Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)

Downloading fonttools-4.59.2-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl (4.9 MB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 70.0 MB/s eta 0:00:00

Downloading isodate-0.7.2-py3-none-any.whl (22 kB)

Downloading kiwisolver-1.4.9-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.5 MB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 34.8 MB/s eta 0:00:00

Downloading numpy-2.3.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (16.6 MB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.6/16.6 MB 92.7 MB/s eta 0:00:00

Downloading packaging-25.0-py3-none-any.whl (66 kB)

Downloading pyparsing-3.2.3-py3-none-any.whl (111 kB)

Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)

Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB)

Downloading python_dotenv-1.1.1-py3-none-any.whl (20 kB)

Downloading requests-2.32.5-py3-none-any.whl (64 kB)

Downloading six-1.17.0-py2.py3-none-any.whl (11 kB)

Downloading certifi-2025.8.3-py3-none-any.whl (161 kB)

Downloading charset_normalizer-3.4.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (151 kB)

Downloading idna-3.10-py3-none-any.whl (70 kB)

Downloading urllib3-2.5.0-py3-none-any.whl (129 kB)

Installing collected packages: urllib3, typing-extensions, six, python-dotenv, pyparsing, pillow, packaging, numpy, kiwisolver, isodate, idna, fonttools, cycler, charset_normalizer, certifi, requests, python-dateutil, dotenv, contourpy, matplotlib, azure-core, azure-ai-vision-imageanalysis

Successfully installed azure-ai-vision-imageanalysis-1.0.0 azure-core-1.35.0 certifi-2025.8.3 charset_normalizer-3.4.3 contourpy-1.3.3 cycler-0.12.1 dotenv-0.9.9 fonttools-4.59.2 idna-3.10 isodate-0.7.2 kiwisolver-1.4.9 matplotlib-3.10.6 numpy-2.3.2 packaging-25.0 pillow-11.3.0 pyparsing-3.2.3 python-dateutil-2.9.0.post0 python-dotenv-1.1.1 requests-2.32.5 six-1.17.0 typing-extensions-4.15.0 urllib3-2.5.0

 

[notice] A new release of pip is available: 24.3.1 -> 25.2

[notice] To update, run: pip install --upgrade pip

Azure AI Vision SDK for image analysis explained with Github Copilot : modify the env

code .env

(to modify Keys and Endpoint )

Azure AI Vision SDK for image analysis explained with Github Copilot : open code in visual studio code

 

 

To know what is Python langage, checjk this: Cours python en ligne gratuit français. Base du langage

 

 

code image-analysis.py

from dotenv import load_dotenv

import os

from PIL import Image, ImageDraw

import sys

from matplotlib import pyplot as plt

from azure.core.exceptions import HttpResponseError

import requests

# import namespaces

from azure.ai.vision.imageanalysis import ImageAnalysisClient

from azure.ai.vision.imageanalysis.models import VisualFeatures

from azure.core.credentials import AzureKeyCredential

def main():

    # Clear the console

    os.system('cls' if os.name=='nt' else 'clear')

    try:

        # Get Configuration Settings

        load_dotenv()

        ai_endpoint = os.getenv('AI_SERVICE_ENDPOINT')

        ai_key = os.getenv('AI_SERVICE_KEY')

        # Get image

        image_file = 'images/street.jpg'

        if len(sys.argv) > 1:

            image_file = sys.argv[1]

        

        # Authenticate Azure AI Vision client

        cv_client = ImageAnalysisClient(

            endpoint=ai_endpoint,

            credential=AzureKeyCredential(ai_key))

        # Analyze image

        with open(image_file, "rb") as f:

            image_data = f.read()

        print(f'\nAnalyzing {image_file}\n')

        result = cv_client.analyze(

            image_data=image_data,

            visual_features=[

                VisualFeatures.CAPTION,

                VisualFeatures.DENSE_CAPTIONS,

                VisualFeatures.TAGS,

                VisualFeatures.OBJECTS,

                VisualFeatures.PEOPLE],

        )

        # Get image captions

        if result.caption is not None:

            print("\nCaption:")

            print(" Caption: '{}' (confidence: {:.2f}%)".format(result.caption.text, result.caption.confidence * 100))

    

        if result.dense_captions is not None:

            print("\nDense Captions:")

            for caption in result.dense_captions.list:

                print(" Caption: '{}' (confidence: {:.2f}%)".format(caption.text, caption.confidence * 100))

        

        # Get image tags

if result.tags is not None:

    print("\nTags:")

    for tag in result.tags.list:

        print(" Tag: '{}' (confidence: {:.2f}%)".format(tag.name, tag.confidence * 100))

        # Get objects in the image

if result.objects is not None:

     print("\nObjects in image:")

     for detected_object in result.objects.list:

         # Print object tag and confidence

         print(" {} (confidence: {:.2f}%)".format(detected_object.tags[0].name, detected_object.tags[0].confidence * 100))

     # Annotate objects in the image

     show_objects(image_file, result.objects.list)

        # Get people in the image

if result.people is not None:

     print("\nPeople in image:")

     for detected_person in result.people.list:

         if detected_person.confidence > 0.2:

             # Print location and confidence of each person detected

             print(" {} (confidence: {:.2f}%)".format(detected_person.bounding_box, detected_person.confidence * 100))

     # Annotate people in the image

     show_people(image_file, result.people.list)

            

        

    except Exception as ex:

        print(ex)

def show_objects(image_filename, detected_objects):

    print ("\nAnnotating objects...")

    # Prepare image for drawing

    image = Image.open(image_filename)

    fig = plt.figure(figsize=(image.width/100, image.height/100))

    plt.axis('off')

    draw = ImageDraw.Draw(image)

    color = 'cyan'

    for detected_object in detected_objects:

        # Draw object bounding box

        r = detected_object.bounding_box

        bounding_box = ((r.x, r.y), (r.x + r.width, r.y + r.height)) 

        draw.rectangle(bounding_box, outline=color, width=3)

        plt.annotate(detected_object.tags[0].name,(r.x, r.y), backgroundcolor=color)

    # Save annotated image

    plt.imshow(image)

    plt.tight_layout(pad=0)

    objectfile = 'objects.jpg'

    fig.savefig(objectfile)

    print('  Results saved in', objectfile)

def show_people(image_filename, detected_people):

    print ("\nAnnotating objects...")

    # Prepare image for drawing

    image = Image.open(image_filename)

    fig = plt.figure(figsize=(image.width/100, image.height/100))

    plt.axis('off')

    draw = ImageDraw.Draw(image)

    color = 'cyan'

    for detected_person in detected_people:

        if detected_person.confidence > 0.2:

            # Draw object bounding box

            r = detected_person.bounding_box

            bounding_box = ((r.x, r.y), (r.x + r.width, r.y + r.height))

            draw.rectangle(bounding_box, outline=color, width=3)

    # Save annotated image

    plt.imshow(image)

    plt.tight_layout(pad=0)

    peoplefile = 'people.jpg'

    fig.savefig(peoplefile)

    print('  Results saved in', peoplefile)

if __name__ == "__main__":

    main()

Azure AI Vision SDK for image analysis explained with Github Copilot : choose the model

Azure AI Vision SDK for image analysis github copilot choose model

We can choose the model : Claude Sonnet 3.5 , Gemini 2.0 Flash , GPT-4.1 , GPT-4o, GPT-5 mini (Preview), o3-mini

Azure AI Vision SDK for image analysis explained with Github Copilot : explain import namspaces

 

Azure AI Vision SDK for image analysis github copilot explain namespaces

Azure AI Vision SDK for image analysis explained with Github Copilot : @workspace explain

Azure AI Vision SDK for image analysis github copilot workspace explain namespaces

Next, Github Copilot answer:

"You have several issues:

Expected indented block: This usually means Python expects an indented code block after a statement like if, for, def, or class. Make sure your code structure is correct.

Import could not be resolved: This means the required packages are not installed in your environment.

Unused imports: You have imported modules or classes that are not used in your code.

How to fix

1. Install missing packages

Open your VS Code terminal and run:

pip install python-dotenv pillow azure-ai-vision azure-core matplotlib requests"

Azure AI Vision SDK for image analysis explained with Github Copilot : explain Get image

 # Get image
        image_file = 'images/photo.jpg'
        if len(sys.argv) > 1:
            image_file = sys.argv[1]

Github Copilot: "This code snippet is responsible for determining which image file will be analyzed by the program. By default, it sets image_file to 'images/street.jpg', meaning that if the user does not provide any additional input, the program will use this image. However, the next two lines check if any command-line arguments were provided when the script was run. The sys.argv list contains all command-line arguments passed to the script, with the first element (sys.argv[0]) being the script name itself. If the length of sys.argv is greater than 1, it means the user has supplied at least one argument after the script name. In that case, the code assigns the first argument (sys.argv[1]) to image_file, allowing the user to specify a different image file to analyze. This approach makes the script flexible, supporting both a default image and user-specified images via the command line."

Azure AI Vision SDK for image analysis explained with Github Copilot : explain Analyze image

 # Analyze image
        with open(image_file, "rb") as f:
            image_data = f.read()
        print(f'\nAnalyzing {image_file}\n')

        result = cv_client.analyze(
            image_data=image_data,
            visual_features=[
                VisualFeatures.CAPTION,
                VisualFeatures.DENSE_CAPTIONS,
                VisualFeatures.TAGS,
                VisualFeatures.OBJECTS,
                VisualFeatures.PEOPLE],
        )

Github Copilot: "This section of code is responsible for sending the selected image to the Azure AI Vision service for analysis. First, it opens the image file specified by image_file in binary read mode ("rb"), reads its contents into the variable image_data, and then closes the file automatically thanks to the with statement. This ensures the image data is ready to be sent to the analysis service. Next, the code prints a message to the console indicating which image is being analyzed, providing feedback to the user. The core operation follows: the cv_client.analyze method is called, passing in the image data and a list of visual features to extract. These features include captions, dense captions, tags, objects, and people, as specified by the VisualFeatures enumeration. The result of this analysis is stored in the result variable, which can then be used to access the detected features and their details. This approach allows the program to flexibly analyze various aspects of the image in a single API call."

Azure AI Vision SDK for image analysis explained with Github Copilot : generate doc

 

Azure AI Vision SDK for image analysis github copilot generate doc

Github Copilot: "

    Main function to analyze an image using Azure AI Vision services.

    This function performs the following steps:

    1. Clears the console for better readability.

    2. Loads configuration settings (endpoint and key) from environment variables.

    3. Determines the image file to analyze, either from command-line arguments or a default path.

    4. Authenticates the Azure AI Vision client using the provided credentials.

    5. Reads the image file and sends it to the Azure AI Vision service for analysis, requesting captions, dense captions, tags, objects, and people detection.

    6. Prints the analysis results, including captions, dense captions, and tags with their confidence scores.

    7. Handles and prints any exceptions that occur during execution.

    ""

Azure AI Vision SDK for image analysis explained with Github Copilot : run the code with an image

Azure AI Vision SDK for image analysis person taxi dog simple

python image-analysis.py image.jpg

Azure AI Vision SDK for image analysis explained with Github Copilot : basic results

 

Caption:

 Caption: 'a man walking a dog on a leash on a street' (confidence: 82.07%)

 

Dense Captions:

 Caption: 'a man walking a dog on a leash on a street' (confidence: 82.06%)

 Caption: 'a man walking on a street' (confidence: 69.02%)

 Caption: 'a yellow car on the street' (confidence: 78.22%)

 Caption: 'a black dog walking on the street' (confidence: 75.31%)

 Caption: 'a blurry image of a blue car' (confidence: 82.01%)

 Caption: 'a yellow taxi cab on the street' (confidence: 72.44%)

 

Caption:

 Caption: 'a man in a suit' (confidence: 78.55%)

 

Dense Captions:

 Caption: 'a man in a suit' (confidence: 78.55%)

 Caption: 'a man in a suit giving a thumbs up' (confidence: 80.96%)

 

Tags:

 Tag: 'outdoor' (confidence: 99.87%)

 Tag: 'land vehicle' (confidence: 99.02%)

 Tag: 'vehicle' (confidence: 98.89%)

 Tag: 'building' (confidence: 98.55%)

 Tag: 'road' (confidence: 95.98%)

 Tag: 'wheel' (confidence: 95.14%)

 Tag: 'street' (confidence: 94.71%)

 Tag: 'person' (confidence: 93.01%)

 Tag: 'clothing' (confidence: 91.19%)

 Tag: 'taxi' (confidence: 90.95%)

 Tag: 'car' (confidence: 84.01%)

 Tag: 'dog' (confidence: 82.68%)

 Tag: 'yellow' (confidence: 77.08%)

 Tag: 'walking' (confidence: 74.11%)

 Tag: 'city' (confidence: 64.80%)

 Tag: 'woman' (confidence: 57.53%)

 

Tags:

 Tag: 'outdoor' (confidence: 99.93%)

 Tag: 'cloud' (confidence: 99.79%)

 Tag: 'grass' (confidence: 99.76%)

 Tag: 'tree' (confidence: 99.09%)

 Tag: 'sky' (confidence: 98.95%)

 Tag: 'building' (confidence: 96.15%)

 Tag: 'dome' (confidence: 91.38%)

 Tag: 'presidential palace' (confidence: 91.03%)

 Tag: 'landmark' (confidence: 89.98%)

 Tag: 'palace' (confidence: 89.63%)

 Tag: 'plant' (confidence: 88.77%)

 Tag: 'courthouse' (confidence: 84.78%)

 Tag: 'field' (confidence: 82.64%)

 Tag: 'city' (confidence: 76.11%)

 Tag: 'large' (confidence: 68.73%)

 Tag: 'park' (confidence: 54.24%)

 

Caption:

 Caption: 'a man walking a dog on a leash on a street' (confidence: 82.07%)

 

Dense Captions:

 Caption: 'a man walking a dog on a leash on a street' (confidence: 82.07%)

 Caption: 'a man walking on a street' (confidence: 69.02%)

 Caption: 'a yellow car on the street' (confidence: 78.17%)

 Caption: 'a black dog walking on the street' (confidence: 75.31%)

 Caption: 'a blurry image of a blue car' (confidence: 82.01%)

 Caption: 'a yellow taxi cab on the street' (confidence: 72.44%)

 

Tags:

 Tag: 'outdoor' (confidence: 99.87%)

 Tag: 'land vehicle' (confidence: 99.02%)

 Tag: 'vehicle' (confidence: 98.89%)

 Tag: 'building' (confidence: 98.55%)

 Tag: 'road' (confidence: 95.98%)

 Tag: 'wheel' (confidence: 95.14%)

 Tag: 'street' (confidence: 94.71%)

 Tag: 'person' (confidence: 93.01%)

 Tag: 'clothing' (confidence: 91.19%)

 Tag: 'taxi' (confidence: 90.95%)

 Tag: 'car' (confidence: 84.01%)

 Tag: 'dog' (confidence: 82.68%)

 Tag: 'yellow' (confidence: 77.08%)

 Tag: 'walking' (confidence: 74.11%)

 Tag: 'city' (confidence: 64.80%)

 Tag: 'woman' (confidence: 57.53%)

 

Objects in image:

 car (confidence: 72.40%)

 taxi (confidence: 77.00%)

 person (confidence: 78.10%)

 dog (confidence: 54.40%)

 

Dense Captions:

 Caption: 'a man walking a dog on a leash on a street' (confidence: 82.07%)

 Caption: 'a man walking on a street' (confidence: 69.02%)

 Caption: 'a yellow car on the street' (confidence: 78.22%)

 Caption: 'a black dog walking on the street' (confidence: 75.31%)

 Caption: 'a blurry image of a blue car' (confidence: 82.01%)

 Caption: 'a yellow taxi cab on the street' (confidence: 72.44%)

 

Tags:

 Tag: 'outdoor' (confidence: 99.87%)

 Tag: 'land vehicle' (confidence: 99.02%)

 Tag: 'vehicle' (confidence: 98.89%)

 Tag: 'building' (confidence: 98.55%)

 Tag: 'road' (confidence: 95.98%)

 Tag: 'wheel' (confidence: 95.14%)

 Tag: 'street' (confidence: 94.71%)

 Tag: 'person' (confidence: 93.01%)

 Tag: 'clothing' (confidence: 91.19%)

 Tag: 'taxi' (confidence: 90.95%)

 Tag: 'car' (confidence: 84.01%)

 Tag: 'dog' (confidence: 82.68%)

 Tag: 'yellow' (confidence: 77.08%)

 Tag: 'walking' (confidence: 74.11%)

 Tag: 'city' (confidence: 64.80%)

 Tag: 'woman' (confidence: 57.53%)

 

Objects in image:

 car (confidence: 72.40%)

 taxi (confidence: 77.00%)

 person (confidence: 78.10%)

 dog (confidence: 54.40%)

 

Annotating objects...

  Results saved in objects.jpg

 

People in image:

 {'x': 241, 'y': 109, 'w': 155, 'h': 399} (confidence: 94.99%)

 {'x': 396, 'y': 264, 'w': 23, 'h': 58} (confidence: 24.93%)

 {'x': 699, 'y': 262, 'w': 20, 'h': 33} (confidence: 22.45%)

 

 Annotating objects...

  Results saved in people.jpg

 

Azure AI Vision SDK for image analysis person taxi dog

 

 

Azure AI Vision SDK for image analysis explained with Github Copilot : describe the image with NLP natural language processing

 

Now with the tags outdoor,land vehicle,vehicle,building,road,wheel,street,person,clothing,taxi,car,dog,yellow,walking,city, we can ask Azure Open AI to describe the image in natural langage processing, check AI Azure Open AI Intégration Chatbot avec RAG CosmosDB depuis fork GIT POC MVP 

 

We can use for exemple the model "Phi-4-multimodal-instruct" or "GPT-4o mini" etc.

 

 " In the image, a person is seen walking a black dog along a bustling street in an urban city environment. The individual is dressed in a smart suit, which adds a touch of professionalism to the scene. He appears to be enjoying a leisurely stroll, perhaps on his way to work or a meeting, as he holds a smartphone in one hand while guiding the dog with the other.

 

The road is lined with several yellow taxis, iconic to many metropolitan areas, indicating a vibrant atmosphere where transportation is readily available. These land vehicles are parked along the curb, showcasing their distinct color and design, which contrasts with the more muted tones of the surrounding buildings. The architecture in the background features classic city elements, with tall structures that hint at the historical significance of the area.

 

The wheel of the taxis is visible, emphasizing the movement and activity typical of city life. The scene captures a moment of interaction between the man and his dog, highlighting the companionship that exists in urban settings. The presence of the dog adds a layer of warmth to the otherwise busy environment, suggesting a balance between work and leisure.

 

Overall, this image encapsulates the essence of outdoor life in a city, where people navigate their daily routines amidst the hustle and bustle. The combination of the person, dog, vehicles, and buildings creates a dynamic snapshot of urban living, reflecting both the challenges and joys of city life. "

 

 

A question? Ask it here

 

Need help with Azure AI Vision SDK for image analysis?

 

Fill this form

from dotenv import load_dotenv

import os

from PIL import Image, ImageDraw

import sys

from matplotlib import pyplot as plt

from azure.core.exceptions import HttpResponseError

import requests

# import namespaces

from azure.ai.vision.imageanalysis import ImageAnalysisClient

from azure.ai.vision.imageanalysis.models import VisualFeatures

from azure.core.credentials import AzureKeyCredential

def main():

    # Clear the console

    os.system('cls' if os.name=='nt' else 'clear')

    try:

        # Get Configuration Settings

        load_dotenv()

        ai_endpoint = os.getenv('AI_SERVICE_ENDPOINT')

        ai_key = os.getenv('AI_SERVICE_KEY')

        # Get image

        image_file = 'images/street.jpg'

        if len(sys.argv) > 1:

            image_file = sys.argv[1]

        

        # Authenticate Azure AI Vision client

        cv_client = ImageAnalysisClient(

            endpoint=ai_endpoint,

            credential=AzureKeyCredential(ai_key))

        # Analyze image

        with open(image_file, "rb") as f:

            image_data = f.read()

        print(f'\nAnalyzing {image_file}\n')

        result = cv_client.analyze(

            image_data=image_data,

            visual_features=[

                VisualFeatures.CAPTION,

                VisualFeatures.DENSE_CAPTIONS,

                VisualFeatures.TAGS,

                VisualFeatures.OBJECTS,

                VisualFeatures.PEOPLE],

        )



        # Get image captions

        if result.caption is not None:

            print("\nCaption:")

            print(" Caption: '{}' (confidence: {:.2f}%)".format(result.caption.text, result.caption.confidence * 100))

   

        if result.dense_captions is not None:

            print("\nDense Captions:")

            for caption in result.dense_captions.list:

                print(" Caption: '{}' (confidence: {:.2f}%)".format(caption.text, caption.confidence * 100))

       

        # Get image tags

        # Get objects in the image

        # Get people in the image

 

           

       

    except Exception as ex:

        print(ex)

def show_objects(image_filename, detected_objects):

    print ("\nAnnotating objects...")

    # Prepare image for drawing

    image = Image.open(image_filename)

    fig = plt.figure(figsize=(image.width/100, image.height/100))

    plt.axis('off')

    draw = ImageDraw.Draw(image)

    color = 'cyan'

    for detected_object in detected_objects:

        # Draw object bounding box

        r = detected_object.bounding_box

        bounding_box = ((r.x, r.y), (r.x + r.width, r.y + r.height))

        draw.rectangle(bounding_box, outline=color, width=3)

        plt.annotate(detected_object.tags[0].name,(r.x, r.y), backgroundcolor=color)

    # Save annotated image

    plt.imshow(image)

    plt.tight_layout(pad=0)

    objectfile = 'objects.jpg'

    fig.savefig(objectfile)

    print('  Results saved in', objectfile)

def show_people(image_filename, detected_people):

    print ("\nAnnotating objects...")

    # Prepare image for drawing

    image = Image.open(image_filename)

    fig = plt.figure(figsize=(image.width/100, image.height/100))

    plt.axis('off')

    draw = ImageDraw.Draw(image)

    color = 'cyan'

    for detected_person in detected_people:

        if detected_person.confidence > 0.2:

            # Draw object bounding box

            r = detected_person.bounding_box

            bounding_box = ((r.x, r.y), (r.x + r.width, r.y + r.height))

            draw.rectangle(bounding_box, outline=color, width=3)

    # Save annotated image

    plt.imshow(image)

    plt.tight_layout(pad=0)

    peoplefile = 'people.jpg'

    fig.savefig(peoplefile)

    print('  Results saved in', peoplefile)

if __name__ == "__main__":

    main()