Azure AI Vision SDK for image analysis
When i'm visiting a place people send me messages to ask me if I am the one describing what I see. I answer no, it's computer generated. People don't believe me. I'm so tired of explaining every time that there is no magic, and there's no Harry Potter with me, so that I decided to create this article to explain. Anyone can do the same thing as long as they know a little bit about software development. You can learn by yourself here on Microsoft Learn, and GitHub Copilot even explains if we don't understand something. Lets'go!
This works with Microsoft Cloud Azure. In your subscription, you’ll need to provision an Azure AI Vision resource.
You can also use Azure AI Vision services in an Azure AI Services multi-service resource, either directly or in an Azure AI Foundry project.
Check this guide to know what is Github : Git & Github for beginners and Utiliser Git et GitHub
Azure AI Vision SDK for image analysis explained with Github Copilot : git clone
git clone https://github.com/MicrosoftLearning/mslearn-ai-vision
Or use Visual Studio Code:
"Clone Git Repository"
Select a repository, "ai-vision"...
Cloning into 'mslearn-ai-vision'...
remote: Enumerating objects: 1787, done.
remote: Counting objects: 100% (653/653), done.
remote: Compressing objects: 100% (294/294), done.
remote: Total 1787 (delta 497), reused 424 (delta 354), pack-reused 1134 (from 1)
Receiving objects: 100% (1787/1787), 37.22 MiB | 39.25 MiB/s, done.
Resolving deltas: 100% (856/856), done.
Azure AI Vision SDK for image analysis explained with Github Copilot : create a venv
python -m venv labenv
Azure AI Vision SDK for image analysis explained with Github Copilot : requirements
pip install -r requirements.txt azure-ai-vision-imageanalysis==1.0.0
Collecting azure-ai-vision-imageanalysis==1.0.0
Downloading azure_ai_vision_imageanalysis-1.0.0-py3-none-any.whl.metadata (22 kB)
Collecting dotenv (from -r requirements.txt (line 1))
Downloading dotenv-0.9.9-py2.py3-none-any.whl.metadata (279 bytes)
Collecting matplotlib (from -r requirements.txt (line 2))
Downloading matplotlib-3.10.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (11 kB)
Collecting pillow (from -r requirements.txt (line 3))
Downloading pillow-11.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (9.0 kB)
Collecting isodate>=0.6.1 (from azure-ai-vision-imageanalysis==1.0.0)
Downloading isodate-0.7.2-py3-none-any.whl.metadata (11 kB)
Collecting azure-core>=1.30.0 (from azure-ai-vision-imageanalysis==1.0.0)
Downloading azure_core-1.35.0-py3-none-any.whl.metadata (44 kB)
Collecting typing-extensions>=4.6.0 (from azure-ai-vision-imageanalysis==1.0.0)
Downloading typing_extensions-4.15.0-py3-none-any.whl.metadata (3.3 kB)
Collecting python-dotenv (from dotenv->-r requirements.txt (line 1))
Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 kB)
Collecting contourpy>=1.0.1 (from matplotlib->-r requirements.txt (line 2))
Downloading contourpy-1.3.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.5 kB)
Collecting cycler>=0.10 (from matplotlib->-r requirements.txt (line 2))
Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib->-r requirements.txt (line 2))
Downloading fonttools-4.59.2-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl.metadata (109 kB)
Collecting kiwisolver>=1.3.1 (from matplotlib->-r requirements.txt (line 2))
Downloading kiwisolver-1.4.9-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.3 kB)
Collecting numpy>=1.23 (from matplotlib->-r requirements.txt (line 2))
Downloading numpy-2.3.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (62 kB)
Collecting packaging>=20.0 (from matplotlib->-r requirements.txt (line 2))
Downloading packaging-25.0-py3-none-any.whl.metadata (3.3 kB)
Collecting pyparsing>=2.3.1 (from matplotlib->-r requirements.txt (line 2))
Downloading pyparsing-3.2.3-py3-none-any.whl.metadata (5.0 kB)
Collecting python-dateutil>=2.7 (from matplotlib->-r requirements.txt (line 2))
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)
Collecting requests>=2.21.0 (from azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)
Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting six>=1.11.0 (from azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)
Downloading six-1.17.0-py2.py3-none-any.whl.metadata (1.7 kB)
Collecting charset_normalizer<4,>=2 (from requests>=2.21.0->azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)
Downloading charset_normalizer-3.4.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (36 kB)
Collecting idna<4,>=2.5 (from requests>=2.21.0->azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)
Downloading idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.1 (from requests>=2.21.0->azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)
Downloading urllib3-2.5.0-py3-none-any.whl.metadata (6.5 kB)
Collecting certifi>=2017.4.17 (from requests>=2.21.0->azure-core>=1.30.0->azure-ai-vision-imageanalysis==1.0.0)
Downloading certifi-2025.8.3-py3-none-any.whl.metadata (2.4 kB)
Downloading azure_ai_vision_imageanalysis-1.0.0-py3-none-any.whl (64 kB)
Downloading dotenv-0.9.9-py2.py3-none-any.whl (1.9 kB)
Downloading matplotlib-3.10.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (8.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.7/8.7 MB 67.4 MB/s eta 0:00:00
Downloading pillow-11.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 82.6 MB/s eta 0:00:00
Downloading azure_core-1.35.0-py3-none-any.whl (210 kB)
Downloading contourpy-1.3.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (362 kB)
Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
Downloading fonttools-4.59.2-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl (4.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 70.0 MB/s eta 0:00:00
Downloading isodate-0.7.2-py3-none-any.whl (22 kB)
Downloading kiwisolver-1.4.9-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 34.8 MB/s eta 0:00:00
Downloading numpy-2.3.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (16.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.6/16.6 MB 92.7 MB/s eta 0:00:00
Downloading packaging-25.0-py3-none-any.whl (66 kB)
Downloading pyparsing-3.2.3-py3-none-any.whl (111 kB)
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB)
Downloading python_dotenv-1.1.1-py3-none-any.whl (20 kB)
Downloading requests-2.32.5-py3-none-any.whl (64 kB)
Downloading six-1.17.0-py2.py3-none-any.whl (11 kB)
Downloading certifi-2025.8.3-py3-none-any.whl (161 kB)
Downloading charset_normalizer-3.4.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (151 kB)
Downloading idna-3.10-py3-none-any.whl (70 kB)
Downloading urllib3-2.5.0-py3-none-any.whl (129 kB)
Installing collected packages: urllib3, typing-extensions, six, python-dotenv, pyparsing, pillow, packaging, numpy, kiwisolver, isodate, idna, fonttools, cycler, charset_normalizer, certifi, requests, python-dateutil, dotenv, contourpy, matplotlib, azure-core, azure-ai-vision-imageanalysis
Successfully installed azure-ai-vision-imageanalysis-1.0.0 azure-core-1.35.0 certifi-2025.8.3 charset_normalizer-3.4.3 contourpy-1.3.3 cycler-0.12.1 dotenv-0.9.9 fonttools-4.59.2 idna-3.10 isodate-0.7.2 kiwisolver-1.4.9 matplotlib-3.10.6 numpy-2.3.2 packaging-25.0 pillow-11.3.0 pyparsing-3.2.3 python-dateutil-2.9.0.post0 python-dotenv-1.1.1 requests-2.32.5 six-1.17.0 typing-extensions-4.15.0 urllib3-2.5.0
[notice] A new release of pip is available: 24.3.1 -> 25.2
[notice] To update, run: pip install --upgrade pip
Azure AI Vision SDK for image analysis explained with Github Copilot : modify the env
code
.env
(to modify Keys and Endpoint )
Azure AI Vision SDK for image analysis explained with Github Copilot : open code in visual studio code
To know what is Python langage, checjk this: Cours python en ligne gratuit français. Base du langage
code
image-analysis.py
from dotenv import load_dotenv
import os
from PIL import Image, ImageDraw
import sys
from matplotlib import pyplot as plt
from azure.core.exceptions import HttpResponseError
import requests
# import namespaces
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.ai.vision.imageanalysis.models import VisualFeatures
from azure.core.credentials import AzureKeyCredential
def main():
# Clear the console
os.system('cls' if os.name=='nt' else 'clear')
try:
# Get Configuration Settings
load_dotenv()
ai_endpoint = os.getenv('AI_SERVICE_ENDPOINT')
ai_key = os.getenv('AI_SERVICE_KEY')
# Get image
image_file = 'images/street.jpg'
if len(sys.argv) > 1:
image_file = sys.argv[1]
# Authenticate Azure AI Vision client
cv_client = ImageAnalysisClient(
endpoint=ai_endpoint,
credential=AzureKeyCredential(ai_key))
# Analyze image
with open(image_file, "rb") as f:
image_data = f.read()
print(f'\nAnalyzing {image_file}\n')
result = cv_client.analyze(
image_data=image_data,
visual_features=[
VisualFeatures.CAPTION,
VisualFeatures.DENSE_CAPTIONS,
VisualFeatures.TAGS,
VisualFeatures.OBJECTS,
VisualFeatures.PEOPLE],
)
# Get image captions
if result.caption is not None:
print("\nCaption:")
print(" Caption: '{}' (confidence: {:.2f}%)".format(result.caption.text, result.caption.confidence * 100))
if result.dense_captions is not None:
print("\nDense Captions:")
for caption in result.dense_captions.list:
print(" Caption: '{}' (confidence: {:.2f}%)".format(caption.text, caption.confidence * 100))
# Get image tags
if result.tags is not None:
print("\nTags:")
for tag in result.tags.list:
print(" Tag: '{}' (confidence: {:.2f}%)".format(tag.name, tag.confidence * 100))
# Get objects in the image
if result.objects is not None:
print("\nObjects in image:")
for detected_object in result.objects.list:
# Print object tag and confidence
print(" {} (confidence: {:.2f}%)".format(detected_object.tags[0].name, detected_object.tags[0].confidence * 100))
# Annotate objects in the image
show_objects(image_file, result.objects.list)
# Get people in the image
if result.people is not None:
print("\nPeople in image:")
for detected_person in result.people.list:
if detected_person.confidence > 0.2:
# Print location and confidence of each person detected
print(" {} (confidence: {:.2f}%)".format(detected_person.bounding_box, detected_person.confidence * 100))
# Annotate people in the image
show_people(image_file, result.people.list)
except Exception as ex:
print(ex)
def show_objects(image_filename, detected_objects):
print ("\nAnnotating objects...")
# Prepare image for drawing
image = Image.open(image_filename)
fig = plt.figure(figsize=(image.width/100, image.height/100))
plt.axis('off')
draw = ImageDraw.Draw(image)
color = 'cyan'
for detected_object in detected_objects:
# Draw object bounding box
r = detected_object.bounding_box
bounding_box = ((r.x, r.y), (r.x + r.width, r.y + r.height))
draw.rectangle(bounding_box, outline=color, width=3)
plt.annotate(detected_object.tags[0].name,(r.x, r.y), backgroundcolor=color)
# Save annotated image
plt.imshow(image)
plt.tight_layout(pad=0)
objectfile = 'objects.jpg'
fig.savefig(objectfile)
print(' Results saved in', objectfile)
def show_people(image_filename, detected_people):
print ("\nAnnotating objects...")
# Prepare image for drawing
image = Image.open(image_filename)
fig = plt.figure(figsize=(image.width/100, image.height/100))
plt.axis('off')
draw = ImageDraw.Draw(image)
color = 'cyan'
for detected_person in detected_people:
if detected_person.confidence > 0.2:
# Draw object bounding box
r = detected_person.bounding_box
bounding_box = ((r.x, r.y), (r.x + r.width, r.y + r.height))
draw.rectangle(bounding_box, outline=color, width=3)
# Save annotated image
plt.imshow(image)
plt.tight_layout(pad=0)
peoplefile = 'people.jpg'
fig.savefig(peoplefile)
print(' Results saved in', peoplefile)
if __name__ == "__main__":
main()
Azure AI Vision SDK for image analysis explained with Github Copilot : choose the model
We can choose the model : Claude Sonnet 3.5 , Gemini 2.0 Flash , GPT-4.1 , GPT-4o, GPT-5 mini (Preview), o3-mini
Azure AI Vision SDK for image analysis explained with Github Copilot : explain import namspaces
Azure AI Vision SDK for image analysis explained with Github Copilot : @workspace explain
Next, Github Copilot answer:
"You have several issues:
Expected indented block: This usually means Python expects an indented code block after a statement like if, for, def, or class. Make sure your code structure is correct.
Import could not be resolved: This means the required packages are not installed in your environment.
Unused imports: You have imported modules or classes that are not used in your code.
How to fix
1. Install missing packages
Open your VS Code terminal and run:
pip install python-dotenv pillow azure-ai-vision azure-core matplotlib requests"
Azure AI Vision SDK for image analysis explained with Github Copilot : explain Get image
Github Copilot: "This code snippet is responsible for determining which image file will be analyzed by the program. By default, it sets image_file to 'images/street.jpg', meaning that if the user does not provide any additional input, the program will use this image. However, the next two lines check if any command-line arguments were provided when the script was run. The sys.argv list contains all command-line arguments passed to the script, with the first element (sys.argv[0]) being the script name itself. If the length of sys.argv is greater than 1, it means the user has supplied at least one argument after the script name. In that case, the code assigns the first argument (sys.argv[1]) to image_file, allowing the user to specify a different image file to analyze. This approach makes the script flexible, supporting both a default image and user-specified images via the command line."
Azure AI Vision SDK for image analysis explained with Github Copilot : explain Analyze image
Github Copilot: "This section of code is responsible for sending the selected image to the Azure AI Vision service for analysis. First, it opens the image file specified by image_file in binary read mode ("rb"), reads its contents into the variable image_data, and then closes the file automatically thanks to the with statement. This ensures the image data is ready to be sent to the analysis service. Next, the code prints a message to the console indicating which image is being analyzed, providing feedback to the user. The core operation follows: the cv_client.analyze method is called, passing in the image data and a list of visual features to extract. These features include captions, dense captions, tags, objects, and people, as specified by the VisualFeatures enumeration. The result of this analysis is stored in the result variable, which can then be used to access the detected features and their details. This approach allows the program to flexibly analyze various aspects of the image in a single API call."
Azure AI Vision SDK for image analysis explained with Github Copilot : generate doc
Github Copilot: "
Main function to analyze an image using Azure AI Vision services.
This function performs the following steps:
1. Clears the console for better readability.
2. Loads configuration settings (endpoint and key) from environment variables.
3. Determines the image file to analyze, either from command-line arguments or a default path.
4. Authenticates the Azure AI Vision client using the provided credentials.
5. Reads the image file and sends it to the Azure AI Vision service for analysis, requesting captions, dense captions, tags, objects, and people detection.
6. Prints the analysis results, including captions, dense captions, and tags with their confidence scores.
7. Handles and prints any exceptions that occur during execution.
""
Azure AI Vision SDK for image analysis explained with Github Copilot : run the code with an image
python image-analysis.py image.jpg
Azure AI Vision SDK for image analysis explained with Github Copilot : basic results
Caption:
Caption: 'a man walking a dog on a leash on a street' (confidence: 82.07%)
Dense Captions:
Caption: 'a man walking a dog on a leash on a street' (confidence: 82.06%)
Caption: 'a man walking on a street' (confidence: 69.02%)
Caption: 'a yellow car on the street' (confidence: 78.22%)
Caption: 'a black dog walking on the street' (confidence: 75.31%)
Caption: 'a blurry image of a blue car' (confidence: 82.01%)
Caption: 'a yellow taxi cab on the street' (confidence: 72.44%)
Caption:
Caption: 'a man in a suit' (confidence: 78.55%)
Dense Captions:
Caption: 'a man in a suit' (confidence: 78.55%)
Caption: 'a man in a suit giving a thumbs up' (confidence: 80.96%)
Tags:
Tag: 'outdoor' (confidence: 99.87%)
Tag: 'land vehicle' (confidence: 99.02%)
Tag: 'vehicle' (confidence: 98.89%)
Tag: 'building' (confidence: 98.55%)
Tag: 'road' (confidence: 95.98%)
Tag: 'wheel' (confidence: 95.14%)
Tag: 'street' (confidence: 94.71%)
Tag: 'person' (confidence: 93.01%)
Tag: 'clothing' (confidence: 91.19%)
Tag: 'taxi' (confidence: 90.95%)
Tag: 'car' (confidence: 84.01%)
Tag: 'dog' (confidence: 82.68%)
Tag: 'yellow' (confidence: 77.08%)
Tag: 'walking' (confidence: 74.11%)
Tag: 'city' (confidence: 64.80%)
Tag: 'woman' (confidence: 57.53%)
Tags:
Tag: 'outdoor' (confidence: 99.93%)
Tag: 'cloud' (confidence: 99.79%)
Tag: 'grass' (confidence: 99.76%)
Tag: 'tree' (confidence: 99.09%)
Tag: 'sky' (confidence: 98.95%)
Tag: 'building' (confidence: 96.15%)
Tag: 'dome' (confidence: 91.38%)
Tag: 'presidential palace' (confidence: 91.03%)
Tag: 'landmark' (confidence: 89.98%)
Tag: 'palace' (confidence: 89.63%)
Tag: 'plant' (confidence: 88.77%)
Tag: 'courthouse' (confidence: 84.78%)
Tag: 'field' (confidence: 82.64%)
Tag: 'city' (confidence: 76.11%)
Tag: 'large' (confidence: 68.73%)
Tag: 'park' (confidence: 54.24%)
Caption:
Caption: 'a man walking a dog on a leash on a street' (confidence: 82.07%)
Dense Captions:
Caption: 'a man walking a dog on a leash on a street' (confidence: 82.07%)
Caption: 'a man walking on a street' (confidence: 69.02%)
Caption: 'a yellow car on the street' (confidence: 78.17%)
Caption: 'a black dog walking on the street' (confidence: 75.31%)
Caption: 'a blurry image of a blue car' (confidence: 82.01%)
Caption: 'a yellow taxi cab on the street' (confidence: 72.44%)
Tags:
Tag: 'outdoor' (confidence: 99.87%)
Tag: 'land vehicle' (confidence: 99.02%)
Tag: 'vehicle' (confidence: 98.89%)
Tag: 'building' (confidence: 98.55%)
Tag: 'road' (confidence: 95.98%)
Tag: 'wheel' (confidence: 95.14%)
Tag: 'street' (confidence: 94.71%)
Tag: 'person' (confidence: 93.01%)
Tag: 'clothing' (confidence: 91.19%)
Tag: 'taxi' (confidence: 90.95%)
Tag: 'car' (confidence: 84.01%)
Tag: 'dog' (confidence: 82.68%)
Tag: 'yellow' (confidence: 77.08%)
Tag: 'walking' (confidence: 74.11%)
Tag: 'city' (confidence: 64.80%)
Tag: 'woman' (confidence: 57.53%)
Objects in image:
car (confidence: 72.40%)
taxi (confidence: 77.00%)
person (confidence: 78.10%)
dog (confidence: 54.40%)
Dense Captions:
Caption: 'a man walking a dog on a leash on a street' (confidence: 82.07%)
Caption: 'a man walking on a street' (confidence: 69.02%)
Caption: 'a yellow car on the street' (confidence: 78.22%)
Caption: 'a black dog walking on the street' (confidence: 75.31%)
Caption: 'a blurry image of a blue car' (confidence: 82.01%)
Caption: 'a yellow taxi cab on the street' (confidence: 72.44%)
Tags:
Tag: 'outdoor' (confidence: 99.87%)
Tag: 'land vehicle' (confidence: 99.02%)
Tag: 'vehicle' (confidence: 98.89%)
Tag: 'building' (confidence: 98.55%)
Tag: 'road' (confidence: 95.98%)
Tag: 'wheel' (confidence: 95.14%)
Tag: 'street' (confidence: 94.71%)
Tag: 'person' (confidence: 93.01%)
Tag: 'clothing' (confidence: 91.19%)
Tag: 'taxi' (confidence: 90.95%)
Tag: 'car' (confidence: 84.01%)
Tag: 'dog' (confidence: 82.68%)
Tag: 'yellow' (confidence: 77.08%)
Tag: 'walking' (confidence: 74.11%)
Tag: 'city' (confidence: 64.80%)
Tag: 'woman' (confidence: 57.53%)
Objects in image:
car (confidence: 72.40%)
taxi (confidence: 77.00%)
person (confidence: 78.10%)
dog (confidence: 54.40%)
Annotating objects...
Results saved in objects.jpg
People in image:
{'x': 241, 'y': 109, 'w': 155, 'h': 399} (confidence: 94.99%)
{'x': 396, 'y': 264, 'w': 23, 'h': 58} (confidence: 24.93%)
{'x': 699, 'y': 262, 'w': 20, 'h': 33} (confidence: 22.45%)
Annotating objects...
Results saved in people.jpg
Azure AI Vision SDK for image analysis explained with Github Copilot : describe the image with NLP natural language processing
Now with the tags outdoor,land vehicle,vehicle,building,road,wheel,street,person,clothing,taxi,car,dog,yellow,walking,city, we can ask Azure Open AI to describe the image in natural langage processing, check AI Azure Open AI Intégration Chatbot avec RAG CosmosDB depuis fork GIT POC MVP
We can use for exemple the model "Phi-4-multimodal-instruct" or "GPT-4o mini" etc.
" In the image, a person is seen walking a black dog along a bustling street in an urban city environment. The individual is dressed in a smart suit, which adds a touch of professionalism to the scene. He appears to be enjoying a leisurely stroll, perhaps on his way to work or a meeting, as he holds a smartphone in one hand while guiding the dog with the other.
The road is lined with several yellow taxis, iconic to many metropolitan areas, indicating a vibrant atmosphere where transportation is readily available. These land vehicles are parked along the curb, showcasing their distinct color and design, which contrasts with the more muted tones of the surrounding buildings. The architecture in the background features classic city elements, with tall structures that hint at the historical significance of the area.
The wheel of the taxis is visible, emphasizing the movement and activity typical of city life. The scene captures a moment of interaction between the man and his dog, highlighting the companionship that exists in urban settings. The presence of the dog adds a layer of warmth to the otherwise busy environment, suggesting a balance between work and leisure.
Overall, this image encapsulates the essence of outdoor life in a city, where people navigate their daily routines amidst the hustle and bustle. The combination of the person, dog, vehicles, and buildings creates a dynamic snapshot of urban living, reflecting both the challenges and joys of city life. "
A question? Ask it here
Need help with Azure AI Vision SDK for image analysis?
Fill this form
from dotenv import load_dotenv
import os
from PIL import Image, ImageDraw
import sys
from matplotlib import pyplot as plt
from azure.core.exceptions import HttpResponseError
import requests
# import namespaces
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.ai.vision.imageanalysis.models import VisualFeatures
from azure.core.credentials import AzureKeyCredential
def main():
# Clear the console
os.system('cls' if os.name=='nt' else 'clear')
try:
# Get Configuration Settings
load_dotenv()
ai_endpoint = os.getenv('AI_SERVICE_ENDPOINT')
ai_key = os.getenv('AI_SERVICE_KEY')
# Get image
image_file = 'images/street.jpg'
if len(sys.argv) > 1:
image_file = sys.argv[1]
# Authenticate Azure AI Vision client
cv_client = ImageAnalysisClient(
endpoint=ai_endpoint,
credential=AzureKeyCredential(ai_key))
# Analyze image
with open(image_file, "rb") as f:
image_data = f.read()
print(f'\nAnalyzing {image_file}\n')
result = cv_client.analyze(
image_data=image_data,
visual_features=[
VisualFeatures.CAPTION,
VisualFeatures.DENSE_CAPTIONS,
VisualFeatures.TAGS,
VisualFeatures.OBJECTS,
VisualFeatures.PEOPLE],
)
# Get image captions
if result.caption is not None:
print("\nCaption:")
print(" Caption: '{}' (confidence: {:.2f}%)".format(result.caption.text, result.caption.confidence * 100))
if result.dense_captions is not None:
print("\nDense Captions:")
for caption in result.dense_captions.list:
print(" Caption: '{}' (confidence: {:.2f}%)".format(caption.text, caption.confidence * 100))
# Get image tags
# Get objects in the image
# Get people in the image
except Exception as ex:
print(ex)
def show_objects(image_filename, detected_objects):
print ("\nAnnotating objects...")
# Prepare image for drawing
image = Image.open(image_filename)
fig = plt.figure(figsize=(image.width/100, image.height/100))
plt.axis('off')
draw = ImageDraw.Draw(image)
color = 'cyan'
for detected_object in detected_objects:
# Draw object bounding box
r = detected_object.bounding_box
bounding_box = ((r.x, r.y), (r.x + r.width, r.y + r.height))
draw.rectangle(bounding_box, outline=color, width=3)
plt.annotate(detected_object.tags[0].name,(r.x, r.y), backgroundcolor=color)
# Save annotated image
plt.imshow(image)
plt.tight_layout(pad=0)
objectfile = 'objects.jpg'
fig.savefig(objectfile)
print(' Results saved in', objectfile)
def show_people(image_filename, detected_people):
print ("\nAnnotating objects...")
# Prepare image for drawing
image = Image.open(image_filename)
fig = plt.figure(figsize=(image.width/100, image.height/100))
plt.axis('off')
draw = ImageDraw.Draw(image)
color = 'cyan'
for detected_person in detected_people:
if detected_person.confidence > 0.2:
# Draw object bounding box
r = detected_person.bounding_box
bounding_box = ((r.x, r.y), (r.x + r.width, r.y + r.height))
draw.rectangle(bounding_box, outline=color, width=3)
# Save annotated image
plt.imshow(image)
plt.tight_layout(pad=0)
peoplefile = 'people.jpg'
fig.savefig(peoplefile)
print(' Results saved in', peoplefile)
if __name__ == "__main__":
main()