Azure AI Engineer Study Guide

Table of Contents

Below are questions to know the answer to pass the Azure AI Engineer (AI-102) exam.

Resources

Azure AI Services

What are Azure AI services?

A suite of cloud-based products that deliver AI capabilities.

What are the two types of Azure service resources, and when might you use each?
  • A single-service resource provides access to a single Azure service. This is a good strategy when you only need one Azure resource for a project or when you want to see cost information separately.
  • A multi-service resource provides access to many Azure services with a single key and endpoint. With this strategy all services are billed together. This is helpful when you need several Azure services or are exploring capabilities.
What are two reasons to provision separate resources for model training and model prediction?
  1. This allows you to separate and monitor costs for training and inference separately.
  2. This allows you to train on a service-specific resource, but host the model for predictions on a general AI-services resource.
What pieces of information are necessary to consume Azure AI services provisioned through a resource?

Once you provision an Azure AI resource, it creates an endpoint to target to consume those resources. To do this you will need:

  1. The endpount URI.
  2. An API key for the endpoint (two are provisioned and can be refreshed at any time).
  3. The resource location (e.g., us-west2).
Through which mechanisms can you consume Azure AI resources?
  • Via REST APIs
  • Using SDKs (e.g. Python or C#)
  • In many cases, via cloud-based Studios (e.g., Azure AI Studio, Azure Speech Studio)

Securing Azure AI Services

What are three ways you can secure your Azure AI services?
  1. By refreshing and managing subscription/API keys according to best practices.
  2. By limiting access to applications and roles with the appropriate permissions.
  3. By using network security to limit the range of IP addresses that can access the services.
What are best practices for keeping API/subscription keys secure?
  1. Refresh subscription keys often.
  2. Protect keys with Azure key vault.
What are best practices for using the two provided API keys?

Use one for production and another for development. Refresh the development API key more often.

How can you refresh API keys without production service interruption?

Since each resource provides two API keys, you can:

  1. Switch production applications to reference API key 2.
  2. Refresh API key 1.
  3. Switch production applications back to using API key 1.
  4. Refresh API key 2.
How does Azure Key Vault improve service security? How does it work?

It allows you to consume Azure services without hardcoding the API key in an application’s codebase.

Azure Key Vault admins provide access to a security principal, who can define a managed identity for the application that needs access. During runtime, the application uses this managed identity to access the key vault and grab the API key, allowing it to consume the corresponding Azure AI Service.

What is token-based authentication? When is it required?

Token-based authentication is when an API key is included in the initial request to use an Azure AI service. All subsequent requests are then authenticated for 10 minutes, after which authentication will need to happen again.

Token-based authentication is supported (and sometimes required) for REST API requests. Use of SDKs handles token authentication for you.

What is Microsoft Entra ID? How can you, and when is it appropriate to, use it for authentication?

Microsoft Entra ID is a cloud-based identity and access management (IAM) service, which allows you to provision access to certain service principals and managed identities on Azure.

You can use it to assign permissions to a role, then add that role to users within your Azure subscription. It is good to use when designing applications meant for use within your organization.

What is Network Security?

Network security is limiting the range of IP addresses that can access an Azure AI service.

By default, which IP addresses can consume Azure AI services?

All of them! This is why network security is so important.

Monitoring Azure AI Services

Why is monitoring Azure AI services so important?

Like any cloud-based service, monitoring is essential to track costs, identify utilization trends, and detect potential issues.

How can you estimate the cost for your AI applications?

You can view cost structures in service documentation, and you can use the Azure Pricing Calculator!

How can you view costs for Azure AI services?

In the Azure portal, you can view costs for your subscription with the Cost Analysis tab. You can add a filter to only view costs for AI (i.e., Cognitive) services.

How does Azure support alerts?

Alert rules can be implemented for each service. These rules are based on event or metric thresholds.

What are the five levels of alerts? How should each be monitored or used?
  1. Critical alerts mean there is an urgent problem that needs attention.
  2. Errors also indicate urgent problems that need immediate attention.
  3. A warning means something needs attention very soon.
  4. Informational alerts are useful to check once a week or so.
  5. Verbose alerts are only helpful when connected to a dashboard or something similar.
What pieces of information need specified when creating an alert rule?
  1. The alert rule’s scope (the resource it is monitoring).
  2. An alert condition, meaning the event it is looking for or the metric threshold that must be exceeded.
  3. An action, like sending an email to notify someone of the alert.
  4. Alert details, such as the name of the alert and the resource group that defines it.
What does Azure Monitor automatically track for AI services?
  • Endpoint requests
  • Data submitted
  • Data returned
  • Errors
By default where can you view metrics related to a resource?

Metrics can be viewed for an individual resource on that resource’s Metrics page. You can create charts of different aggregation levels and add them to the Metrics page.

How can you view metrics for multiple resources at once?

Create a dashboard in the Azure portal. You can have up to 100 named dashboards, in which you can add metrics from multiple resources.

What service allows you to log information about your resources and applications? How can you view and analyze logged data?

You can log information with diagnostic logging. Typically logs get sent to Azure Event Hub for storage, and then you can access and analyze these events in Azure Log Analytics. You can also send older log data to Azure Storage for archiving and occasional retrieval.

When setting up diagnostic logs, what attributes do you need to specify?
  • A name for the settings.
  • The categories of event data that you want to capture.
  • Retention policy for the logged data.
  • Details of the destinations for the data.
For setting up diagnostic logging, where can I find more details?

At the Azure diagnostic logging page.

How long does it take after logging setup before data can be viewed and queried in Azure Log Analytics?

About one hour.

Containerizing Azure AI Services

Why might you want to deploy an Azure AI service on prem?
  • To keep sensitive data on the same network without sending it to the cloud.
  • To reduce latency between the AI service and the local data.
What is a container? Why are they useful?

A container comprises an application and the runtime components needed to run it. Containers are great because they are portable across hosts, meaning they can be run on many different types of operating systems or hardware. Also, a single host can support running multiple containers at once.

What are commong ways of deploying a container?

Deploy it locally or on a private network that is running a Docker server. Deploying on an Azure Container Instance. Deploying on an Azure Kubernetes Cluster.

What is the process of hosting an AI Service container locally? How is does Azure take care of billing?
  1. Find and download the container image for the service you need from Microsoft Container Registry.
  2. Run the container (it will host endpoints) and use it as you would use the service on the cloud.
  3. Periodically utilization metrics will be sent to Azure AI Services for billing purposes.
What details must be passed when deploying an Azure AI Service container?
  1. An API Key for billing purposes.
  2. The URI endpoint (for billing).
  3. A value of “Accept” for the EULA.

Azure AI Content Safety

What is Azure AI Content Safety?

An Azure AI service designed to make it easy to include advanced content safety capabilities into applications.

What are four drivers of the need for online content safety?
  1. Human- and AI-generated content continues to exponentiall increase online, including harmful content.
  2. There is growing regulatory pressure to regulate online content.
  3. Users need transparency in content moderation standards and enforcement.
  4. Content is increasingly more complex (i.e. multi-modal).
How can you easily play with Content Safety features?

In Azure AI Foundry! Content Safety is a tab in the application.

What are some features of Content Safety?

With Content Safety you can:

  • moderate text content
  • detect hallucinated content from LLMs
  • identify protected (i.e. copyrighted) material in LLM output
  • implement prompt shields (guard against jailbreak and indirect attacks)
  • moderate image content
  • customize your own content filtering
In what four categories does Content Safety classify content?

It can classify content as including hate, sexual, self-harm, or violent.

What are the severity levels for each category?

For text moderation, each class is given an integer from 0 (no risk) to 6 (high, urgent risk). For image moderation each class is given a score of safe, low, or high.

What do specified thresholds accomplish for content moderation tasks?

The threshold level determines what content is automatically allowed in an application. For example, if an image content moderation threshold is set at “medium”, and someone submits an image that has a “high” rating on any of the four categories, that image won’t be allowed.

What are limitations of content safety?

Content safety depends on AI algorithms, and they aren’t perfect. To ensure they are working as good as possible, the algorithms should be evaluated (using classification metrics such as precision, recall, and F1 score). We might want to prioritize recall to avoid false negatives!

Even though content safety isn't perfect, what is a main advantage of using it?

A primary advantage is its scale. It can check all incoming content and flag content that needs further moderation from humans.

Azure AI Vision

Image Analysis

What is image analysis?

Image analysis is all about extracting information from images.

What different image analysis tasks can you accomplish with Azure AI Vision?
  • DenseCaptions – Developing a caption for an image
  • Tags – Identifying tags that are fitting for an image
  • Detecting people and objects in images
  • Determining the format and size of an image
  • Classifying an image, and determining if it contains known celebrities and/or landmarks
  • Detecting and removing the background of an image.
  • Image moderation … determining if an image contains adult or violent content.
  • Optical character recognition (OCR)
  • Smart thumbnail generation, which looks at what part of an image would be best for a thumbnail, then generates a thumbnail from that image.
Generally how does the AI Vision API or SDK work?

You make a call to Azure AI Vision, including the image to analyze and the visual features to include in the analysis (one or many of the tasks AI Vision can handle).

Azure AI Custom Vision

What three types of custom Azure AI Vision models are there?
  • Image classification models (multi-class or multi-label)
  • Object detection models
  • Product recognition models (object detection, but specifically trained for product labels and brand names)
What are the core components/steps of a custom vision project?
  • Create a blob storage container and upload the training images.
  • Create a dataset, specifying what type of custom vision project it is.
  • Label the data, which creates a COCO file.
  • Connect the COCO file to the images in the dataset.
  • Training the model (specify model type and training budget)
  • Evaluate performance and make predictions!
What is a COCO file and what are its key attributes?

A COCO file is a JSON file with a specific format:

  • the images attribute defines the location of images in blob storage and has the file name, width, height, etc.
  • the annotations attribute defines the classifications (or objects) for the image and the bounding box/area for the classification (in the case of object detection)
  • the categories attribute defines the classification categories
What Azure service allows you to label training images?

Azure Machine Learning studio! You can use it to create a Data Labeling project

What is the training budget?

An upper bound of time for how long the training will run.

How many images per class do you need to train a custom image classification model?

At least 3-5 per class/label, but the more the better!

How can you access your custom image model after it has been trained?

Through APIs or in Vision Studio.

What resources need provisioned to train and serve a Custom Vision model?
  • An Azure AI multi-service resource will take care of both training and prediction, or …
  • An Azure AI Custom Vision training resource and a …
  • Azure AI Custom Vision prediction resource

Object detection

What are the two core components in object detection prediction?
  1. The class label of each object in the image.
  2. The location of each object in the image, represented as coordinates of a bounding box that encloses the object.
What services can you use for labeling images for object detection?
  • The Azure AI Custom Vision portal
  • Azure Machine Learning Studio Data Labeler
  • Microsoft Visual Object Tagging Tool (VOTT)
How are the bounding box values expressed?

They are expressed by four values, each represents a proportional value relative to the image size

  • left – the left coordinate of the bounding box
  • top – the top coordinate of the bounding box
  • width – the percentage of the image that the object’s width takes
  • height – the percentage of the image that the object’s height takes
If you want to use the smart labeler, what do you have to do first?

Tag some images with objects of each class and train an initial object detection model.

Facial analysis, detection, and recognition

What are common tasks when it comes to facial detection?
  • Detect when a person is present
  • Identify a person’s facial location
  • Recognize individuals
What two services provide facial detection capabilities?
  • Azure AI Vision can detect people in an image and will return a bounding box for the location of a face.
  • The Face service is best suited for the task and offers comprehensive facial analysis capabilities.
What are key considerations for facial analysis software when it comes to responsible AI?
  • Data privacy and security – facially data is PII!
  • Transparency – need to make sure users know how their facial data is used and who will be able to access it.
  • Fair and inclusive – need to ensure the AI system isn’t used in a manner that is prejudiced or unfairly targets individuals.
If a person is detected using the Analyze Image function of the Azure AI Vision service, what attributes will be returned in the API response?

There is a peopleResult attribute returned by the API. It will return a list of bounding boxes for each person detected and how confident it is about its prediction.

What capabilities does the Face service provide?
  • Face detection
  • Face attribute analysis (head pose, glasses, blur, exposure, noise, occlusion, face accessories, quality for recognition)
  • Facial landmark location (eye corners, pupils, tips of nose, etc.)
  • Face comparison (compare faces across many images for similarity)
  • Face verification (see if a face in one image is the same one in another)
  • Facial recognition (identify specific individuals)
  • Facial liveness (determine if a video stream is real or fake)
Which Face service features require approval through a Limited Access policy? Why?

Facial recognition, comparison, and verification since those services are rich with PII.

How does facial comparison/verification work? Do these features preserve anonymity?

When a face is detected, a GUID is assigned to the face and retained (cached) for 24 hours. Subsequent images can be compared to the cached data to determine if they are similar (comparison) or the same person (verification).

The way this works allows for comparison/verification anonymously, since the identity of the person doesn’t actually need to be known.

How is facial recognition implemented?

In facial recognition, you create a person group (e.g., employees, or family members, etc.), you add examples of each person to this group (ideally in multiple poses), and identify/label each person. Then you train. The identities of these individuals persist.

What can you do with a trained facial recognition model?

A trained facial recognition model is stored in a Face/Azure AI service resource. It can be used to:

  • Identify individuals in images
  • Verify the identity of a detected face
  • Analyze images to find faces that are similar to known faces.

Optical Character Recognition

What two Azure services are helpful for extracting text from images and documents?
  • Image Analysis (OCR) in Azure AI VIsion is great for extracting text from images and handwritten notes.
  • Document Intelligence extracts text from PDFs and documents that are more orderly (e.g., receipts, invoices)
What visual feature needs to be specified when calling Image Analysis for an OCR task? What is returned?

You need to pass VisualFeatures.READ. In the response you will get an object for each line of text in the provided image. Within each object will be the text it found, the bounding polygon (x/y coordinates), and the confidence of the word extracted.

Analyzing Video

What are features of Azure Video Indexer?

Video Indexer is your one-stop-shop for extracting information from videos. It can do:

  • Facial recognition for people in the video
  • OCR for text in the video
  • Speech transcription for the video’s audio
  • Topics – identify key topics for the video
  • Sentiment analysis on themes in the video
  • Labels – identify tags for the video
  • Moderate adult, harmful, or violent themes in the video
  • Segment the video into individual themes
What custom models can you train to extend Video Indexer's capabilities?
  • You can train the model to recognize certain people in the video, such as employees!
  • Domain specific language models can be trained to detect and transcribe this niche language.
  • Detect specific brands, products, companies, etc.
What are the two ways you can integrate Video Indexer into applications?
  • You can embed a Video Indexer widget onto your website!
  • You can use the Video Indexer API
Before you analyze a video with Video Indexer, what do you need to do first?

You need to ulpoad the video to Azure Video Indexer and index it!

Natural Language Processing

Azure AI Language

What features does Azure AI Language provide?
  • Language detection for a passage of text
  • Key phrase extraction – pulling out important words and phrases from text
  • Sentiment analysis
  • Named entity recognition (NER) – detecting references to people, locations, time periods, organizations, etc.
  • Entity linking – identify specific entities and the link to appropriate Wikipedia articles
When making a call to Azure AI Language for language detection, what do you provide in the request and what is the response?

You provide a list of passages of text (documents) and an identifier for each passage. The service responds with an object that provides an identified language for each passage and a confidence score from 0 to 1.

What will Azure AI Language do if submit a single passage that contains multiple languages?

It will try and detect the dominant language in the passage. It only returns one language classification per passage.

If Azure AI Language can't determine the language of a passage, what classification is returned?

It will return (Unknown) and a confidence score of 0.0.

When making a call to Azure AI Language for key phrase extraction, what do you provide in the request and what is the response?

The request includes a list of passages and their language. In the response there will be a keyPhrases object, which is a list of strings, one entry per key phrase.

How does the sentiment analysis feature respond to an API call?

It responds with the sentiment (negative, neutral, or positive) for each sentence in the passage. It also gives confidence scores for each classification for each sentence. It also provides an overall sentiment and confidence scores for the entire passage based on the scores of the individual sentences.

How does the entity recognition feature respond?

For each passage provided, it will give a list of entities. For each entity it returns an object that specifies the entity name, category, location, and confidence score for its classification.

What is returned by the entity linking feature?

For each passage it gives the name of the entity, its location within the text, a confidence score for the entity identification, and the wikipedia URL for the matched entity.

How does entity linking handle ambiguous entities? For example, people with the same name as cities?

The entity linking service can handle ambiguous entities automatically.

Question Answering systems

How does question answering differ from a conversational language understanding task?

In question answering, a user submits a question and an answer is returned. NLP is used to help accomplish the task. In conversational language understanding, a user submits an utterance (can be a question), but they expect an action to be performed (e.g., Hey Siri, play Imagine Dragons on Spotify)

What are data sources that can be used to populate a QA knowledge base?
  • An FAQ document or a URL that contains an FAQ.
  • Files containing text from which questions and answers can be extracted.
  • QA chat sessoins that include common questions and answers.
What are multi-turn responses, and how can you implement them in the question answering feature of AI Language?

A multi-turn response is one in which follow-up questions might be need in order to provide the correct answer. When setting up your knowledge based you can specify follow up prompts.

What attributes are sent in a request to the knowledge base API?
  • The question that needs answered.
  • top, the number of answers to be returned
  • A scoreThreshold for the answers that can be returned.
  • strictFilters to limit the response to answers that contain certain metadata.
What is returned in the knowledge base API's response?
  • A score for the percieved quality of the answer.
  • The answer
  • The question in the knowledge base that is associated with the answer.
  • Metadata attached to the answer.
How can you improve question answering performance?

Through active learning and defining synonyms.

  • Active learning, enabled by default, suggests related phrasing for the questions you have in your QA database. Including these related phrases can improve the odds of a good answer match.
  • Defining synonyms means providing synonyms to keywords in your questions and answers to improve the likelihood of a good match.

Conversational Language Understanding

What is a common design pattern for a natural language understanding solution?
  1. A user inputs natural language into an application.
  2. A language model determines the user’s intent.
  3. The app performs an appropriate action
What is the difference between pre-built and learned features of Azure AI Language? What are the examples of learned features?

Pre-built features require no custom data, but learned features do and enable you to train your own custom models.

Azure AI Language learned features are:

  • Conversational language understanding
  • Custom named entity recognition
  • Custom text classification
  • Question answering
When providing a training set of utterances, what guidelines will help improve performance?
  • Provide a lot of alternative ways of saying the same thing.
  • Vary the length of utterances (short, medium, and long examples)
  • Vary where the noun or subject of the utterance is in the sentance (beginning, middle, and end).
  • Give examples with correct and incorrect grammar.
  • Give a lot of examples for each intent.
When labeling data for CLU applications, what are three key factors to improve performance?
  • Be precise in your labeling. Label each entity to its right type, and only include what you want extracted.
  • Be consistent in your labeling across examples.
  • Label completely – label all the instance of an entity in all the utterances.
What is the role of entities in CLU tasks? What key types of entities are there?

Entities add context to user intents. For example, “turn the fan on” and “turn the light on” share the same intent, but only the entity differs.

The three key types of entities are:

  • A learned entity is most common, which is when you define a component and associate words and phrases with it in training data.
  • List entities include a specific set of possible values (think days of the week).
  • Prebuilt entities include numbers, datetimes, and names.
What prebuilt entities does Azure AI Language support out of the box?

Age Number Percentage Dimensions Temperature Currency Number ranges Datetimes Names Email addresses Phone numbers URLs Companies and corporations Locations (cities, states, countries) IP addresses

What four step process can you iterate through to build a CLU model?
  1. Train a model to learn intents and entities from example utterances.
  2. Test the model interactively or useing a test dataset.
  3. Deploy the model for use.
  4. Review the performance of your model so you can include more training data and improve it.

Text Classification

What types of text classification projects are there?

There is single class classification (assigning one label to a body of text) or multi-class classification (assigning one or more labels to a body of text).

Why is improving multi-class classifiaction model performance more difficult than single-class?

When performance is poor that means you need more training data specific to the classes your model is having a hard time predicting. For multi-label projects finding quality data is hard due to the different combinations of labels that can be assigned.

Custom NER

When might you build a custom NER solution?

If the entities you want to extract aren’t part of the built-in service or if you only want to extract specific entities.

What considerations should you make to boost performance of a custom NER solution?

It’s all about the data! Get training data that is:

  • Representative of what the model will see in the real world.
  • Use as diverse data as you can (including document types) that matches the distribution it will see in the wild.
  • Keep your entities as distinct as possible (not always easy or practical).
How many entity types can you define for the model to recognize?

Up to 200 entity types!

What is the best way to label NER training data?

Using Azure Language Studio!

Translation

Azure AI Translator supports translation between how many supported languages?

90!

What translation tasks can you use Azure AI Translation for?
  • Language detection
  • One-to-one translation
  • One-to-many translation
  • Transliteration, which is the process of converting written text from one writing system (script) to another.
What kinds of resource must be provisioned to use Azure AI Translator?

Either a single-service Azure AI Translator resource, or the Text Analytics API in a multi-service AI Service resource.

When hitting the `detect` function of the translator API, what is returned?
  • The detected language for the input text.
  • The confidence score for the detection.
  • Whether translation is supported.
  • Whether transliteration is supported.
When hitting the `translate` endpoint, what must you provide?
  • The text to translate.
  • The language to translate from.
  • The language (s) to translate to.
When using the `transliterate` endpoint, what must you provide?
  • The text to transliterate.
  • The input script.
  • The desired output script.
What does the word alignment parameter do in the `translate` endpoint when set to true?

It will provide a numerical mapping of how the characters in the input text relate to the characters in the output text. For example, “Good morning” to “Buenos diás” would show 0:3-0:5 5:11-7:10 to show that “good” maps to “buenos” and “morning” maps to “diás”.

What does `includeSentenceLength` in the `translate` endpoint do when set to true?

In the translation response it will include the character length of the input and translated texts. This can be helpful when determining how to display the translated text.

What profanity filtering options are there in the translation service?

There is a profanityAction parameter, which can be set to the following:

  • NoAction will translate profanity along with the rest of the text.
  • Deleted will remove profanities from the translation.
  • Marked replace translated profanities with asterisks.
When might you need to train a custom translation model? What do you need to include as your training data?

When the default translation model doesn’t cut it. This might happen if you are in an industry with a very specific vocabulary. Your training data will be parallel documents, where one document (the target) is a translation of the other (the source).

How many training examples are recommended for training a custom model?

Azure recommends 10,000 parallel words to train a model.

When you train a custom translation model, how can you use it to make translations?

You can use the translate endpoint. When the model is trained it will be assigned a unique categoryId, which you will include in the request to the translation service.

Speech Recognition (speech to text)

What are the various APIs that Azure AI Speech provides for various speech-related tasks?
  • Speech to text
  • Text to speech
  • Speech translation
  • Speaker recognition (determine which individual is talking based on their voice)
  • Intent recognition (determine the semantic meaning of spoken input)
What type of resource must be provisioned to use Azure AI Speech capabilities?

Either an Azure AI Speech resource or a multi-service Azure AI Services resource.

When using the speech-to-text SDK, what objects will you have to set up?
  • A SpeechConfig object will contain info to authenticate with Azure (the API key).
  • An AudioConfig object will house details on the input audio. It will default to the microphone, but can be changed to a third party microphone or an audio file.
  • A SpeechRecognizer object is a proxy client for the API.
  • The speech recognizer object has functions you can call like RecognizeOnceAsync() which will trasncribe a single utterance.
What attributes are returned by the `RecognizeOnceAsync()` function?
  • The duration of the recognized speech.
  • The reason for the result, either RecognizedSpeech (yay!), NoMatch, meaning no speech was recognized in the audio, or Cancelled, meaning an error occurred.
  • The transcribed text if the transcription was successful.
What two APIs support speech recognition (speech to text)?

The speech to text API, which is meant for live inference, and the speech to text short audio API, which supports audio less than 60 seconds. You can use the speech to text API for live and batch operations.

Speech Synthesis (text to speech)

What two APIs support speech synthesis (text to speech)?

The text to speech API, and the batch synthesis API, which is optimized for batch operations.

What type of object will handle speech synthesis in the Azure AI Speech SDK?

You must instantiate a SpeechSynthesizer object, which is a proxy client for text to speech API.

What will be returned by the speech synthesizer's `SpeakTextAsync()` method?
  • AudioData is the output for the speech. This will be the device’s speaker or a file based on the configs.
  • The Reason will be either Cancelled (error) or SynthesizingAudioCompleted if successful.
What options can be specified in the SpeechConfig object for a speech synthesis task?

You can specify:

  • The audio format, such as the file type and sample rate.
  • The voice to use in the spoken speech. These can be synthetic or more natural sounding.
What is Speech Synthesis Markup Language (SSML) and when/why might you use it?

SSML is an XML-based syntax that offers greater control over how the spoken output sounds. It is helpful if you want to specify the following options for your spoken output: a speaking style (e.g., cheerful), inserting pauses, certain phonemes, or inserting recorded speech or audio.

How can you change the voice used in speech synthesis?

You need to set the SpeechSynthesisVoiceName property of the SpeechConfig object to the desired voice name.

Speech Translation

When using Azure AI Speech SDK for speech translation, what does the SpeechTranslationConfig object specify?
  • The location and key for you Azure AI Speech resource.
  • The speech recognition language
  • The target language(s) into which the speech should be translated
What object acts as a proxy client for the Azure AI Speech translation API, and what helpful function will translate a single spoken utterance?

The TranslationRecognizer object, and it’s helpful function is RecognizeOnceAsync()!

What attributes will be in the response from Azure AI Speech for a speech translation task?
  • The Duration of the recognized speech.
  • The Text of the recognized speech.
  • The Translations of the recognized speech in the specified target languages.
  • The Reason for the result. RecognizedSpeech if successful, NoMatch if no spoken word was detected, or Cancelled if there was an error.
If you want to synthesize the translated transcription (i.e., speech-to-speech translation), how can you accomplish this?

For 1:1 translation, you can specify the desired voice for the translated speech in the TranslationConfig object. Then create an event handler using TranslationRecognizer.Synthesizing events. In the event handler result, GetAudio() will retrieve the byte stream of translated audio.

For one-to-many translations you need to synthesize the speech manually, meaning you’ll have to take the translated text and pass it through a SpeechSynthesizer object.

What are the pricing tiers for Azure AI Search resources, and what do they offer?
  • The free tier (F) is great for trying out the product.
  • The basic tier (B) supports up to 15 indexes and 5 GB of indexed data.
  • The standard tier (S) is for enterprise-scale solutions. There are multiple variants in this tier for larger index sizes and quantity.
In the context of search services, what are replicas and partitions?

Replicas are the number of nodes in a cluster. Increasing the number of replicas increase the number of concurrent queries the service can handle. Partitions divide an index into multiple storage locations, allowing queries and index rebuilds to be distributed.

What are the components of an Azure AI Search solution?
  • A data source, which is JSON data. But this can be data in Azure blob storage, Azure SQL tables, or Cosmos DB documents.
  • AI skills to enrich data with additional insights.
  • An indexer, which can run at regular intervals or on demand.
  • An index (the product of an indexer)
What are some examples of AI skills that you might use in an AI Search solution?
  • Detecting the language of text.
  • Extracting key phrases from text.
  • Determining the sentiment of text.
  • Identify entities in the text.
  • Generate descriptions of images or extract text from images.
  • Custom skills.
What attributes can be configured for the fields in an index?
  • key - fields that define a unique key
  • whether the field should be searchable
  • wether the field should be filterable
  • wether the field should be sortable
  • wether the field should be facetable, which gives the user the ability to drill-down/filter results based on values.
  • wether the field should be retrievable, or included in results
What are common parameters included in a full text search in Azure AI Search? (Lucene query syntax)
  • The search term
  • Whether the query is using the simple or full query syntax (queryType)
  • searchFields are the fields to be searched
  • Secify the fields to return in the result with select
  • For multi-word search terms, searchMode will allow you to specify if you want to match documents according to Any or All of the terms.
What are the four stages for how a query is processed?
  1. The search term is evaluated and reconstructed as multiple subqueries.
  2. Query terms are refined and normalized (e.g., converted to lowercase, stopwords are removed, words are trimmed down to their root, etc.). This is called lexical analysis.
  3. Terms are matched against the indexed documents, and matching documents are identified.
  4. Matching documents are scored, sorted, and returned.
How can you present facet options to a user?

You can search for all documents (*) and facet by the appropriate field. This is best done on fields with a smallish number of discrete values. When the user selects one of these facets, include that in the filter parameter of the next search.

What syntax would you use if you wanted to filter an AI Search index to include books printed in 2024 and sort them by the latest print date?
$filter=print_year eq '2024'
$orderby=print_date desc
What functionality is provided when you add a suggester to an index?
  • Suggestions allow you to get a list of suggested results as the user types in the search box (without submitting the search query)
  • Or you can autocomplete partially typed search terms based on the vales in indexed fields.
What is custom scoring?

The default relevance scoring algorithm is term-frequency/inverse-document-frequency (TF/IDF). But you can customize how the score is calculated!

You can also boost results as you need. For example, you can increase the relevancy score for more recent documents.

How can you appropriately handle synonyms in a search index?

You can define synonym maps to link related terms together.

Custom Skills

What are some use cases where a custom skill could be helpful for enriching an AI Search index?
  • You need to use document inelligence to extract data from forms.
  • Using an Azure Machine Learning model to get predicted values into an index.
  • Perform text classification!
How do you define a custom skill, and what must be included?

Custom skills are defined in JSON. They must include:

  • The endpoint and parameters for calling the inference service.
  • Specify at which point in the document hierarchy the skill should be called. This is called the context.
  • Assign input values (fields from the data sources)
  • Specify which fields to store the outputs.
What are the different hierarchy levels of skill context, and when are they appropriate?
  • The default is document, which means the skill will be applied across the entire document.
  • Another common one is document/pages/*, which will be applied to each chunk of text on each page individually. This is great for sentiment use cases.
  • Another common context is document/normalized_images/*, which will be applied for each image. This is great for image analysis use cases.
What do the timeout and degreeOfParallelism settings do when creating an ML skill?

Timeout sets the number of seconds before a skill will move on to the next document. Degree of parallelism controls how many documents are processed by the skill at once. Best practice is to begin at “one” and scale up as necessary (and as resources allow).

To use an Azure ML model as a skill, how must it be deployed?

At current it has to be deployed as a web service endpoint and the endpoint has to be an Azure Kubernetes Service cluster. ML Studio can create and manage the cluster for you.

Once you have an ML model to use as a custom skill, how do you add the output of that skill to an index?
  1. Add a field to your index where you will store the output from the model.
  2. Update the index skillset and add the #Microsoft.Skills.Custom.AmlSkill.
  3. Change the indexer to map the output from the custom skill to the field you just created.
  4. Rerun the indexer!
What built-in skills are available?

There are lots of them! In general, the following types of skills are built in:

  • Text analysis/NLP
  • Image analysis
  • Translation
  • Document intelligence
  • OpenAI Embedding

Knowledge Stores

What is a knowledge store?

A knowledge store is a table-like output for a skill-enriched search index.

What is a projection? What is best practice for defining a projection's schema?

Projections define the schema for tables, objects, and files in a knowledge store. Since the schema for individual documents can vary bases on the skills applied, the output schema can vary. To help with this, there is a Shaper skill which allows you to map skill output to a well-structured schema for use in a knowledge store.

How do you create a knowledge store?

Create a knowledgeStore object in the skillset that defines the projections (tables) and specifies storage account information. Projections can be tables, files, and objects. Each type needs to be defined separately.

Advanced AI Search Features

What is term boosting, and how is it implemented?

Term boosting gives higher relevancy scores to certain terms in the index. You can specify the fields in which these terms should reside. This is implemented with the Lucene query syntax carat (^) (e.g., Category:luxury^3 to triple the score for items with “luxury” in the category field).

What are scoring profiles and how can you apply them?

A scoring profile allows you to customize how the final scoring of terms is implemented. You can set a custom scoring profile as a default, or you can specify a scoring profile using the Lucene query syntax.

What are functions that you can use in a scoring profile?
  • distance to boost search results that are nearer to the search location
  • freshness to boost newer or older results
  • magnitude to boost results based on the value in a numeric field
  • tag to boost results based on tags that are in the data
What is an analyzer in AI Search, and what options do you have if you don't want to use the default?

An Analyzer is what breaks the data in the index into more useful terms, such as normalizing text and removing stopwords. By default the Lucene analyzer is used, which is best for most use cases. You can also choose:

  • A language analyzer, which includes capabilities such as lemmatization, word compounding, and entity recognition for 50 languages.
  • Specialized analyzers for fields like zip codes and product identifiers.
  • You can also define and test your own!
How can you augment and index to include multiple languages?
  1. Identify the fields that need translation.
  2. Duplicate those fields for each language you want to support.
  3. Use Azure AI Services to translate the text and store the output in the created fields.
  4. During search, limit the fields that you will search to the appropriate language.
What geospatial functions does AI Search include and how do you use them?

The two functions are:

  • geo.distance, which returns the distance (in km) in a straight line between the input point and search results.
  • geo.intersects, which returns true if the location of a search result is within an input polygon.

To use these, there should be a location field in the index with type Edm.GeographyPoint. You can use these functions in a filter, in a sort, or in a search term.

External Data Sources

How can you get data from sources outside of Azure into an AI Search Index?
  • Bring those sources into Azure using Azure Data Factory! You can specify a Search Index as a sink in a Data Factory pipeline.
  • Add data directly to an index using the AI Search REST API.
  • Use an SDK like C#!
What are limitations of directly sinking ADF data into an AI Search Index?

Complex data types such as arrays, objects, and other complex types aren’t supported.

AI Search Monitoring, Security, and Maintenance

What data are encrypted whne using AI Search?

Data are encrypted in transit and at rest. This includes indexes, data sources, skillsets, and indexers.

What three areas of focus should be considered when thinking about solution security?
  1. Incoming search requests
  2. Outbound requests from the search solution to the user
  3. Restricting access at the document level to certain searchers
How can you restrict inbound access to the search service?
  • Implement a free firewall to restrict who can access the service. This can be done even if the serice is used in a public-facing application or website.
  • Require authentication using admin or query keys
How can you determine who can search which documents?

You need to update each document with a role or group that specifies permissions. When this is done, you can use these roles/groups in the filter of the search syntax. This requires the user to be authenticated.

How do you enable diagnostic logging for a search service?
  • In the Azure portal, under Diagnostic settings, select + Add diagnostic settings.
  • Select allLogs, AllMetrics, and Send to Log Analytics workspace
How will you know if the search service is being throttled?

If user searches are being throttled it will be captured in Log Analytics as a 503 response. If the indexes are being throttled they will show up as 207 responses.

How can you check the performance of individual queries?

The best way is with a client request tool like Postman.

What typically makes searches take longer?

Searches take longer for indexes that are larger (more fields, more records) and complex (more fields searchable, facetable, etc.). To reduce search time, reduce the number of records and/or the complexity.

What are ways to improve query performance?
  • Only specify the fields that you need to be searchable.
  • Only return the fields that are absolutely necessary.
  • Avoid prefix searches or regular expression since they are more computationally expensive.
  • Avoid high skip values.
  • Limit facetable and filterable fields to those with a small number of discrete values.
  • Use search functions instead of individual values in filter criteria.
Which perform better, a smaller tier service with more partitions or a bigger tier service with fewer paritions, even if they cost the same?

The larger tier service will have more powerful compute resources, memory, and provides opportunities to scale for future growth. Go with that one!

When estimating costs for an AI Search solution, what are all the different components that can contribute to costs?
  • The search service itself.
  • Data storage costs (e.g., Blob Storage or Azure SQL)
  • Skills.
What SLAs are guaranteed for search solutions?

If you have two replicas there is a 99.9% availability guarantee for queries. Three replicas or more gives 99.9% guarantee for queries and indexing.

What is the best way to manage service costs?

Monitor and set budget alerts for search resources.

Which three metrics can be viewed in graphs without any other configuration?

Search latency, queries per second, and percentage of throttled queries.

Semantic Ranking

What is semantic reranking?

Semantic reranking leverages natural language understanding to re-rank the results after the initial BM25 ranking function.

What are semantic captions and answers?

Semantic captions extract summary sentences from the the document and highlight the most relevant text in summary sentences.

If the search query appears to be a question, and the search results contain text that appears to be the answer, a semantic answer will be created and returned.

How does semantic ranking work?

The top 50 results from the BM25 ranking are grabbed. These results are converted into strings, trimmed to 256 tokens, then passed to an ML model to determine the semantic caption/answer. Then the results are ranked based on the relevance of the caption/answer and returned in descending order of relevance.

Will semantic ranking result in returning results not originally grabbed by the BM25 algorithm?

No.

How many semantic ranking queries per month do you get for free?

One thousand! Any more and you should choose standard pricing.

How do you add semantic ranking to an index?

In the Azure portal, select the index of interest and select the Semantic configurations tab. Then go from there!

Vector Search and Retrieval

What is vector search?

Vector search is the capability to index, store, and retrieve vector embeddings from a search index.

What types of data can benefit from vector searches?

Whatever you can make embeddings out of! This includes text, video, images, and audio data sources.

Misc questions based on the practice exams

You use Azure AI services on an app that is deployed to an Azure VM. Firewall rules are enabled. What should you do to ensure the app can access the AI service through a service endpoint?

There are two potential routes. One is adding an IP range to the firewall rules to include the virtual machine. The better answer is allowing access to a virtual network, which is where the virtual machine resides.

When making a request to Azure OpenAI, what pieces of information need to be included in the header?
  • The OpenAI resource name,
  • the API version, and
  • the deployment ID.
When making a request to Azure OpenAI, what is the only body attribute that is required?

The prompt!

What does a "mismatch" mean when using an API key for an Azure AI Services container application?

It means the API key is valid, but it is for the wrong type of resource!

With conversational language understanding do you need to train different models to support multilingual use cases?

Why no! CLU models are multilingual by default!

If you have an Azure App Services web application, and you want to authenticate the AI services it uses via Microsoft Entra ID, what should you do?

Enable a managed identity from the application and assign role-based access control (RBAC) permissions to Azure AI Services.

What are two prerequisites to enabling diagnostic logging on an Azure AI Services resource?
  1. A Log Analytics workspace.
  2. An Azure Storage account.
Can an API key be invalid if it is for the correct type of resource, but for the wrong region?

Yes!

When will Azure OpenAI automatically update a model version even if auto-update is disabled?

This can happen when the model version reaches its retirement date.

In the DALL-E model's response, how does it provide images and thumbnails, and in what response object?

These will be found in the result element, and it will contain a collection of URLs that link to the PNG image(s) generated from the prompt.

What are valid attributes to include in the body of a call to the DALL-E 3 Azure OpenAI model?
  • The user’s prompt
  • The quality of the generated images
  • The style of the generated images
  • There are a ton of them.
What is Azure OpenAI's "User your data" feature?

It is a REST API/SDK that helps with RAG use cases. You can upload files to Azure OpenAI studio, and data will be cracked, chunked, and embedded. Then you can develop and deploy an Azure OpenAI model for prompting.

What are helpful parameters with Azure OpenAI on your data?
  • You can limit responses to only include those relevant to the data you provided. For example, if the user asks about the weather, the model won’t respond.
  • You can specify the number of documents to retrieve (3, 5, 10, or 20).
  • You can specify how strict (strcitness) the model is in filtering out search documents based on their similarity scores.
What are three best practices for prompt engineering strategies?
  1. Be Descriptive – the more details the better!
  2. Be Specific – tell the model exactly what to expect and what you want it to do.
  3. Order Matters – the order in which you present information will affect the output!
When running an Azure AI Search indexer for the first time, what exact steps are occuring behind the scenes?
  • Document cracking (opening files and extracting content)
  • Field mapping
  • Skillset execution
  • Output field mapping
  • Pushing to an index
When defining a skillset for Azure AI Search, what are the minimum sections that need to be included in the definition?

The name, description, and skills. If you are sending skill output to a knowledge store then you must include the knowledgeStore parameter. If you are using some billable skills then you need to include cognitiveServices.

What three document formats does the Azure AI Document Intelligence pre-built read model support?
  • Excel
  • Word
  • PDFs
In a text analysis-based app you are performing sentiment analysis. What attributes can you include in the API requests the app makes and what do those attributes do?
  • opinionMining will make sentiment very granular (i.e., identifying both positive and negative sentiments in a single sentence)
  • loggingOptOut will opt out of logging
What are different types of PII that the PII detection feature will detect automatically?
  • Person – names
  • Age – people’s ages
  • DateTime – dates and time values
  • PhoneNumber – phone numbers
  • PersonType – job types/roles
  • Organization – companies, political groups, bands, sports teams, etc.
  • Email
  • URL
  • IPAddress
What are common errors with speech-to-text, and how do you resolve them?
  • A substitution error is when a word is different in the text than in the speech. This typically happens when domain-specific terms aren’t in the corpus and you need to provide examples of these terms.
  • An insertion error occurs when words are added to the text that aren’t in the speech. These can happen in noisy environments and words from other conversations are included in the text.
  • A deletion error occurs when words in the speech aren’t in the text. This typically means weak audio signal strength, and you should get the microphone closer to the source.
In speech-to-text tasks, what is the word error rate (WER), and how is it calculated?

It is the number of insertions, deletions, and substitions divided by the total number of words.

What is a good WER in speech-to-text tasks?

5% to 10% is considered good quality and ready to use.

Does the base speech-to-text model support noisy environments?

Not really. If you have a lot of ambient noise in your data you might want to train a custom speech-to-text model.

What is pattern matching in CLU?

Pattern matching uses the Speech SDK and is helpful when you are only interested in matching strictly what the user said.

What is the Bilingual Evaluation Understudy (BLEU) score, what does it do, and what is its scale?

BLEU is an algorithm for evaluating machine translation. It ranges from 0 to 100, with 100 being perfect. A score between 40 and 60 indicates a high-quality translation.

Which colors can the Image Analysis API detect as the dominant background color of an image?

black, blue, brown, gray, green, orange, pink, purple, red, teal, white, and yellow.

What two types of CLU models are there?

Standard and advanced. Standard is the default and works for English language only. It is provided free of charge. Advanced training leverages the latest in ML technology, will result in higher scores, and will enable multilingual capabilities. It is more expensive to do it this way.

What is the orchestration workflow feature in Azure AI Language?

Orchestration workflows allow you to connect CLU, Q&U, and language understanding applications. They can be developed in Language Studio.

If a multilingual CLU app is performing poorly, what can you do?

Provide more utterances in languages that are performing poorly. Reminder … CLU applications are multilingual by default.

What is active learning in the context of a Q&A app?

Active learning will automatically generate suggestions of data based on user queries that don’t have great answers. It takes at least 30 minutes for these suggestions to start showing up once it has been enabled (it is enabled by default for custom QA models).

What influences the price of a QA service?

Pricing depends on the throughput (utilization), the size of the knokwledge base, and the number of knowledge bases.

For the Azure AI Face service, what are the different detection models, and what are their strengths?
  • detection_01 is the default model that works best overall.
  • detection_02 improves accuracy on small, side-view and blurry faces. It doesn’t return face landmarks.
  • detection_03 has even further improved accuracy, can detect masks, but not other accessories.
What does the spatial analysis feature of Azure AI Vision do?

It detects the presence of people in a video feed.

If users of a Document Intelligence app note that they can't process some documents, what could be the problem?

For an S0 instance of the app it can handle 500mb documents with 2k pages (i.e. most documents), so the likely issue is password-protected files.