Asking Multilingual Questions To AI Skills Using Azure AI Services
I continue to test AI Skills in Fabric. I have written two blog posts [this and this] on how you can use the AI Skills endpoint in a notebook. In this blog, I extend it further to highlight prebuilt Azure AI Services in Fabric. With prebuilt AI services, you can use a host of AI features without needing any additional subscriptions. Some of the services such as Azure Open AI require F64+ subscription while the Text analytics services like below have no subscription limitation.
AI Skills currently only supports queries in English. In this blog, I want to show how you can use text analytics services to translate any language to English and pass it to AI Skills endpoint to get the answer. Of course, this will not be needed once multilingual queries are supported but hopefully this will prompt you to take a look at the existing AI services offerings within Fabric.
Translate
You can use an API, Python SDK but I will use synapseml
as you can extend it to large dataframes as a part of your pipeline.
import synapse.ml.core
from synapse.ml.services.language import AnalyzeText
from synapse.ml.services import Translate
from pyspark.sql.functions import col, flatten, explode
import pandas as pd
def translate(text):
df = spark.createDataFrame([(text,)], ["text"])
model = (Translate()
.setTextCol("text")
.setToLanguage(["en"]) #change to another language if needed, check docs. e.g. "de" for german
.setOutputCol("response"))
result = (model.transform(df)
.withColumn("response_exploded", explode(col("response")))
.select(col("response_exploded.translations").getItem(0).getField("text").alias("text"))
.collect()[0][0]
).replace("\\n", "").strip()
return result
text = "नमस्ते मित्रांनो" #Marathi my mother tongue
tranlated_text = translate(text)
tranlated_text
To test this, I translated a question I wanted to ask AI Skills in three different languages.
English:
Overall average order amount and average amount for items that start with Mountain for all customers and Laura Lin. So there should be four columns - two for all customers for all products and for Mountain products and the other two for Laura Lin.
Japanese (Thanks Eiki Sui) :
全顧客とLaura Linに対して、全体の平均注文金額と、Mountainで始まる商品の平均注文金額。したがって、4つの列が必要です。全顧客に対しての全製品とMountain製品用の列が2つ、Laura Linに対しての列が2つです。
Chinese(Thanks David):
总体平均订单金额和以 "Mountain" 开头的商品平均金额(针对所有客户和 Laura Lin)。因此,应有四列 - 两列是针对所有客户的所有产品和 Mountain 产品,另外两列是针对 Laura Lin 的。
Spanish (Thanks Chuy Verela):
Obtener la Cantidad Total Promedio por Orden y la Cantidad Promedio para artículos que comienzan con “Montaña” para todos los clientes, y también para Laura Lin. Con lo cual deberíamos obtener 4 columnas:
-2 columnas para todos los clientes y todos los productos que comienzan con “Montaña”,
-las otras 2 columnas para Laura Lin.
All three were translated correctly, which I passed to the function I shared in the last blog and got correct answers in all languages.
Notes
You certainly don't have to use the Azure AI services, there are plenty of free and cheap APIs for translation but your queries remain within Azure if you use Fabric.
You can also use Azure Text Translation API which has higher API limit than Fabric's prebuilt AI services. Refer to the docs for rate limits and other limitations.
As always, validate before you publish to production.
Prebuilt AI services in Fabric are currently (as of 08-15-2024) in preview and offered for free. I am not sure how AI Skills is billed.
Subscribe to my newsletter
Read articles from Sandeep Pawar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by