Text Analysis with Power BI in different languages

According to the msdn article „How to integrate Text Analysis into Power BI„, I needed to detect the language of an comment and make a sentiment analysis of it. The reason to make it parametrized and not change the language key is easy, mostly in Germany I have comments in English, German Spain etc.

So, after I completed the Howto above, i created a new M Function named „Language“ with this code:

// Returns the two-letter language code (for example, 'en' for English) of the text
(text) => let
     apikey      = "YOUR_API_KEY_HERE",
    endpoint    = "https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics" & "/v2.1/languages",
    jsontext    = Text.FromBinary(Json.FromValue(Text.Start(Text.Trim(text), 5000))),
    jsonbody    = "{ documents: [ { id: ""0"", text: " & jsontext & " } ] }",
    bytesbody   = Text.ToBinary(jsonbody),
    headers     = [#"Ocp-Apim-Subscription-Key" = apikey],
    bytesresp   = Web.Contents(endpoint, [Headers=headers, Content=bytesbody]),
    jsonresp    = Json.Document(bytesresp),
    language    = jsonresp[documents]{0}[detectedLanguages]{0}[iso6391Name]
in  language

The code above is 1:1 from the msdn webpage to get a two letter code of the language of the key merged subject and body.

This is also similar to the other steps, but it must be the first invoke function. The next step is to get the Key Phrases, but depending of the language of the column which are created before.
// Returns key phrases from the text in a comma-separated list
(text,lang as text) => let
     apikey      = "YOUR_API_KEY_HERE",
    endpoint    = "https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics" & "/v2.1/keyPhrases",
    jsontext    = Text.FromBinary(Json.FromValue(Text.Start(Text.Trim(text), 5000))),
    jsonbody    = "{ documents: [ { language: """ & lang & """, id: ""0"", text: " & jsontext & " } ] }",
    bytesbody   = Text.ToBinary(jsonbody),
    headers     = [#"Ocp-Apim-Subscription-Key" = apikey],
    bytesresp   = Web.Contents(endpoint, [Headers=headers, Content=bytesbody]),
    jsonresp    = Json.Document(bytesresp),
    keyphrases  = Text.Lower(Text.Combine(jsonresp[documents]{0}[keyPhrases], ", "))
in  keyphrases

The JSon Body has no longer the hard coded en, is uses a new parameter language which is given on the function header. So you must edit the „invoke custom Function“-call with our new language column:

Now, we have our Key Phrases depending on the detected language and we can get the sentiment score also depending on the language with this code:
// Returns the sentiment score of the text, from 0.0 (least favorable) to 1.0 (most favorable)
(text,lang as text) => let
     apikey      = "YOUR_API_KEY_HERE",
    endpoint    = "https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics" & "/v2.1/sentiment",
    jsontext    = Text.FromBinary(Json.FromValue(Text.Start(Text.Trim(text), 5000))),
    jsonbody    = "{ documents: [ { language: """ & lang & """, id: ""0"", text: " & jsontext & " } ] }",
    bytesbody   = Text.ToBinary(jsonbody),
    headers     = [#"Ocp-Apim-Subscription-Key" = apikey],
    bytesresp   = Web.Contents(endpoint, [Headers=headers, Content=bytesbody]),
    jsonresp    = Json.Document(bytesresp),
    sentiment   = jsonresp[documents]{0}[score]
in  sentiment

That’s it, after we make another invoke function call, we get the sentiment score based on the given language. The order after that in the M Query Editor:

Categorized: Allgemein

Schreibe einen Kommentar