Skip to main content

Introduction to Artificial Intelligence - Microsoft - DAT263x on edX


Recently, my company encouraged us to learn more about Azure ML Studio.  A little Googling led me to a course on edx.org offered by Microsoft called Introduction to Artificial Intelligence.  While it was at least 50% advertisement for some of Microsoft's cloud-based AI offerings, it was definitely worth the 10 or so hours I put into it over the last 24 hours.

This was a fairly high-level course that discussed aspects of statistical techniques like regression, categorization and clustering, as well as more advanced concepts including artificial intelligence (AI), natural language processing (NLP), and chatbots.  Overall, what I learned wasn't so much about these topics, but how much Microsoft's ML Studio has to offer in the way of getting you up and running quickly and easily with these technologies.

Having said that, it's Microsoft, and there were several things that didn't work as they should have, as I expected.  Full disclosure:  I'm an AWS certified developer and solutions architect (associate level) and a Linux / Mac user using Google Chrome, so I wasn't really too surprised.  Here's a listing of the workarounds I had to do to get the labs to work properly.

Lab 1 worked pretty much out of the gate.  However, labs 2 and 3 needed some tweaking.  To start, the Jupyter notebook for Lab 2 had a plus sign in its name (Text+Speech.ipynb).  I had to remove the plus sign (renamed the file Text_Speech.ipynb) before it would upload properly.

Lab 2 also bombed when running the following command:

!pip install -U textblob

This was the first line in the first code cell under the heading Get TF-IDF Values for the top three words in each document.  To fix this, create a new, blank cell (Esc, then "b" to create one (b)elow the currently-selected cell) and then run the command in it without the -U, like so:

!pip install textblob

Then comment out the original line (the one with the -U in it) and run that cell again.  It may still bomb, but I was able to continue after that.


Also in Lab 2 (and Lab 3), you'll probably get an error stating "the JSON object must be str, not 'bytes'".  In Lab 2, this happens under the section titled Call the Text Analytics Service to Determine Key Phrases in the Documents.  Change the assignment of the variable parsed to:


parsed = json.loads(data.decode('utf-8'))

You'll also need to do this in the section named Perform Sentiment Analysis, and on line 18 of the code under the section named Consume the LUIS App, and in line 24 of the final code cell for Lab 2.

Lab 3 only had this issue in a section named Use the Computer Vision API to Get Image Features - it's the same symptom and cure - just add that .decode('utf-8') and you should be good to go.

Also in Lab 3, when copying the URI for the facial recognition API (by clicking the "copy" button per the directions), you'll need to add a trailing slash to it before it will work in the notebook.

Lab 4 pretty much worked seamlessly.  I really didn't want to have to write a C# application just to do a simple POST to a Web service API though.  To that end, here's the format to use so you can make to call it with curl (replace YOUR_KEY_HERE, REGION_NAME, and YOUR_ACCOUNT_ID to fit your setup):

curl -X POST -H "Ocp-Apim-Subscription-Key: " -H "Content-Type: application/json" https://.api.cognitive.microsoft.com/qnamaker/v2.0/knowledgebases//generateAnswer -d '{"question":"hi"}'

Overall, this was a solid intro to AI course.  More than that, it was a real eye-opener to how many services Microsoft has created (or purchased, as the case often is).  The fact that you had to go to a different Web site to create / link all the different services back to your Azure subscription made the experience less than fluid, but overall, I was pretty impressed with how far Microsoft has come, and how much ground AWS has to make up in this arena.


Comments

Popular posts from this blog

Boston Housing Dataset Missing From UC Irvine's Site

I'm putting together a series of blog posts on Python for R programmers, and I figured I'd use the Boston dataset of Boston housing prices.  It's a pretty well-known dataset for regression, and it's included in R in the MASS package and in Python in sklearn.datasets .  I know the data originally came from UCI, so I wanted to give credit where credit was due. When I clicked the first few Google links that appeared, I got this message on the UCI site: I'm sorry, the dataset "housing" does not appear to exist. Weird that they're not linked at all, but I found these links. Here's the link to the dataset in their archive. Here's the link to the data dictionary. Enjoy!