Posts

Showing posts from February, 2021

Rapid Automatic Keyword Extraction (RAKE) in Power BI

Image
 This post can now be found here I like Python but it needs more acronyms In this post I am going to go through the basic use of Rapid Automatic Keyword Extraction (RAKE) using the Natural Language Tool Kit (NLTK) as documented here .  So before we get into RAKE what is NLTK? It is a set of tools that do some quite similar stuff to Natural Language Processing as used in the spaCy Position of Speech (POS) and Named Entity Recognition (NER) tools that are explored here . Essentially the tools break down text into their component parts for scripts like RAKE, POS, NER to return results based on analysis of those components. What is RAKE? This tool extracts key words and phrases based on frequency and without any understanding of the context of the text being analysed. This makes it quite a general tool that will not work well with all text sources but like POS and NER is a good exploratory tool and may work well when joined up to other information about the text data. As an exampl...

Natural Language Processing in Power BI

Image
This post can now be found here With great Power BI comes great responsibility It is time: Power BI + Python = Amazing!  It worked, it is relatively easy to convert the code to run within Power BI. The main thing that changes is the emphasis of using print() or exporting data in some ways things are easier in that you just need to update or create data frames so 🐼s is key... Pandas . In terms of a step by step guide the first thing I did was get the code working in Jupyter as a proof that the code will execute successfully.  Following on from using spaCy . When you add a Python Script to a Power BI Query it basically takes the data as it stands in the previous step and converts it to a data frame called dataset. This means that for the code I had for spaCy the name "df" needed to be changed to "dataset". The other thing that needs to be different is what you do with the outputs of your code. I previously was printing out the values to test that it was working whic...

Lost in spaCy

Image
This post can now be found here In spaCy no one can hear you scream... when you get a Syntax error. Experiments with spaCy are going well in the appropriately named Jupyter notebooks. There are two spaCy Natural Language Processing (NLP) components I am working with 1. Position of Speech (POS) and 2. Named Entity Recognition  This tutorial  was really helpful in generalising the experiments I have been doing and looking at different aspects and capabilities of NLP. The idea with this is to get a general set of tools developed in a notebook before looking at adapting the code to be used in Power BI to run NLP as a step in Power Query. The big question is if you can load the NLP language library into Python running in Power BI - if not it is going to be tricky. Starting Point So I am starting from an example here where NLP tools have been used to get an overview count of the contents of a csv of text values - which is great - but not what I need for a Power BI report to be able...

Starting Py-Fun

Image
This post can now be found here   💡The Idea I wanted a way to document and encourage practicing of my experiments with Python and integrating it with Azure ML, Power BI which is a major project I am currently working on. 📋The Plan I am going to blog as regularly as possible covering the Python exploration and investigation that I am doing. The plan is to help document the challenges and experience of trying to work to bring Python into the Microsoft Power Platform and processes that are used in my team. This area is something that I have not found many resources about in my work so far so I am conscious that some of this may be of help to others which is another reason to put this into a blog. 👣Next Steps 1) Publish this post 2) Start looking at Natural Language Processing in Python (using Jupyter notebooks) 3) Convert Jupyter code to Power BI for use in data models