Maven Challenge – Harry Potter Scripts

Using Power Query to match number of particular spells to characters

When I first saw the new Maven Magic Challenge description, I was drawn to the words “share a…visual that captures the MAGIC. And so, when I started looking at the data I had little real interest in the budgets, box office and run times. It was all about the characters, their journey, and the spells they cast… both in the movies and on the movie goers.

Connecting the Places and Spells to the Characters

So, I started to look at how I could connect the story characters and their dialogues to the magical places around the Potter universe, and the data as it was arranged allowed me to do that without too much hassle.

I then came to look at the data associated with the spells. There was no path in the dataset as provided that allowed me to directly link the spells both to the characters, the dialogues and the places. All it would allow me to do is look at them in isolation, which was no good!!

The challenge of this data set was officially set – how can I link the spells to the dialogue, and then onto the characters and places, and how can I pull this weaving story into a single page visual???

I came up with two methods to try and achieve this

My first method would be to count the number of spells cast by a character. The next would be to also identify the type and name of each individual spell cast.

It was time to cast those “spells”…..Powerus Querio!!

Method 1 – Calculating the Number of Spells Cast by a Character

Step 1 – Create a List of Spells

I needed to create a separate list containing only the incantations from the spell file. So, I imported a copy of the “Spells” csv file. I then selected the Incantation column and the Transform tab, selected Convert to List (shown below) and muttered “Reducio” under my breath.

Voila! The file was reduced to a single list of the spell incantations, as shown below. I would use this list as a reference to look at instances where the incantations were mentioned in the character dialogue. Keep reading to find out more!

Step 2 – Searching the Dialogue for Incantations

The next step was to search the Dialogue for any of the line items from my newly acquired list. I imported the Dialogue csv, then set to work deploying a combination of List.Count and Splitter.SplitTextByAnyDelimeter. The line of M code I used was:

= Table.AddColumn(#"Changed Type", "No. of Spells - Splitter", each List.Count(Splitter.SplitTextByAnyDelimiter(Spells)([Dialogue])) -1)

This would count the number of times a spell was mentioned in the line of dialogue. Let me explain a little how this works, starting from the inside out.

Splitter.SplitTextByAnyDelimeter(Spells)[Dialogue]

This code allows us to work through the lines of dialogue, and when it meets a spell from our list, it effectively stops and splits that line into a list. This new mini list will contain a separate line for each time a spell is encountered, and will have a final line to dump the text from the dialogue that does not appear in the spell list.

This new mini list does not capture which spell it has encountered, only that it has encountered one. In fact, it returns a blank line each time it finds a spell. Confused? I was a little. The below picture may help visualize it.

List.Count – Wrap Around

The Text Splitter does not do a whole lot on it’s own. Therefore, the next piece of the puzzle was to wrap it in a List.Count. This would count the number of items in each mini list we have created. However, as the above shows, it would count once for the spell, and once for the remaining text.

The next action would be to add “-1”. The result would be to give us a count of the number of times an incantation from our list had been used in the dialogue. See below example highlighting where a count has been made of an incantation.

Step 3 – Tidy Up

The final step in this initial tidy up is simply to filter the column for values greater than zero. You have now identified all the lines of dialogue that contain an incantation, as well the numbers of incantations used – sometimes there was more than 1 – see below.

Not Satisfied Yet!!!!!!

Method 2 – Extracting the name of the spells cast by each character

Step 1 – Custom Column with List

Create a list of Spells in the same way as Method 1, Step 1 above – Easy as!

Step 2 – Add Spells List to Dialogue Line Items

Add a custom column to your dialogue sheet in Power Query called “Find Spell”, and simply add “= Spells” and hit ok. You will see a new column added containing the word List in each cell. This list contains the full list of spells.

Step 3 – Expanded List

This next step may not be recommended for very large data sets, like the Maven Taxi Challenge, but it performs fine here.

We will now expand the list. This is done by clicking on the double arrows and selecting “Expand to New Rows” as shown below.

This will then duplicate each line of dialogue to equal the number of spells in our list. Below is an example, where we see the line of dialogue repeated, with each spell listed beside it. The reason for doing this will become clear below.

Step 4 – Find the Spell

Next we can use a Boolean calculation in a custom column to find if our line of dialogue contains the spell that is living in the “Find Spell” column.

I have used Text.Contains([Dialogue],[Find Spell]) . This will look at the text in the Dialogue column and see if it contains the text shown in the corresponding Find Spell column. If it finds it, it will return TRUE, and if not it will return FALSE. See the below example, where if I can recall correctly Hermione repairs Harry’s glasses with an Occulus Reparo spell!!

Oculus Reparo - theQuotes.me

Step 5 – Filter for TRUE values

Once the above step is taken, it is a matter of applying a simple filter on your custom column to only show TRUE values. Looking at the below, I can now see only the lines of dialogue that contain a spell, and now also the name of the actual spell that has been used!! This was what I originally had in mind.

Using this Information

As the Dialogue data sheet contains the movie, chapter, place and character IDs, I can now link the spells to these people and locations!

Next Step

The next step is to now build the visualisation I had in mind to work with this new insight.

Watch this space…..

Expecto Patronum!

15 thoughts on “Maven Challenge – Harry Potter Scripts

  1. It’s actually a great and useful piece of information. I’m happy that
    you shared this useful info with us. Please keep us up to date
    like this. Thank you for sharing.

    Like

  2. It is in point of fact a great and useful piece of info.
    I am glad that you shared this useful info with us.
    Please stay us informed like this. Thank you for sharing.

    Like

  3. Great blog you have here but I was curious if you
    knew of any discussion boards that cover the same topics discussed here?
    I’d really like to be a part of group where I can get advice from other experienced individuals that share the same interest.
    If you have any recommendations, please let me know. Cheers!

    Like

    • You can follow Maven Analytics on LinkedIn.

      I also just shared a new blog on the Super Bowl challenge, sharing my approach to creating that dash report.

      Are you looking for technical support, or interested in the planning and strategy that goes on behind the scenes?

      Like

  4. Hi, good work on this, i just want to share my thought how i did it, please reply with your opinion. what possible issue we might face. I did the inner left join between dialogue[Dialogue] and spells[Incantation] with fuzzy logic threshold of .99 in powerbi. i got the same result, i just want to confirm whether it has any possible flaws in doing so.

    Like

    • Hi,
      Thanks for your comment. I am not an expert in use of fuzzy logic, but I think you will get the same result as it is very unlikely there will be spelling mistakes or typos etc that would give an incorrect match.

      Did it pick up lines where there was more than one spell/incantation spoken in that dialogue line?

      It’s something I will look into a bit more now – thanks for pointing it out👍

      Like

  5. I blog often and I really thank you for your content. The article has truly peaked my interest. I’m going to bookmark your site and keep checking for new details about once a week. I opted in for your RSS feed too.

    Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s