PHD: Quant analysis rough draft 1

First, some context. This is going to be rough. I am trying to wrap my brain around what is important, what is not, and how to frame the important to tell the story. First off, here are the three research questions I am addressing:

RQ1: What are the network behaviors of participants in TMathC in 2017?
RQ2: What is the human activity of conference participation of TMathC in 2017?
RQ3: How are the network behaviors and professional development activity of the participants interrelated in TMathC in 2017?

This blog post is a first attempt at answering the first question only. The second question is a qualitative question that will be answered by actually reading each of the tweets and blog posts from TMC17. The third question will be answered by putting the results of the first and second together.

The software I am using to do this analysis is NodeXL. NodeXL is created by the Social Media Research Foundation (https://www.smrfoundation.org/nodexl/) and is available in both free and pro versions. I am using the pro version to do the analysis you will find in this and later posts.

To collect the data for this analysis, I every day from 4 days before TMC17 through the end of August, I downloaded the Twitter activity using a search on the hashtag #TMC17. This only collected public tweets that contained the hashtag. That means that followup tweets between people may not be collected unless they used the hashtag in their replies. This is typically not frequent behavior, so all of the analysis below should be understood as underestimating the actual patterns of communication. I bolded this statement, because it cannot be stressed enough.

First off, is the pattern of communication different from a “typical” math teacher conference (whatever that may be?) The easy answer is yes. Working from the public list of TMC attendees (https://twitter.com/TmathC/lists/tmc17) the number of attendees was 189. The data set I downloaded and compiled has 1348 unique accounts in it. Comparing the two lists, 169 of the 189 listed show up among the 1348. This means that 89.4% of the participants at the conference had some kind of Twitter activity. That is a huge percentage of attendees in the attending participants (AP) in the data set. A little arithmetic shows the number of remote participants is 1159. This means the ratio of AP to RP is 0.1458, or 14.6%.

These 1348 nodes (they are not all people, some are accounts like NCTM) created the following network map.

The G1-G25 reference the Clusters the software segments the nodes into based upon their patterns of communication. So, for example, the G1 cluster has a tight group of nodes in the center, with a radiating pattern of nodes around it. The small central cluster communicated with each other frequently, and the radiating nodes had less communication. The words are the top words used by the nodes within each cluster. I set the software to ignore the TMC17, MTBoS, and ITeachMath hashtags. Otherwise they showed up in every cluster.

This graph tells me that there is a great deal of communication between the different clusters. In fact, almost every single cluster has strong communication ties with cluster 1, 2, 3, and 4. Cluster 8 has none, but given that TMCJealousyCamp shows up in that cluster it makes sense.

But which of these nodes are AP, and which are RP? When that question is asked, I realized there are actually four different communication patterns which must be explored.
AP to AP; AP to RP; RP to AP, and RP to RP.

It is very difficult to see all four one map. The tightness of the communication patterns means that the colors overlap, and wash each other out. What if, I only look for RP to RP communication in the data, and I hide all other communication? Is it reasonable to think that remote participants talk to each other about TMC17?

Surprisingly, the answer is YES! In fact, in some clusters (notice I changed the default “G1” notation to “C1” to align with the vocabulary I am using) there is a tremendous amount of RP to RP communication. In addition, each of the clusters has RP to RP communication. C9 has strong grouping of communication, which aligns with the fact that is a cluster of GlobalMathDepartment communications. I also limited this graph to only the top 10 clusters, and added the count of the number of unique nodes into the labels. This helps give further context to the graph (I hope).

Do the RPs engage with the APs?

Again, yes. The RPs engage with not just other RPs, but in large amount with the APs. This graph shows RP to RP and RP to AP directed activity. The graph shows that the RPs engage with the APs more than with RPs, which is to be expected. But do the APs engage with the RPs?

Adding in the AP to RP directed activity creates more edges where there already existed edges, resulting in more lines between the participants, and a darker red. The RP AP communication definitely worked both ways. When the AP to AP communication is added back in the result is:

This graph shows in blue the AP to AP communication along with any RP communication in red. At this point, it is clear that TMC17 was not just a conference for the people in attendance, but it was a conference for a much larger number of people who were participating remotely. The numbers also show this:

Total number of APAP dyads: 7592
Total number of APRP dyads: 1620
Total number of RPRP dyads: 1559
Total number of RPAP dyads: 2778
APAP/(all others) = 7592/5957 = 1.274

A dyad is a two part communication pattern. For example if I were to tweet (not a real tweet, mind you):

Hey @TMathC, you should check out the blog post on the initial analysis of TMC17! @cheesemonkeySF, @druinok, @lmhenry, you should too!

This tweet creates the following four dyads:

gwaddellnvhs to TMathC
gwaddellnvhs to cheesemonkeysf
gwaddellnvhs to druinok
gwaddellnvhs to lmhenry

Should one of those individuals reply, it would create an additional four dyads. This is how the directionality of the tweets is maintained, and how I managed to show the directionality in the graphs above. It also means that while the largest number of dyads is between AP and AP, the ratio of APAP dyads to any dyad which contained an RP is 1.3. This Remote Conference Participation Ratio (I think I just created a new metric) is interesting. A ratio of 1 would mean that there is equal participation between attending and remote attendees. It isn’t a ‘clean’ ratio, because AP shows up in RPAP and APRP categories, but it does suggest a level of RP participation.

That is enough for today. Hopefully this gives the mtbos and iteachmath community something to challenge and think about.

Please, give me feedback. If there are questions here I have not addressed, ask. I welcome the pushback and opportunity to answer your questions. After all, that will just make my dissertation better.

2 thoughts on “PHD: Quant analysis rough draft 1”

Leave a Comment