Chapter 23: Quiz — Network Analysis of Information Spread

DataField.Dev

Chapter 23: Quiz — Network Analysis of Information Spread

Instructions: Answer all questions. For multiple-choice questions, select the single best answer. Answers are hidden in collapsible sections — attempt each question before revealing the answer.

Question 1

In a directed retweet network, what does a node's in-degree typically represent?

A) The number of accounts that node retweeted B) The number of accounts that retweeted that node C) The total number of tweets the account posted D) The number of communities the account belongs to

Answer

**B) The number of accounts that retweeted that node.** In-degree in a directed retweet network counts the incoming edges — the number of times other accounts retweeted this node's content. High in-degree indicates a widely amplified account (a popular original poster). Out-degree, by contrast, counts outgoing edges — how many other accounts this node retweeted, indicating amplification activity.

Question 2

Watts and Strogatz (1998) identified the "small-world" property as a combination of which two features?

A) High degree and low diameter B) High clustering coefficient and short average path length C) Power-law degree distribution and high betweenness D) Low modularity and high transitivity

Answer

**B) High clustering coefficient and short average path length.** The small-world property, as defined by Watts and Strogatz, combines high local clustering (your friends tend to know each other — characteristic of regular lattices) with short global average path lengths (you can reach anyone in a few steps — characteristic of random graphs). This combination is achieved by adding a small number of random long-range connections to an otherwise regular lattice.

Question 3

A node has a betweenness centrality score significantly higher than its degree centrality score. What does this likely indicate?

A) The node is a hub with many direct connections B) The node bridges two otherwise disconnected communities C) The node's neighbors are all highly connected D) The node was among the earliest to join the network

Answer

**B) The node bridges two otherwise disconnected communities.** High betweenness with relatively lower degree is the signature of a structural broker or bridge node — an account that sits between communities and carries information (or controls information flow) between them. Such nodes are critical conduits for cross-community information diffusion, even if they do not have many direct connections. Removing them can fragment the network into isolated clusters.

Question 4

In the Barabási-Albert model of network growth, new nodes preferentially attach to existing nodes based on:

A) Geographic proximity B) Common interests C) Existing degree (number of connections) D) Age of the account

Answer

**C) Existing degree (number of connections).** Preferential attachment — the "rich get richer" mechanism — specifies that when a new node joins the network, it connects to existing nodes with probability proportional to their current degree. This generates the power-law degree distributions (with hubs) characteristic of scale-free networks, observed in the World Wide Web, citation networks, and social media platforms.

Question 5

What does a modularity score (Q) of 0.72 indicate about a social network?

A) The network has very weak community structure B) The network has strong, well-separated community structure C) The network has approximately random edge distribution D) The network has exactly 72 communities

Answer

**B) The network has strong, well-separated community structure.** Modularity Q ranges from approximately -0.5 to 1. Values above 0.3 indicate meaningful community structure; values above 0.6-0.7 indicate very strong communities with dense within-community connections and sparse between-community connections. A Q of 0.72 suggests highly separated communities — potentially characteristic of echo chamber dynamics in an information network.

Question 6

In the Independent Cascade Model, once an activated node has attempted to activate a neighbor and failed, what happens?

A) It can try again in the next time step B) It can try again only after recovering C) It can never attempt to activate that neighbor again D) It deactivates and returns to the susceptible state

Answer

**C) It can never attempt to activate that neighbor again.** The IC model's defining feature is that each active node gets exactly one chance to activate each of its neighbors. Whether or not the attempt succeeds, the edge is "used up." This memoryless, one-shot property distinguishes the IC model from the Linear Threshold model (where influence accumulates over multiple time steps) and from the SIR model on a network.

Question 7

The Linear Threshold Model is most appropriate for modeling:

A) The spread of a meme that people share on first impulse B) The adoption of a belief that requires social validation from multiple sources C) The rapid broadcast of breaking news by media accounts D) The random diffusion of spam links

Answer

**B) The adoption of a belief that requires social validation from multiple sources.** The LT model captures "complex contagion" — behaviors or beliefs that require multiple reinforcing exposures before adoption. This is appropriate for political belief changes, behavioral shifts, and cultural norm adoption, where people need to see that many of their peers agree before they change their views. Simple content sharing (a meme, a news link) is better modeled by the IC model.

Question 8

Vosoughi et al. (2018) found that false news was primarily spread by which of the following?

A) Automated bots B) Foreign state-sponsored accounts C) Human users making deliberate choices D) Platform algorithmic amplification

Answer

**C) Human users making deliberate choices.** One of the most significant findings of Vosoughi et al. (2018) was that bots spread true and false news at approximately the same rate — the differential spread of false news was attributable to humans. The researchers hypothesized that false news was more novel and generated stronger emotional responses, making humans more likely to share it. This finding complicates policy narratives focused exclusively on bot regulation.

Question 9

According to Vosoughi et al. (2018), compared to true news, false news on Twitter:

A) Spread more slowly but reached more people B) Spread faster, farther, and deeper C) Spread at similar rates but persisted longer D) Spread more quickly only in political topics

Answer

**B) Spread faster, farther, and deeper.** False news in Vosoughi et al.'s dataset reached 1,500 people approximately six times faster than true news. False news cascades were larger (farther) and involved more retweet levels (deeper). The top 1% of false news cascades reached 1,000–100,000 people, while true news rarely exceeded 1,000 people. These advantages of false news held across all topic categories, not just politics.

Question 10

The Louvain algorithm for community detection works by:

A) Removing high-betweenness edges until the network fragments B) Propagating labels from node to node until convergence C) Maximizing a modularity objective through iterative local optimization and network aggregation D) Finding the minimum cut that separates the network into two equal halves

Answer

**C) Maximizing a modularity objective through iterative local optimization and network aggregation.** The Louvain algorithm operates in two alternating phases: (1) local modularity optimization (moving individual nodes between communities to increase Q) and (2) network aggregation (collapsing communities into single nodes). These phases repeat until Q no longer increases. The algorithm is fast (O(n log n)) and effective, making it the most widely used community detection algorithm for large social networks.

Question 11

In the context of echo chambers, a community with a cross-community edge fraction of 3% indicates:

A) Moderate cross-community information exchange B) Nearly complete isolation from other communities C) High susceptibility to external misinformation D) A balanced information diet for community members

Answer

**B) Nearly complete isolation from other communities.** If only 3% of a community's edges cross into other communities, members are almost exclusively sharing information within their own group. This near-total isolation is the network signature of a strong echo chamber — members are unlikely to encounter information that contradicts their community's prevailing narratives, and corrections or counter-information from outside the community will reach very few members.

Question 12

Which centrality measure is most appropriate for identifying accounts that derive their influence from being connected to other influential accounts?

A) Degree centrality B) Betweenness centrality C) Closeness centrality D) Eigenvector centrality

Answer

**D) Eigenvector centrality.** Eigenvector centrality (and its directed-graph variant, PageRank) captures influence that propagates through the network — being connected to highly connected nodes makes you more important. This is appropriate for identifying accounts whose significance comes from being amplified by or connected to other important nodes, even if their raw degree is moderate. An account retweeted by major media organizations or verified political figures would score high on eigenvector centrality.

Question 13

The influence maximization problem is classified as NP-hard. What does this practically mean for researchers?

A) The problem cannot be solved at all B) There is no efficient algorithm guaranteed to find the exact optimal solution C) The problem requires quantum computing to solve D) The problem is only solvable for networks with fewer than 1,000 nodes

Answer

**B) There is no efficient algorithm guaranteed to find the exact optimal solution.** NP-hardness means that no polynomial-time algorithm is known (and probably exists) for finding the exact optimal seed set. In practice, researchers use approximation algorithms — most notably the greedy algorithm, which provably achieves at least (1 - 1/e) ≈ 63% of the optimal influence. Faster heuristics like CELF further trade some optimality for computational tractability on large networks.

Question 14

In a scale-free network, what happens when hubs (the highest-degree nodes) are targeted for removal?

A) The network remains largely intact due to its redundant connections B) The network's average path length increases slightly C) The network rapidly fragments into disconnected components D) Information spreads faster because hubs were bottlenecks

Answer

**C) The network rapidly fragments into disconnected components.** Scale-free networks are "robust yet fragile": removing random nodes (most of which have few connections) barely affects connectivity, but removing hubs causes rapid fragmentation. This is because hubs are the connective tissue holding the network together — without them, many nodes become isolated. This property has direct implications for deplatforming: removing high-follower accounts that spread misinformation can significantly disrupt the network's connectivity.

Question 15

In the SIR model, R₀ = β/γ = 0.7. What does this predict about information spread?

A) The information will spread to 70% of the population B) The information will spread exponentially C) The information will die out without becoming epidemic D) The information will spread at a rate of 0.7 new infections per day

Answer

**C) The information will die out without becoming epidemic.** R₀ < 1 means that each "infected" individual (sharer) on average produces fewer than one new sharer. The spread declines geometrically and the information dies out before reaching a significant fraction of the population. For an information epidemic to occur, R₀ must exceed 1. R₀ = 0.7 means that effective counter-messaging (increasing γ) or platform friction (decreasing β) has successfully suppressed the spread below the epidemic threshold.

Question 16

Cross-platform network analysis is especially challenging because:

A) Network analysis algorithms only work on single-platform data B) Platforms have incompatible data formats that cannot be reconciled C) Different platforms provide vastly different levels of data access, and linking accounts across platforms is technically and ethically complex D) Information rarely crosses platform boundaries in practice

Answer

**C) Different platforms provide vastly different levels of data access, and linking accounts across platforms is technically and ethically complex.** Twitter historically provided relatively open API access; Facebook's data is largely inaccessible; Telegram channels are public but not indexed; 4chan posts are ephemeral. Beyond access, linking the same person's accounts across platforms requires either username matching (which many users avoid) or content matching (which requires sophisticated NLP) or network structure matching — all while respecting user privacy. This asymmetric access creates systematic biases in what researchers can study.

Question 17

A researcher finds that the average clustering coefficient of a network has increased over six months while modularity has also increased. What would this most likely indicate?

A) The network is becoming more random B) The network is becoming more echo-chambered — communities are growing tighter and more isolated C) The network is growing in size D) Cross-community bridges are increasing

Answer

**B) The network is becoming more echo-chambered — communities are growing tighter and more isolated.** Increasing average clustering coefficient indicates that local neighborhoods are becoming more tightly knit (your contacts increasingly know each other). Simultaneously increasing modularity indicates that communities are becoming more clearly separated. Together, these trends suggest echo chamber dynamics intensifying over time — a pattern documented in research on political polarization on social media platforms.

Question 18

The Girvan-Newman algorithm detects communities by:

A) Maximizing modularity using local node moves B) Propagating community labels from high-degree nodes C) Iteratively removing edges with the highest betweenness centrality D) Clustering nodes based on their eigenvector centrality scores

Answer

**C) Iteratively removing edges with the highest betweenness centrality.** Girvan-Newman uses the insight that inter-community edges carry many shortest paths (high betweenness) because they are the only connections between communities. By iteratively removing these high-betweenness edges, the algorithm progressively reveals community structure. It produces a hierarchical dendrogram showing nested community structure at different scales, but it is computationally expensive for large networks.

Question 19

In NetworkX, which function would you use to compute PageRank for a directed graph G?

A) nx.degree_centrality(G) B) nx.pagerank(G, alpha=0.85) C) nx.eigenvector_centrality(G) D) nx.betweenness_centrality(G)

Answer

**B) `nx.pagerank(G, alpha=0.85)`** NetworkX provides `nx.pagerank(G, alpha=0.85)` for computing PageRank on directed graphs. The `alpha` parameter (default 0.85) is the damping factor — the probability that a random walker follows an edge rather than jumping to a random node. PageRank is essentially eigenvector centrality adapted for directed graphs and is appropriate for ranking nodes by their influence through incoming links.

Question 20

A misinformation cascade has depth = 12, breadth = 3, and size = 45. How should these statistics be interpreted?

A) The information spread wide but stayed shallow B) The information propagated through many levels of retweet but with limited branching at each level C) The information reached 12 different communities D) The information was retweeted by 12 bots and 33 humans

Answer

**B) The information propagated through many levels of retweet but with limited branching at each level.** Depth = 12 means the cascade reached 12 levels deep (original tweet → retweet → retweet → ... twelve levels down). Breadth = 3 means the maximum number of nodes at any single level was only 3. Size = 45 is the total cascade size. This pattern — deep but narrow — suggests a chain-like spread through sequential sharing rather than explosive branching. Vosoughi et al. found false news tends to form deeper cascades than true news, suggesting this pattern may indicate the "depth advantage" of false content.

Question 21

Which of the following would be the strongest evidence of echo chamber dynamics in a social network community?

A) High average degree within the community B) Low modularity score for the overall network C) High clustering coefficient, low cross-community edges, and content homogeneity within the community D) Short average path length within the community

Answer

**C) High clustering coefficient, low cross-community edges, and content homogeneity within the community.** Echo chambers require multiple converging signals: structural isolation (low cross-community edges), internal cohesion (high clustering), and functional homogeneity (members share similar content reflecting shared views). Any one measure alone is insufficient — a tightly knit sports fan community might have high clustering and low cross-community edges but would not constitute an echo chamber in the politically relevant sense without content homogeneity.

Question 22

In NetworkX, what does the function G.to_undirected() do, and why might a researcher need it?

A) It removes all node attributes, simplifying the graph B) It converts a directed graph to an undirected graph by treating all edges as bidirectional C) It removes nodes with no incoming edges D) It converts the graph to a bipartite representation

Answer

**B) It converts a directed graph to an undirected graph by treating all edges as bidirectional.** Many community detection algorithms — including the Louvain algorithm as typically implemented — require undirected graphs. A directed retweet network (where A→B and B→A are distinct) must be converted to an undirected graph for these algorithms. `G.to_undirected()` performs this conversion, collapsing any pairs of directed edges into single undirected edges. Researchers must be aware that this conversion loses directional information.

Question 23

The concept of "complex contagion" (Centola & Macy, 2007) implies that for some information to spread, it requires:

A) A single highly connected hub to broadcast it B) Exposure from multiple independent sources in a person's network C) Algorithmic amplification by platform recommendation systems D) Visual rather than textual presentation

Answer

**B) Exposure from multiple independent sources in a person's network.** Complex contagion describes behaviors and beliefs that spread only when a person receives reinforcing exposure from multiple independent sources — mere exposure once is insufficient. This distinguishes complex contagion from simple contagion (like disease spread or meme sharing), where a single exposure can be sufficient. The distinction is important because complex contagion spreads more readily through clustered networks (echo chambers) than through random networks, since clustered networks provide the redundant exposures needed.

Question 24

A researcher wants to identify accounts that could most efficiently disseminate a correction across an entire politically polarized network. Which centrality measure should they prioritize?

A) Degree centrality within one community B) Betweenness centrality (accounts bridging communities) C) Clustering coefficient D) In-degree from bots

Answer

**B) Betweenness centrality (accounts bridging communities).** To disseminate a correction across a polarized network where communities are largely isolated from each other, you need accounts that bridge communities — those with high betweenness centrality. These "bridge" accounts can carry the correction into communities that would not encounter it through within-community sharing alone. Simply targeting the highest-degree accounts within one community would leave other communities unreached.

Question 25

Vosoughi et al. (2018) used fact-checking organizations' verdicts as ground truth for classifying news as true or false. What is the main selection bias concern with this approach?

A) Fact-checkers tend to verify stories from only one country B) Fact-checkers select stories that are already circulating widely and are politically salient, potentially missing most false stories that circulate very little C) Fact-checkers are more likely to identify bot-generated content D) Fact-checkers can only assess visual content, not text

Answer

**B) Fact-checkers select stories that are already circulating widely and are politically salient, potentially missing most false stories that circulate very little.** Fact-checking organizations prioritize content that is already viral or politically impactful — they are not attempting to exhaustively catalog all misinformation. This means the Vosoughi et al. sample is not representative of all false content on the platform but is biased toward high-salience, high-circulation content. Altay et al. (2022) subsequently showed that most individual false news stories circulate very little, suggesting the "fastest and farthest" framing may overstate the typical case.