Many business and technology problems are naturally “network problems”. Customers refer other customers, payments flow between accounts, servers depend on services, and suppliers connect to factories and warehouses. Graph analytics helps you model these relationships as nodes and links, then quantify which nodes matter most. Two of the most practical centrality measures are PageRank (influence through incoming links) and betweenness centrality (importance as a bridge between groups). If you are learning applied graph thinking through data analytics courses in Delhi NCR, these measures are often the first step from “drawing a network” to “making decisions from it”.
Modelling Networked Data as a Graph
Before computing centrality, you need a clear graph definition.
Nodes and edges
- Nodes (vertices): entities such as users, products, bank accounts, web pages, devices, or departments.
- Edges (links): relationships such as “follows”, “purchased”, “transfers to”, “depends on”, or “co-occurs with”.
Directed, undirected, weighted
- Directed graphs matter when the relationship has a direction (payments, hyperlinks, dependencies).
- Undirected graphs fit mutual relationships (co-purchase, collaboration).
- Weights capture strength (transaction value, frequency, reliability score).
A common mistake is mixing meanings. For example, treating a “payment” as undirected blurs who initiated the transfer, which can change PageRank outcomes. Keep your modelling consistent with the decision you want to support.
PageRank: Influence That Flows Through Links
PageRank was designed to rank web pages, but it generalises well to any directed network where “being pointed to” signals importance.
Intuition
PageRank imagines a random walker moving through the graph. A node’s score becomes higher if it receives links from other high-scoring nodes. This captures network endorsement: not just how many links you have, but who those links come from.
Key mechanics (in simple terms)
- A damping factor (often written as d) represents the probability of following a link versus jumping to a random node.
- Scores are updated iteratively until they stabilise.
- It works best when the direction of edges represents “attention”, “trust”, or “flow”.
Practical uses
- Fraud and risk networks: identify accounts that receive funds from many connected entities, especially when those entities are themselves well-connected.
- Content and engagement graphs: find creators whose followers are influential, not just numerous.
- IT dependency graphs: rank systems whose failure would affect many downstream services.
When taught well, data analytics courses in Delhi NCR usually stress that PageRank is not a universal “importance score”. It is importance under a specific flow assumption, so your edge definition must match the real-world process.
Betweenness Centrality: The Power of Being a Bridge
Betweenness centrality measures how often a node lies on the shortest paths between other nodes. It highlights nodes that connect clusters, act as brokers, or control access between groups.
Intuition
A node can have modest degree (few connections) but still be critical if it sits between communities. In an organisation network, a single integration service between two business units may have high betweenness, even if it connects to only a few systems.
Why it matters
- Bottleneck detection: nodes with high betweenness can become points of failure.
- Influence across communities: in social or collaboration graphs, brokers can spread information between otherwise separate groups.
- Supply chain resilience: a logistics hub or supplier that connects multiple product lines can have high betweenness, signalling concentration risk.
Computational reality
Exact betweenness can be expensive on large graphs because it depends on many shortest-path computations. In practice, teams often use:
- Approximation via node sampling
- Constraints like considering only paths up to a maximum length
- Distributed implementations in big data environments
This is a key engineering lesson: centrality is valuable, but it must be computed within your runtime and cost limits.
Putting Centrality Measures to Work
Centrality scores become useful when tied to a business action.
A simple workflow
- Define the decision: detect fraud hubs, prioritise outreach, harden critical systems, or optimise routing.
- Build the graph carefully: choose node types, edge direction, time window, and weights.
- Compute PageRank and betweenness: start with defaults, then test sensitivity to parameters like damping factor or weight scaling.
- Validate: compare top-ranked nodes with known outcomes (incidents, churn, historical fraud cases, expert review).
- Act: create thresholds, alerts, investigation queues, or resilience plans.
Use both measures together
- High PageRank + high betweenness can indicate nodes that are both influential and structurally critical.
- High PageRank but low betweenness may signal strong endorsement within a community.
- High betweenness but moderate PageRank can reveal quiet connectors that deserve monitoring.
Learners in data analytics courses in Delhi NCR often find this combined view more actionable than relying on a single metric.
Conclusion
Centrality measures turn complex networks into prioritised, explainable signals. PageRank is strong for identifying influence that flows through directed links, while betweenness centrality surfaces bridges, bottlenecks, and structural risk. The real value comes from correct graph modelling, practical computation choices, and validation against outcomes. If your goal is to build job-ready analytics skills, applying these measures to real networked datasets is one of the most direct ways to move from theory to impact, which is why data analytics courses in Delhi NCR increasingly include graph analytics in hands-on projects.
