Social Semantic Web Paper Summaries 4: Social Network Analysis: A Handbook by John Scott

Hi!

In 2018, I have taken a master’s course from Bogazici University, called Social Semantic Web (CMPE 58H), teached by Suzan Üsküdarlı. (https://twitter.com/uskudarli)

I have wrote summaries for a few papers there. Today, I have decided to share them with you. This is both to share my understanding on these papers and to show my approach on how to read papers. Summary of the fourth paper is below.

Merhaba!

2018 yılında Boğaziçi Üniversitesinde Sosyal Semantik Web (CMPE 58H) diye bir ders almıştım, Suzan Üsküdarlı hocamız dersi veriyordu.(https://twitter.com/uskudarli)

O derste birkaç makale özeti yazdım. Şimdi, o özetleri paylaşmaya karar verdim. Bunu, hem bu makalelere bakış açımı yansıtsın, ilgililer varsa okusun diye yapıyorum, hem de kendi makale okuyuş yöntemimi paylaşmak için yapıyorum. Dördüncü makalenin özeti aşağıda.

Title:

Social Network Analysis: A Handbook

https://www.amazon.com/Social-Network-Analysis-John-Scott/dp/0761963391

Citation:

John Scott

https://en.wikipedia.org/wiki/John_Scott_(sociologist)

About Author:

Being a member of British Academy , Royal Society of Arts and Academy of Social Sciences; John Scott is a respected sociologist. He worked on the topics of Social Network Analysis, Social Classes and History of Sociology.

He is the creator of Sociological Theory and Social Theory. Some keywords on his studies would be: Class, Ownership, Business Decision Making, Power Structures of Different Countries.

Abstract

Author begins the article with defining data types on networks in Chapter 1.

Then, in Chapter 2, he summarizes the history of Social Network Analysis with dividing it mainly to three branches.

After that, in Chapter 4 and Chapter 5, he states the most important concepts on Social Network Analysis with their historical backgrounds.

Chapter 4 stands on more basic concepts, Chapter 5 is about more complex concepts such as Centrality or Absolute Density. (Method of finding absolute density that is mentioned at the end is pretty interesting.)

Issues

Chapter1 Networks and Relations: .

.

.

.

Purpose of the paper: Because it is hard to understand the concepts about networks, author tries to explain them. With his own words: “Identifying the key concepts about networks.”

Relations and Attiributes:

There are three types of information to consider in case of networks:

  • Attribute data: Properties of agents.
  • Relational data: Properties of system of agents. Ties and connections between agents.
  • Ideational data: Meanings of the typifications themselves. (Metadata)

Chapter 2 : The development of social network analysis

SNA can be divided mainly into three branches:

  • Sociometric Analysts: Provided technical advances via graph theory.
  • The Harvard researchers: Provided cliques.
  • Manchester anthropologists: Built on others’ studies to examine real life enviroments.

Some scientists:

  • Moreno found Sociagram. This is the first attempt to systemitize the concept of ‘network’ into an analytical scheme.

Topological Approach: Points connected by Paths, there are also regions

  • Lewin: Mathematical modelling of group relations
  • Cartwright and Harary: Studied on models of : Group Cohesion, Social Pressure, Cooperation, Power and Leadership. Signed and directed graphs.

Balance of a social network: A balance state on valued-directed graphs (particularly on liking-disliking relations graphs) when there are two sub-groups that anticipate each other and all the members in each of the sub-groups like the other members of the group.

  • Rapoport: Studied spread of ideas and innovations

Interpersonal Configurations and Cliques

  • Hawthorne studies: Studying a bank-wiring room, researchers examine friendships, antagonisms etc.

Techniques for drawing graphs: Members that belong to the same sub-group must be drawn closely, connection intersections must be as small as possible, lengths of the connections must not be much different.

  • Warner Clique: A clique is ‘an intimate non-kin group, membership in which may vary in numbers from two to thirty or more people’
  • Davis et al (Warners colleagues): Cliques are three layered. The core, the primary circle and the secondary circle. As you move outwards from the core, you become less of a member of the clique.
  • Homans: Studied matrix rearrangement (a spectacular achivement).

Homans divides any group into internal and external systems.

Networks : Total and Partial

  • Manchester Antropologists: John Barnes, Clyde Mitchell and Elizabeth Bott, Max Gluckman: conflict and change
  • Gluckman: negotiation, bargaining and coercion
  • Mitchell: Carried on after Nadel. Worked on the conceptualization of networks: Ego Centric and Global Feature based conceptualization. This conceptualization (ego-centric) is used a lot, later on the article.

Quality of relations is bounded to three features: Reciprocity (the state of edges being double directioned), Intensity , Durability

Density: Completeness of the network

Reachability: How easy it is for something to spread in the network

  • Harrison White (Harvard):

Algebraic models of groups using set theory.

Multidimensional Scaling: When mapping, use the “distance” on the map to model relationships. (such as drawing the lesser valued edges thinner.)

  • Granovetter:

Studied how people find work. Found out that Informal (personal) contacts are especially important. Rational choice, rewards etc do not constitute that much of an importance. The main issue is the flow of information.

  • Lee: Studied on finding an abortionist. Average number of steps to find one is 5.8

Chapter 4: The basic building blocks of social networks: Points, lines, distance, density, direction

Graph Theory: Graph theory is different than a sociogram in the sense that graph theory can be applied into a broader range of problems.

In the graph the pattern of the connections is important. The positioning of the edges and nodes are not important.

Directed Graph = Diagraph

In a graph, intensity is represented with values of edges. (most given example is the multiplicity of the edges)

  • Adjancency: Is being connected via a line.
  • Neighbourhood of point X: The set of all points that are adjacent to point X.
  • Degree of point X: Number of points that are connected to the point X.

Sum of the degrees the points in a graph = twice of edge number.

  • Walk: The number of edges to touch when going from one node to another.
  • Path: A walk where each edge and node is distinct. This means that there are no repetitions, circles, etc.
  • Length of Path: Number of edges in the path
  • Distance between two points: The length of the shortest path.

Diagraphs consist two types of degrees. Indegree and outdegree.

In diagraphs, all edges in a path must be in the same direction. This rule can sometimes be ignored. When it is ignored, path is mostly named as a semi-path.

  • Density: Measure of if all possible connections already made, completeness of the graph. Depends on two things:

-Inclusiveness: (Number of all points — Number of isolated points) / Number of all points

-Sum of the degrees

Density = edges / ([nodes * (nodes -1)] / 2 )

Density for diagraphs = edges / [nodes * (nodes -1)]

  • Barnes contrasted two approaches on SNA. Ego-centric and Socio-centric.

Density of a ego-centric network: It is more useful if you totally ignore the “Ego” node and its edges. The reason is when you count the edges that connect the ego node and the other nodes, density always appears to be high.

Density of valued graphs: When calculating the density of valued graphs, weighing the density formula by multiplicities of the edges is useful. However, extra information on the network is needed to decide on the maxiumum theoretical limit of multiplicities.(the denominator)

  • Problem of the negative correlation between size and density: This prevents density comparisons between different sized networks . Larger sized networks mostly have low density. So it is misleading to compare densities of different sized networks. Nodes practically can sustain only a certain number of relationships between each other. After that, sustaining all relationships becomes expensive. Mayhew and Lewinger argues that maximum amount of density that is likely to be found in a real-life network is 0.51 .

The ability to sustain relationships also depends on the type of the relationship.

The way to calculate the density on the samples found in real life: [(mean degree of nodes * number of nodes) / number of maximum possible connections]

  • Another way to calculate density that is more reliable in case of population studies (Granovetter’s Solution): (total sum of average densities of some randomly chosen sub-group samples / number of those sub-group samples)
  • Wellman: Used Granovetter’s Solution on East York’ers to study if urbanization resulted in dissapearance of community. Average density of East York’ers turned out to be low (0.33 on average), which means community is not too tight.
  • Smith: Based on historical documents about an English village, taking into account 112 individual’s ego-centric networks, he found out that even then the community was not so tight. (found densities between: 0.2–0.4)

He also found that there is almost no correlation between network density and group size as it was asserted before. This was a very surprising discovery. About this issue, he concluded that:

“the variations in network density which were observed were not a mere artefact of network size, but reflected real variations in the quality of interpersonal relations.” (?)

Smith also found out that tightly organized village communities are not as tight as it was predicted. The reality was that these communities were much looser than it was estimated. (median density being between 0.2 and 0.4)

  • Concatenation of networks: Grieco’s study showed that, when the person in the centre of the ego-centric network (the ego) provides an information flow about job opportunities to other people in the network either directly or indirectly, there is a possibility for the ego and the other person (which is not connected to ego directly) to form a direct link between them. Therefore, providing help (solidarity and obligation ) augments the density of the global network.

Chapter 5: Centrality, Centralization

SNAnalysts agree with each other about point centrality.

  • Local Centrality: Is high if the node has a large neighbourhood.
  • Global Centrality: Is high if the node has a type of strategical importance in the network.
  • Centralization: Indicates the overall cohesion/integration of the graph. This is different than ‘centrality’. Centrality is a feature of a point. Centralization is a feature of a network.
  • Nieminen has a systematic study about degree-based centrality.
  • It is also possible to measure local centrality in directed graphs. There are two types of centrality in this case: Indegree and Outdegree.
  • Measuring centrality based on the degree can be extended with including distance 2 connections (including the second level neighbourhood of the node). However, this extending method does not provide much information with distance 3 or more distanced connections since most nodes in real life graphs are not more farther away than 3–4 jumps to each other. When this much of an extending is applied, centrality of most points are the same as one another.
  • Comparing centrality between the nodes that exist in different graphs may be risky. Since the size of the graph affects the degree of every node in the graph, it is not reasonable to make such comparisons.
  • Freeman proposed a relative centrality: (neighourhood size of the node / number of all nodes in the graph)

Freeman also suggested a global centrality: When a point is positioned closely (has not much of a distance) to other points in the graph, that particular point is central.

There is also in-centrality and out-centrality in directed graphs.

Freeman: Betweenness: Betweenness of a point X increases as the point X lies between the shortest path of many other pairs of nodes A and B.

Local Dependency: Point A is dependent to point B in terms of reaching C IF B is on the way from A to C.

Pair Dependency: The sum of all dependency values of point A to point B, in terms reaching all other nodes P1,P2,P3…PN.

  • Bonacich Centrality: Centrality of a point is dependent upon the centralities of the points it is connected to. So, when calculating centrality, a recursive process must be applied.

ci = rijcj -> centrality(i) = value_of_connection(i,j) x centrality(j)

i is the node that we’re interested in. j is another point near i that has a significant amount of centrality, so that we cannot ignore i being connected to j.

  • Generalized version of Bonacich Centrality: ci = rijcj + (cj) /// the last term is the limit for maximum distance (the neighbourhood level) that will be taken into account while measuring centrality. If it is zero, no indirect nodes are taken into account while measuring centrality of a point. The term is a constant.

The problem is it can be hard to decide what to assign to this constant in which type of graphs.

Least central points = Peripheral points

Density: Cohesion of a graph

Centralization: Indicates if this cohesion is concentrated on some particular points.

Density and Centralization are complements to each other.

  • Centralization of a graph : If most central point is called P0, and others P1, P2 etc. and if C(P) is the centrality score of a point:

Centralization(Graph) = [C(P0)-C(P1)] + [C(P0)-C(P2)] + [C(P0)-C(P3)] … + [C(P0)-C(PN)] / (N-1)

N-1 is the maximum possible sum of differences, which is number of nodes minus one, multiplied by the maximum difference of 1. [(N-1) * (1–0)]

  • Stokman and Snijders: Nuclear Centrality: Most central point on a graph. They place it in the middle when mapping the graph.

Margin of the Centre: Points that are not peripheral nor central; and close to the nuclear centre.

Absolute Centre: The point which is closest to the other points (Does not seem different than Freeman’s Global Centrality)

  • Christofides defines: Eccentricity of a point A is the longest distance between A and any point X on the graph. The first idea was defining the centre as the point that has the lowest eccentricity.

Second idea: Thinking about an imaginary point X that is on one of the already existing paths. The point X is absolute centre of the graph if it has the lowest eccentricity. Some graphs do not contain their imaginary absolute centres, some graphs do contain them.

To define an absolute density, we need measures of radius , diameter and circumference. For all these, we need an absolute centre.

Radius of a graph: Distance between two most distant points.

Circumference of a graph: Length of the longest path on the graph. (?)(Aren’t radius and circumference same then?)

  • With these, we can calculate the volumes of graphs and then calculate the absolute density via dividing the volume of the graph by the mass of the graph (number of the edges). This method also can be generalized to higher dimensions.
  • Schwartz applied point centrality on corporate enviroment to illustrate the power scheme. Schwartz also applied Granovetter’s strong and weak ties concepts on the corporate enviroment. His theory was that more important events for the company would be interlocked by the nodes (people) that had strong ties (that are full-time employees) with the company.

Approach

Author talks about measures that can be used to analyze graphs, with examples of their uses and their historical background. He explains mathematical concepts with an easier language.

Author’s conclusions:

Author states that SNA metrics are mostly defined by mathematicians and that many sociologists find it hard to understand those metrics. He says his aim is to explain the concepts on Social Network Analysis in a comprehensible way with also mentioning the creators of the concepts.

My conclusions

I think authors aim is being suffered by the outline form of the text. (titles etc.)

The layout and the paragraphization of the article is a disaster. Author outlined the article such that it is one big story. There is no problem with the organisation of the text. However better use of titles are certainly needed.

It can also be hard for some people to get lost in the historical background parts. I think it could be better to state definitions with titles first and then telling about their historical background.

Rating

Text has fantastic content. However structure makes it very hard to understand. Therefore I would evaluate the text as moderate.

5/10

NLP Engineer at PragmaCraft. Former Researcher at Bogazici University Medical Imaging Lab. twitter:ahmetmeleq /// website:ahmetmelek.com