archethic-node icon indicating copy to clipboard operation
archethic-node copied to clipboard

Network patch calculation look not accurate

Open Neylix opened this issue 1 year ago • 5 comments

Describe the problem you discovered

On the mainnet, each node has a network patch calculated during the beacon summary. This network patch is used to determine which node is near to another node regarding the network latency between them. Actually all nodes have a great internet connection with low latency Here is the result of the function P2P.nearest_nodes from one of the node in mainnet:

image

I annoted each node with their geolocalisation and it's looks like the network patch is not accurate: The node in green should be the nearest ones from the node in blue, and the red one should be far from it.

For exemple new york has a patch next to sydney but far from toronto while the latency between NY and sydney is far more than the ping between NY and toronto:

  • ping NY => toronto = 22.5
  • ping NY => sydney = 215

This result with a non efficient fetching of data used through the quorum.

Describe the solution you'd like

Network patch should be more accurate and the nearest node from a node should be the one with less latency and not the one which are literally in the opposite side on the world.

Epic

No response

Neylix avatar Oct 25 '23 10:10 Neylix

Solution: Please change the formula/code to calculate the (x, y) coordinates. (This helps in correcting the above error)

To transform a distance matrix (Dij) and subsequently derive x, y coordinates, below are the steps to get the coordinates:

Step 1: Direct Transformation Method

  • Transformation: Pasted Graphic 2

  • Here, mean(D[i]) and mean(D[j]) are the averages of the ithrow and jth column of D, respectively, and mean(D) is the average of all elements in D.

After transforming the distance matrix D into the matrix B using either method, follow these steps to obtain x, y coordinates:

Step 2: Eigenvalue Decomposition

  • Perform eigenvalue decomposition on B: Pasted Graphic 5

  • Here, V is the matrix of eigenvectors and λ is the diagonal matrix of eigenvalues.

Step 3: Select the Top Two Eigenvectors

  • Choose the eigenvectors corresponding to the two largest eigenvalues (say, λ1 and λ2).

Step 4: Form the Coordinates

  • The x, y coordinates are then given by: Pasted Graphic 6

  • v1 and v2 are the eigenvectors corresponding to λ1 and λ2, respectively.

These steps will give you the x, y coordinates in a 2D space that best approximate the original distances in your matrix D. The accuracy of this representation depends on how well the original distances can be modeled in two dimensions.

internet-zero avatar Nov 15 '23 07:11 internet-zero

Here is sample of the data used to do the calculation stats.csv

This sample is coming from the summary of 14/11/2023 (JJ/MM/AAAA)

It looks like some latency are not accurate, for example a node at Embrun (in France) as 49 ms latency to a node at Singapore. When I connect to this node and send a ping to Singapore node it takes 280 ms.

Neylix avatar Nov 15 '23 09:11 Neylix

image

Here is the result of the above plot, it is looking accurate to the network latency.

internet-zero avatar Nov 16 '23 06:11 internet-zero

Please checkout the full python code

import pandas as pd import numpy as np import matplotlib.pyplot as plt

def classical_mds(distance_matrix_df): # Squaring the distance matrix D_squared = np.square(distance_matrix_df)

# Calculate the mean of rows, mean of columns, and mean of all elements
mean_Di = np.mean(D_squared, axis=1)
mean_Dj = np.mean(D_squared, axis=0)
mean_D = np.mean(D_squared.values)

# Apply the transformation to get matrix B
n = D_squared.shape[0]
B = np.empty((n, n))
for i in range(n):
    for j in range(n):
        B[i, j] = -0.5 * (D_squared.iloc[i, j] - mean_Di[i] - mean_Dj[j] + mean_D)

# Eigen Decomposition
eigenvalues, eigenvectors = np.linalg.eigh(B)

# Sorting the eigenvalues in descending order
idx = np.argsort(eigenvalues)[::-1]
eigenvalues_sorted = eigenvalues[idx]
eigenvectors_sorted = eigenvectors[:, idx]

# Selecting top 2 eigenvectors and eigenvalues
top_eigenvalues = np.diag(np.sqrt(eigenvalues_sorted[:2]))
top_eigenvectors = eigenvectors_sorted[:, :2]

# Compute the x, y coordinates
coordinates = top_eigenvectors @ top_eigenvalues

# Create a DataFrame for better visualization
coordinates_df = pd.DataFrame(coordinates, columns=['x', 'y'])
coordinates_df.index = distance_matrix_df.index

return coordinates_df

Load the distance matrix from the CSV file

distance_matrix = pd.read_csv('/mnt/data/stats.csv', header=None)

Extracting the numeric part of the matrix and using the location names as index

numeric_distance_matrix = distance_matrix.iloc[1:, 1:].astype(float) numeric_distance_matrix.index = distance_matrix.iloc[1:, 0] numeric_distance_matrix.columns = range(numeric_distance_matrix.shape[1])

Call the classical MDS function and get the coordinates

classical_mds_coordinates_df = classical_mds(numeric_distance_matrix)

Plot the coordinates

def plot_classical_mds(coordinates_df): plt.figure(figsize=(10, 8)) for idx, row in coordinates_df.iterrows(): plt.scatter(row['x'], row['y']) plt.text(row['x'], row['y'], str(idx), fontsize=9) plt.xlabel('x Coordinate') plt.ylabel('y Coordinate') plt.title('2D Representation of Locations (Classical MDS)') plt.grid(True) plt.show()

plot_classical_mds(classical_mds_coordinates_df)

internet-zero avatar Nov 16 '23 07:11 internet-zero

Also, please check if the input data in the Distance matrix is correct, it is very important that the RTT/Latency time input into the distance matrix is accurate. @Neylix @samuelmanzanera @bchamagne @redDwarf03

internet-zero avatar Nov 16 '23 07:11 internet-zero