archethic-node
archethic-node copied to clipboard
Network patch calculation look not accurate
Describe the problem you discovered
On the mainnet, each node has a network patch calculated during the beacon summary.
This network patch is used to determine which node is near to another node regarding the network latency between them.
Actually all nodes have a great internet connection with low latency
Here is the result of the function P2P.nearest_nodes
from one of the node in mainnet:
I annoted each node with their geolocalisation and it's looks like the network patch is not accurate: The node in green should be the nearest ones from the node in blue, and the red one should be far from it.
For exemple new york has a patch next to sydney but far from toronto while the latency between NY and sydney is far more than the ping between NY and toronto:
- ping NY => toronto = 22.5
- ping NY => sydney = 215
This result with a non efficient fetching of data used through the quorum.
Describe the solution you'd like
Network patch should be more accurate and the nearest node from a node should be the one with less latency and not the one which are literally in the opposite side on the world.
Epic
No response
Solution: Please change the formula/code to calculate the (x, y) coordinates. (This helps in correcting the above error)
To transform a distance matrix (Dij) and subsequently derive x, y
coordinates, below are the steps to get the coordinates:
Step 1: Direct Transformation Method
-
Transformation:
-
Here,
mean(D[i])
andmean(D[j])
are the averages of theith
row andjth
column ofD
, respectively, andmean(D)
is the average of all elements inD
.
After transforming the distance matrix D
into the matrix B
using either method, follow these steps to obtain x, y
coordinates:
Step 2: Eigenvalue Decomposition
-
Perform eigenvalue decomposition on
B
: -
Here,
V
is the matrix of eigenvectors andλ
is the diagonal matrix of eigenvalues.
Step 3: Select the Top Two Eigenvectors
- Choose the eigenvectors corresponding to the two largest eigenvalues (say,
λ1
andλ2
).
Step 4: Form the Coordinates
-
The
x, y
coordinates are then given by: -
v1
andv2
are the eigenvectors corresponding toλ1
andλ2
, respectively.
These steps will give you the x, y
coordinates in a 2D space that best approximate the original distances in your matrix D
. The accuracy of this representation depends on how well the original distances can be modeled in two dimensions.
Here is sample of the data used to do the calculation stats.csv
This sample is coming from the summary of 14/11/2023 (JJ/MM/AAAA)
It looks like some latency are not accurate, for example a node at Embrun (in France) as 49 ms latency to a node at Singapore. When I connect to this node and send a ping to Singapore node it takes 280 ms.
Here is the result of the above plot, it is looking accurate to the network latency.
Please checkout the full python code
import pandas as pd import numpy as np import matplotlib.pyplot as plt
def classical_mds(distance_matrix_df): # Squaring the distance matrix D_squared = np.square(distance_matrix_df)
# Calculate the mean of rows, mean of columns, and mean of all elements
mean_Di = np.mean(D_squared, axis=1)
mean_Dj = np.mean(D_squared, axis=0)
mean_D = np.mean(D_squared.values)
# Apply the transformation to get matrix B
n = D_squared.shape[0]
B = np.empty((n, n))
for i in range(n):
for j in range(n):
B[i, j] = -0.5 * (D_squared.iloc[i, j] - mean_Di[i] - mean_Dj[j] + mean_D)
# Eigen Decomposition
eigenvalues, eigenvectors = np.linalg.eigh(B)
# Sorting the eigenvalues in descending order
idx = np.argsort(eigenvalues)[::-1]
eigenvalues_sorted = eigenvalues[idx]
eigenvectors_sorted = eigenvectors[:, idx]
# Selecting top 2 eigenvectors and eigenvalues
top_eigenvalues = np.diag(np.sqrt(eigenvalues_sorted[:2]))
top_eigenvectors = eigenvectors_sorted[:, :2]
# Compute the x, y coordinates
coordinates = top_eigenvectors @ top_eigenvalues
# Create a DataFrame for better visualization
coordinates_df = pd.DataFrame(coordinates, columns=['x', 'y'])
coordinates_df.index = distance_matrix_df.index
return coordinates_df
Load the distance matrix from the CSV file
distance_matrix = pd.read_csv('/mnt/data/stats.csv', header=None)
Extracting the numeric part of the matrix and using the location names as index
numeric_distance_matrix = distance_matrix.iloc[1:, 1:].astype(float) numeric_distance_matrix.index = distance_matrix.iloc[1:, 0] numeric_distance_matrix.columns = range(numeric_distance_matrix.shape[1])
Call the classical MDS function and get the coordinates
classical_mds_coordinates_df = classical_mds(numeric_distance_matrix)
Plot the coordinates
def plot_classical_mds(coordinates_df): plt.figure(figsize=(10, 8)) for idx, row in coordinates_df.iterrows(): plt.scatter(row['x'], row['y']) plt.text(row['x'], row['y'], str(idx), fontsize=9) plt.xlabel('x Coordinate') plt.ylabel('y Coordinate') plt.title('2D Representation of Locations (Classical MDS)') plt.grid(True) plt.show()
plot_classical_mds(classical_mds_coordinates_df)
Also, please check if the input data in the Distance matrix is correct, it is very important that the RTT/Latency time input into the distance matrix is accurate. @Neylix @samuelmanzanera @bchamagne @redDwarf03