GraphEngine icon indicating copy to clipboard operation
GraphEngine copied to clipboard

How do I run a cluster on Linux(centos)

Open wuyanxing opened this issue 5 years ago • 35 comments

wuyanxing avatar Mar 27 '19 03:03 wuyanxing

I can't find any explanation or method.

wuyanxing avatar Mar 27 '19 03:03 wuyanxing

I am having the same problem. As I understood from the documentation it should be sufficient to add servers to the config file. GE then at least sends messages between these servers but it does not run the script. The script was running successfully in embedded mode. I also used Global.CloudStorage.SaveXYZ to access memory. What can I do to set it up correctly?

This is the config I was trying to use:

<Trinity ConfigVersion="2.0">
  <Local>
    <!-- Add any configuration the client might need -->
  </Local>

    <section name="Application">
     <entry name="ConfigOutputOn">true</entry>
     <entry name="CurrentRunningMode">Distributed</entry>
    </section>

    <section name="Network">
     <entry name="ClientBufferSize">1048576</entry>
     <entry name="ClientMaxBufferSize">134217728</entry>
     <entry name="ServerSocketBufferSize">8192</entry>
     <entry name="ClientSocketBufferSize">8192</entry>
     <entry name="ServerMaxConn">512</entry>
     <entry name="ServerMaxAcceptOps">512</entry>
     <entry name="PreferedNetworkMask"/>
    </section>

    <Cluster>
        <Server Endpoint="<IPADRESS_SERVER1>:8133" />
        <Server Endpoint="<IPADRESS_SERVER2>:8133" />
    </Cluster>
</Trinity>

jkliss avatar May 20 '19 09:05 jkliss

@jkliss could you attach the log for the two instances?

yatli avatar May 20 '19 09:05 yatli

[ INFO    ] EchoOnConsole set to ON
[ INFO    ] Log: changing logging directory to /home/jkliss/BFS/BFSClient/bin/Debug/netcoreapp2.2/trinity-log.
[ INFO    ] Loading Graph Engine Extensions.
[ INFO    ] Scanning for TSL storage extension.
[ INFO    ] TSL storage extension loaded.
[ INFO    ] Scanning for MemoryCloud extensions.
[ INFO    ] No MemoryCloud extension found.
[ INFO    ] Scanning for startup tasks.
[ INFO    ] EventLoop: Starting.
[ INFO    ] *****************************************************
[ INFO    ] ServerCount: 2
[ INFO    ]     IPADRESS_SERVER1:8133
[ INFO    ]     IPADRESS_SERVER2:8133
[ INFO    ] ProxyCount: 0
[ INFO    ] *****************************************************

if I replace IPADRESS_SERVER1 with localhost I get the following additional lines (and it starts sending network packets but the script doesn't go on further):

[ INFO    ] LocalMemoryStorage is initialized in read-write mode
[ INFO    ] Initializing logging facility
[ INFO    ] Reading log file.
[ INFO    ] Write-ahead-log successfully loaded. Recovered 0 records.
[ INFO    ] Creating write-ahead log file /home/jkliss/BFS/BFSClient/bin/Debug/netcoreapp2.2/storage/B/write_ahead_log/primary_storage_log_20.dat

jkliss avatar May 20 '19 11:05 jkliss

@yatli is there anything that I need to incorporate in the code to enable communication between servers or is there a function to make a server wait for another server to make them synchronize?

If you have an example on how to setup and run a distributed system on linux it would be very helpful for me and probably for others too

jkliss avatar May 20 '19 13:05 jkliss

I'm having the same problem as @jkliss . Documentation on how to setup GraphEngine in a distributed way would be very useful.

ToxicJojo avatar May 20 '19 13:05 ToxicJojo

I'm also looking for a working example of a distributed GraphEngine with multiple servers.

edouardpoitras avatar Aug 30 '23 18:08 edouardpoitras

@edouardpoitras Hi. Here is an example of how to configure for Graph Engine Availability Group (Cluster): This is my xml configuration on my head server (Only)

<!--Declare and Define the Head (Primary) Graph Engine Cluster-->
<!-- <Local Template="primary-rub-truespark-sf-cluster-template"/>-->
<!-- <Remote Template="rub-truespark-ontology-taxonomy-cluster-template"/>-->

<!--A Cluster node contains configurations for servers and proxies of a Graph Engine cluster. 
    There can be multiple Cluster nodes as long as they have different identifiers. 
			        Endpoint="10.1.10.5:7001" 
    A Cluster node can have an optional attribute Id.-->

<Cluster RunningMode="Server">
	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.1.100.5:7001" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>

	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.23:7001" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>

	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.34:7001" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>

	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.108:7002" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>

	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.71:7002" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
				CleintMaxConn="2" 
			    ClientSendRetry="5" 
			    ClientReconnectRetry="5" 
			    Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
				LogLevel="Verbose" 
				LogToFile="TRUE" 
             	EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
				ReadOnly="FALSE" 
				StorageCapacity="Max16G" 
				StorageRoot="D:\GraphEngine-Storage\" 
				DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>
</Cluster>

image

TaviTruman avatar Aug 30 '23 22:08 TaviTruman

For the other machines in the GE Cluster you only need to specify the local machine:

<!--Declare and Define the Head (Primary) Graph Engine Cluster-->
<!-- <Local Template="primary-rub-truespark-sf-cluster-template"/>-->
<!-- <Remote Template="rub-truespark-ontology-taxonomy-cluster-template"/>-->

<!--A Cluster node contains configurations for servers and proxies of a Graph Engine cluster. 
    There can be multiple Cluster nodes as long as they have different identifiers. 
			        Endpoint="10.1.10.5:7001" 
    A Cluster node can have an optional attribute Id.-->
<Cluster RunningMode="Server">
	<!-- IKW-GE-APPSVR03.inknowworksdev.net -->
	<Server Endpoint="10.30.10.23:7001" AssemblyPath="D:\Trinity TripleStore Server Deployment\">
		<Network HttpPort="-1" 
			CleintMaxConn="2" 
			ClientSendRetry="5" 
			ClientReconnectRetry="5" 
			Handshake="TRUE"/>

		<Logging LogDirectory="D:\GraphEngine-Log\"
			LogLevel="Verbose" 
			LogToFile="TRUE"
			EchoOnConsole="TRUE"/>

		<Storage TrunkCount="256"
			ReadOnly="FALSE" 
			StorageCapacity="Max16G" 
			StorageRoot="D:\GraphEngine-Storage\" 
			DefragInterval="600"/>

		<LIKQ Timeout="90000" />
	</Server>
</Cluster>

image

TaviTruman avatar Aug 30 '23 23:08 TaviTruman

Hey @TaviTruman, thanks for the quick reply.

I've tried your configs and am still scratching my head :)

To simplify things, I have three servers in the cluster (127.0.0.1:700[0-2]). I've stripped out most of the config options but the endpoint values.

I run the first "head" node with the cluster config:

...
[ INFO    ] My IPEndPoint: 127.0.0.1:7000
...
[ INFO    ] ServerCount: 3
[ INFO    ]     127.0.0.1:7000
[ INFO    ]     127.0.0.1:7001
[ INFO    ]     127.0.0.1:7002
[ INFO    ] ProxyCount: 0
...

Great!

Then I run a 2nd instance with a config that only contains the server 127.0.0.1:7001:

[ INFO    ] My IPEndPoint: 127.0.0.1:7001
...
[ INFO    ] ServerCount: 1
[ INFO    ]     127.0.0.1:7001
[ INFO    ] ProxyCount: 0

And then the last node:

[ INFO    ] My IPEndPoint: 127.0.0.1:7002
[ INFO    ] ServerCount: 1
[ INFO    ]     127.0.0.1:7002
[ INFO    ] ProxyCount: 0

At this point, all three instances have reported:

...
[ INFO    ] Scanning for MemoryCloud extensions.
[ INFO    ] No MemoryCloud extension found.
...
[ INFO    ] Server 0 is successfully started.

Is that correct? No sure how to get the MemoryCloud extension working...

Also, when running a client, I re-used the first cluster config and then in the code specified: TrinityConfig.CurrentRunningMode = RunningMode.Client; That seems to work in that the DistributedHashtable sample works. But it's unclear to me if it's working as expected.

Thanks again for you help!

edouardpoitras avatar Aug 31 '23 20:08 edouardpoitras

@edouardpoitras Hi. Looks like you've made great progress. If you don't mind please upload your trinity.xml configure file and your GE server and Client code. I'll get back you asap. Looks like you are trying to run the Distributed hash demo code, right.

TaviTruman avatar Aug 31 '23 21:08 TaviTruman

Hey @TaviTruman, thanks again for your time. I created a new repository to create a working minimal cluster example: https://github.com/edouardpoitras/TrinityResearch

It's basically the DistributedHashtable sample with a few code tweaks in the Program.cs. Nothing changed in the DistributedHashtableServer.cs or DistributedHashtable.tsl.

The README.md file has the manual steps I'm trying. I want to eventually get this working with docker-compose.

I've tweaked a few things since we last talked. The closest I've gotten now is to use the same config but shuffle the server endpoint definitions around so the first one chosen is different for each config. Pretty sure I'm doing something wrong.

Let me know what you think/spot.

edouardpoitras avatar Sep 01 '23 14:09 edouardpoitras

Hi, @edouardpoitras, I will take a look at this today and get back to you shortly.

TaviTruman avatar Sep 01 '23 16:09 TaviTruman

I will upload a diagram that depicts the GE Cluster along with the trinity configuration for each server; the GE Client configuration is quite simple and typically all that you need to specify TrinityConfig.CurrentRunningMode = RunningMode.Client;

TaviTruman avatar Sep 04 '23 19:09 TaviTruman

Hi, @edouardpoitras. I have written a new and improved version of the "Distributed Hash Table" sample, and it works well in the GE Availability Group of three servers. It is late here so I will post it for you in the morning. I have written a new set of documentation as well describing the API sets.

TaviTruman avatar Sep 05 '23 06:09 TaviTruman

Hi, @edouardpoitras. I have written a new and improved version of the "Distributed Hash Table" sample, and it works well in the GE Availability Group of three servers. It is late here so I will post it for you in the morning.

TaviTruman avatar Sep 05 '23 06:09 TaviTruman

@TaviTruman excellent! Looking forward to it :+1:

edouardpoitras avatar Sep 05 '23 13:09 edouardpoitras

image

TaviTruman avatar Sep 05 '23 14:09 TaviTruman

GE DHT Server (Head): Running on my Windows 11 Desktop

image

TaviTruman avatar Sep 05 '23 14:09 TaviTruman

GE DHT Server (Secondary Server in GE Cluster): Running in Hyper-V Windows Server 2022

image

In the GE log you can see that some of the work has been distributed to this DHT Server Instance

TaviTruman avatar Sep 05 '23 15:09 TaviTruman

GE DHT Server (Secondary Server in GE Cluster): Running in Hyper-V Windows Server 2022. This is the 3rd server in the GE Cluster

image

You can see here that this server in my test never received any work.

TaviTruman avatar Sep 05 '23 15:09 TaviTruman

Here is my DHT Client doing work!

image

TaviTruman avatar Sep 05 '23 15:09 TaviTruman

Here are the trinity.xml config files:

  1. GE DHT Server Cluster Config (see attached file)
  2. 2nd GE DHT Server Instance config
  3. 3rd GE DHT Server Instance config primary GE DHT Cluster trinity.zip

image

image

TaviTruman avatar Sep 05 '23 15:09 TaviTruman

I will create PR and submit the new demonstration. In the meantime, I will upload the VS Studio project for you here. FYI, I do have a new Discord Channel coming up in October; the channel is dedicated to all things Graph Engine, Knowledge Graphs, Ontology Driven Software Design (ODSD) using Graph Engine, and much, much more.

TaviTruman avatar Sep 05 '23 15:09 TaviTruman

Here is the VS Solution and Project structure I use:

image

TaviTruman avatar Sep 05 '23 15:09 TaviTruman

Amazing @TaviTruman - giving it a shot now. I noticed that the projects use GraphEngine v4.0 - that doesn't seem available in the repo. I tried changing the references to use v3.0 instead, but I get a Unable to find package GraphEngine.Client error. Presumably GraphEngine.Client is new in v4?

Thanks again!

Edit: Looking over the code and configs - this is exactly what I was looking for and is making a lot of sense to me now. Just need to work out how to get v4.

edouardpoitras avatar Sep 05 '23 17:09 edouardpoitras

Yikes - I forgot about that - sorry! I maintain my own public repo of the Graph Engine and have a lot of updates. Let me upload the new Nuget packages for you. FYI - the GraphEngine.Client was removed from this repo, but I have been using it for a few years now.

TaviTruman avatar Sep 05 '23 17:09 TaviTruman

Let me get you everything you need and then you can have a lot of fun :-) I have also updated the code so that it saves the Local Memory Cloud and then restores or reloads it. Distributed Hash Table on GE Cluster -V2.zip

TaviTruman avatar Sep 05 '23 17:09 TaviTruman

Here is the Windows Ready Deployment

DHT server Deployment.zip

TaviTruman avatar Sep 05 '23 17:09 TaviTruman