emergent-comms-negotiation
emergent-comms-negotiation copied to clipboard
Reproduce ICLR2018 submission "Emergent Communication through Negotiation"
"Emergent Communication through Negotiation"
Reproduce https://openreview.net/forum?id=Hk6WhagRW¬eId=Hk6WhagRW , "Emergent Communication through Negotation", ICLR 2018 anonymous submission.
To install
- install pytorch 0.2, https://pytorch.org
- download this repo,
git clone https://github.com/asappinc/emergent_comms_negotiation
To run
python ecn.py [--disable-comms] [--disable-proposal] [--disable-prosocial] [--enable-cuda] [--term-entropy-reg 0.5] [--utterance-entropy-reg 0.0001] [--proposal-entropy-reg 0.01] [--model-file model_saves/mymodel.dat] [--name gpu3box]
Where options are:
--enable-cuda: use NVIDIA GPU, instead of CPU--disable-comms: disable the comms channel--disable-proposal: disable the proposal channel (ie agent can create proposals, but other agent cant see them)--disable-prosocial: disable prosocial reward--term-entropy-reg VALUE: termination policy entropy regularization--utterance-entorpy-reg VALUE: utterance policy entropy regularization--proposal-entropy-reg VALUE: proposal policy entropy regularization--model-file models_saves/FILENAME: where to save the model to, and where to look for it on startup--name NAME: this is used in the logfile name, just to make it easier to find/distinguish logfiles, no other purpose
Stdout layout
eg if we have:
000000 4:4/0 7:5/5 9:4/4
000000 4:5/0 6:1/5 7:2/4
000000 4:0/0 7:0/5 9:1/4
ACC
r: 0.91
Then:
- each of the first 4 lines here is the action of a single agent
- the
ACCline is the agent accepting previous proposal - each proposal line is laid out as:
[utterance] [utility 0]:[proposal 0]/[pool 0] ... etc ...
- if the agents run out of time, last line will be
[out of time]
One negotation is printed out every 3 seconds or so, using the training set; the other negotations are executed silently. There is no test set for now.
Results so far, summary
| Agent sociability | Proposal | Linguistic | Both | None |
|---|---|---|---|---|
| Self-interested, random term | >=0.80 | |||
| Prosocial, random term | ~0.91 | ~0.83 | ~0.96 | >= 0.90 |
Notes:
- prosocial runs all use termreg=0.5, uttreg=0.0001, propreg=0.01
- self-interested run uses: termreg=0.05, uttreg=0.0001, propreg=0.005
Scenario details
| Prop? | Comm? | Soc? | Rend term? | Term reg | Utt reg | Prop reg | Subjective variance | Reward | Greedy ratios |
|---|---|---|---|---|---|---|---|---|---|
| Y | Y | Y | Y | 0.5 | 0.0001 | 0.01 | Low | ~0.96 | term=0.7345 utt=0.7635 prop=0.8304 |
| Y | - | Y | Y | 0.5 | 0.0001 | 0.01 | Medium-High | ~0.91 | term=0.6965 utt=0.0000 prop=0.8741 |
| - | Y | Y | Y | 0.5 | 0.0001 | 0.01 | High | ~0.83 | term=0.6889 utt=0.7849 prop=0.8222 |
| - | - | Y | Y | 0.5 | 0.0001 | 0.01 | Very low | >= 0.90 (climbing) | term=0.7781 utt=0.0000 prop=0.6006 |
| Y | Y | - | Y | 0.5 | 0.0001 | 0.01 | Very High | ~0.25 | term=0.7467 utt=0.9284 prop=0.8137 |
| Y | Y | - | Y | 0.05 | 0.0001 | 0.005 | Very Low | >= 0.80 (climbing) | term=0.9820 utt=0.7040 prop=0.6523 |
Training curves
proposal, comms, prosocial
Three training runs, identical settings:
Proposal, no comms, prosocial
No proposal, comms, prosocial
No proposal, no comms, prosocial
Proposal, comms, no social
Run 1, same entropy regularization as prosocial graphs:
Run 2, with reduced entropy regularization:
Unit tests
- install pytest, ie
conda install -y pytest, and then:
py.test -svx
- there are also some additional tests in:
python net_tests.py
(which allow close examination of specific parts of the network, policies, and so on; but which arent really 'unit-tests' as such, since neither termination criteria, nor success criteria)
Plotting graphs
Assumptions:
- running the training on remote Ubuntu 16.04 instances
sshaccess, as userubuntu, to these instances- remote has home directory
/home/ubuntu - logs are stored in subdirectory
logsof current local directory - the location of
logsrelative to~is identical on local computer and remote computer
Setup/configuration:
- copy
instances.yaml.templto~/instances.yaml, on your own machine- configure
~/instances.yamlwith:- name and ip of each instance (names are arbitrary)
- the path to your private sshkey, that can access these instances
- configure
Procedure
- run:
python merge.py --hostname [name in instances.yaml] [--logfile logs/log_20171104_1234.log] \
[--title 'my graph title'] [--y-min 75 --y-max 85]
This will:
rsyncthe logs from the remote instance identified by--hostname- if
--logfileis specified, load the results from that logfile- else, will look for the most recent logfile, ordered by name
- plots the graph into
/tmp/out-reward.png