Introducing BarterValue for games
What problem does this feature address?
As a casual trader, it is difficult to assess an offer or compare game values. This causes problems in 2 ways:
- I either make offers which are too greedy:
- Which are immediately rejected by traders who realize it is greedy
- Or accepted by novice traders, which creates unfair treatment against them
- Or I make offers which are unfair against myself, by offering a valuable game for a low value game.
The other way to get out of this problem is to learn how to value games as a trader, but that requires a lot of time investment.
Describe a solution
My proposal is to create a BarterValue for every game. While these values can be calculated by many machine learning approaches, my initial proposal is applying a simple Logistic Regression algorithm. Initial design can be found below:
- We use the features of the game, such as number of wish listed, number of tradeables, current steam price, lowest ever price, was ever given away, steam rating, steam review count, etc.
- This list can be improved by including grey market prices.
- We also use the trader features, such as number of completed offers, number of received and declined offers, number of sent and declined offers
For sake of simplicity I will use only 2 values per game in my description nW (number of wishlist) and nT (number of tradeable) and 2 values per trader nS (number of sent offers) and nR (number of received offers).
The assumption we follow is if user A offered game X for game Y to user B and the offer is accepted we interpret that as:
- User A says
Value(X)>=Value(Y), since they offered the trade - User B says
Value(Y)>=Value(X), since they accepted the trade
On the other hand, if user A offered game W for game Z to user B and the offer is rejected we interpret that as:
- User A says
Value(W)>=Value(Z), since they offered the trade - User B says
Value(Z)<Value(W), since they rejected the trade
With these assumptions we create data rows as
nS(A),nR(A), nW(X), nT(X),nS(B),nR(B), nW(Y), nT(Y) -> TRUE
nS(B),nR(B), nW(Y), nT(Y),nS(A),nR(A), nW(X), nT(X) -> TRUE
nS(A),nR(A), nW(W), nT(W),nS(B),nR(B), nW(Z), nT(Z) -> TRUE
nS(B),nR(B), nW(Z), nT(Z),nS(A),nR(A), nW(W), nT(W) -> FALSE
We then run a simple logistic regression. Just to give the equations for the rejected trade above:
sig(C0+C1s*nS(A)+C2s*nR(A)+C3s*nW(W)+c4s*nT(W)- C1r*nS(B)-C2r* nR(B)-C3r*nW(Z)-C4r*nT(Z)) =1
sig(C0+C1s*nS(B)+C2s*nR(B)+C3s*nW(Z)+c4s*nT(Z)- C1r*nS(A)-C2r* nR(A)-C3r*nW(W)-C4r*nT(W)) =0
In the equation above C1s is the coefficient assigned to number of sent offers for the sender, and C1r is the coefficient for the offers sent for the receiver. Since we want those coefficients to be almost the same, we can add a regularization step to make sure they dont drift too far apart.
As soon as we have our logistic regression trained and we have the coefficients, we can easily calculate the value of a game by using its features. For example the value of game X will become: C2*nW(X)+C3*nT(X)
Having this feature would not only help people make easier evaluations, but also it can be used to generate most fair offers on BarterVG automatically.
Examples of similar features
Similar approaches have been used for predicting real estate prices.
I had a similar idea: using machine learning, predict how likely a proposed trade is going to be accepted. The problem for that though, is there is no historical game data with past trades. Meaning the value/properties of a game in a past trade can be changed by now.
Anyway, a lot of data is currently available through barter's API. Coincidentally, I'm developing some visual analytics tools for barter.vg data for a uni project (which deadline is this sunday, btw). If you want, we can collab and develop something nice over the weekend. https://game-data-explorer.glitch.me/
We've had this discussion before multiple times and it comes up again every once in a while.
"Value" is not a hard and fast number that can be calculated; people take into account so many factors and may even determine value subjectively. Besides, your proposal doesn't address potential arguments such as "why is this counted/why is this not counted?" and the issue of calculations being potentially "wrong", causing annoyance for "veteran traders" and new users whose trades constantly get declined.
If you want to have a "value tracker", it should be at worst opt-in, or preferably via a userscript.
I had a similar idea: using machine learning, predict how likely a proposed trade is going to be accepted. The problem for that though, is there is no historical game data with past trades. Meaning the value/properties of a game in a past trade can be changed by now.
That's a good point, I haven't considered that. I think for the proof of concept, we can assume the properties of the games are considered constant at the time of training. I know this is a big assumption, but for now I don't have a better solution.
Anyway, a lot of data is currently available through barter's API. Coincidentally, I'm developing some visual analytics tools for barter.vg data for a uni project (which deadline is this sunday, btw). If you want, we can collab and develop something nice over the weekend. https://game-data-explorer.glitch.me/
I will check the API and what can be done for the training.
We've had this discussion before multiple times and it comes up again every once in a while.
"Value" is not a hard and fast number that can be calculated; people take into account so many factors and may even determine value subjectively. Besides, your proposal doesn't address potential arguments such as "why is this counted/why is this not counted?" and the issue of calculations being potentially "wrong", causing annoyance for "veteran traders" and new users whose trades constantly get declined.
I am not sure what you mean by "why this is counted/not counted". About the veteran/novice users, a side product of our model will be the user properties and their behaviors. Those values can be used in a future iteration. About the subjectivity concern, since the model is trained using the accumlated trade data, it will be as objective as possible. The subjectivity will always be an aspect, but it is also an aspect in real world, but we should be able to get an objective value to get a common ground.
EDIT : this will also pre-eliminate a lot of trades which are destined to be declined, so it will save a lot of annoyence to both new and veteran traders.
If you want to have a "value tracker", it should be at worst opt-in, or preferably via a userscript.
I work as a backend developer, so having this as an opt-in option or userscript is not an area that I am an expert on. But as my personal opinion the more common this feature is, the more useful it will be. Because the main reason I made this proposal is for getting a common ground for all the users to evaluate games.
Re: @antigravities I think you should see it as a tool. It's equivalent to asking a third party trader what (s)he thinks of the trade. In this case, it's the ML model's opinion.
I am not sure what you mean by "why this is counted/not counted".
Someone could say "why is [metric I consider to be important in a trade] not counted in the 'BarterValue' calculator?", i.e. "why is the price of the game on my obscure store not counted?"
About the veteran/novice users, a side product of our model will be the user properties and their behaviors. Those values can be used in a future iteration. About the subjectivity concern, since the model is trained using the accumlated trade data, it will be as objective as possible. The subjectivity will always be an aspect, but it is also an aspect in real world, but we should be able to get an objective value to get a common ground.
EDIT : this will also pre-eliminate a lot of trades which are destined to be declined, so it will save a lot of annoyence to both new and veteran traders.
If you want to have a "value tracker", it should be at worst opt-in, or preferably via a userscript.
I work as a backend developer, so having this as an opt-in option or userscript is not an area that I am an expert on. But as my personal opinion the more common this feature is, the more useful it will be. Because the main reason I made this proposal is for getting a common ground for all the users to evaluate games.
It doesn't matter how "objective" your trade metric is. How each individual person values each individual item is sentimental, subjective and mutable. I can guarantee you that how you value your copy of Pesterquest is entirely different to how I or anyone else values theirs.
Re: @antigravities I think you should see it as a tool. It's equivalent to asking a third party trader what (s)he thinks of the trade. In this case, it's the ML model's opinion.
If it's built in to the site, it will be seen by new users as the "objective" and "only" way to determine game value, regardless of what the intention is.
I am not sure what you mean by "why this is counted/not counted".
Someone could say "why is [metric I consider to be important in a trade] not counted in the 'BarterValue' calculator?", i.e. "why is the price of the game on my obscure store not counted?"
For the internal data, my approach to machine learning is, just feed all the data you have, and let the algorithm decide what is important and what is not.
For the data from external stores, I believe the same rule applies with some extra steps. If we can set the system in a way that introducing new data is easy, we can try their obscure store prices, and if Machine Learning algorithm says they are helpful, it is great, if it says they are not helpful, there is your answer to the people for not using their store.
About the veteran/novice users, a side product of our model will be the user properties and their behaviors. Those values can be used in a future iteration. About the subjectivity concern, since the model is trained using the accumlated trade data, it will be as objective as possible. The subjectivity will always be an aspect, but it is also an aspect in real world, but we should be able to get an objective value to get a common ground. EDIT : this will also pre-eliminate a lot of trades which are destined to be declined, so it will save a lot of annoyence to both new and veteran traders.
If you want to have a "value tracker", it should be at worst opt-in, or preferably via a userscript.
I work as a backend developer, so having this as an opt-in option or userscript is not an area that I am an expert on. But as my personal opinion the more common this feature is, the more useful it will be. Because the main reason I made this proposal is for getting a common ground for all the users to evaluate games.
It doesn't matter how "objective" your trade metric is. How each individual person values each individual item is sentimental, subjective and mutable. I can guarantee you that how you value your copy of Pesterquest is entirely different to how I or anyone else values theirs.
Re: @antigravities I think you should see it as a tool. It's equivalent to asking a third party trader what (s)he thinks of the trade. In this case, it's the ML model's opinion.
If it's built in to the site, it will be seen by new users as the "objective" and "only" way to determine game value, regardless of what the intention is.
About the objectivity/subjectivity concern, I definitely agree with you that the values of games will change for every person. But, in my opinion same thing applies to the real world as well. For example the value of an apple changes in a person's perception based on: if they are a vegan or not, how hungry they are, if they are allergic to apples and so on... But that doesn't make having a market price on apples any less useful.
If it's built in to the site, it will be seen by new users as the "objective" and "only" way to determine game value, regardless of what the intention is.
This just depends on how it's implemented. It sure can be implemented that way, but it can also be implemented in such a way it's only suggestive.
Relevant steam discussion: https://steamcommunity.com/groups/bartervg/discussions/0/405692758726921809
I started working on a simple model ( even if it ends up being thrown away, I am fine with it ) , but the data collection is quite slow since I am having to hit the API for every offer. Is there a bulk API, that I can get the offer details from?
We ran into this problem too. There's only these (using cdn domain):
- Most recent global trades:
https://bartervg.com/o/json/ - User trade list:
https://bartervg.com/u/<USER_HEX_ID>/o/json/ - User trade:
https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/
Which offers would you like, completed (312760) and declined (1367632) offers? Or declined due to a specific reason, such as not worth it to me (299169)?
Do declined include countered?
Yes, remember #251? There are 130333 declined offers with the reason countered.
Good. Datadump for accepted and declined trades would be nice, with the included games data included.
accepted and declined trades
Accepted is a temporary status. It should either be completed or failed (or expired). I'm not sure how much of a difference it would make, but not all accepted offers lead to completed offers.
If I understand correctly, most declined offers should not be used as inputs. There isn't enough information if someone declines without a reason. There is the wrong information is someone declines due to already own or not longer have. These declines do not reflect the offer recipient's sense of value.
Silly example: You offer Cyberpunk 2077 in exchange for my The Haunted Island, a Frog Detective Game. I decline because, whoops, I forgot to update my tradable collection and no longer have The Haunted Island, a Frog Detective to trade. If the model uses this offer, it will naively compute that the value of The Haunted Island, a Frog Detective Game is greater than Cyberpunk 2077 (either to me, or more dangerously, in general).
I had a similar idea: using machine learning, predict how likely a proposed trade is going to be accepted. The problem for that though, is there is no historical game data with past trades. Meaning the value/properties of a game in a past trade can be changed by now.
Do you have anything to address this? I did ask you I think like a year ago to start logging this historical data. I assume that still has no happened yet?
https://github.com/bartervg/barter.vg/issues/128
We ran into this problem too. There's only these (using cdn domain):
- Most recent global trades:
https://bartervg.com/o/json/- User trade list:
https://bartervg.com/u/<USER_HEX_ID>/o/json/- User trade:
https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/
I am currently going the path of
-> Collect all users from https://barter.vg/u/json
-> Collect all offer IDs from https://bartervg.com/u/<USER_HEX_ID>/o/json/
-> Collect all offer details from https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/
Which offers would you like, completed (312760) and declined (1367632) offers? Or declined due to a specific reason, such as
not worth it to me(299169)?
I am marking Completed and 'Acceptedas positive samples, andDeclinedas negative samples. I guess I am at the point ofthe more data, the better` point
accepted and declined trades
Acceptedis a temporary status. It should either be completed or failed (or expired). I'm not sure how much of a difference it would make, but not all accepted offers lead to completed offers.If I understand correctly, most declined offers should not be used as inputs. There isn't enough information if someone declines without a reason. There is the wrong information is someone declines due to already own or not longer have. These declines do not reflect the offer recipient's sense of value.
Silly example: You offer Cyberpunk 2077 in exchange for my The Haunted Island, a Frog Detective Game. I decline because, whoops, I forgot to update my tradable collection and no longer have The Haunted Island, a Frog Detective to trade. If the model uses this offer, it will naively compute that the value of The Haunted Island, a Frog Detective Game is greater than Cyberpunk 2077 (either to me, or more dangerously, in general).
I believe most people would just not set a reason because they are lazy. Even the declined offers without reasons give us a lot of information and they should be used. Ofcourse, what I believe is not very important, what we should do is try a few different methods (including and excluding the no-reason declines) to see how accurate our model gets, and make this call based on what the output of the model says. I agree that there can be some cases where the decline doesn't make any sense (such as your example) but the ML algorithms should be strong enough to be not fooled by those, as long as we have enough data to realize that it was an outlier.
I had a similar idea: using machine learning, predict how likely a proposed trade is going to be accepted. The problem for that though, is there is no historical game data with past trades. Meaning the value/properties of a game in a past trade can be changed by now.
Do you have anything to address this? I did ask you I think like a year ago to start logging this historical data. I assume that still has no happened yet?
I am seeing some of the game details in the offer detail api, but I didn't get to check if that data comes from the historical state of the game or the offer data is joined with the current state of the game data.
-> Collect all users from https://barter.vg/u/json -> Collect all offer IDs from
https://bartervg.com/u/<USER_HEX_ID>/o/json/-> Collect all offer details fromhttps://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/
I think you can minimize the data-collecting with logbase2 by skipping the same trades from the other side perspective.
I agree that there can be some cases where the decline doesn't make any sense (such as your example) but the ML algorithms should be strong enough to be not fooled by those, as long as we have enough data to realize that it was an outlier.
Yes barter had a point. There are some 'invalid' decline reasons you should filter out, like no longer have reason. Sure, ML algorithms may be resilient to outliers, but best is to clean the data to avoid any (slight) biases.
I am seeing some of the game details in the offer detail api, but I didn't get to check if that data comes from the historical state of the game or the offer data is joined with the current state of the game data.
A trade-off has to be made by either using less, but more recent/accurate data, or more with more outdated data.
-> Collect all users from https://barter.vg/u/json -> Collect all offer IDs from
https://bartervg.com/u/<USER_HEX_ID>/o/json/-> Collect all offer details fromhttps://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/I think you can minimize the data-collecting with logbase2 by skipping the same trades from the other side perspective.
Yes, I am doing that, but I think it is not log2, it just halves the data collection.
I agree that there can be some cases where the decline doesn't make any sense (such as your example) but the ML algorithms should be strong enough to be not fooled by those, as long as we have enough data to realize that it was an outlier.
Yes barter had a point. There are some 'invalid' decline reasons you should filter out, like
no longer havereason. Sure, ML algorithms may be resilient to outliers, but best is to clean the data to avoid any (slight) biases.
That's right, but I still don't think that should be part of data dump. It should be data preprocessing step.
In my opinion even the offers which was rejected with reason "no longer have", carry some information. If I offer you Game A for Game B and you reject because "no longer have", by making the offer I am still creating a row in the dataset saying B is more valuable than A. Probably ignoring your rejection is a good idea, but we can still use the fact that I am offering.
I don't get how that makes B more valuable than A. With "no longer have" you don't if a trade offer would have been declined or accepted.
By making an offer to you and saying give me B and I will give you A, I am intrinsicly claiming B is more valuable than A (for me). When you reject because of "no longer have" we don't get any information from you which helps to compare values of A and B.
EDIT : If you check the equations from my initial proposal, for every offer we create 2 rows: 1 for the offerer's evaluation, and one for receiver's evaluation. In this case we still have the offerer's evaluation. But we won't be able to create the second equation, which is for receiver's evaluation.
By making an offer
Excellent point. This would mean that even expired offers could provide some information.
By making an offer
Excellent point. This would mean that even expired offers could provide some information.
Yes, though I have to admit that I didn't consider expired offers, that is a good catch. That is why I still think the data dump has to be as comprehensive as possible.
Also I have been running the data collection for over 16 hours now and I still didn't get to 100k offers. So it would be great to know if we will have some data dump soon. Otherwise getting all the offers will probably weeks/months for me.
Also I have been running the data collection for over 16 hours
There was a massive traffic spike around 10 hours ago. Right now though, no problems.
if we will have some data dump soon
What would that look like? There are ~3M offers, However, ~1M are cancelled, and I assume cancelled have no value and can be excluded. Therefore ~2M offers in this https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/ format? Combined into one big file to download?
I guess we can ignore cancelled. it can be one file, or if it is too large to handle, some partitioning would also work. Offers I have downloaded so far are about ~5kb/offer so I assume that the dump will be about 10gb.
P.S : I just introduced some parallelization to the code, and now it is going way, way much faster. I will let it run for an hour amd recalculate how long would it take to download all. So if it will take much time to create the data dump, don't bother at the moment.
Parallelized version of the data collector downloads ~100K offers an hour. So I should have the data ready by tomorrow (hopefully). I hope I am not creating spikes in your servers.
I am done downloading the offers ( ~1.8 million). Now I can work on them.