barter.vg icon indicating copy to clipboard operation
barter.vg copied to clipboard

Introducing BarterValue for games

Open osmanyucel opened this issue 4 years ago • 87 comments

What problem does this feature address?

As a casual trader, it is difficult to assess an offer or compare game values. This causes problems in 2 ways:

  • I either make offers which are too greedy:
    • Which are immediately rejected by traders who realize it is greedy
    • Or accepted by novice traders, which creates unfair treatment against them
  • Or I make offers which are unfair against myself, by offering a valuable game for a low value game.

The other way to get out of this problem is to learn how to value games as a trader, but that requires a lot of time investment.

Describe a solution

My proposal is to create a BarterValue for every game. While these values can be calculated by many machine learning approaches, my initial proposal is applying a simple Logistic Regression algorithm. Initial design can be found below:

  • We use the features of the game, such as number of wish listed, number of tradeables, current steam price, lowest ever price, was ever given away, steam rating, steam review count, etc.
    • This list can be improved by including grey market prices.
  • We also use the trader features, such as number of completed offers, number of received and declined offers, number of sent and declined offers

For sake of simplicity I will use only 2 values per game in my description nW (number of wishlist) and nT (number of tradeable) and 2 values per trader nS (number of sent offers) and nR (number of received offers).

The assumption we follow is if user A offered game X for game Y to user B and the offer is accepted we interpret that as:

  • User A says Value(X)>=Value(Y), since they offered the trade
  • User B says Value(Y)>=Value(X), since they accepted the trade

On the other hand, if user A offered game W for game Z to user B and the offer is rejected we interpret that as:

  • User A says Value(W)>=Value(Z), since they offered the trade
  • User B says Value(Z)<Value(W), since they rejected the trade

With these assumptions we create data rows as

nS(A),nR(A), nW(X), nT(X),nS(B),nR(B), nW(Y), nT(Y) -> TRUE
nS(B),nR(B), nW(Y), nT(Y),nS(A),nR(A), nW(X), nT(X) -> TRUE
nS(A),nR(A), nW(W), nT(W),nS(B),nR(B), nW(Z), nT(Z) -> TRUE
nS(B),nR(B), nW(Z), nT(Z),nS(A),nR(A), nW(W), nT(W) -> FALSE

We then run a simple logistic regression. Just to give the equations for the rejected trade above:

sig(C0+C1s*nS(A)+C2s*nR(A)+C3s*nW(W)+c4s*nT(W)- C1r*nS(B)-C2r* nR(B)-C3r*nW(Z)-C4r*nT(Z)) =1
sig(C0+C1s*nS(B)+C2s*nR(B)+C3s*nW(Z)+c4s*nT(Z)- C1r*nS(A)-C2r* nR(A)-C3r*nW(W)-C4r*nT(W)) =0

In the equation above C1s is the coefficient assigned to number of sent offers for the sender, and C1r is the coefficient for the offers sent for the receiver. Since we want those coefficients to be almost the same, we can add a regularization step to make sure they dont drift too far apart.

As soon as we have our logistic regression trained and we have the coefficients, we can easily calculate the value of a game by using its features. For example the value of game X will become: C2*nW(X)+C3*nT(X)

Having this feature would not only help people make easier evaluations, but also it can be used to generate most fair offers on BarterVG automatically.

Examples of similar features

Similar approaches have been used for predicting real estate prices.

osmanyucel avatar Apr 15 '21 21:04 osmanyucel

I had a similar idea: using machine learning, predict how likely a proposed trade is going to be accepted. The problem for that though, is there is no historical game data with past trades. Meaning the value/properties of a game in a past trade can be changed by now.

Anyway, a lot of data is currently available through barter's API. Coincidentally, I'm developing some visual analytics tools for barter.vg data for a uni project (which deadline is this sunday, btw). If you want, we can collab and develop something nice over the weekend. https://game-data-explorer.glitch.me/

Revadike avatar Apr 15 '21 22:04 Revadike

We've had this discussion before multiple times and it comes up again every once in a while.

"Value" is not a hard and fast number that can be calculated; people take into account so many factors and may even determine value subjectively. Besides, your proposal doesn't address potential arguments such as "why is this counted/why is this not counted?" and the issue of calculations being potentially "wrong", causing annoyance for "veteran traders" and new users whose trades constantly get declined.

If you want to have a "value tracker", it should be at worst opt-in, or preferably via a userscript.

antigravities avatar Apr 16 '21 00:04 antigravities

I had a similar idea: using machine learning, predict how likely a proposed trade is going to be accepted. The problem for that though, is there is no historical game data with past trades. Meaning the value/properties of a game in a past trade can be changed by now.

That's a good point, I haven't considered that. I think for the proof of concept, we can assume the properties of the games are considered constant at the time of training. I know this is a big assumption, but for now I don't have a better solution.

Anyway, a lot of data is currently available through barter's API. Coincidentally, I'm developing some visual analytics tools for barter.vg data for a uni project (which deadline is this sunday, btw). If you want, we can collab and develop something nice over the weekend. https://game-data-explorer.glitch.me/

I will check the API and what can be done for the training.

We've had this discussion before multiple times and it comes up again every once in a while.

"Value" is not a hard and fast number that can be calculated; people take into account so many factors and may even determine value subjectively. Besides, your proposal doesn't address potential arguments such as "why is this counted/why is this not counted?" and the issue of calculations being potentially "wrong", causing annoyance for "veteran traders" and new users whose trades constantly get declined.

I am not sure what you mean by "why this is counted/not counted". About the veteran/novice users, a side product of our model will be the user properties and their behaviors. Those values can be used in a future iteration. About the subjectivity concern, since the model is trained using the accumlated trade data, it will be as objective as possible. The subjectivity will always be an aspect, but it is also an aspect in real world, but we should be able to get an objective value to get a common ground.

EDIT : this will also pre-eliminate a lot of trades which are destined to be declined, so it will save a lot of annoyence to both new and veteran traders.

If you want to have a "value tracker", it should be at worst opt-in, or preferably via a userscript.

I work as a backend developer, so having this as an opt-in option or userscript is not an area that I am an expert on. But as my personal opinion the more common this feature is, the more useful it will be. Because the main reason I made this proposal is for getting a common ground for all the users to evaluate games.

osmanyucel avatar Apr 16 '21 01:04 osmanyucel

Re: @antigravities I think you should see it as a tool. It's equivalent to asking a third party trader what (s)he thinks of the trade. In this case, it's the ML model's opinion.

Revadike avatar Apr 16 '21 02:04 Revadike

I am not sure what you mean by "why this is counted/not counted".

Someone could say "why is [metric I consider to be important in a trade] not counted in the 'BarterValue' calculator?", i.e. "why is the price of the game on my obscure store not counted?"

About the veteran/novice users, a side product of our model will be the user properties and their behaviors. Those values can be used in a future iteration. About the subjectivity concern, since the model is trained using the accumlated trade data, it will be as objective as possible. The subjectivity will always be an aspect, but it is also an aspect in real world, but we should be able to get an objective value to get a common ground.

EDIT : this will also pre-eliminate a lot of trades which are destined to be declined, so it will save a lot of annoyence to both new and veteran traders.

If you want to have a "value tracker", it should be at worst opt-in, or preferably via a userscript.

I work as a backend developer, so having this as an opt-in option or userscript is not an area that I am an expert on. But as my personal opinion the more common this feature is, the more useful it will be. Because the main reason I made this proposal is for getting a common ground for all the users to evaluate games.

It doesn't matter how "objective" your trade metric is. How each individual person values each individual item is sentimental, subjective and mutable. I can guarantee you that how you value your copy of Pesterquest is entirely different to how I or anyone else values theirs.

Re: @antigravities I think you should see it as a tool. It's equivalent to asking a third party trader what (s)he thinks of the trade. In this case, it's the ML model's opinion.

If it's built in to the site, it will be seen by new users as the "objective" and "only" way to determine game value, regardless of what the intention is.

antigravities avatar Apr 16 '21 04:04 antigravities

I am not sure what you mean by "why this is counted/not counted".

Someone could say "why is [metric I consider to be important in a trade] not counted in the 'BarterValue' calculator?", i.e. "why is the price of the game on my obscure store not counted?"

For the internal data, my approach to machine learning is, just feed all the data you have, and let the algorithm decide what is important and what is not.

For the data from external stores, I believe the same rule applies with some extra steps. If we can set the system in a way that introducing new data is easy, we can try their obscure store prices, and if Machine Learning algorithm says they are helpful, it is great, if it says they are not helpful, there is your answer to the people for not using their store.

About the veteran/novice users, a side product of our model will be the user properties and their behaviors. Those values can be used in a future iteration. About the subjectivity concern, since the model is trained using the accumlated trade data, it will be as objective as possible. The subjectivity will always be an aspect, but it is also an aspect in real world, but we should be able to get an objective value to get a common ground. EDIT : this will also pre-eliminate a lot of trades which are destined to be declined, so it will save a lot of annoyence to both new and veteran traders.

If you want to have a "value tracker", it should be at worst opt-in, or preferably via a userscript.

I work as a backend developer, so having this as an opt-in option or userscript is not an area that I am an expert on. But as my personal opinion the more common this feature is, the more useful it will be. Because the main reason I made this proposal is for getting a common ground for all the users to evaluate games.

It doesn't matter how "objective" your trade metric is. How each individual person values each individual item is sentimental, subjective and mutable. I can guarantee you that how you value your copy of Pesterquest is entirely different to how I or anyone else values theirs.

Re: @antigravities I think you should see it as a tool. It's equivalent to asking a third party trader what (s)he thinks of the trade. In this case, it's the ML model's opinion.

If it's built in to the site, it will be seen by new users as the "objective" and "only" way to determine game value, regardless of what the intention is.

About the objectivity/subjectivity concern, I definitely agree with you that the values of games will change for every person. But, in my opinion same thing applies to the real world as well. For example the value of an apple changes in a person's perception based on: if they are a vegan or not, how hungry they are, if they are allergic to apples and so on... But that doesn't make having a market price on apples any less useful.

osmanyucel avatar Apr 16 '21 06:04 osmanyucel

If it's built in to the site, it will be seen by new users as the "objective" and "only" way to determine game value, regardless of what the intention is.

This just depends on how it's implemented. It sure can be implemented that way, but it can also be implemented in such a way it's only suggestive.

Revadike avatar Apr 16 '21 12:04 Revadike

Relevant steam discussion: https://steamcommunity.com/groups/bartervg/discussions/0/405692758726921809

Revadike avatar Apr 16 '21 12:04 Revadike

I started working on a simple model ( even if it ends up being thrown away, I am fine with it ) , but the data collection is quite slow since I am having to hit the API for every offer. Is there a bulk API, that I can get the offer details from?

osmanyucel avatar Apr 18 '21 08:04 osmanyucel

We ran into this problem too. There's only these (using cdn domain):

  • Most recent global trades: https://bartervg.com/o/json/
  • User trade list: https://bartervg.com/u/<USER_HEX_ID>/o/json/
  • User trade: https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/

Revadike avatar Apr 18 '21 13:04 Revadike

Which offers would you like, completed (312760) and declined (1367632) offers? Or declined due to a specific reason, such as not worth it to me (299169)?

bartervg avatar Apr 18 '21 13:04 bartervg

Do declined include countered?

Revadike avatar Apr 18 '21 13:04 Revadike

Yes, remember #251? There are 130333 declined offers with the reason countered.

bartervg avatar Apr 18 '21 13:04 bartervg

Good. Datadump for accepted and declined trades would be nice, with the included games data included.

Revadike avatar Apr 18 '21 14:04 Revadike

accepted and declined trades

Accepted is a temporary status. It should either be completed or failed (or expired). I'm not sure how much of a difference it would make, but not all accepted offers lead to completed offers.

If I understand correctly, most declined offers should not be used as inputs. There isn't enough information if someone declines without a reason. There is the wrong information is someone declines due to already own or not longer have. These declines do not reflect the offer recipient's sense of value.

Silly example: You offer Cyberpunk 2077 in exchange for my The Haunted Island, a Frog Detective Game. I decline because, whoops, I forgot to update my tradable collection and no longer have The Haunted Island, a Frog Detective to trade. If the model uses this offer, it will naively compute that the value of The Haunted Island, a Frog Detective Game is greater than Cyberpunk 2077 (either to me, or more dangerously, in general).

bartervg avatar Apr 18 '21 14:04 bartervg

I had a similar idea: using machine learning, predict how likely a proposed trade is going to be accepted. The problem for that though, is there is no historical game data with past trades. Meaning the value/properties of a game in a past trade can be changed by now.

Do you have anything to address this? I did ask you I think like a year ago to start logging this historical data. I assume that still has no happened yet?

Revadike avatar Apr 18 '21 14:04 Revadike

https://github.com/bartervg/barter.vg/issues/128

bartervg avatar Apr 18 '21 15:04 bartervg

We ran into this problem too. There's only these (using cdn domain):

  • Most recent global trades: https://bartervg.com/o/json/
  • User trade list: https://bartervg.com/u/<USER_HEX_ID>/o/json/
  • User trade: https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/

I am currently going the path of -> Collect all users from https://barter.vg/u/json -> Collect all offer IDs from https://bartervg.com/u/<USER_HEX_ID>/o/json/ -> Collect all offer details from https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/

Which offers would you like, completed (312760) and declined (1367632) offers? Or declined due to a specific reason, such as not worth it to me (299169)?

I am marking Completed and 'Acceptedas positive samples, andDeclinedas negative samples. I guess I am at the point ofthe more data, the better` point

accepted and declined trades

Accepted is a temporary status. It should either be completed or failed (or expired). I'm not sure how much of a difference it would make, but not all accepted offers lead to completed offers.

If I understand correctly, most declined offers should not be used as inputs. There isn't enough information if someone declines without a reason. There is the wrong information is someone declines due to already own or not longer have. These declines do not reflect the offer recipient's sense of value.

Silly example: You offer Cyberpunk 2077 in exchange for my The Haunted Island, a Frog Detective Game. I decline because, whoops, I forgot to update my tradable collection and no longer have The Haunted Island, a Frog Detective to trade. If the model uses this offer, it will naively compute that the value of The Haunted Island, a Frog Detective Game is greater than Cyberpunk 2077 (either to me, or more dangerously, in general).

I believe most people would just not set a reason because they are lazy. Even the declined offers without reasons give us a lot of information and they should be used. Ofcourse, what I believe is not very important, what we should do is try a few different methods (including and excluding the no-reason declines) to see how accurate our model gets, and make this call based on what the output of the model says. I agree that there can be some cases where the decline doesn't make any sense (such as your example) but the ML algorithms should be strong enough to be not fooled by those, as long as we have enough data to realize that it was an outlier.

I had a similar idea: using machine learning, predict how likely a proposed trade is going to be accepted. The problem for that though, is there is no historical game data with past trades. Meaning the value/properties of a game in a past trade can be changed by now.

Do you have anything to address this? I did ask you I think like a year ago to start logging this historical data. I assume that still has no happened yet?

I am seeing some of the game details in the offer detail api, but I didn't get to check if that data comes from the historical state of the game or the offer data is joined with the current state of the game data.

osmanyucel avatar Apr 18 '21 18:04 osmanyucel

-> Collect all users from https://barter.vg/u/json -> Collect all offer IDs from https://bartervg.com/u/<USER_HEX_ID>/o/json/ -> Collect all offer details from https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/

I think you can minimize the data-collecting with logbase2 by skipping the same trades from the other side perspective.

Revadike avatar Apr 18 '21 18:04 Revadike

I agree that there can be some cases where the decline doesn't make any sense (such as your example) but the ML algorithms should be strong enough to be not fooled by those, as long as we have enough data to realize that it was an outlier.

Yes barter had a point. There are some 'invalid' decline reasons you should filter out, like no longer have reason. Sure, ML algorithms may be resilient to outliers, but best is to clean the data to avoid any (slight) biases.

Revadike avatar Apr 18 '21 18:04 Revadike

I am seeing some of the game details in the offer detail api, but I didn't get to check if that data comes from the historical state of the game or the offer data is joined with the current state of the game data.

A trade-off has to be made by either using less, but more recent/accurate data, or more with more outdated data.

Revadike avatar Apr 18 '21 18:04 Revadike

-> Collect all users from https://barter.vg/u/json -> Collect all offer IDs from https://bartervg.com/u/<USER_HEX_ID>/o/json/ -> Collect all offer details from https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/

I think you can minimize the data-collecting with logbase2 by skipping the same trades from the other side perspective.

Yes, I am doing that, but I think it is not log2, it just halves the data collection.

I agree that there can be some cases where the decline doesn't make any sense (such as your example) but the ML algorithms should be strong enough to be not fooled by those, as long as we have enough data to realize that it was an outlier.

Yes barter had a point. There are some 'invalid' decline reasons you should filter out, like no longer have reason. Sure, ML algorithms may be resilient to outliers, but best is to clean the data to avoid any (slight) biases.

That's right, but I still don't think that should be part of data dump. It should be data preprocessing step.

In my opinion even the offers which was rejected with reason "no longer have", carry some information. If I offer you Game A for Game B and you reject because "no longer have", by making the offer I am still creating a row in the dataset saying B is more valuable than A. Probably ignoring your rejection is a good idea, but we can still use the fact that I am offering.

osmanyucel avatar Apr 18 '21 23:04 osmanyucel

I don't get how that makes B more valuable than A. With "no longer have" you don't if a trade offer would have been declined or accepted.

Revadike avatar Apr 18 '21 23:04 Revadike

By making an offer to you and saying give me B and I will give you A, I am intrinsicly claiming B is more valuable than A (for me). When you reject because of "no longer have" we don't get any information from you which helps to compare values of A and B.

EDIT : If you check the equations from my initial proposal, for every offer we create 2 rows: 1 for the offerer's evaluation, and one for receiver's evaluation. In this case we still have the offerer's evaluation. But we won't be able to create the second equation, which is for receiver's evaluation.

osmanyucel avatar Apr 18 '21 23:04 osmanyucel

By making an offer

Excellent point. This would mean that even expired offers could provide some information.

bartervg avatar Apr 19 '21 00:04 bartervg

By making an offer

Excellent point. This would mean that even expired offers could provide some information.

Yes, though I have to admit that I didn't consider expired offers, that is a good catch. That is why I still think the data dump has to be as comprehensive as possible.

Also I have been running the data collection for over 16 hours now and I still didn't get to 100k offers. So it would be great to know if we will have some data dump soon. Otherwise getting all the offers will probably weeks/months for me.

osmanyucel avatar Apr 19 '21 01:04 osmanyucel

Also I have been running the data collection for over 16 hours

There was a massive traffic spike around 10 hours ago. Right now though, no problems.

if we will have some data dump soon

What would that look like? There are ~3M offers, However, ~1M are cancelled, and I assume cancelled have no value and can be excluded. Therefore ~2M offers in this https://bartervg.com/u/<USER_HEX_ID>/o/<TRADE_OFFER_ID>/json/ format? Combined into one big file to download?

bartervg avatar Apr 19 '21 01:04 bartervg

I guess we can ignore cancelled. it can be one file, or if it is too large to handle, some partitioning would also work. Offers I have downloaded so far are about ~5kb/offer so I assume that the dump will be about 10gb.

P.S : I just introduced some parallelization to the code, and now it is going way, way much faster. I will let it run for an hour amd recalculate how long would it take to download all. So if it will take much time to create the data dump, don't bother at the moment.

osmanyucel avatar Apr 19 '21 01:04 osmanyucel

Parallelized version of the data collector downloads ~100K offers an hour. So I should have the data ready by tomorrow (hopefully). I hope I am not creating spikes in your servers.

osmanyucel avatar Apr 19 '21 06:04 osmanyucel

I am done downloading the offers ( ~1.8 million). Now I can work on them.

osmanyucel avatar Apr 20 '21 02:04 osmanyucel