Analyzing and predicting virtual goods auctions in an online game

Written by Dr. Ulrich Sigmund on Monday, 22 April 2013.

Many massive multiplayer online games have their own virtual economy where virtual goods are traded against a virtual (or sometimes real) currency. These trades happen in either a direct form, e.g. player to player or player to NPC or in an anonymous form of an auction. The later allows a price finding and represents an almost perfect market. A distinguishing feature of these virtual auctions is the limited set of merchandise.

In this article I will look at the virtual goods auctions in the browser game Horizon. This game is staged in a SciFi space environment and the goods in question are ancient alien devices. These devices can be found or harvested in various ways and may be traded amongst players in a direct way or using a common and annonymous auction place.

ScreenshotAuction

It is important for players as well as the game administrators to know the value of these goods, and with the help of the auction results it should be possible to predict the outcome of an auction before it is started.

Getting the Data

Unfortunately there is no log in the game that records the auction results, only user interactions are logged. The game server stores a complete backup of the games state, including all user account data, every 20 minutes and these logs are kept for debugging and recovery needs for approximately 14 days. We will therefore harvest these logs for auction results. The game code dates back to 2004 and the state files had to be extended frequently in the past – so they are now in a hard to parse and not standard text form. The ANKHOR lexer and EBNF parser was used to build a grammar to parse these files, and extract the interesting information from the state files.

After parsing the state files for the full two weeks (11.5 GByte) the sheet harvested 2550 Auction entries which yielded 155 completed auctions.

A quick glance at the data shows that there are significantly more players selling than buying items. This is expected considering that the number of hardcore collectors is small.

Sellers Stats Buyers Stats

Analyzing the Data

Each device that is traded is a combination of five features, a quality level that specifies the power of the artifact, a color that provides factors when combining it with other devices, an alien race that that provides some basic powers, a type that defines what the device does (e.g. generate energy or helps constructing buildings) and an extension that has a positive or negative effect on other buildings or units in the players account.

Each feature will be used to predict the devices value. The simple assumption we will make is, that price is a product of factors determined by the features – we will those use a factorizing analysis. A decision tree would not give us a continuous price spectrum and would result in an overfitted predictor due to the small number of training auctions compared to the number of possible items. A linear regression is problematic because all features are discreet.

Using the prediction on the training set and the geometric mean of the modified price for each device, results in a root mean square log error of 0.45, which is much better, than the currently built in prediction which has 3.05.

Foctorize Graph

We will now look at the individual features.

The first feature we look at is the device quality. It is possible to craft more powerful devices by merging three similar devices of a lower quality level into one. Thus we would expect that each level is three times more expensive than its lower level.

Factors Quality

While this assumption holds true for the last step (there are more valuable items in the game, but they are not traded at the current state of the universe), we do observe a factor of roughly two for the lower levels. The reason for this unexpected evaluation is most likely the ability of lower quality devices to provide some features to higher quality devices when used in crafting.

The next feature is color. Color in itself has no direct value, but there are different saturations and six different hues. For crafting and combining one needs similar hues and saturations and for other game purposes a balancing of hues. Devices with a stronger color should thus be more valuable. The hue factor might be influenced by the total number of devices of this hue in the universe.

Colors Frequency

Colors Factors

This assumption seems to be correct. The more intense colors achieve a higher price, and the abundance of green devices reduces the price.

Each of the five races has a different bonus factor and thus represents the preference of the players for the type of bonus. This information is valuable when balancing those factors during game design to result in a similar power for each race

Factors Race

It appears that the players are willing to pay a premium for the “Diggren” race device, which provides a bonus for mining and similar activities. The factor of two compared to the other races indicates, that some rebalancing should take place.

A similar argument can be made for the type of device, but here the value should be used to moderate the frequency of the device – resulting in a lower frequency for more valuable types.

Factors Type

It appears that players have a preference for three types of devices, the others are more or less considered equal.

Finally we have the device extension, and now we also see negative factors – using such a device would have a negative impact of the players account and thus should reduce its value.

Factors Extension

The positive extensions (blue) show a slightly higher factor on average, but not as strong as would have been expected. There are various reasons why this might be the case. First the small sample size compared to the large numbers of possible extensions might simply result in noise instead of knowledge. Second players consider extensions not so important, because they can be eliminated or replaced during crafting.