So, we need to build the platforms in which the AI agents will stand upon. We will call grid of these platforms “plate“ taking inspiration from tectonic plates.

Pictorially, we will have the following setup where Plate is 2D grid of tiles that contain various probabilities.

If probability of water is high, it will most likely be water tile. If probability of Dirt is high, it will most likely be a ground tile.

Given these set of probabilities, and neighboring tiles, our ML model will create a beautiful looking tile that considers those probabilities and the neighboring patches as well.

So far, I only know how to do this using auto regressive models and not with stable diffusion. So I probably will try this using auto regressive model.

From my AI course work, my understanding is that it is possible to create a control-net on top of Stable Diffusion pre-trained model, and fine tune it using LoRA and generated data.

The hardest part of this is again will be data collection. Collecting artificially created 2d map of games won’t be easy. What I’m going to do though is that we will try to generate them using AI tools.

First Step: Benchmark AI tools for tile map generation

Midjourney can generate some intriguing visuals with consistent designs. The composition is kinda ingrained in the model so it really generates something that looks like it is out of the movies.

ChatGPT can generate pixel art style maps as well. This one has little too much Maple Story vibe though. The composition is also much more lacking than Midjourney.

Synthopic was little hit or miss. It was hard to generate traditional 2D platform game maps.

Label Generation

More important than generating the stunning visuals is the ability to auto label data. Data needs to be labeled with probabilities in order to fine-tune the model.

Tried bunch of prompts on ChatGPT to see if it can segment things into different elements but didn’t have much success so far.

MidJourney has Edit feature, and that one so far doesn’t seem to do anything at least in my tests.

Segment-Anything seems to do a decent job at segmenting things but labels are not automatically generated. Perhaps we can first run Segment Anything, and separate label the patches using CLIP model.

Alternatively, we can just randomly crop the platformer map image and just feed into CLIP to get information on what the tile is. CLIP models are not all that good at small tiny images but should be better than nothing. Then we can cross check the result with Segment Anything to see if the probabilities it assigned is accurate.

Edges

Let’s say we generated all the good map in a visual sense, but detecting platform edges where the agents can walk on will be another challenge before this approach can be used.

Where do we go from here?

All in all, this almost looks like we will first need to hand create the game, then use the game to fine-tune a model. If there a game, then we can generate many different images and labels.

I think we need to scale down here and simplify the problem. At this time, there is no good AI solution achieve this. We can however, create stunning visuals and use them as the map. Perhaps we should focus on that.

[Design Doc] Probabilistic Tile Map

First Step: Benchmark AI tools for tile map generation

Label Generation

Edges

Where do we go from here?

Subscribe to my newsletter

Sprited Dev

Sprited Dev