Jump to content

Recommended Posts

1PL-Husar-1Esk
Posted (edited)
39 minutes ago, Aapje said:

Again, this is not true if you teach the AI to mimic human behavior. Then the reward function is how closely it matches the real behavior. And secondly, you don't necessarily have to judge individual actions, since you would know all the actions that belong to the same pilot, so you could judge their entire flight record.

 

You need to judge individual actions because it's a combat, for ich BFM or ACM problems you have correct and not correct responses . Fight is about solving problems created by opponent and creating problems for opponent.

Thats determine the entire fight. You can make good and bad decision altogether and win or loose or disengage ( hance saying that one who make less mistakes wins) How you and opponent react, benefits one or both, each action has own  value for further outcome, it's a chain of actions or not acting. To stimulate human behavior you must artificially alter ai to not pick always best decision or make it fail to execute it right.

 

 

Edited by 1PL-Husar-1Esk
1PL-Husar-1Esk
Posted (edited)
1 hour ago, Aapje said:

I see people here demanding realistic AI pilots, but I strongly doubt that this is actually representative of all or most SP players. People often think they want something, but then actually reject it in reality. An example is that (AFAIK) all shooters, even the ones that aim for realism like Arma and Hell Let Loose, have sped up reloads, because people tend to greatly dislike long reloads while they are getting shot at.

Bad example, HLL is not realistic shooter, ppl who chooses that genre want action and rambo gameplay. Less commutation, teamplay and tactics in favor of more individual skill. For example OWI owner of the milsim Squad game made a change -  an ICO which results in more realistic outcome from unrealistic simulation. In virtual worlds sometimes you need to add unrealistic / artificial layer to produce realistic outcome. This is sometimes necessary because there are limitations in what you can simulate in relation to a  human being behavior.

BTW This change make huge outrage in community, splitting it between two opposite teams but in the end game do grow and numbers of ppl playing rised.

 

Edited by 1PL-Husar-1Esk
Posted

I don't envy the people programming the AI. When they tweeked the AI a week or so ago, the first few sorties I actually took hits on a merge and was confounded when the AI held its altitude--for a day or so. I adjusted. End result? Dead AI pilots. 

 

I'm not a programmer. But to put the issue into a framework I understand as a historian, I imagine a flight instructor trying to teach a trainee how to take on Erich Hartmann. 

 

Am I right? 

Posted
2 hours ago, 1PL-Husar-1Esk said:

To stimulate human behavior you must artificially alter ai to not pick always best decision or make it fail to execute it right.

 

If you train the AI to mimic human players, it will also mimic human errors, in a realistic way (at least compared to those players). For example, a controlled flight into terrain would then not only be mostly limited to low level dogfights, but also would be more likely if you force the AI into situations where they have to choose between playing it safe and losing in the dogfight, or risking not being able to avoid the ground.

 

This is then an inherent feature of this kind of AI. It's not something that has to be added to it.

 

Quote

This change make huge outrage in community, splitting it between two opposite teams but in the end game do grow and numbers of ppl playing rised.

 

This is exactly my point though. That game is already on the more realistic side and yet a change to make it more realistic caused a lot of players to be unhappy. My idea is that the game could cater to different desires by having a flexible AI that makes way more people happy and definitely plays in a way that is more realistic to what real people do than now.

1PL-Husar-1Esk
Posted
5 minutes ago, Aapje said:

 

If you train the AI to mimic human players, it will also mimic human errors, in a realistic way (at least compared to those players). For example, a controlled flight into terrain would then not only be mostly limited to low level dogfights, but also would be more likely if you force the AI into situations where they have to choose between playing it safe and losing in the dogfight, or risking not being able to avoid the ground.

 

This is then an inherent feature of this kind of AI. It's not something that has to be added to it.

But that is not a porouse of AI training, it's to make AI to execute the best decision from thousand of humans examples. To mimic people behavior you must do opposite, to make AI choose bad. Why people still can win against AI which porouse is to win based on best outcome- becouse people observe and adapt-  learn on the fly. To people have fun we need to judge individual actions and make AI fail and do it manner that people do not  recognize the pattern easly.

18 minutes ago, Aapje said:

This is exactly my point though. That game is already on the more realistic side and yet a change to make it more realistic caused a lot of players to be unhappy. My idea is that the game could cater to different desires by having a flexible AI that makes way more people happy and definitely plays in a way that is more realistic to what real people do than now.

This could be done with different generation of AI . New generation tend to be better at solving the problems but sometimes AI engineer need to regres and alter AI to make it even better.

Posted
4 hours ago, 1PL-Husar-1Esk said:

But that is not a pupose of AI training, it's to make AI to execute the best decision from thousand of humans examples. To mimic people behavior you must do opposite, to make AI choose bad. Why people still can win against AI which purpose is to win based on best outcome- because people observe and adapt-  learn on the fly. To people have fun we need to judge individual actions and make AI fail and do it manner that people do not recognize the pattern easily.

You have an incorrect idea of how machine learning works that's causing you to say things that are not true.

 

AI has no inherent concept of winning or losing, or good or bad. The only thing it knows it is the reward function, which is how you tell it that it is doing a good job. It then then try to do as best as it can to gain those rewards. You can make a reward function that rewards the AI for crashing as quick as possible. You can make a reward function that rewards it for getting as many kills as possible.

 

But you can also make a reward function that rewards it for mimicking real people as good as possible, or a little more complex, to mimic people as good as possible based on certain characteristics. So then it would fly differently if you tell it to mimic a pilot with a high kill ratio than if you tell it to mimic a pilot with a low kill ratio, because it has learned that those pilots will fly differently.

 

And a basic implementation would actually be a less predictable than a real pilot, because if one pilot of that skill level would have a tendency to turn left and another pilot of that skill level would have a tendency to turn right in the same situation, the AI would learn to alternate, rather than have a distinct bias like a real pilot. Although you can actually also add such biases to the AI, although it would be a bit of extra work.

  • Like 1
1PL-Husar-1Esk
Posted
3 hours ago, Aapje said:

You have an incorrect idea of how machine learning works that's causing you to say things that are not true.

I can say same about you , we both do not work with AI learning  professionally, only use common sense and available information.  Reward function now you learn something , great progress as you started with language model for combat simulator. 

Posted (edited)
17 hours ago, 1PL-Husar-1Esk said:

I can say same about you , we both do not work with AI learning professionally, only use common sense and available information.

 

Well, it was a topic I was taught about during my university days and that I made and administered a workshop for, as a teacher's assistant. But it is true that I don't do AI work right now and what I was taught is of course not the latest tech.

 

Quote

great progress as you started with language model for combat simulator. 

 

At that point I was simplifying things a bit for a lay audience that is unlikely to be familiar with AI systems other than LLMs. I didn't expect the discussion to get to this level of detail.

Edited by LukeFF
watch the comments towards others
AEthelraedUnraed
Posted (edited)

  

On 3/5/2024 at 12:14 PM, Aapje said:

Again, this is not true if you teach the AI to mimic human behavior. Then the reward function is how closely it matches the real behavior. And secondly, you don't necessarily have to judge individual actions, since you would know all the actions that belong to the same pilot, so you could judge their entire flight record.

What I mean is that "closely matches" is not a phrase that a computer inherently understands. You need to come up with some numeric value (let's say between 0 and 1) for how "closely" it matches human behaviour. So you have to come up with some way to judge, during training, to what extent what the AI just did classifies as "intended behaviour".

 

And yes, it is true you can judge the AI through some simple, objective measures like getting hit or shot down. However, this will complicate training since you're effectively shrinking and mixing up your training dataset. An AI might make 10 fantastic decisions but completely screws up on the 11th one and gets killed. Ideally, you do not want to penalise those 10 good decisions simply because of the one mistake afterwards. In other words, training how you describe it is possible but will take longer and you will need more data.

 

On 3/5/2024 at 12:14 PM, Aapje said:

If you look at what LLMs do, then you see that they don't actually generate a single answer, but a probability distribution for various answers (well, tokens actually). They intentionally don't pick the most likely answer all the time, so the LLM becomes 'creative' instead of merely always producing the blandest, most predictable answer. This is actually an issue for factual questions, because the creativity then consists of wrong answers (known as hallucinating, although I consider that a bit of a misnomer).

 

An advantage of using a probability-generating AI for AI pilots is that there are no objectively correct behaviors if your goal is to mimic human pilots, although people in games especially can do some excessively dumb stuff. You can probably fix the worst issues with weird behaviors by simply only considering behaviors above a certain threshold of probability.

 

And then you can use a random number generator similar to how Dungeons & Dragons uses dice together with a probability chart. So the behavior that humans do 20% of the time then would also happen 20% of the time by the AI.

 

But of course you can play with this as well. For example, if the user prefers a predictable AI, you can weigh the more common actions more heavily or when the predictability slider is maxed out, only do the highest scoring action.

You are correct that Neural Nets (whether LLMs or otherwise) inherently generate a probability distribution across possible outcomes, simply because the whole training process is based on stochastic concepts. But the rest of your post is actually more analogous to how a "traditional" AI might work. EDIT: I forgot to add that a Neural Net is in itself completely deterministic, always generating the exact same output if you provide it with the exact same input.

 

(BTW, I like your Dungeons & Dragons analogy; DnD player of six years here :))

 

On 3/5/2024 at 12:14 PM, Aapje said:

I think that our fundamental disagreement is that your goal is to have a single AI that mimics a specific kind of pilot, while my goal is to have an AI that can be used to mimic different kinds of semi-realistic pilots. You are correct that it massive complicates things if you take the programmatic approach to do this, but a big strength of the AI approach (that uses the lessons learned from recent AI advances like LLMs), is that it can generate different behaviors organically based on the input parameters aka what you ask it to do.

 

[...]

 

But with an AI that can produce different (semi-)realistic kinds of pilots, players could pick the kind of pilots they want to fight against. 1GCS could then experiment with the prompts for the AI to see what generates a type of pilot that the player may want to choose to fight against and then they could offer a very much simplified interface to the player.

The issue with one AI that generates different "kinds of pilot" based on a certain input parameter, is that you have to train it using this parameter. Meaning you now do not only have to quantify how well it matches human behaviour, but how well it matches a certain kind of human behaviour.

 

That said, some differentiation is possible. You can have some kind of "skill level" input parameter that would e.g. occasionally penalise good decisions during training. Or you could have some "aggressiveness" slider that penalises defensive actions, as long as you find a way to quantify the aggressiveness of the AI's actions during training.

 

 

Anyhow, I think there is actually more we agree on than would seem at first sight but that we're just approaching the issue from a different angle - you from the viewpoint of the network itself, I from the viewpoint of the reward function. I think we can both agree that a (Convolutional) Neural Net can only ever be as good as your reward function is able to describe the desired behaviour, and that is the crux of my argument: coming up with a good quantification of what you want your AI to do is often much more difficult than the training process itself :)

Edited by AEthelraedUnraed
Posted
On 3/6/2024 at 12:05 PM, AEthelraedUnraed said:

You need to come up with some numeric value (let's say between 0 and 1) for how "closely" it matches human behaviour. So you have to come up with some way to judge, during training, to what extent what the AI just did classifies as "intended behaviour".

 

LLMs are already typically trained this way, where they get points for guessing the missing value in a sentence correctly. So if they get trained on the sentence: "I am hungry," then we remove a word from the sentence and make them guess it: "I am X".

 

Then they get a point if they guess hungry correctly. Of course, there are many possibilities for X, so if you train them with a lot of different sentences, you teach them all the common words that are used as the X. Although in reality they get way more context of course, than just two preceding words, which is why they do so well now.

 

I don't see why you can't train an pilot AI similarly, where you have input data like: There is an airplane on my six. He is shooting at me. His speed is 400 mph. My speed is 200 mph. He is 300 feet away from me. I roll right and pull hard on the stick to force the overshoot.

 

Then you remove the roll to the right and the pull of the stick and make the AI guess the right action, teaching it to do the right thing.

 

Of course, my example is way too high level and already contains too much knowledge, to make it easier to understand. In reality it would consist of a lot of parameters and then some control inputs that it needs to guess.

 

So it would be more like: plane speed = 200, distance to enemy 1 is 300, enemy 1 speed = 300, bullet distance to aircraft = 20 & then 1000's more parameters; control inputs: stick max right for x.x seconds, rudder right for x.x seconds.

 

Because you can train the model on a very limited set of control inputs, rather than the whole dictionary, where it is hard to judge how correct a mistake is, you should be able to make the reward function a bit smarter. For example, if the model generates half a second of roll, where the real pilot used 1 second of roll, you can still give partial credit for how close the AI got.

 

On 3/6/2024 at 12:05 PM, AEthelraedUnraed said:

You are correct that Neural Nets (whether LLMs or otherwise) inherently generate a probability distribution across possible outcomes, simply because the whole training process is based on stochastic concepts. But the rest of your post is actually more analogous to how a "traditional" AI might work.

 

That is more what I was taught in college, but I think that you can mix the two to some extent. The technological innovations of LLMs have been successfully translated to non-language domains like image generation.

 

In a way, 'what are the RGB values of a pixel' seems pretty similar to 'what are the input values of a joystick'. In both cases, you need the model to generate a bunch of values that describe the desired output. In this analogy, each pixel would then be one of the inputs of the plane, like the stick, rudder pedals, gun firing button, flaps lever, etc. As long as you can describe each plane control element with a very limited set of numbers, the model should be able to generate a value for each.

 

On 3/6/2024 at 12:05 PM, AEthelraedUnraed said:

EDIT: I forgot to add that a Neural Net is in itself completely deterministic, always generating the exact same output if you provide it with the exact same input.

 

Yes, this is why they intentionally put noise into the LLM inputs, to diversify the results. This seems like one of the LLM lessons that can be applied to a pilot AI model.

 

In fact, you can argue that it is very realistic for human pilots to suffer from judgement errors, so if a plane is 100 feet away, they may guess anywhere between 70 and 130 depending on the conditions and such. So if IL-2 AI-ovik  would change the inputs before presenting it to the AI, you can argue that this makes it more human-like.

 

On 3/6/2024 at 12:05 PM, AEthelraedUnraed said:

The issue with one AI that generates different "kinds of pilot" based on a certain input parameter, is that you have to train it using this parameter. Meaning you now do not only have to quantify how well it matches human behaviour, but how well it matches a certain kind of human behaviour.

 

Yes, although you can just experiment with this after the fact. I would argue that if 1GCS would offer an option to configure the AI pilot parameters in obscene detail, there would be a bunch of IL-2 fans who would put a huge number of hours into experimenting with this and figuring out combinations of settings that makes for good AI opponents.

 

You don't necessarily have to understand how pilot behavior or game results from pilots that you measure actually impacts their decisions in detail. The AI model will automatically figure it out and then you can literally play with the model to see what it found.

 

On 3/6/2024 at 12:05 PM, AEthelraedUnraed said:

Anyhow, I think there is actually more we agree on than would seem at first sight but that we're just approaching the issue from a different angle - you from the viewpoint of the network itself, I from the viewpoint of the reward function. I think we can both agree that a (Convolutional) Neural Net can only ever be as good as your reward function is able to describe the desired behaviour, and that is the crux of my argument: coming up with a good quantification of what you want your AI to do is often much more difficult than the training process itself :)

 

Well, I think that the reward function is essentially a solved problem (at least in the abstract, the details would need some tuning) by using the method I described above, where you reward the model for guessing correctly what the real pilots did in the collected data from multiplayer.

 

I see much bigger issues in figuring out what kinds of things you can measure about the pilot & figuring out which of those measurements actually help the most and should be used.

 

And especially how you make it all run fast, because you'd run this model a lot and for multiple planes (although it should parallelize very well so that 12900K will now really be used).

 

And for higher level things like longer term decision making, for instance whether to continue the fight, which plane to attack or such, it may be better to have a separate model that you probably run less often, but that generates a tactical choices that you then feed back into the model that actually flies the plane. Otherwise I fear that you might get a rather inconsistent AI at times, that decides to flee and then to re-engage, but then decides to flee again, etc.

 

That separate AI could then also be used to model the AI outside of combat. In that case you could not run the combat AI, but directly translate the tactical AI into behavior with an additional non-AI algorithm. So an AI that decided to fly to home base would then be assumed by the non-AI algorithm to fly straight home with the standard cruising speed (unless damaged) and at a logical altitude. But if you then intercept the plane, it would switch over to the proper combat AI, which takes more computer power, but results in far more complex behavior than just flying to a waypoint.

Posted

At that rhythm I see a book coming on this topic. Authors: Aapje and AEthelreaedUnraed.

 

 

Posted
3 hours ago, IckyATLAS said:

At that rhythm I see a book coming on this topic. Authors: Aapje and AEthelreaedUnraed.

Yeah, most people surely have been scared off by now :P

 

@AEthelraedUnraed

 

On second thought, the separate tactical decision making part may work better as a procedural system, not AI.

Posted

What follows below is the text from a Rise of Flight Lua file for the Alb Va. This text was modded by VonS in one of his AI improvement Mods. (1.37)

 

I don't understand what the implications of all of the numbers are. But if you compare those of the different categories of pilots, you can see how they change, for example with regard to engagement distances or when they open fire. 

 

In Il-2 the equivalent are cvc files, and they are editable, but they aren't labeled in the same way, and I have no idea what any of them relate to. 

 

But someone knows! 

 

[dogfight]
    [novice]
        RefNoseAngle         = 0.0
        MinTAS1              = 25.0
        MaxTAS2              = 40.0
        MaxPitchRate         = 50.0
        AttackDistance       = 450.0
        EngageDistance       = 3800.0
        MinFireDistance      = 75.0
        MaxFireDistance      = 150.0
        MinOpenFireAngle     = 10.0
        MaxOpenFireAngle     = 5.0
        MinStopFireAngle     = 15.0
        MaxStopFireAngle     = 7.5
        CollisionTimeTreshold  = 2.0
        HBTNegativeRoll        = -50.0
        HBTPositiveBTRoll      = 50.0
    [end]
    [normal]
        RefNoseAngle      = 0.0
        MinTAS1           = 20.0
        MaxTAS2           = 35.0
        MaxPitchRate      = 65.0
        AttackDistance    = 400.0
        EngageDistance    = 3800.0
        MinFireDistance   = 75.0
        MaxFireDistance   = 150.0
        MinOpenFireAngle  = 7.5
        MaxOpenFireAngle  = 2.5
        MinStopFireAngle  = 10.0
        MaxStopFireAngle  = 5.0
        CollisionTimeTreshold  = 1.5
        HBTNegativeRoll        = -40.0
        HBTPositiveBTRoll      = 55.0
    [end]
    [high]
        RefNoseAngle      = 0.0
        MinTAS1           = 15.0
        MaxTAS2           = 30.0
        MaxPitchRate      = 80.0
        AttackDistance    = 325.0
        EngageDistance    = 5400.0
        MinFireDistance   = 100.0
        MaxFireDistance   = 200.0
        MinOpenFireAngle  = 7.5
        MaxOpenFireAngle  = 2.5
        MinStopFireAngle  = 10.0
        MaxStopFireAngle  = 5.0
        CollisionTimeTreshold  = 1.25
        HBTNegativeRoll        = -35.0
        HBTPositiveBTRoll      = 60.0
    [end]
    [ace]
        RefNoseAngle      = 0.0
        MinTAS1           = 15.0
        MaxTAS2           = 30.0
        MaxPitchRate      = 90.0
        AttackDistance    = 250.0
        EngageDistance    = 5400.0
        MinFireDistance   = 125.0
        MaxFireDistance   = 250.0
        MinOpenFireAngle  = 5.0
        MaxOpenFireAngle  = 1.0
        MinStopFireAngle  = 7.5
        MaxStopFireAngle  = 2.0
        CollisionTimeTreshold  = 1.0
        HBTNegativeRoll        = -25.0
        HBTPositiveBTRoll      = 75.0
    [end]
 

  • Thanks 1
AEthelraedUnraed
Posted
18 hours ago, IckyATLAS said:

At that rhythm I see a book coming on this topic. Authors: Aapje and AEthelreaedUnraed.

Nah, no plans for a book here. Perhaps a journal article. Æ. Unræd and A. Apje, "Applications of neural networks in simulating human behaviour in world war two-era combat flight", Nat. Pub. Fl. AI. Etc. Blah., March 2024. I like the ring of that.

 

19 hours ago, Aapje said:

I don't see why you can't train an pilot AI similarly, where you have input data like: There is an airplane on my six. He is shooting at me. His speed is 400 mph. My speed is 200 mph. He is 300 feet away from me. I roll right and pull hard on the stick to force the overshoot.

 

Then you remove the roll to the right and the pull of the stick and make the AI guess the right action, teaching it to do the right thing.

 

Of course, my example is way too high level and already contains too much knowledge, to make it easier to understand. In reality it would consist of a lot of parameters and then some control inputs that it needs to guess.

 

So it would be more like: plane speed = 200, distance to enemy 1 is 300, enemy 1 speed = 300, bullet distance to aircraft = 20 & then 1000's more parameters; control inputs: stick max right for x.x seconds, rudder right for x.x seconds.

 

Because you can train the model on a very limited set of control inputs, rather than the whole dictionary, where it is hard to judge how correct a mistake is, you should be able to make the reward function a bit smarter. For example, if the model generates half a second of roll, where the real pilot used 1 second of roll, you can still give partial credit for how close the AI got.

The problem is in gathering the data. I imagine that things like speed and heading can be easily gathered, but that's where it ends. An action such as "roll to the right" is actually much too low level; rolling to the right can be either a great or a stupid thing to do based on what you do after. Combat maneuvers aren't made of single actions like rolling, but of larger combined actions. E.g. the rolling to the right might be the start of a Split-S, Scissors, plain old turn, etc. So you need to have already-classified data in order to train the AI. Given that you need many thousands of examples, you can either hire some cheap labour in India, give them tens of thousands of hours of recorded combat flight and have them manually classify it, or you need to build an AI to do it for you in which case we've now got a chicken and egg problem.

 

Of course, you *could* have the AI automatically learn complicated maneuvers based only on feedback about simple maneuvers like rolls and a chain of previous events. But then you need a much more complicated architecture (i.e. more "memory") and hence even more training data (since the backpropagation is now less direct).

 

20 hours ago, Aapje said:

Yes, although you can just experiment with this after the fact. I would argue that if 1GCS would offer an option to configure the AI pilot parameters in obscene detail, there would be a bunch of IL-2 fans who would put a huge number of hours into experimenting with this and figuring out combinations of settings that makes for good AI opponents.

 

You don't necessarily have to understand how pilot behavior or game results from pilots that you measure actually impacts their decisions in detail. The AI model will automatically figure it out and then you can literally play with the model to see what it found.

Agreed, although this again complicates gathering your dataset. Now you don't just need to classify a certain maneuver as Split-S, Yo-Yo, Break, Looping, etc. but also how "aggresssive" the action was.

 

20 hours ago, Aapje said:

Well, I think that the reward function is essentially a solved problem (at least in the abstract, the details would need some tuning) by using the method I described above, where you reward the model for guessing correctly what the real pilots did in the collected data from multiplayer.

I think the things I described above re pre-processing your data are a bit more than "details of an essentially solved problem that need some tuning" ;)

To summarize, the problem is that, starting from recordings of dogfights, it's not immediately apparent how to quantify "what the real pilots did".

 

20 hours ago, Aapje said:

I see much bigger issues in figuring out what kinds of things you can measure about the pilot & figuring out which of those measurements actually help the most and should be used.

This is actually pretty easy. You just train your model with all the data you can gather, and if it turns out some inputs barely make a difference, you can take them out.

 

20 hours ago, Aapje said:

And especially how you make it all run fast, because you'd run this model a lot and for multiple planes (although it should parallelize very well so that 12900K will now really be used).

Hence the importance of training it as high-level as possible; i.e. deciding to fly an Immelmann and then leaving its execution to other AI is much faster than deciding to pull on the stick, pull some more, enter a roll, stop the roll, etc.

 

That said, if they're not too complicated, Neural Nets would perform just fine. Relatively complicated image processing networks quite often run in real-time (i.e. something like 50FPS) on a normal home laptop. Especially if you optimise them a bit. E.g. you can sometimes drastically reduce the amount of parameters for little quality loss by completely leaving out those whose weights nearly zero.

 

16 hours ago, Aapje said:

On second thought, the separate tactical decision making part may work better as a procedural system, not AI.

Procedural AI is surely much easier to make ?

Posted (edited)

 

2 hours ago, AEthelraedUnraed said:

Of course, you *could* have the AI automatically learn complicated maneuvers based only on feedback about simple maneuvers like rolls and a chain of previous events. But then you need a much more complicated architecture (i.e. more "memory") and hence even more training data (since the backpropagation is now less direct).

 

Yes, that was my idea. To have the AI learn that a roll is often followed by a pull on the stick, to make a sharp turn. Or to learn the scissors itself, by mimicking human behavior. It would indeed require more parameters. Basically, you would provide the AI with what controls the pilot put in previously, with a limit to that, of course.

 

My idea was to simplify the control input by simply recording the strength of the input and the duration, so if the pilot is in a 2 circle fight, they may be constantly pulling on the stick for minutes, so instead of generating a lot of parameters for the model, explaining what the pilot did every millisecond in the past, it would be something like: stick pulled in direction A with force X for Y seconds starting Z seconds ago.

 

This would probably work decently, although it would smooth out varying control inputs a bit. But if you use proper thresholds for when you generate a new control input, it should work fine, I think. And then you provide something like 50? previous control inputs to the model. Something to be tuned.

 

You don't need to classify anything manually this way. The computer can itself compress sequences of control inputs into a single control input sequence.

 

Then the model can also output these sequences: pull on the stick with strength X for Y seconds. Of course, other controls can be used at the same time. Especially the throttle, rudder pedals and the firing button(s). So the model has to (be able to) output multiple control sequences.

 

2 hours ago, AEthelraedUnraed said:

Agreed, although this again complicates gathering your dataset. Now you don't just need to classify a certain maneuver as Split-S, Yo-Yo, Break, Looping, etc. but also how "aggresssive" the action was.

 

No, you'd determine aggressiveness based on something that you can just measure. But I would personally not concern myself too much with that at first, and would just put some already measured or easy to measure pilot attributes like k/d, kills per hour of flying, amount of damage sustained per hour, etc into the model at first and would experiment with what the model does when you tell it to mimic a pilot who varies by those measurements.

 

You don't have to measure aggression itself, but only need a measurement that correlates well with aggression.

 

2 hours ago, AEthelraedUnraed said:

I think the things I described above re pre-processing your data are a bit more than "details of an essentially solved problem that need some tuning" ;)

 

I specifically referred to the reward function that mimics behavior when I talked about a problem already being solved, not the pre-processing step. 

 

2 hours ago, AEthelraedUnraed said:

To summarize, the problem is that, starting from recordings of dogfights, it's not immediately apparent how to quantify "what the real pilots did".

 

I completely disagree on that. It is actually very easy to do that, since the only impact the human pilot has on the plane is the control inputs, which are all provided to the sim in a completely clear way.

 

The difficulty is providing that data to the AI model in a way that is performant. And that goes both for the training phase, but even more so for the actual running of the model.

 

2 hours ago, AEthelraedUnraed said:

That said, if they're not too complicated, Neural Nets would perform just fine. Relatively complicated image processing networks quite often run in real-time (i.e. something like 50FPS) on a normal home laptop. Especially if you optimise them a bit. E.g. you can sometimes drastically reduce the amount of parameters for little quality loss by completely leaving out those whose weights nearly zero.

 

Indeed, you would need something that is fast like that (while leaving plenty of CPU for the game itself). It doesn't have to run at the same FPS as the game though, but often enough for the AI pilot not to be slow in responding. But the average human response time to visual stimuli is more like 4 reactions per second, so something like 10  should be plenty. Probably less rather than more, since those human reaction times are only relatively simple tasks like catching a ball. Human decision making while in combat should typically be considerably slower than that.

 

But you can just tune this fairly easily, by experimenting with running the model less and less often until you start to notice the AI pilot getting too slow to react.

 

2 hours ago, AEthelraedUnraed said:

Procedural AI is surely much easier to make ?

 

I don't really like easier/harder in this context, since it is always dependent on what you want to do with it. Something like ChatGPT is way, way, way, way harder to make procedurally than with LLMs. Similarly, I think that you can achieve a much better AI pilot with machine learning than with a procedural AI, assuming a realistic amount of developer effort and developer ability.

 

It's always about the proper tool for the job. Combining procedural AI for the decisions on what goal to pursue and machine learning AI for the actual combat seems like it could match the strengths of the respective approaches.

Edited by Aapje
Posted

Excellent the pages of the book keep coming.  ?

  • Haha 1
AEthelraedUnraed
Posted
On 3/8/2024 at 3:36 PM, Aapje said:

Yes, that was my idea. To have the AI learn that a roll is often followed by a pull on the stick, to make a sharp turn. Or to learn the scissors itself, by mimicking human behavior. It would indeed require more parameters. Basically, you would provide the AI with what controls the pilot put in previously, with a limit to that, of course.

 

My idea was to simplify the control input by simply recording the strength of the input and the duration, so if the pilot is in a 2 circle fight, they may be constantly pulling on the stick for minutes, so instead of generating a lot of parameters for the model, explaining what the pilot did every millisecond in the past, it would be something like: stick pulled in direction A with force X for Y seconds starting Z seconds ago.

 

This would probably work decently, although it would smooth out varying control inputs a bit. But if you use proper thresholds for when you generate a new control input, it should work fine, I think. And then you provide something like 50? previous control inputs to the model. Something to be tuned.

 

You don't need to classify anything manually this way. The computer can itself compress sequences of control inputs into a single control input sequence.

 

Then the model can also output these sequences: pull on the stick with strength X for Y seconds. Of course, other controls can be used at the same time. Especially the throttle, rudder pedals and the firing button(s). So the model has to (be able to) output multiple control sequences.

I'm not sure you appreciate exactly how much harder it is to train the AI for something like this.

 

Essentially, what you're doing is teaching the AI how to fly in addition to combat tactics. These are unrelated tasks and even the human brain has separate areas for them (flying is mostly "muscle memory"; tactics is a conscious decision). Doing both simultaneously unnecessarily complicates things. I'm not exaggerating if I say that you'd need a decently-sized server farm, thousands of hours of recorded flights and a generous amount of time to get decent results. Let alone that the network would be of such a complexity that you'd need a pretty high-spec PC to even run the resulting network at a fast enough framerate. Even assuming your figure of 10FPS is fast enough, if you'd have 10 aircraft in a furball, that means you'd need to run it 100 times per second in addition to the "normal" GPU work.

 

Your solution is basically what someone from a Data Sciences or Software Sciences background might come up with: "Let's just take a bunch of data and let the "AI" figure out the rest. If it doesn't work, just gather more data." But unfortunately, data is usually in short supply or of an unsuitable format for efficient training. That's where people like me with a Signal Processing background come in to analyse and pre-process the data :). The whole idea of that pre-processing - in my described case by extracting annotated sequences of manoeuvres - is to facilitate training by leaving out all unnecessary parts. Once you have annotated data like that, I could design, train and run the network on my mid-spec home laptop.

 

One other disadvantage of using "raw" data (I think I already mentioned it) is that you cannot adjust the data for whatever you want. You train the AI with multiplayer furball recordings; you get an AI that fights in a multiplayer furball way. The whole reason people like me don't play multiplayer is exactly to avoid that kind of playing ;)

 

On 3/8/2024 at 3:36 PM, Aapje said:

No, you'd determine aggressiveness based on something that you can just measure. But I would personally not concern myself too much with that at first, and would just put some already measured or easy to measure pilot attributes like k/d, kills per hour of flying, amount of damage sustained per hour, etc into the model at first and would experiment with what the model does when you tell it to mimic a pilot who varies by those measurements.

 

You don't have to measure aggression itself, but only need a measurement that correlates well with aggression.

Again this would needlessly complicate things since you're now not only teaching the AI to fly the plane and apply tactics, but teaching the AI to fly the plane and apply certain tactics according to a certain attribute. Remember that with a neural network of the complexity your AI would require, your backpropagation function is basically 0 at the inputs - it's going to be hard enough to learn anything at all even for parameters that have a pretty direct correlation with the output (pulling up when fired on -> not getting hit). Training for input parameters that are even more disconnected from the end result is next to impossible.

 

On 3/8/2024 at 3:36 PM, Aapje said:

I specifically referred to the reward function that mimics behavior when I talked about a problem already being solved, not the pre-processing step. 

Then please describe this reward function. "The percentage of players that had exactly 0.2137264 of joystick deflection with an aircraft 123.5435 metres away at a 174.324 angle?" Perhaps add some tolerance: "The percentage of players that had a 0.2137264 +- 0.1 joystick deflection at 174.324 +- 5 degrees angle at 123.5435 +- 10m?" Are you going to search your entire database with millions of recorded frames to calculate these percentages on each of your training inputs?

 

I don't think the problem of the reward function is already solved just yet ;)

 

On 3/8/2024 at 3:36 PM, Aapje said:

I completely disagree on that. It is actually very easy to do that, since the only impact the human pilot has on the plane is the control inputs, which are all provided to the sim in a completely clear way.

 

The difficulty is providing that data to the AI model in a way that is performant. And that goes both for the training phase, but even more so for the actual running of the model.

You would never ever want to train a tactical combat AI based on the control inputs. Besides the reasons I already mentioned above; if the Devs want to add a new aircraft to the sim later on (that Fokker D.XXI we've all been waiting for :P), what are they gonna do? Hire the server farm again to re-train the network for the new aircraft? Use the AI for a similar aircraft and just accept that it either underperforms or continuously stalls because the inputs need to be slightly different?

 

You want to decouple unrelated functionality. Training one single AI to regulate both the control inputs and the tactical decisions is just waiting for trouble to happen.

 

On 3/8/2024 at 3:36 PM, Aapje said:

Indeed, you would need something that is fast like that (while leaving plenty of CPU for the game itself). It doesn't have to run at the same FPS as the game though, but often enough for the AI pilot not to be slow in responding. But the average human response time to visual stimuli is more like 4 reactions per second, so something like 10  should be plenty. Probably less rather than more, since those human reaction times are only relatively simple tasks like catching a ball. Human decision making while in combat should typically be considerably slower than that.

 

But you can just tune this fairly easily, by experimenting with running the model less and less often until you start to notice the AI pilot getting too slow to react.

Mostly agree, except that you already want to calculate this before you start designing the network since it's a pretty fixed design requirement (that depends on the lowest PC specs that you want to support). Based on the network architecture, you can calculate how many operations this requires and hence make a guesstimate of how fast it'll run.

 

Data on how often the AI needs to run their routines is probably already available, since the current AI obviously also doesn't do everything every game update.

 

On 3/8/2024 at 3:36 PM, Aapje said:

I don't really like easier/harder in this context, since it is always dependent on what you want to do with it. Something like ChatGPT is way, way, way, way harder to make procedurally than with LLMs. Similarly, I think that you can achieve a much better AI pilot with machine learning than with a procedural AI, assuming a realistic amount of developer effort and developer ability.

 

It's always about the proper tool for the job. Combining procedural AI for the decisions on what goal to pursue and machine learning AI for the actual combat seems like it could match the strengths of the respective approaches.

I in general agree with what you say, but in this case "what I want to do with it" is pretty clear: an AI that does the tactical decision-making for AI aircraft in IL2. Within the constraints of a realistic amount of developer effort and developer ability, I am of the opinion that a "traditional" AI is by far the better choice. Note that other people even in this thread have pointed to other simulators that they perceive as having a much better AI than IL2 - and none of those use Neural Networks.

1PL-Husar-1Esk
Posted

I think wishful thinking is not what get us better AI in CFS from learning  AI . The resource needed to try that task are tremendously high and don't predict desired results. Better general old fashion AI programming with better understanding and apiled  BFM and ACM would solve problem quicker and without tremendous resources needed. 

  • Upvote 3
AEthelraedUnraed
Posted
44 minutes ago, 1PL-Husar-1Esk said:

I think wishful thinking is not what get us better AI in CFS from learning  AI . The resource needed to try that task are tremendously high and don't predict desired results. Better general old fashion AI programming with better understanding and apiled  BFM and ACM would solve problem quicker and without tremendous resources needed. 

Thanks for the TLDR summary of my post ?

1PL-Husar-1Esk
Posted
4 minutes ago, AEthelraedUnraed said:

Thanks for the TLDR summary of my post ?

Your welcome ?

Posted (edited)
8 hours ago, AEthelraedUnraed said:

I'm not sure you appreciate exactly how much harder it is to train the AI for something like this.

 

I freely admit that I'm not qualified to judge this.

 

8 hours ago, AEthelraedUnraed said:

Essentially, what you're doing is teaching the AI how to fly in addition to combat tactics. These are unrelated tasks and even the human brain has separate areas for them (flying is mostly "muscle memory"; tactics is a conscious decision).

 

I'm unsure whether you are using tactics as this word is commonly used. In a military context, everything related to winning or surviving the dogfight itself, and just flying, would be the operational level.

 

(Basic) tactics would be more something like deciding whether to go after bombers or the escorting fighters. I already argued that this level would probably be better served with a procedural AI.

 

Quote

The whole idea of that pre-processing - in my described case by extracting annotated sequences of manoeuvres - is to facilitate training by leaving out all unnecessary parts.

 

You do leave out a lot of very important information with that solution, since a crucial part of flying in warbirds is how well you execute the manoeuvres, not just when. So your solution would not result in realistic flying, but an AI that would just always execute manoeuvres the same way, which is boring and unrealistic. 

 

8 hours ago, AEthelraedUnraed said:

But unfortunately, data is usually in short supply or of an unsuitable format for efficient training.

 

I've already explained how I think that the data could be condensed in way more compact and thus manageable chunks. I haven't seen you argue that this could not work, other than by dismissing control inputs altogether, without any explanation of why this could not work.

 

8 hours ago, AEthelraedUnraed said:

You train the AI with multiplayer furball recordings; you get an AI that fights in a multiplayer furball way.

 

Again, I've already argued that this is not necessarily the case, similar to how ChatGPT and DALL·E are not limited to reproducing what it was fed, but their outcome can be heavily altered using smart 'prompting'. It should be possible to make the AI mimic pilots that are less prone to such behavior and more prone to historic behavior. Whether that is possible with a model that is light enough to be trained by 1GCS and run in IL-2 is a separate question.

 

But frankly, this entire comment of yours is a bit disappointing (in the context of your earlier, much better comments, at least), because you seem to constantly be ignoring that I already addressed most of the concerns that you brought up, and are mostly just restating what you said before and what I addressed before, without engaging with my arguments. So I'm not going to react to most of the rest of your post, aside from one new & good argument you make.

 

8 hours ago, AEthelraedUnraed said:

if the Devs want to add a new aircraft to the sim later on (that Fokker D.XXI we've all been waiting for :P), what are they gonna do? Hire the server farm again to re-train the network for the new aircraft? Use the AI for a similar aircraft and just accept that it either underperforms or continuously stalls because the inputs need to be slightly different?

 

That depends, if you train the model with the parameters of the flight model, then the model would presumably learn that climbing is a good strategy in planes with a strong climb rate, turning a good strategy in planes with a good turn rate, scissors a good strategy in planes with a good roll rate. And it would also learn how to deal with the (relative) characteristics of other planes, and could have 'Ace' AI do this kind of stuff way better than amateur pilots. So an amateur AI would be far more likely to do a turn fight in a BF109 than an ace AI and the ace AI would adapt its flying way more to the strengths of the opponent's plane.

 

So if they then introduce a new plane, the model should be able to adapt to it well, unless the new plane has some fundamentally different FM characteristic, but I don't think that IL-2 has such a complex flight model.

Edited by Aapje
AEthelraedUnraed
Posted
23 hours ago, Aapje said:

I'm unsure whether you are using tactics as this word is commonly used. In a military context, everything related to winning or surviving the dogfight itself, and just flying, would be the operational level.

 

(Basic) tactics would be more something like deciding whether to go after bombers or the escorting fighters. I already argued that this level would probably be better served with a procedural AI.

I'm using the word "tactics" as in BFM: someone with a certain energy level has a certain relative position to you, so you should fly this and that manoeuvre for optimum results. I'm unaware if air forces use other terminology, but this usage is covered by the general definition of the word "tactics" and I believe on this site too "tactics" is generally understood to refer to BFM as well :)

 

Yes, more high-level decisions such as whether or not to go after an enemy would definitely be better done procedurally. However, I'm arguing that at this point in time, with the current state of AI and hardware, and with an even remotely realistic budget, the IL2 Developers should do the entirety of AI procedurally and not waste any time with Neural Networks. (This coming from someone who works with CNNs).

 

23 hours ago, Aapje said:

You do leave out a lot of very important information with that solution, since a crucial part of flying in warbirds is how well you execute the manoeuvres, not just when. So your solution would not result in realistic flying, but an AI that would just always execute manoeuvres the same way, which is boring and unrealistic.

And that's exactly my point - a good AI should be able to both think of the right manoeuvres as well as be able to execute them. Do either of those badly, and your whole AI is flawed. With your solution, you combine the "planning" and execution phases. That's two unrelated tasks. If there's anything (C)NNs are not good at, it's doing separate tasks. You're going to end up with an AI that can neither fly very well nor make very good decisions.

 

Note that I'm not making any statements about the flying AI. You could absolutely train a second CNN to fly the aircraft; i.e. translate for instance a target position into the flight controls needed to arrive at that position. Or you could do it procedurally like they do currently. Even with a procedural AI, it isn't very complicated to program the AI such that it executes manoeuvres slightly differently each time. Anyhow, for my argument it doesn't really matter which solution you choose. What I'm saying is that to train a single NN to do both the flying as well as make tactical decisions, is a fool's errand.

 

23 hours ago, Aapje said:

I've already explained how I think that the data could be condensed in way more compact and thus manageable chunks. I haven't seen you argue that this could not work, other than by dismissing control inputs altogether, without any explanation of why this could not work.

Alright then. I assume you know how NNs work "under the hood", but I'll give a quick summary for other people that might be reading this. I will also mostly use some analogies since it keeps things understandable for the general public, and also because I can't be bothered to write down the mathematics. Since it's going to be quite a wall of text, I've put it under the spoiler tag.
 

Spoiler

Everything a computer does is mathematics. Bytes are just numbers after all. This means that when we want a computer to do something with a certain input and produce a certain output, what we're really looking for is a mathematical function that transforms the input numbers into the desired output numbers. We call this a "transfer function". Now whatever our usecase - whether that's a calculator app, flying an aircraft or generating AI art - this always holds true.

Such transfer functions can be immensely complex, to a point where they get absolutely impossible to work out as a human. For instance, there's no way anyone's ever going to create a ChatGPT with traditional programming.

 

This means that it is impossible to exactly determine this transfer function!

 

However, we can approximate them. NNs use nonlinear activation layers, and it can be mathematically proven that using a combination of several such nonlinear functions (which NNs do), every mathematical function can be approximated, regardless of how complex it is.

 

The problem is that we generally don't know what the "real" transfer function looks like. So we just create a "generic" function consisting of multiplications, additions and some nonlinearity, using a great multitude of numeric variables ("parameters" in NN lingo). Which parameters get added or multiplied and how many operations take place is determined beforehand by your network architecture. We then initialise all those variables with random values, and see what the output is. Next, we take the difference between the calculated output and the desired output, using the so-called "loss function" (or "reward function"; the term we've used so far). This "loss" can then be used to improve our guesses for the variables using mathematical techniques like Stochastic Gradient Descent. This is optimisation, and in the context of NNs called backpropagation.

 

Now, since the target transfer function is so complex, the relation between the variables and the loss looks like somewhat of a mountain landscape, with high mountains alternating with deep valleys. We aim for the lowest loss, meaning the generated output is as close as possible to the desired output. In the mountainscape analogy this is the very deepest valley. But the problem is that if we arrive in a valley, the mountains block the view to the next valley so there is no way to see if other valleys might be deeper. We can see whether or not we are at the deepest point of our current valley, but there is simply no way to know if the current valley is also the deepest valley of all.

 

Exactly how many hills and valleys there are depends on the complexity of the transfer function. If there is a very straightforward relationship (let's say the colour of a traffic light, and whether or not we should stop), there might be as little as one valley. In that case, training is fast and easy, we can do with a relatively simple network architecture, and we can be pretty certain that our approximated transfer function looks a lot like the real thing. However if the transfer function is very complex, the loss function will have thousands or even millions of valleys. This has three major effects, being 1) it slows down training, 2) requires a more complex network architecture, and 3) it is extremely likely that whatever valley we end up choosing isn't the deepest one.

 

The first two problems we have a(n inconvenient) solution for, namely more training. The third one, we don't. There does not exist any mathematical way to know whether our found local optimum is also the global optimum.* So although we will still find an approximated transfer function that looks somewhat like the actual one, it will likely look less like the real one than if we didn't include the extra variable in the first place.** In terms of results, this means less accurate and/or more erratic results.

 

At this point, we should take note of a certain paradox that arises. While adding more input variables than necessary decreases the likelihood of finding the best possible approximation of the transfer function, removing input variables decreases how well you can fit the transfer function in the first place. In other terms, by adding more kinds of data, you increase the maximum achievable performance of your network, while at the same time you decrease your chances of ever achieving that. Now, there isn't any way to know beforehand whether the effect of adding the parameters you propose will be for the better or for the worse. But my - admittedly entirely subjective - gut feeling tells me that the effect in this case would probably not be for the good.

 

* There do exist tricks to "jump out of" a local optimum, but they're just that: tricks. They are of little use in finding the global optimum.

** Note that this isn't true for adding input values that are uncorrelated to the output. By orthogonality, these necessarily have a derivative of 0 to the loss function, and hence won't have any influence on the presence or shape of hills and valleys. So if we'd add the "current surface temperature on the planet Mars" as an input, it might slow down training but not necessarily influence the end result. However the core assumption in your method is that "aggressiveness variables" do influence the end result, and therefore they do also decrease the expected relative accuracy of the end result.

TLDR: adding additional types of data, even if "condensed into compact chunks" not only slows down training, but may even lead to less accurate results for your Neural Net.

 

On 3/12/2024 at 9:11 PM, Aapje said:

Again, I've already argued that this is not necessarily the case, similar to how ChatGPT and DALL·E are not limited to reproducing what it was fed, but their outcome can be heavily altered using smart 'prompting'. It should be possible to make the AI mimic pilots that are less prone to such behavior and more prone to historic behavior.

That is very much untrue. The only reason Dall-E generates a different picture if you ask for a "Van Gogh style monkey" than if you ask for a "Ukiyo-e monkey" is that it already was trained with annotated pictures of both styles. An AI may come up with original applications, combinations or variations of styles, but will never do something it was not trained for. If you want something else than "generic multiplayer behaviour", you need to have people with a different flying style among your dataset, as well as a way to differentiate these people from the outset.

 

One good illustration of my point are so-called "deep dreams". If you use an AI trained to detect dogs, you'll end up seeing (parts of) dogs everywhere. You'll never ever see any horses because the network doesn't even know of the concept "horse". Just dogs.

spacer.png

 

On 3/12/2024 at 9:11 PM, Aapje said:

But frankly, this entire comment of yours is a bit disappointing (in the context of your earlier, much better comments, at least), because you seem to constantly be ignoring that I already addressed most of the concerns that you brought up, and are mostly just restating what you said before and what I addressed before, without engaging with my arguments. So I'm not going to react to most of the rest of your post, aside from one new & good argument you make.

Yes, I was restating some of my earlier assertions, since you never sufficiently addressed those. The things I wrote above should give plenty of reason why I still believe my concerns to be mostly unaddressed.

 

Look, you obviously know a thing or two about CNNs. But CNNs are pretty complicated marvels of engineering. These days, everyone and their mother knows how to train a neural network, but it takes a lot of knowledge to train a good neural network. Then you need the hardware to train it, as well as the data itself in an easy to read and memory efficient format. Add to this the restrictions of the hardware you're going to do the inference on. At this point in time, there is no way that a relatively small company like 1CGS is going to do all of that with sufficient quality - let alone in a cost-efficient way.

 

On 3/12/2024 at 9:11 PM, Aapje said:

That depends, if you train the model with the parameters of the flight model, then the model would presumably learn that climbing is a good strategy in planes with a strong climb rate, turning a good strategy in planes with a good turn rate, scissors a good strategy in planes with a good roll rate. And it would also learn how to deal with the (relative) characteristics of other planes, and could have 'Ace' AI do this kind of stuff way better than amateur pilots. So an amateur AI would be far more likely to do a turn fight in a BF109 than an ace AI and the ace AI would adapt its flying way more to the strengths of the opponent's plane.

 

So if they then introduce a new plane, the model should be able to adapt to it well, unless the new plane has some fundamentally different FM characteristic, but I don't think that IL-2 has such a complex flight model.

This would work for an architecture like the one I described, with decoupled tactical and flight parts. However in your proposed model, where the model itself controls things like joystick deflection, it won't work without re-training the model for the new controls. Take a Fokker D.VII and you can yank the stick fully back without any issues. Give the exact same joystick inputs to a Fw-190 and you'll be upside down and halfway towards the earth before you know it.

 

Posted
2 hours ago, AEthelraedUnraed said:

However, I'm arguing that at this point in time, with the current state of AI and hardware, and with an even remotely realistic budget, the IL2 Developers should do the entirety of AI procedurally and not waste any time with Neural Networks. (This coming from someone who works with CNNs).

 

Perhaps, although the current state of hardware is rapidly changing, with companies adding NPU's to new processors and such.

 

Also, you did agree with me that you'd need quite a bit of data. By starting early, you have more time to collect data.

 

2 hours ago, AEthelraedUnraed said:

Even with a procedural AI, it isn't very complicated to program the AI such that it executes manoeuvres slightly differently each time.

 

Yes and no. It's fairly easy to randomize the maneuvers. However, it is very hard to simulate realistic variations, where the different maneuvers have different error rates and where pilots make more errors in certain situations (cognitive load, solar blinding, multiple opponents, etc, etc).

 

The logical result of having a procedural AI where every extra nuance and variation has to be programmed by hand, is that a procedural AI will always remain fairly shallow, since realistic complexity is just too much work. AI has a crossover point where the achieving a similar level of complexity procedurally becomes more expensive than using an AI. But you are correct that this crossover point may be at a price point that 1GCS cannot afford anyway.

 

2 hours ago, AEthelraedUnraed said:

Anyhow, for my argument it doesn't really matter which solution you choose. What I'm saying is that to train a single NN to do both the flying as well as make tactical decisions, is a fool's errand.

 

I'm not convinced this is true.

 

2 hours ago, AEthelraedUnraed said:

If you want something else than "generic multiplayer behaviour", you need to have people with a different flying style among your dataset, as well as a way to differentiate these people from the outset.

 

It is a given that people in MP servers fly differently, or we would not see such differing results.

 

2 hours ago, AEthelraedUnraed said:

One good illustration of my point are so-called "deep dreams". If you use an AI trained to detect dogs, you'll end up seeing (parts of) dogs everywhere. You'll never ever see any horses because the network doesn't even know of the concept "horse". Just dogs.

 

Yes, but in reality we are talking about variations that already show up in the data set, not pilots that ride on horses, rather than fly planes. For example, even if the average pilot will be more aggressive in MP than real life pilots were, because they could really die, that doesn't mean that you won't have MP pilots that are much more aggressive than average and those that are far less aggressive.

 

2 hours ago, AEthelraedUnraed said:

These days, everyone and their mother knows how to train a neural network, but it takes a lot of knowledge to train a good neural network. Then you need the hardware to train it, as well as the data itself in an easy to read and memory efficient format. Add to this the restrictions of the hardware you're going to do the inference on. At this point in time, there is no way that a relatively small company like 1CGS is going to do all of that with sufficient quality - let alone in a cost-efficient way.

 

Yeah, probably. Although the typical workflow seems to be that developers train and test a partially trained model on a single video card & then hire a data farm to do the more extensive training.

 

2 hours ago, AEthelraedUnraed said:

This would work for an architecture like the one I described, with decoupled tactical and flight parts. However in your proposed model, where the model itself controls things like joystick deflection, it won't work without re-training the model for the new controls. Take a Fokker D.VII and you can yank the stick fully back without any issues. Give the exact same joystick inputs to a Fw-190 and you'll be upside down and halfway towards the earth before you know it.

 

You ignore that I've explained that you can address this by giving the flight model to the model.

Posted

What happened. Did you take a pause, before the next chapter. All the community is waiting for the publishing of the book. 

Back to writing please, we miss you. 

AEthelraedUnraed
Posted
15 hours ago, IckyATLAS said:

What happened. Did you take a pause, before the next chapter. All the community is waiting for the publishing of the book. 

Back to writing please, we miss you. 

Next chapter incoming!

 

On 3/14/2024 at 1:07 AM, Aapje said:

Also, you did agree with me that you'd need quite a bit of data. By starting early, you have more time to collect data.

This is true. If they ever plan to do anything like this, it never hurts to start gathering already.

 

On 3/14/2024 at 1:07 AM, Aapje said:

Yes and no. It's fairly easy to randomize the maneuvers. However, it is very hard to simulate realistic variations, where the different maneuvers have different error rates and where pilots make more errors in certain situations (cognitive load, solar blinding, multiple opponents, etc, etc).

 

The logical result of having a procedural AI where every extra nuance and variation has to be programmed by hand, is that a procedural AI will always remain fairly shallow, since realistic complexity is just too much work. AI has a crossover point where the achieving a similar level of complexity procedurally becomes more expensive than using an AI. But you are correct that this crossover point may be at a price point that 1GCS cannot afford anyway.

You're correct about this "crossover point". However, I not only doubt that 1CGS can afford training a Neural Net; I also doubt that programming a realistic AI "by hand" isn't possible.

 

On 3/14/2024 at 1:07 AM, Aapje said:

I'm not convinced this is true.

I've just given you a very lengthy explanation why it is, which you entirely fail to address in your post. At least enlighten me why you think it isn't true. (Although I guess strictly speaking what I said was in response to a slightly different issue, it equally applies to this one.)

 

Mathematically speaking, training a NN is just regression; "curve fitting" one might say. So I'll give a very simple mathematical example to illustrate the issue. Let's say you've got two actions that the human brain does, in this case flying and decision making. Our fictitious decision making transfer function f() has a single output y (the intended manoeuvre) and two input parameters x1 and x2(doesn't really matter what they denote here). Our flying transfer function g() has an output w (the flight controls) and inputs z1 and z2. Let's say z1 is the intended manoeuvre while z2 is, for example, the flight model.

y = f(x1, x2)
w = g(z1, z2); z1=y

 

So far so good, right? You calculate the intended manoeuvre, then input this into g() along with some additional data to arrive at the desired flight control position.

However what you're saying is that one should merge these two functions into a combined transfer function (let's call it h()) with inputs x1, x2, and z2 (we don't need y/z1 since it's now an internal variable).

w = h(x1, x2, z2)

 

By definition, this must be equal to the above function with f() in place of z1:

w = h(x1, x2, z2) = g(f(x1, x2), z2)

 

Suppose that we know that f() and g() are second-order polynomials (quadratic functions). Meaning f() can be written as a*x1^2 + b*x2 + c, and g() as d*z1^2 + e*z2 + f. Now, we want to figure out what a, b, c, d, e and f are here (this is what training your NN does). Let's write out h() (or rather, let WolframAlpha do it for us):

h(x1, x2, z2) = g(f(x1, x2), z2)
= d*c^2 + d*2*a*c*x1^2 + d*a^2*x1^4 + d*2*b*c*x2 + d*2*a*b*x1^2*x2 + d*b^2*x2^2 +e*z2 + f

 

We've now got a complete mess of a fourth-order polynomial! Now what do you suppose is easier to fit accurately - two quadratic functions with two inputs and three parameters each, or one quartic function with three inputs and six parameters? Of course, the quadratic functions are much easier to fit. That means that in order to fit the quartic function equally good, we'd need more data and more time. Given that usually at least one of those is limited by real-world considerations, in practice this means less good results.

 

However, this isn't all. Because quadratic functions (which I admit I have chosen for this exact reason) are convex, we can prove that SGD (stochastic gradient descent; an optimisation function at the heart of NNs) will converge to the global optimum. So we know that the trained AI makes the absolute best decision given the situation, then applies the absolute best controls to fly the chosen manoeuvre. On the other hand, a fourth-order polynomial is not convex. Besides its global minimum, it may have an additional local minimum. If we use SGD, there is a 50% chance we end up in this local minimum. So by doing our training on the combined version of the two transfer functions, we have not only made the training itself harder, but we now have a 50% chance that our solution isn't even the optimal one!

 

Real transfer functions don't usually consist of second-order polynomials, of course, but that doesn't make a difference for the outcome. Adding complexity in any form will require more data and longer training (which leads to less good results if limitations exist on either of those), while reducing the chance of ending up in a good local minimum.

 

 

For one additional "illustration", I'd like to draw your attention to nature. Evolution, which has a tendency to come to very great solutions, agrees that specialisation is the way to go. Our visual cortex is separate from the auditory cortex, which is separate from the primary motor cortex, etc. etc. If it was a better idea to just put hearing, vision and movement together in a massive function, I'm sure our brain would look very different :)

 

On 3/14/2024 at 1:07 AM, Aapje said:

It is a given that people in MP servers fly differently, or we would not see such differing results.

 

[...]

 

Yes, but in reality we are talking about variations that already show up in the data set, not pilots that ride on horses, rather than fly planes. For example, even if the average pilot will be more aggressive in MP than real life pilots were, because they could really die, that doesn't mean that you won't have MP pilots that are much more aggressive than average and those that are far less aggressive.

You've ignored the second part of the phrase you quoted: "as well as a way to differentiate these people from the outset." Being able to generate certain behaviours (e.g. "aggressiveness") isn't enough - we need to generate those behaviours on demand. If we don't tell the AI already during training whether or not a pilot exhibits "aggressive" or "historical" behaviour, the AI will never be able to differentiate between all possible "behaviour types".

 

On 3/14/2024 at 1:07 AM, Aapje said:

You ignore that I've explained that you can address this by giving the flight model to the model.

So train an AI to move the flight controls because that gives more realistic results than a procedural AI, but then have the entirety of how the AI moves the flight controls depend on very simplified parameters like "stall speed"? That just doesn't make any sense. Then you're just as well off just procedurally programming the whole thing.

 

Unless of course you intend to give the AI the *entire* flight model as input rather than some simplified parameters. But then you're adding massive complexity to the model, which I've just shown is very detrimental to the eventual quality.

Posted (edited)
1 hour ago, AEthelraedUnraed said:

but we now have a 50% chance that our solution isn't even the optimal one!

 

No real pilot flies optimally, so unless the goal is to have an 'unbeatable' setting (in the same plane, at least), I don't see why this is even a question. I also believe that even with relatively limited training, the AI will be much better than what we have now, where AI pilots just fly into each other when they aren't even in a dogfight.

 

But the issue in my eyes is that you simplify things way too much, to a point where the AI will be robotic and the player will sense that the AI is robotic. For example, better pilots will use rudder more often and in general, will coordinate their various actions better.

 

If you want to mimic realistic differences in how pilots fly maneuvers based on skill level and other attributes with your solution, it seems to me that with your solution you need maneuver to control input mapping per skill level and per plane. So then you have to tune each one, which is a lot of work, that still results in a subpar solution. Because even then you run into the issue that (good) pilots will not just execute a maneuver identically or for x seconds. And they will do things differently based on their plane, their airspeed, their height, etc.

 

So to do this right your maneuver to control input mapping will need to be some very complicated procedural code, that needs to generate different results based on the AI skill level, the plane characteristics, the state of the plane itself, the state of the other planes, the environment, etc.

 

How else will you ever get the AI to perform maneuvers in a somewhat realistic way if you do as you suggest, where the AI merely tells you: 'do scissors now'?

 

Now, it may be or may not be true that my solution is unfeasible in the short term, but I don't think you recognize the huge problems that your solution has. For example, your solution is far more likely to result in things like planes flying into each other, because a real pilot is often smart enough to realize that he is scissoring straight into the enemy airplane and will for example pull up while doing the scissor.

 

So then your solution may be 85% 'optimal' where my solution is 80% 'optimal' with the same training, but optimal in this context is subjective to the limitations of the neural model you chose in the first place and has no clear relationship with whether the gamer perceives the pilot as being good. Then in reality, your 'more optimal' model is probably going to make way more completely shitty choices like flying straight into the enemy with no attempt to evade, which is a way worse mistake than using a little less or more rudder than a real pilot would probably do. Because my AI would usually still attempt to evade and the mistakes would thus seem way more human-like.

 

1 hour ago, AEthelraedUnraed said:

You've ignored the second part of the phrase you quoted: "as well as a way to differentiate these people from the outset." Being able to generate certain behaviours (e.g. "aggressiveness") isn't enough - we need to generate those behaviours on demand. If we don't tell the AI already during training whether or not a pilot exhibits "aggressive" or "historical" behaviour, the AI will never be able to differentiate between all possible "behaviour types".

 

Yes, that's why you provide extra parameters to the model that you either keep track off for the pilot (like the kill/death ratio, kills/hour and deaths/hour), or things that you extract from the flying data (which requires preprocessing). I've already addressed this.

 

1 hour ago, AEthelraedUnraed said:

So train an AI to move the flight controls because that gives more realistic results than a procedural AI, but then have the entirety of how the AI moves the flight controls depend on very simplified parameters like "stall speed"? That just doesn't make any sense. Then you're just as well off just procedurally programming the whole thing.

 

I'm not really sure why it doesn't make sense to base the AI off the same parameters that the FM uses. If the FM is realistic enough, then why wouldn't an AI that uses the same parameters to create a realistic enough FM, not be able to be just as realistic when it comes to behaviors that depend on the flight model?

 

And the reason to not procedurally program the thing is that everything is interconnected. A real pilot doesn't decide that a certain maneuver is right separate from the flight models of the involved airplanes, but he chooses to one-circle or two-circle or scissor, etc based on the FMs. And similarly, once a pilot decided to do a certain maneuver, he doesn't do the maneuver the same way in each aircraft and against each opponent.

 

You create this completely artificial distinction between tactics and executing the maneuvers, where you keep ignoring that they depend on the same data and that they constantly interact with each other. A poorly executed maneuver means that the pilot may need to choose a different maneuver to try to save the situation.

 

1 hour ago, AEthelraedUnraed said:

Unless of course you intend to give the AI the *entire* flight model as input rather than some simplified parameters. But then you're adding massive complexity to the model, which I've just shown is very detrimental to the eventual quality.

 

If you put the parameters to the current flight model into the AI, then it will learn the flight model, if you train it on real pilot data.

Edited by Aapje
AEthelraedUnraed
Posted
5 minutes ago, Aapje said:

No real pilot flies optimally,

I don't think you fully understand how training a neural network functions mathematically.

 

"Optimal" in my post never refers to any specific behaviour. It refers to how well our calculated transfer function is able to approach the real transfer function. The "real transfer function" being the hypothetical one that is able to exhibit exactly the behaviour you want, at exactly the right moments. Whatever "the behaviour you want" entails isn't really important here, but will include making and executing decisions and manoeuvres without any difference to how a real human would do it. That may not be achievable, but that doesn't matter since this is after all a hypothetical function. We are however able to evaluate the output of this "hypothetical function" when given a certain input (this input/output relationship being our training data). Therefore we are able to approach this function with some mathematical model we design (the neural net).

 

As I already explained in my post, your method is mathematically no different from my method. They are identical. If both are properly trained towards the global optimum, one does not have any different behaviour from the other in whatever way, whether that's "flying into other planes," "robotic" behaviour or "identical manoeuvres."

 

The only difference in the two methods is in where you do the function fitting ("training"). I use two separate functions that I fit separately (f and g in my example), whereas you fit the combined function (h). As I explained, because SGD (and similar methods) is a numerical rather than exact optimisation method,* in your case you won't be able to approach the "hypothetical transfer function" as well as I am. For the reasons why, I refer to my previous post.

 

* Remember that exact methods to solve a NN optimisation problem do in general not exist! If they did, we didn't need to go through the lengthy process of training and could instead just calculate all the parameters of a Neural Network.

 

Possible effects of a "suboptimal" fit include less realistic behaviour and/or erratic choices. So the method that is able to more closely approach the "hypothetical transfer function" will be the method that exhibits the most human-like behaviour.

 

Again, please consider evolution. A human uses different parts of the brain to decide what manoeuvre to fly and to do the actual flying. Do you really think this would've been the case if using the same "brain part" to do both was the better solution?

 

50 minutes ago, Aapje said:

Yes, that's why you provide extra parameters to the model that you either keep track of for the pilot (like the kill/death ratio, kills/hour and deaths/hour), or things that you extract from the flying data (which requires preprocessing). I've already addressed this.

...And now we're back at my earlier argument that doing so complicates training. Which as I have shown very comprehensively now and which you still haven't addressed, leads to worse results.

 

59 minutes ago, Aapje said:

I'm not really sure why it doesn't make sense to base the AI off the same parameters that the FM uses. If the FM is realistic enough, then why wouldn't an AI that uses the same parameters to create a realistic enough FM, not be able to be just as realistic when it comes to behaviors that depend on the flight model?

The problem is that the FM is much more complicated than just some parameters like "stall speed" or "critical AoA". One part of your wing might be in a stall while 10cm to the left of it there's airflow as normal. So that's why providing the AI with "simple" parameters doesn't work as well. The AI would need to know everything from the shape of the aircraft to the amount of fuel in the left-wing fuel tank.

 

Providing the AI with the entire FM would work, theoretically. But given how complicated a FM is, you're now introducing a gazillion extra input parameters to your transfer function. Which makes training more difficult, which is a very bad idea as I've already covered many times now.

 

1 hour ago, Aapje said:

You create this completely artificial distinction between tactics and executing the maneuvers, where you keep ignoring that they depend on the same data and that they constantly interact with each other. A poorly executed maneuver means that the pilot may need to choose a different maneuver to try to save the situation.

Yes they do partly depend on the same input data, and the results of the execution of the manoeuvre do influence decision making. I've never said anything to the contrary. In fact, I don't remember saying anything at all about which input data you should or should not choose.

 

All I'm saying is that making this distinction facilitates training your NN which therefore leads to better results. And this holds regardless of which input data you do or do not use.

 

Furthermore, I argue that this distinction isn't artificial at all since this distinction already exists in our brain.

 

1 hour ago, Aapje said:

If you put the parameters to the current flight model into the AI, then it will learn the flight model, if you train it on real pilot data.

Only if you provide the full flight model. Which I have already covered above is a bad idea.

 

Saying this is still true for simplified parameters, is analogous to stating that one can fit a quadratic function using a straight line.

Posted
19 minutes ago, AEthelraedUnraed said:

Whatever "the behaviour you want" entails isn't really important here, but will include making and executing decisions and manoeuvres without any difference to how a real human would do it.

 

Yes, but your choice is to not output the actual human behavior (the control inputs), but a abstract simplification (the maneuvers). So then you introduce the problem of translating those maneuvers into actual control inputs.

 

The issue is that your decision to not output the control inputs means that you inherently limit yourself to either rather unrealistic behavior, or a very complicated mapping of maneuvers to control inputs, where you effectively have to redo part of the work that you did for generating the maneuvers in the first place.

 

And because you do not in fact output the same thing that a real human outputs, you also introduce the problem of how to score your model against human behavior. You have to then either measure the performance of the combination of the machine learning AI that outputs the maneuvers and the code that translates this to control inputs, or you have to grade the maneuvers itself somehow.

 

The issue with the first solution is that you effectively have to build out the entire system fully first and cannot test the two parts separately, which is a very bad idea from a programming point of view. What part of your system have failed if the final control outputs are wrong? Did it output the wrong maneuvers? Is the translation wrong? A bit of both? You can't really know how to fix things if you can't judge which part is not working correctly.

 

A system can't you can't debug and thus not fix is trash.

 

And if you grade the maneuvers on their own, then how do you know that your grading is accurate? You are almost certainly going to have to depend on your preprocessed data that you used for training, but if you have bugs in the preprocessing that cause you to determine the maneuvers wrongly or too simplistically, then you will use the exact buggy data to grade the system. So the system will pass inspection if it outputs the same garbage that you put in. But then you actually try to fly with the model and find out that it is garbage. So then you have to fix the preprocessing, run the preprocessing again over all the data, then start over completely with the training and then fly again. Rinse and repeat.

 

In my opinion, there is no way that outputting maneuvers can achieve the same quality as outputting control inputs. If you grade your entire system based on whether the final control inputs have the smallest loss function and thus are most similar on average to the real pilots, you are going to find structural problems where your translation layer from maneuver to actual control inputs will fail to consider the very same data that you need to determine the proper maneuver in the first place. You cannot fix these issues without reintroducing a lot of complexity that you removed, and then you still have all the complexity that you added to remove the other complexity, that you put back in!

 

19 minutes ago, AEthelraedUnraed said:

Again, please consider evolution. A human uses different parts of the brain to decide what manoeuvre to fly and to do the actual flying.

 

I do not believe that this is how the brain works. Brain research suggests that some decision making is in fact a post-action rationalization of subconscious behavior that gets perceived as a conscious choice. Also, I believe that there is constant mutual feedback between the mostly subconscious operation of the controls and conscious decision making.

 

This is part of the reason why force feedback and motion platforms improve not just the sensation of flying, but also allow for better decision making.

 

So if you implement your system with a one-directional flow, where the decision making precedes operation of the controls, you are not in fact mimicking nature. So your simile seems rather flawed.

 

My solution can partially handle the feedback loop problem, because the data that gets fed into the system is going to include things that a human pilot would perceive a bit later. This will then both impact the decision making and the actual flying, while your model cannot do this.

 

19 minutes ago, AEthelraedUnraed said:

...And now we're back at my earlier argument that doing so complicates training.

 

Yes and no. My model would indeed have to deal with more parameters and more outputs, but your choice comes at the expense of having to do way more and very complicated preprocessing, having to do more and very complicated post-processing and making it much harder to check how well your system works. And I believe that your solution means that you will forever suffer from significant quality deficiencies. I think that it is far from obvious that your choice would result in an overall system that is cheaper to make, quicker to get working and would offer superior quality with the same resources (for example, a similar load on the CPU).

 

This debate between us seems very similar to the debate between those who attempt to create strong AI by trying to write a lot of clever code that they think is able to interpret or generate human behavior, versus the Jeremy Clarkson method (POWERRRRRRRRRRRRRRRR).

 

So far, the Jeremy's of the world seem to be winning. And this is not just relevant to what the best approach is, but also to what hardware and software will increasingly become available to increasingly efficiently train and execute large models with lots of parameters. And in the world of large AI models, even with all the parameters I suggest, we would still be very, very far from truly large AI models. So even a relatively modest NPU could do wonders.

 

19 minutes ago, AEthelraedUnraed said:

Which as I have shown very comprehensively now and which you still haven't addressed, leads to worse results.

 

Only if you handwave away the fact that you have simply excised a lot of that complexity from your proposal and have never actually explained how you plan to make that work. Of course, if you pretend that you can do some actually very complicated parts perfectly and very easily, you can argue that your solution would be superior in theory. But your theory is flawed, as it often is. Einstein also noticed how often this is the case when he said: "In theory, theory and practice are the same. In practice, they are not."

 

How do you plan to make your translation of the maneuvres to the actual control inputs take the environment into account? The status and FM of your own plane? Of the other planes? How will you model the impact of stress on how well the maneuvres are executed? How will you model the impact of being blinded by the sun? How will you model losses of spatial aweness, loss of sight, misjudgements of the behavior of the opponent, etc that cause the maneuvers to be executed differently?

 

19 minutes ago, AEthelraedUnraed said:

The problem is that the FM is much more complicated than just some parameters like "stall speed" or "critical AoA". One part of your wing might be in a stall while 10cm to the left of it there's airflow as normal. So that's why providing the AI with "simple" parameters doesn't work as well. The AI would need to know everything from the shape of the aircraft to the amount of fuel in the left-wing fuel tank.

 

All of this stuff is already in IL-2 and is simulated for AI planes. So why would this suddenly be overly complicated if we make the AI do it?

 

19 minutes ago, AEthelraedUnraed said:

All I'm saying is that making this distinction facilitates training your NN which therefore leads to better results. And this holds regardless of which input data you do or do not use.

 

And yet they put a trillion parameters into ChatGPT 4 and it works out. A combat flight AI would be way, way simpler and you could still feed it with quite a lot of data.

Posted
On 3/14/2024 at 11:04 PM, Aapje said:

@IckyATLAS

 

We are waiting for you to actually build the thing...slacker

 

;)

Ok once the debate will end I will try to group all in a document. ? 

Posted
On 3/16/2024 at 9:33 AM, IckyATLAS said:

Ok once the debate will end I will try to group all in a document. ? 

 Please do, and then share with us ?

This is a very fine discussion thread.

 

---

 

Forgive me if I have missed it, but.. was it already suggested breaking "The AI" into multiple, smaller, AI with different methods? Each one handling one different aspect of the pilot's job and quality at doing it? It could be cheaper (and perhaps easier; not sure about not taking longer*) to implement than one single AI,using one single method, to make it all work.
 

For example: one AI just for piloting one single plane, with varying degrees of quality, perhaps regulated by how many cycles the training was done for each class of piloting skill (recruit would have been trained for less cycles than an ace at any given plane).

 

Then, one AI for combat skill with that plane (using each class of piloting skill from the previous AI).

Then, one AI for coordinated flight with other pilots. And then another AI for various squads, with or without different roles and planes.

And so on, and so on.

 

And, for different nations in the conflict, enforcing different rules with different rewards regarding formations, tactics, strategies.

 

Heavily inspired by the AI&Games video on Alien: Isolation previously posted here, and this other video:

 

 

---

*Maybe recruiting select members of the community to help would alleviate the work

 

Posted
4 hours ago, Araosam said:

For example: one AI just for piloting one single plane, with varying degrees of quality, perhaps regulated by how many cycles the training was done for each class of piloting skill (recruit would have been trained for less cycles than an ace at any given plane).

 

Then, one AI for combat skill with that plane (using each class of piloting skill from the previous AI).

 

Why and how do you think that pilot quality and combat skill are different?

 

Also, I want to point out that a less skilled pilot doesn't necessarily do the same things as a better pilot, more poorly , but can use different tactics. For example, a common complaint by the better pilots was that the less skilled pilots would shoot from too far out.

 

4 hours ago, Araosam said:

Then, one AI for coordinated flight with other pilots. And then another AI for various squads, with or without different roles and planes.

And so on, and so on.

 

Yes, we've discussed separating out decisions like who/what to attack. When to escape for home. What route to fly. Etc.

 

A possibility is then to supply these decisions to the AI model that flies the plane, as an input parameter/goal. However, a complicating factor is then how you teach the AI what behavior belongs to what goal.

Posted
1 hour ago, Aapje said:

 

Why and how do you think that pilot quality and combat skill are different?

 

 

Because, to my understanding, how well one controls the machine is not related to how well one perceives the situation they are in and plans/acts accordingly, although they live together within the pilot.

An Ace would probably perform well in both, a rookie not so much in one, the other, or both.

 

So, by having as different aspects of the pilots job being controled by different AI, I do think (pure conjecture of my part, mind you) it would be simpler to achieve something more akin to what we would expect a pilot to do. At least, simpler than cramming everything into one single AI model (although I admit that it too could be possible), or simpler to do fine adjustments.

 

1 hour ago, Aapje said:

Also, I want to point out that a less skilled pilot doesn't necessarily do the same things as a better pilot, more poorly , but can use different tactics. For example, a common complaint by the better pilots was that the less skilled pilots would shoot from too far out.

 I do agree with you here, and think we should question why was that the case. Was the different tactics because they had poorer combat skill, piloting skill, both? Another characteristic altogether?
 Or perhaps fear? (Another variable to consider!)

 

1 hour ago, Aapje said:

A possibility is then to supply these decisions to the AI model that flies the plane, as an input parameter/goal. However, a complicating factor is then how you teach the AI what behavior belongs to what goal.

 

Maybe giving them some kind of personality could solve this.
The "Rogue" would be more willing to stay out of formation to rack in kills.

The "Paladin" would do everything by the book and obey every order given to him.

The "Oportunist" would favor maintaining energy advantage unless there is a sure prize.

 

All of these should be multiplying factors in the weights/prizes related to each goal.

 

 

 

Posted
1 hour ago, Araosam said:

Because, to my understanding, how well one controls the machine is not related to how well one perceives the situation they are in and plans/acts accordingly, although they live together within the pilot.

 

I think that these are too closely related to be separated. AFAIK, science tends to find that a lot of supposed perception and planning is really often subconscious behavior with perception and 'planning' only happening after taking an action. So even the human mind seems to blur these, probably to produce a coherent sense of self.

 

IMO, one of the biggest benefits of training is moving behavior from the slow conscious part of the brain to the much faster subconscious part, which means that that the part done by by subconscious AI would have to differ based on the ability at that level, but then without clashing with the other AI potentially trying to do the same thing on the conscious level.

 

1 hour ago, Araosam said:

Or perhaps fear? (Another variable to consider!)

 

Fear is the mind killer. Oh wait, wrong genre.

 

Yes and this is a hard thing to mimic as virtual pilots have to worry more about back problems than getting killed.

 

1 hour ago, Araosam said:

Maybe giving them some kind of personality could solve this.
The "Rogue" would be more willing to stay out of formation to rack in kills.

The "Paladin" would do everything by the book and obey every order given to him.

The "Opportunist" would favor maintaining energy advantage unless there is a sure prize.

 

All of these should be multiplying factors in the weights/prizes related to each goal.

 

My idea for a mimicking AI would attempt to do this by finding some measurements for real pilots that match up with such behaviors.

 

Although testimony suggests that fighter pilots were not that good at Paladin behavior.

Posted
1 hour ago, Aapje said:

I think that these are too closely related to be separated. AFAIK, science tends to find that a lot of supposed perception and planning is really often subconscious behavior with perception and 'planning' only happening after taking an action. So even the human mind seems to blur these, probably to produce a coherent sense of self.

 

IMO, one of the biggest benefits of training is moving behavior from the slow conscious part of the brain to the much faster subconscious part, which means that that the part done by by subconscious AI would have to differ based on the ability at that level, but then without clashing with the other AI potentially trying to do the same thing on the conscious level.

In regards to the human mind, sure! Much like driving, once you 'get in the zone'.
Maybe this is doable with some sort of deep learning, but it seems rather excessive using such a model in the context of a an addon/revamp for this sim. (That said, it would be awesome nonetheless)

 

1 hour ago, Aapje said:

Fear is the mind killer. Oh wait, wrong genre.

 

Yes and this is a hard thing to mimic as virtual pilots have to worry more about back problems than getting killed.

heh!?

 

1 hour ago, Aapje said:

My idea for a mimicking AI would attempt to do this by finding some measurements for real pilots that match up with such behaviors.

 

Although testimony suggests that fighter pilots were not that good at Paladin behavior.

Real pilots as in the ones from the 1940s? Or virtual pilots?

I do think that the latter ones would be even less adepts of the 'paladin' personality ?

Posted
9 hours ago, Araosam said:

Real pilots as in the ones from the 1940s? Or virtual pilots?

I do think that the latter ones would be even less adepts of the 'paladin' personality ?

 

The only way to train the system with human pilots is to use modern pilots playing games, since real dogfights with warbirds are no longer a thing and we have no pilots who can fly in a way that was truly realistic to WW II.

 

Attempting to truly mimic the WW II pilots would probably require some sort of hybrid approach, but we don't have the technology to do that, AFAIK.

Posted

Not that this conversation hasn't been discussed before but here we are.

 

1.  I am assuming the AI queries are based on QMB flights - my experience has been that there is a difference  between AI in QMB and in Career flights.

 

2. The AI in Il2 is using the same flight models and engine as we have and also pilot physiology.   This is being based on data from WW2 and taking into consideration issues pilots in that era suffered.

 

We can not assume WW2 pilots (of which the AI is trying to replicate) are of the same level of competence as sim pilots nigh on a century later are.  We also are not being subjected physically to the combat conditions that we are flying in simulation.

Anyway, for me, AI in career does a reasonable job and for me a better option would be a better communication system for pilots to  co-ordinate with AI with.  Be nice to tell wingmen to stick close or drag and other options like telling individual flight planes to bug out  and maybe them and a wingman fly home pending on frontline situation.  Maybe we see this in the next version of the sim - Korea.

 

With AI in QMB scenarios, I always end up telling my flight to do as I  am to stop them flying off.  Then command them to fight as the objective is in range.  Still not as good as in career but it helps keeping them on task and not leaving you solo against objectives.

 

As for LLM AI - interesting times - Claude etc.

1PL-Husar-1Esk
Posted (edited)

Interesting is  that humans are good for writing the reward function when task is simple, but when it's complicated and rooted in physics world simulation like one imagine flight sim is,   the AI agents are better than humans with generating reward function leading to desired results (learned AI to correctly perfor given task)

 

Screenshot_2024-03-19-13-32-16-780_com.google.android.youtube.jpg

 

Interesting read

 

https://eureka-research.github.io/

 

If everyone interested today on GTC

Generally Capable Agents in Open-Ended Worlds [S62816] - 
Tuesday, Mar 19 , 4:00 PM - 4:50 PM CET

Edited by 1PL-Husar-1Esk
  • Like 1
Posted
2 hours ago, 1PL-Husar-1Esk said:

Interesting is  that humans are good for writing the reward function when task is simple, but when it's complicated and rooted in physics world simulation like one imagine flight sim is, the AI agents are better than humans with generating reward function leading to desired results (learned AI to correctly perform given task)

 

Yes, although here the reward function is even harder to judge, since the goal is not to perform the best, but in a realistic or fun way, depending on the goal of the human player who faces the AI.

Posted
6 hours ago, Aapje said:

Attempting to truly mimic the WW II pilots would probably require some sort of hybrid approach, but we don't have the technology to do that, AFAIK.

 Training on event servers could do the trick, no?
Not at all perfect, but with more people engaged in realistic behaviour, following orders, formations and such.

 

And since training material for this is scarce, having the independent AI modules try and achieve what is seen/collected at this servers could prove a good middleground between cost and quality of new AI.

Guest
This topic is now closed to further replies.
×
×
  • Create New...